China has proposed a set of rules for generative AI training data to promote the development of safe AI tools for businesses, according to a notice by the Cyberspace Administration of China (CAC). 

These rules, if passed, seek to control the training data that is used to create generative AI tools using large language models (LLMs). LLMs are trained to answer prompts by analysing the syntax and language of data sets. 

The CAC states that any language of violence, terrorism or data that could disrupt national unity would be banned from being used in the training of LLMs. 

Reasoning this, the CAC has stated that these rules aim to proactively defend against potential risks that LLMs could bring about if not created ethically. Specifically, it cites data security and copyright as top concerns. 

Speaking on these potential rules, GlobalData principal analyst Laura Petrone gave a wider view of Chinese AI regulations. 

“Given the broad scope of information and data covered,” she begins, “China’s AI regulations are shaping up to be among the most stringent and detailed globally.” 

How well do you really know your competitors?

Access the most comprehensive Company Profiles on the market, powered by GlobalData. Save hours of research. Gain competitive edge.

Company Profile – free sample

Thank you!

Your download email will arrive shortly

Not ready to buy yet? Download a free sample

We are confident about the unique quality of our Company Profiles. However, we want you to make the most beneficial decision for your business, so we offer a free sample that you can download by submitting the below form

By GlobalData
Visit our Privacy Policy for more information about our services, how we may use, process and share your personal data, including information of your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.

Petrone stated that this may be because China has “a lot at stake” over the data content that goes into training LLMs to guard against what responses may be generated. 

“[The CAC] must ensure that all information produced aligns with Chinese Communist Party ideology and doesn’t threaten China’s national stability and security,” she stated. 

Petrone described China as a key “strategic player” in AI globally and believed that, despite its own political system, the deployment and regulation of AI within China could set the tone for other countries. 

“Alongside Europe, China is poised to set important standards on AI regulation that will be observed closely worldwide,” she concluded. 

The training data currently used within LLMs, including those made by Western companies, has come under scrutiny over a lack of transparency.

Writing for the Harvard Business Review Reid Blackman and Beena Ammanath stated that data transparency can help mitigate algorithmic bias within LLMs.

Despite this, measuring transparency with content control will be a theme for AI software developers within China.