China proposes rules for generative AI training data

China remains a key player in global AI regulation, according to GlobalData analyst Laura Petrone.

China has proposed a set of rules for generative AI training data to promote the development of safe AI tools for businesses, according to a notice by the Cyberspace Administration of China (CAC).

These rules, if passed, seek to control the training data that is used to create generative AI tools using large language models (LLMs). LLMs are trained to answer prompts by analysing the syntax and language of data sets.

Access deeper industry intelligence

Experience unmatched clarity with a single platform that combines unique data, AI, and human expertise.

Find out more

The CAC states that any language of violence, terrorism or data that could disrupt national unity would be banned from being used in the training of LLMs.

Reasoning this, the CAC has stated that these rules aim to proactively defend against potential risks that LLMs could bring about if not created ethically. Specifically, it cites data security and copyright as top concerns.

Speaking on these potential rules, GlobalData principal analyst Laura Petrone gave a wider view of Chinese AI regulations.

“Given the broad scope of information and data covered,” she begins, “China’s AI regulations are shaping up to be among the most stringent and detailed globally.”

Petrone stated that this may be because China has “a lot at stake” over the data content that goes into training LLMs to guard against what responses may be generated.

“[The CAC] must ensure that all information produced aligns with Chinese Communist Party ideology and doesn’t threaten China’s national stability and security,” she stated.

Petrone described China as a key “strategic player” in AI globally and believed that, despite its own political system, the deployment and regulation of AI within China could set the tone for other countries.

“Alongside Europe, China is poised to set important standards on AI regulation that will be observed closely worldwide,” she concluded.

The training data currently used within LLMs, including those made by Western companies, has come under scrutiny over a lack of transparency.

Writing for the Harvard Business Review Reid Blackman and Beena Ammanath stated that data transparency can help mitigate algorithmic bias within LLMs.

Despite this, measuring transparency with content control will be a theme for AI software developers within China.

Sections

Sections

Sections

Sections

China proposes rules for generative AI training data

Go deeper with GlobalData

Generative Artificial Intelligence (AI) Powerplay: What’s in the Big Tech AI Playbook

Enterprise Security Software Sector Scorecard - Thematic Intelligence

Data Insights

Access deeper industry intelligence

Generative Artificial Intelligence (AI) Powerplay: What’s in the Big Tech AI Playbook

Enterprise Security Software Sector Scorecard - Thematic Intelligence

Go deeper with GlobalData

Intel commits $5.7bn to expand Leixlip manufacturing site

Cloudflare launches Precursor for continuous detection of web bots

Entering the Agentverse: Powering the next decade of mobile growth

Meta pulls Muse Image days after launch amid privacy outcry

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

Go deeper with GlobalData

Data Insights

Access deeper industry intelligence

Sign up for our daily news round-up!

Give your business an edge with our leading industry insights.

Go deeper with GlobalData

Go deeper with GlobalData

Access deeper industry intelligence

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

I would also like to subscribe to:

Thank you for subscribing