The technology industry continues to be a hotbed of patent innovation. Activity is driven by the growth of the market for cloud-enabled synthetic data including the increasing demand for large-scale and diverse datasets, the need for privacy-preserving data generation, and the requirement for cost-effective and scalable data solutions. The growing importance of technologies such as generative models, machine learning algorithms, and cloud computing infrastructure is further driving innovation in the technology industry. These technologies enable the creation of high-quality synthetic data that closely resembles real data, facilitating data-driven innovation while addressing privacy and data protection concerns. In the last three years alone, there have been over 4.1 million patents filed and granted in the technology industry, according to GlobalData’s report on Big data in technology: synthetic dataBuy the report here.

According to GlobalData’s Technology Foresights, which uses over 1.5 million patents to analyze innovation intensity for the technology industry, there are 190+ innovation areas that will shape the future of the industry.

Synthetic data is a key innovation area in big data

Synthetic data refers to data that is generated artificially through computer programs or algorithms. Although not real, it serves the purpose of simulating real-world scenarios and situations for analysis and testing. Synthetic data finds application in diverse fields, including machine learning, artificial intelligence, and data visualization, enabling researchers and developers to explore and experiment with data-driven techniques in a controlled environment.

GlobalData’s analysis also uncovers the companies at the forefront of each innovation area and assesses the potential reach and impact of their patenting activity across different applications and geographies. According to GlobalData, there are 840 companies, spanning technology vendors, established technology companies, and up-and-coming start-ups engaged in the development and application of synthetic data.

Key players in synthetic data – a disruptive innovation in the technology industry

‘Application diversity’ measures the number of applications identified for each patent. It broadly splits companies into either ‘niche’ or ‘diversified’ innovators.

‘Geographic reach’ refers to the number of countries each patent is registered in. It reflects the breadth of geographic application intended, ranging from ‘global’ to ‘local’.

Patent volumes related to synthetic data

Company Total patents (2010 - 2022) Premium intelligence on the world's largest companies
eBay 70 Unlock Company Profile
Fujifilm 70 Unlock Company Profile
Microsoft 748 Unlock Company Profile
Siemens 344 Unlock Company Profile
Apple 241 Unlock Company Profile
Alibaba Group 102 Unlock Company Profile
Salesforce 114 Unlock Company Profile
Hitachi 92 Unlock Company Profile
SAP 100 Unlock Company Profile
Bank of America 144 Unlock Company Profile
Meta Platforms 376 Unlock Company Profile
General Electric 56 Unlock Company Profile
ADOBE 153 Unlock Company Profile
Koninklijke Philips 92 Unlock Company Profile
Mitsubishi Electric 70 Unlock Company Profile
Factual 63 Unlock Company Profile
Sony Group 105 Unlock Company Profile
Tencent 141 Unlock Company Profile
Nippon Telegraph and Telephone 72 Unlock Company Profile
IBM 2340 Unlock Company Profile
Wipro 51 Unlock Company Profile
Toshiba 70 Unlock Company Profile
NEC 251 Unlock Company Profile
Intel 133 Unlock Company Profile
Digital Global Systems 80 Unlock Company Profile
MondayCom 66 Unlock Company Profile
Cortica 57 Unlock Company Profile 186 Unlock Company Profile
Qomplx 53 Unlock Company Profile
Primal Fusion 65 Unlock Company Profile

Source: GlobalData Patent Analytics

IBM is the leading patents filer in the field of synthetic data. One of the company’s patents describes a system, method, and computer program for evaluating an analogical pattern. The approach involves using natural language processing (NLP) to identify analogical pattern terms, refining them through deep analysis and semantic analysis to create metadata, generating interpretations of the pattern, and scoring each interpretation to select the best one that surpasses a predetermined threshold, thereby evaluating the analogical pattern.  

In terms of geographic reach, Factual leads the pack, followed by and In terms of application diversity, holds the top position, followed by and Factual.  

Big data innovation in synthetic data has transformed the way data is generated and utilized for various applications. Synthetic data refers to artificially generated data that mimics the characteristics and statistical properties of real-world data. It is used for tasks such as data augmentation, model training, and testing, without compromising the privacy and security of sensitive information.  

To further understand the key themes and technologies disrupting the technology industry, access GlobalData’s latest thematic research report on Big Data.

Premium Insights


The gold standard of business intelligence.

Blending expert knowledge with cutting-edge technology, GlobalData’s unrivalled proprietary data will enable you to decode what’s happening in your market. You can make better informed decisions and gain a future-proof advantage over your competitors.


GlobalData, the leading provider of industry intelligence, provided the underlying data, research, and analysis used to produce this article.

GlobalData’s Patent Analytics tracks patent filings and grants from official offices around the world. Textual analysis and official patent classifications are used to group patents into key thematic areas and link them to specific companies across the world’s largest industries.