With AI and ML’s vast capabilities in analyzing Big Data, large medical datasets are in high demand and could cost millions of dollars to buy.

During the Covid-19 pandemic, artificial intelligence (AI) and machine learning (ML) algorithms have become increasingly popular in the healthcare industry. AI has been helpful in interpreting and analyzing medical images, such as mammograms and brain scans, at a much faster rate than a human is capable of.

According to GlobalData forecasts, the market for AI platforms for the entire healthcare industry will reach $4.3B by 2024, up from $1.5B in 2019. This will be driven by the use of AI by healthcare providers and payers, which is forecast to reach $2.9B by 2024.

A global repository of datasets

Recognizing the importance of public datasets containing medical information, Stanford’s Center for Artificial Intelligence in Medicine and Imaging (AIMI) has significantly enlarged its free repository of medical imaging datasets that could be used to train AI-powered algorithms. Stanford’s AIMI dataset has already been considered one of the largest free datasets, and partnered with Microsoft’s AI for Health program, the updated datasets will be larger, more accessible, and able to host images from other different institutions all over the world. By developing the platform into a global repository, this will provide opportunities to share research and improve different models.

With access to a refined and large medical dataset, researchers will be able to explore and find solutions to medical problems that affect small communities and thus are often overlooked by big companies. Furthermore, while AI/ML-powered algorithms are incredibly powerful tools, they cannot avoid negative impact that comes from underlying bias in data, due to being trained on datasets that contain information on only certain locations or demographics.

With Stanford’s AIMI dataset including data from different populaces, medical professionals and computer scientists will be able to expose these issues.