Today’s digital economy is powered by big data. It is produced in abundance by both individuals and enterprises and stored in vast data centres, some of which cover hundreds of thousands of square feet. Several prominent business people and a number of leading publications have described data as the new oil, capable of generating significant value if used in the right way.
Leading data trends in big data
Listed below are the leading trends in big data, as identified by GlobalData.
Many big data vendors have had to contend with a growing market perception that data governance, security, and management have taken a back seat to accessibility and speed. In response, most companies are now accepting the challenge and openly prioritising data governance. This is expected to result in multiple disparate solutions being replaced by single data management platforms, leading to efficient scalability, collection, and distribution of data.
The transformative value of data-driven business insights has led to market demands that data be made available to the widest applicable base of users, enabling them to draw insights through self-service analytics models. This driver began with an emphasis on data consumers and has now expanded to target producers with new tools supporting data analysis and the creation of visualisations. This trend has already begun to transform the publishing industry.
Owing to market demands for data democratisation, enterprise buyers are now in need of data integration and preparation tools capable of retaining access to disparate data sources without sacrificing data quality and security. Machine learning and artificial intelligence (AI)-enabled smart data integration tools such as SnapLogic’s Intelligent Integration platform can replace extract, transform, load (ETL) processes, and recommend the best solutions to help data scientists in organisations.
AI for data quality
One of the benefits of using AI is that it can improve data quality. This improvement is needed within any analytics-driven organisation. The proliferation of personal, public, cloud, and on-premise data has made it nearly impossible for IT to keep up with user demand. Companies want to improve quality by taking advanced design and visualisation concepts typically reserved for the final product of a business intelligence solution, and putting them to work at the very beginning of the analytics lifecycle. AI-based data visualisation tools, such as Qlik’s Sense platform and Google Data Studio, are enabling enterprises to identify critical data sets that need attention for business decision-making, reducing human workloads.
In an effort to speed time-to-market for custom-built AI tools, technology vendors are introducing pre-enriched, machine-readable data, specific to given industries. Intended to help data scientists and AI engineers, these kits include the data necessary to speed up the creation of AI models. For example, the IBM Watson Data Kit for food menus includes 700,000 menus from across 21,000 US cities and menu dynamics such as price, cuisine, ingredients, etc. This information could be used directly in an AI travel app that enables users to locate nearby establishments catering to specific dietary requirements, such as gluten-free bakeries.
Data as a service (DaaS)
Analytics and BI tools require data from a single, high-performance relational database. However, most organisations have multiple solutions, formats, and sources for data. Hence IT teams typically apply custom ETL processes and proprietary tools to integrate data from various systems and improve the accessibility of analytics solutions. This approach leads to numerous challenges, including increased infrastructure costs, low architecture flexibility, increased data complexity, complex data governance, and increased time to move data between systems.
DaaS – a cloud service that provides users with on-demand data access – helps enterprises address these challenges. It is typically deployed with data lakes, which are huge repositories of unstructured, semi-structured, and structured data. DaaS solutions store and manage enterprise data by compiling it into relevant streams. This helps enterprises reduce storage and management costs, as well as enhancing quality. Microsoft, Oracle, and MuleSoft (acquired by Salesforce in 2018) offer DaaS solutions.
This is an edited extract from the Big data – Thematic Research report produced by GlobalData Thematic Research.