Databricks. has been granted a patent for a method and system that facilitates merging target and source tables. The process involves executing jobs to identify matching rows, perform merge operations, and manage deletion vectors, ultimately producing a resulting table based on these operations. GlobalData’s report on Databricks gives a 360-degree view of the company including its patenting strategy. Buy the report here.

Access deeper industry intelligence

Experience unmatched clarity with a single platform that combines unique data, AI, and human expertise.

Find out more

According to GlobalData’s company profile on Databricks, was a key innovation area identified from patents. Databricks's grant share as of June 2024 was 57%. Grant share is based on the ratio of number of grants to total number of patents.

Table merging with deletion vector management system

Source: United States Patent and Trademark Office (USPTO). Credit: Databricks Inc

The patent US12045220B2 outlines a system and method for merging data from a target table and a source table, particularly when the source table contains files that differ from those in the target table. The system comprises a memory with encoded instructions and one or more processors that execute these instructions. The process begins with determining the need to merge the two tables, followed by a first job that identifies matching files between the target and source tables. This job involves storing information about these matches and the corresponding rows that align in both tables. A second job is then performed, which executes a merge operation based on the identified rows and any deletion vectors associated with previously removed rows in the target table. The outcome of this operation is a resulting table that integrates the relevant data from both sources.

Additionally, the patent details various functionalities related to managing deletion vectors, which track previously deleted rows in the target table. The system can determine and store these vectors based on metadata, ensuring that the merge operation accurately reflects the current state of the data. The instructions also allow for post-processing tasks, such as updating metadata statistics for the resulting table and ensuring the validity of these statistics. The patent emphasizes the importance of maintaining data integrity during the merging process, including the handling of files that may not be keyed to the target table. Overall, the invention provides a structured approach to data merging that enhances the efficiency and accuracy of data management systems.

To know more about GlobalData’s detailed insights on Databricks, buy the report here.

Data Insights

From

The gold standard of business intelligence.

Blending expert knowledge with cutting-edge technology, GlobalData’s unrivalled proprietary data will enable you to decode what’s happening in your market. You can make better informed decisions and gain a future-proof advantage over your competitors.

GlobalData

GlobalData, the leading provider of industry intelligence, provided the underlying data, research, and analysis used to produce this article.

GlobalData Patent Analytics tracks bibliographic data, legal events data, point in time patent ownerships, and backward and forward citations from global patenting offices. Textual analysis and official patent classifications are used to group patents into key thematic areas and link them to specific companies across the world’s largest industries.