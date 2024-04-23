Box has been granted a patent for a method to classify documents using unsupervised learning and template metadata. The system analyzes content objects to form clusters and automatically populates documents with corresponding metadata. Policies are then enforced based on the metadata values. GlobalData’s report on Box gives a 360-degree view of the company including its patenting strategy. Buy the report here.

According to GlobalData’s company profile on Box, AI for workflow management was a key innovation area identified from patents. Box's grant share as of February 2024 was 65%. Grant share is based on the ratio of number of grants to total number of patents.

Automated document classification and metadata population system

Source: United States Patent and Trademark Office (USPTO). Credit: Box Inc

A recently granted patent (Publication Number: US11928425B2) outlines a method for classifying documents using a computer system. The method involves storing an association between a content object template and template metadata by processing content objects to identify features, analyzing these features to determine clusters, and forming content object clusters based on these clusters. When a specific content object is received, the system populates metadata into it by selecting the appropriate template and identifying template metadata. Policy enforcement is then implemented based on the metadata and values in the template metadata, applying policies and restrictions to the specific content object.



Furthermore, the patent details additional features such as deriving vectors from features to determine feature clusters, forming content object clusters based on feature configurations, and adding subject content objects for collaboration. The method also involves determining the content object template based on attributes of features, content object clusters, and metrics. Additionally, unsupervised learning is utilized to identify document features that correspond to specific content object templates. The patent also covers the implementation of the method through non-transitory computer-readable mediums and systems with processors executing the sequence of instructions to classify documents effectively.

