Fortinet has patented a system for natural language message categorization using a vector space model to identify text from a specific topic. The method involves calculating normal exclusion values for unique words, forming a message vector, and comparing it to category extremes for real-time topic determination. GlobalData’s report on Fortinet gives a 360-degree view of the company including its patenting strategy. Buy the report here.

Natural language message categorization based on vector space model

A recently granted patent (Publication Number: US11971983B2) outlines a method for classifying natural language messages using a unique approach. The method involves receiving a message containing text content, calculating normal exclusion values for each unique word based on their frequency in the message and a dictionary, and forming a message vector. This vector is then compared to category extremes to determine if the message belongs to a specific category of interest. The patent details the equations and processes involved in this classification method, emphasizing the use of a large dictionary, such as the Oxford English Corpus™, and the importance of accurate frequency calculations for unique words.

Furthermore, the patent extends beyond just the method and describes a system for characterizing message categories, including a processing resource and a computer-readable medium with stored instructions. The system follows a similar process of calculating normal exclusion values, forming message vectors, and comparing them to category extremes to determine inclusion in a specific category. The patent also highlights the importance of accessing vector definitions and category extremes from storage mediums for accurate classification. Overall, the patent provides a detailed and technical insight into a method and system for classifying natural language messages based on unique word frequencies and dictionary comparisons, offering a novel approach to message categorization in the field of natural language processing.

