Yandex has been granted a patent for a method and server that retrain a machine learning algorithm (MLA) used to classify documents. The method involves accessing a social network resource with user-submitted content items, identifying content items associated with misclassified documents, generating training objects based on these misclassified documents, and retraining the MLA. GlobalData’s report on Yandex gives a 360-degree view of the company including its patenting strategy. Buy the report here.
According to GlobalData’s company profile on Yandex, Social media analytics was a key innovation area identified from patents. Yandex's grant share as of September 2023 was 46%. Grant share is based on the ratio of number of grants to total number of patents.
Retraining a machine learning algorithm based on misclassified documents

A recently granted patent (Publication Number: US11775573B2) describes a method for retraining a machine learning algorithm (MLA) used in a search engine server. The MLA is trained to classify documents based on their features. The method involves accessing a social network resource that contains content items submitted by users. The search engine server identifies content items associated with documents that have been previously presented as search results. These content items are selected based on a predetermined condition, which includes causing irregular fluctuations in user interaction data for the document during a specific period of time.
Once the content items are identified, the search engine server acquires a set of document features corresponding to the features previously used by the MLA to classify the document. The server then analyzes the content items based on the document features to determine if the document has been misclassified by the MLA in response to the search query. If a misclassification is detected, the server generates a training object by labeling the document with an indication of the misclassification. The MLA is then retrained based on this training object.
The method may also involve acquiring a plurality of search queries previously submitted on the search engine server and their respective traffic information before accessing the social network resource. The traffic information includes the number of submissions for each search query and may also include the traffic source. The search engine server determines the social network resource based on the traffic source and the number of submissions, which may be above a predetermined threshold during a specific period of time.
The patent also describes a search engine server that implements the method. The server includes a processor and a non-transitory computer-readable medium comprising instructions. The processor executes the instructions to access the social network resource, identify content items associated with documents, acquire document features, analyze the content items, generate a training object, and retrain the MLA.
Overall, this patent presents a method and system for retraining a machine learning algorithm used in a search engine server by leveraging user-generated content from a social network resource. By identifying misclassified documents and retraining the MLA, the search engine can improve the accuracy and relevance of search results for users.
To know more about GlobalData’s detailed insights on Yandex, buy the report here.
Data Insights
From
The gold standard of business intelligence.
Blending expert knowledge with cutting-edge technology, GlobalData’s unrivalled proprietary data will enable you to decode what’s happening in your market. You can make better informed decisions and gain a future-proof advantage over your competitors.