Pegatron has patented a multilingual speech recognition and translation method for conferences. The technology analyzes audio and video data to provide real-time translation and display results on terminal devices. It also determines the number of attendees based on their proximity to microphones. GlobalData’s report on Pegatron gives a 360-degree view of the company including its patenting strategy. Buy the report here.

According to GlobalData’s company profile on Pegatron, V2V communication antennas was a key innovation area identified from patents. Pegatron's grant share as of January 2024 was 57%. Grant share is based on the ratio of number of grants to total number of patents.

Multilingual speech recognition and translation method for conferences

Source: United States Patent and Trademark Office (USPTO). Credit: Pegatron Corp

A recently granted patent (Publication Number: US11881224B2) discloses a multilingual speech recognition and translation method designed for conferences. The method involves receiving audio and video data from attendees at a server, analyzing the data to generate recognition results, splitting the audio data into segments, performing speech recognition based on language family recognition results, translating the text content, and displaying it on terminal apparatus. The method also includes determining the number of speakers and their speaking time, verifying attendee identity, accessing personal libraries for enhanced recognition and translation, and modifying libraries based on user feedback. The system further corrects text content using a reference library of words and phrases.

The patent also describes a server for conferences equipped with audio and video pre-processing modules, speech recognition module, and translation module. The server includes sub-modules for distance detection, face recognition, personal vocabulary capturing, and cloud database service for establishing personal libraries. Additionally, the server features a recognition and correction module for modifying libraries based on user feedback. The system also includes sub-modules for speech feature extraction, people counting, ethnic recognition, activity recognition, and lip recognition to enhance speech recognition and translation accuracy. The server further includes a speaker grouping sub-module to determine speech segments based on various factors like distance, body movement, and facial movement of attendees. Lastly, a word and phrase recognition and correction sub-module corrects text content using a reference library.

In conclusion, the patented method and server system aim to revolutionize multilingual communication in conferences by providing accurate speech recognition, translation, and personalized enhancements based on attendee identity and feedback. The system's advanced features like speaker grouping, personal libraries, and real-time translation contribute to a seamless and efficient conference experience for attendees speaking different languages.

To know more about GlobalData’s detailed insights on Pegatron, buy the report here.

Premium Insights


The gold standard of business intelligence.

Blending expert knowledge with cutting-edge technology, GlobalData’s unrivalled proprietary data will enable you to decode what’s happening in your market. You can make better informed decisions and gain a future-proof advantage over your competitors.


GlobalData, the leading provider of industry intelligence, provided the underlying data, research, and analysis used to produce this article.

GlobalData Patent Analytics tracks bibliographic data, legal events data, point in time patent ownerships, and backward and forward citations from global patenting offices. Textual analysis and official patent classifications are used to group patents into key thematic areas and link them to specific companies across the world’s largest industries.