CORPORATE MACHINE TRANSLATION SOLUTION
Machine Translation engines processing massive amounts of data
USA

Overview
Multilingual voice recordings, texts, and documents are processed to generate a single document database that can be searched using AI to extract the relevant information constituting the facts of a legal process.
Veritone required a large machine translation solution that was capable of translating legal documents, conversations, and documentation in hard disks into English.

Task
As part of the proceedings of a legal case, the contents of personal computers, voice recorders and printed texts are stored.
The transcribed language resources were the equivalent of 350 times the complete works of Shakespeare. Finding the needles in the multilingual haystack required an automated solution.
Pangeanic's solution
The Corporate Language Solution is built up with different neural networks in charge of processing speech and text. The processes involve:
-
Transcription, speech to text conversion.
-
Translation, generating monolingual (English) versions of all language resources.
-
Sentiment Analysis, detecting the positive/negative relevance of text excerpts.
-
Summarization, paragraph abstraction into short sentences
-
Indexing, locating, and referencing entities (people, organizations, dates, locations, money amounts, keywords) in the batch of documents.
-
Categorization, ranging and sorting documents according to class, category and relevance.
Technology
How does it work?
The Corporate Solution runs in 2-3 servers at the customer premises. No interaction with external third parties is needed, and the information is contained at the customer Data Center.
-
Multiformat resources are loaded in the input area and the firstprocess transcribes resources into single format text:
◦ Images / Raster files are OCRed
◦ Speech is transcribed and actors are detected and referenced
◦ PDF, Word and PowerPoint formats are all converted to plain text files.
-
Language Detection and Translation: The language at paragraph level is detected, and if it is not English the text is sent through a neural network specifically designed to translate from the source language into English
-
Specific neural networks receive the monolingual input and produce records with the relevant results (sentiment, importance, referenced entities…) and a graph-model that can later be used to find references in the raw data.
Support services are in charge of managing the collaborative workflow by pipelining the data to and from neural networks and distributing the load to efficiently leverage the hardware and software resources.
Benefits
Sometimes there’s simply too much data and there’s no way the relevant information can be found using traditional resources.
Cost and delivery time are both strong reasons to consider an automated analysis of language resources.
Other Use Cases
Secure Translation Engine for Law Enforcement (Veritone)
Pangeanic delivered a Neural Machine Translation engine wrapped for deployment in the U.S. Department of Defense’s Iron Bank, the secure software repository where applications are hardened and accredited for military use. The engine was trained to handle criminal slang and coded language, supporting crowd monitoring and interrogations with real-time, defense-grade security. This solution helps law enforcement gather intelligence faster, operate with greater transparency, and ensure compliance with strict cybersecurity standards.
We like Pangeanic's work ethos and professionalism. They actively listen to their clients - and that helps them be the best every day to provide tailored language solutions. From my point of view, that's one of their greatest qualities
Pangeanic makes the translation process easy... And they provide a friendly, fast translation service. Creating a database for all our translations was particularly useful so we could recycle translations and re-use content in other occasions and/or similar jobs.
The quality is excellent as usual. The source has been changed many times during the translation. Pangeanic was quick to respond to the changes and it was helpful.