Contact Us
Back to all case studies

Term Extraction for Simultaneous Interpreters

Simultaneous interpretation training made easier through term extraction.

term extraction for interpreters
Key Details

Simultaneous interpretation training made easier through term extraction.

  • Challenge
    Term extraction from various types of documents for interpreters
  • Solution
    Web app for multi-language data extraction for interpreters
  • Technologies and tools
    Python, Amazon Comprehend, NLP, TypeScript, Django, Celery, Postgres, Redis, RabbitMQ, Selenium


When preparing to work on live events, human interpreters need to get acquainted with relevant terminology, names of people or institutions, etc. Every language has a particular set of words that do not occur so often or that can be more difficult to pronounce. On top of that, another challenge for interpreters is specific terms that can appear during the translation in more technical meetings. Therefore, providing the interpreter with term extraction software as support for those words can improve the preparation of the interpreter and reduce mistakes.

Interprefy is a leading cloud-based Remote Simultaneous Interpretation (RSI) platform that brings remote conference interpreters into your meetings and events anywhere.

Some brief info about the client’s services:

  • enables interpreters to work from anywhere, anytime;
  • supports dozens of languages;
  • works for events and meetings of all shapes and sizes;
  • provides integration with different platforms: their own platform and app, Zoom, Webex and so on.

The client wanted to simplify the work of interpreters during the live events with the help of disruptive technologies, specifically, term extraction software. So they approached the InData Labs team with the initial request of developing a solution that will extract terminology and names from text documents provided before an event.

Challenge: term extraction from various types of documents for interpreters

Every language has a particular set of words that may become pitfalls without diligent preparation. They may include:

  • rarely used words;
  • words with difficult pronunciation;
  • industry-related terms.

Ahead of an event, onboarding documents that contain domain-specific terminology, names of delegates and institutions, and other words (that are relevant for the event/domain) are provided.

The InData Labs’ task was to develop data extraction software to extract the relevant terminology, names of people or institutions, etc. to help interpreters in their preparation for the event.

Data extraction

Solution: web app for multi-language data extraction for interpreters

The client asked the InData Labs’ engineers to build an engine that is able to extract the following terms from a text of an arbitrary domain:

PoC stage

In the PoC stage, we’ve implemented the following global tasks:

  1. Term Extraction Engine for specific terms (words or word forms that rarely occur in the common lexicon).
  2. Simple UI to demonstrate how it works.

MVP stage

During the app development stage, we’ve done the following tasks:

Back-end development

The InData Labs team has developed the Back-end part for the client, providing them with  full-stack development services and saving their time on finding a trusted vendor.

We enabled fast and easy multi-document upload and processing. Besides that, our engineers implemented data scraping that significantly boosted the user experience. Data scraping enabled fast data gathering, compiling, and visualizing of the insights acquired.

Engine improvements

We’ve also integrated 10+ languages for the app users. This functionality provides easy and fast term extraction from these languages, enabling a smooth interpretation process.

Next task for us was to enable language detection functionality while uploading/processing the document. The app processes the document and detects the language automatically, this saves them time and simplifies the user experience significantly.

Front-end development

We’ve implemented the Front-end part of the project ourselves. Our team of engineers has set up app authorization and multi-user mode.

Result: fast and error-free interpretation with data extraction

InData Labs, a data extraction service provider, has developed a robust web application for multi-language data extraction. Using the automatic term extraction software, interpreters can automatically extract specific vocabulary and terminology from the agenda papers and convert them into a readable format. This provides support for the specific vocabulary and improves the preparation of the interpreter for the event and reduces mistakes.

For the interpreters, this enables impeccable interpretation at the events, and for the company – improving the brand positioning on the market and having more contracts from clients.

  • Language Services
  • NLP

Contact InData Labs

Want to start getting value from your data? Fill the form. Click send. Let's talk.

    By clicking Send Message, you agree to our Terms of Use and Privacy Policy.