aiShare Your Requirements
Technologies Involved:
KUBERNETES
Area Of Work: Machine Learning
Project Description

A U.S.-based enterprise focused on large-scale document analysis engaged Oodles to optimize its NLP pipeline, which processed over 100 million documents. The client sought enhanced synonym handling across multiple languages, dynamic search capabilities, and better relevance scoring. Oodles implemented advanced Elasticsearch configurations with synonym expansion and indexing improvements.

Scope Of Work

The client sought Oodles for a scalable solution to support multilingual synonym search, reduce indexing errors caused by special characters, and enable query-time expansion. The project focused on enhancing synonym analyzers, modifying Elasticsearch mappings, recreating indices, and streamlining search relevance—all without increasing index size or requiring frequent reindexing.

Our Solution

To align with the client’s objectives, Oodles redesigned key components of the NLP pipeline to enable flexible and scalable synonym search. 

Key features included:

  • Multilingual Synonym Analyzer Integration: Designed and tested strategies for processing synonyms in multiple languages to improve result accuracy.
  • Special Character Handling: Identified and resolved exceptions caused by special characters (e.g., commas, “>”) in the synonym file, ensuring smooth index mapping.
  • Enhanced Index Mapping: Modified the rjd-data-mapping.sh script and updated Elasticsearch mappings to support expanded synonym definitions across fields.
  • Index Recreation: Applied changes on the client’s machine, recreating Elasticsearch indices to reflect the optimized configurations.
  • Query-Time Synonym Expansion.