Government Solutions

 

“Basis Technology’s linguistics software increases the ease, speed and precision of detecting critical information by assisting analysts in mining documents in their native language and identifying, disambiguating and clearly labeling the most important words and phrases. These capabilities are in demand throughout government and industry as the sheer volume of multilingual data collected continues to grow.  Like industry, government faces a dramatically increasing need for automated processes to help users quickly find and act on the most important pieces of information.”


— Greg Pepus
   Director, Federal/IC Strategy
   In-Q-Tel


National Security depends on the ability to extract critical information from documents written in foreign languages. Information must be filtered and analyzed in time to act.

The volumes of data are massive, the need is urgent, and the missions are complex.

They are:

  • Open Source Intelligence (OSINT)
  • Document and Media Exploitation (DOCEX)
  • E-mail and instant message analysis
  • Chat room and web site monitoring
  • Terrorist and money laundering watch lists

Even with automated assistance, information extraction presents many challenges:

You may not know what is relevant. Until you see a specific word in context, you might not know that you should have looked for it. For example, you might not know ahead of time that specific names, organizations, locations, and dates might be relevant.

Written forms of the same name will vary. Will you recognize a name when you see it again? Names orginating from the Arabic language can have several different spellings when using the Latin alphabet. In addtion to this, in one context "Fouad" may be the common noun "heart" or "mind" while in another, it's the name of a person.

Search tools only know exact match. In some languages, word forms and sentence structures vary widely in text. For example, in Arabic, words frequently include affixes which are linguistic "subwords" that change or add meaning. This prevents an exact text match from working.

Multilingual Information Extraction: A Multi-Tier Approach

Fast, accurate information extraction from high volumes of non-English text is a multi-tier problem. Basis Technology delivers a multi-tier set of advanced linguistic components whose functions can be called by your application as it needs them. Mix and match these interoperable software modules to meet your particular needs. Each is accessible at the program level via Basis-provided C++, .NET, or Java APIs and for the end user via a command line interface.

With 40+ government deployments (200+ commercial) and over a decade of experience, Basis Technology is a trusted source of multilingual information processing components for Federal government agencies.

Here’s a product overview, with problems solved:

Product Problem solved
Rosette Language Identifier Identifies the language(s) in a document so that applications can properly categorize, search, process, and store its data
Rosette Base Linguistics Discovers structure in unstructured text so that large scale document handling systems can identify, classify, analyze, index and search
Rosette Name Translator Provides the precise, correct English version of a name
Rosette Name Indexer Matches names written in English with the official name of a person or or location in a foreign country
Rosette Entity Extractor Locates important concepts (e.g., names, locations, dates)


Desktop Solutions Problem Solved
Transliteration Assistant Automates the process of writing Arabic names according to officially mandated spellings
Arabic Editor Powerful desktop environment for composition and analysis of Arabic language documents
GeoScope Smart viewer for digital maps capable of searching multilingual place names

Partial List of Government Customers:

 

Read about our work with the Intelligence Community

EContent Magazine
The Truth is in There: Sleuthing for Data with Digital Forensics
March 1, 2007

EContent Magazine
Building on Basis for Multilingual Digital Forensics
Volume 29, Number 4, May 2006

Military Information Technology
Military and intelligence communities’ growing need to translate and retrieve pertinent foreign-language intelligence
March 13, 2006

Military Information Technology
Smart Searching: New technology is helping defense intelligence analysts sort through huge volumes of data.
November 28, 2005

Information Week
"Email Analysis Is Key To Catching Terrorists and Corporate Crooks"
July 27, 2005

CNET News.com
Tech's part in preventing attacks
July 7, 2005

Government Computer News
CIA Pumps Capital into Linguistics Software
April 9, 2004

Federal News Radio
Interview with Basis Technology CEO Carl Hoffman
April 12, 2004

The Washington Post
In Brief” [Registration required]
April 12, 2004



Complete list of press coverage