Accurately match names and variations in many languages
Processing names can be a monumental challenge for any business or government, with incredible variation in how each one is represented—think about nicknames, initials, and titles. Now try to identify the same name across other languages—and don’t forget misspellings! Rosette® Name Indexer (RNI) solves these challenges with a single, universal index that allows it to compare and match names of people, places, and organizations, across different languages, despite their many variations.
As linguistics experts with deep understanding at the intersection of language and technology, Basis Technology continually improves the Rosette product family with language additions, feature updates, and the latest innovations from the academic world. RNI is unrivalled in its ability to match the names of entities—find out how your organization can utilize this pioneering technology for extraordinary results.
- Simple API
- High-scale and Throughput
- Industrial-strength Support
- Easy Installation
- Flexible and Customizable
- Integration: Java, C++, or Web Services
- Platform: Unix, Linux, Mac, PC (64 or 32-bit)
- Component of the Rosette SDK
How it Works
RNI returns a confidence score based on a name’s similarity with existing names in the index. This “fuzzy” search automatically matches the names within large collections of documents and unstructured text, or rescues them from languishing databases.
Unlike expensive and less accurate legacy solutions driven by lists of spelling variants, RNI analyzes the intrinsic structure of each name component and performs an intelligent comparison using advanced linguistic algorithms. This approach is not limited to a particular list of variants and reduces the likelihood of both “false positives” (wrong matches) and “false negatives” (zero hits or missed matches). When only some components of a name match, RNI aligns input names with entries to recognize partial matches.
Financial institutions use RNI to manage and update watch lists in order to block terrorists’ access to funds, simultaneously avoiding compliance violations and protecting their reputation. Applications also include fraud detection, money laundering, and document triage.
Customize to Your Need
- Set the minimum threshold of the confidence score to manage the precision and recall of the returned search results.
- Ignore a given list of words (“stopwords”) with respect to matching. (e.g., titles, honorifics)
- Force two name components to always match with a given score. (e.g., “Elizabeth” and “Lisbeth” always match at 90%)
- Force two names to always match with a given score. (e.g., “John Doe” and “Joe Bloggs” always match at 95%)
- Link multiple names to a single individual. (e.g., queries for “Marilyn Monroe” and “Norma Jeane Mortensen” include the same person)
RNI scores progressively degraded variations with appropriately lower confidence:
INDEXED NAME: Jesus Alfonso Lopez Diaz
|92||Jesus Alfonso Lobez Deaz||+ Misspelled family name|
|84||Jesus Alfonso Deaz||+ Mother’s father’s name removed|
|80||Jesus A. Deaz||+ Middle name replaced with an initial|
|78||Chuy A. Deaz||+ Given name replaced with a nickname|
|58||Deaz, Chuy A.||+ Reordered name components|