Rosette Name Indexer fuzzy matches name written in the native script and Latin script of these languages. Additional languages are available via custom development.
Rosette® Name Indexer uses fuzzy matching to find personal, location, and organizational names. For any name queried, the software will search the user-provided database of names to return a ranked list of possible matches—each with a similarity score. It can be used to link names across different databases and enhance data quality. The name matcher is very accurate, highly scalable, and easily integrated with applications and databases, as it has been by governments and companies around the world.
Rosette Name Indexer handles the following name matching problems.
|Same name written in multiple languages||“Mao Zedong”, “Мао Цзэдун”, “毛泽东”, “毛澤東”|
|Phonetic spelling differences||“Cairns”, “Kearns”, “Kerns”|
|Transliteration spelling differences||“Abdul Rasheed”, “Abd-al-Rasheed”, “Abd Ar-Rashid”|
|Nicknames||“William”, “Will”, “Bill”, “Billy”|
|Initials||“J. E. Smith”, “James Earl Smith”|
|Titles and honorifics||“Dr.”, “Mr.”, “Esq.”, etc.|
|Out-of-order name components||“Diaz, Carlos Alfonzo”, “Carlos Alfonzo Diaz”|
|Missing name components||“Phillip Charles Carr”, “Phillip Carr”|
|Missing spaces (variable segmentation)||“MaryEllen”, “Mary Ellen”, “Mary-Ellen”|
|Truncated name components||“Royal Bank of Sco”, “Mcdonal”, “Stev”|
The same name can be intentionally written in many different ways depending on the context. Sometimes people use their full middle name, sometimes just the initial, and sometimes they will skip it entirely. Typographical errors are introduced when names are entered into a computer system. Typical mistakes include missing letters, swapped letters, phonetic variations, etc. In all of these cases, Rosette will still find the matching names.
Our name matching software handles more than just English names. It can also match a name as it’s written in English or in Arabic, Chinese, Japanese, Korean, Persian (Farsi/Dari), Russian, or Urdu. Users with a database of English names can query it starting with a non-English name. Thus, a search for 김정일 or 金正日 will match “Kim Jong-il.” Conversely, a program which only accepts queried names in English can—with Rosette Name Indexer—search databases of names in foreign languages.
Unlike other name matchers, our software algorithm compares English and foreign language names without translating the foreign names. This approach increases its accuracy. Every time a name is translated before comparison, data is lost, decreasing the chances of an accurate comparison.
Rosette Name Indexer uses an integrated suite of algorithms, which combine rules, nicknames, and statistical models to calculate similarity. The result is greater speed than systems that rely on lists of thousands of name variations; and greater flexibility to match unknown variations. For cross-language matching, our software directly compares names without first translating the foreign language name. Read about the pros and cons of different name matching technologies in our name matching whitepaper.
Rosette Name Indexer is a software development kit with a Java API or web service interface. With its small footprint, Rosette can be rapidly integrated into new or existing systems of any size—from desktop-based software to server farms. The name indexer’s high speed adds capability to a system without slowing it down.
The name matcher is suitable for:
Rosette builds a high-performance search engine from your set of names by indexing them. Each time Rosette is queried about a name, it returns a list of matches with a confidence-ranked score from 0% to 100%. Users can set a minimum match threshold to ensure the quality of the results returned.