Rosette Entity Resolver

Find out who, where, and what are really being discussed



Resolve entity issues of variety, ambiguity, and discover “ghosts”

Entity extraction just finds the words representing entities, whereas entity resolution connects the words to real-life people, organizations and locations by looking at the context of each entity mention. Is “Clinton” in the context of “former U.S. president” or “former U.S. secretary of state”? Our entity resolver then knows if it is Bill Clinton or Hillary Clinton.

Our entity resolver accounts for:

  • Ambiguity: When one name can refer to two or more entities, use context to decide which entity it is
  • Variety: For entities with more than one name, group them as “synonyms”
  • “New” entities: For entities new to your knowledge base, use context to track these “ghosts” in learning mode
  • Connections: Follow and connect multiple mentions of each entity in one document or across all your documents

Designed for real-world applications

Entity resolution becomes metadata to power entity-centric search and discovery. With confidence measures for each linking decision, you control how the data is used. Rosette provides the building blocks for notification applications detecting and tracking new people in text streams and for creating custom knowledge graphs.

Product highlights

  • Pre-trained to link to 2.5 million Wikipedia entities
  • Customizable to link to custom knowledge bases
  • 20 languages supported
  • Entity type support for person, location, organization
  • Linking mode: rapidly links entities to knowledge base entries
  • Optimized for tweets
  • Industrial-strength support
  • Cloud and enterprise deployments

Examples of ambiguity

Tamerlin Tsarnaev (
Tamerlane Tsarnaevy (

Paris, Texas (33°39 N, 95°32 W)
Paris, France (48°51 N, 2°21 E)

How It Works

Linking mode

In linking mode, Rosette will rapidly link the names of people, organizations and locations in your text to entities in the chosen knowledge base. Anything that can’t be associated with an existing entity will be ignored.

This mode is optimized for high scale and stable throughput.

Learning mode

In learning mode, Rosette not only links names to known entities, but also discovers new entities (called “ghosts”), and remembers the new aliases and contexts it has found for all these entities.

For example, once “J. Doe” has been encountered and linked to the “John Doe” entity, future occurrences of “J. Doe” will be matched with greater confidence.

Connecting to your own knowledge base

Rosette comes pre-trained to link to a Wikipedia-derived 2M+ entity database. It may be further trained by adding to this entity database or by providing an entirely new database.

Training currently involves adding information about real-world entities to the system such as names, aliases, related entities, and example documents. A simple example is adding a new alias to a Wikipedia-derived entity to improve resolution accuracy.

EXAMPLE: Basketball player Jeremy Lin is often referred to as “Linsanity”.

Training allows developers to add the “Linsanity” alias to the entry for Jeremy Lin. The next time “Linsanity” is encountered, it will be resolved appropriately.

Under the hood

Rosette Entity Resolver uses a machine-learned model to associate names and their contexts with collections of information drawn from the entity databases with known entities.

In linking mode, Rosette fixes both the number of entities and the information within.

Learning mode allows new entities to be created and new information to be added to existing entities. As this system state grows, the entity resolver intelligently prunes the information to maintain performance.

Tech Specs

Availability and platform support

Deployment availability:

Language support

Arabic French Italian Japanese
Chinese, Simplified German Korean Russian
Chinese, Traditional Hebrew Malay Spanish
Dutch Hungarian Pashto Urdu
English Indonesian Persian Vietnamese

Try the Demo


Easy to use

Built for the most demanding text analytics applications and engineered to deliver high accuracy without sacrificing speed, Rosette Cloud is instantly accessible and offers a variety of plans to suit both startups and enterprises.

Try entity linking and the rest of Rosette Cloud’s endpoints for free up to 10,000 calls/month!

Get a Rosette Cloud Key

Optimized for speed & tweets

Our cloud implementation of entity resolution has been optimized to be extra fast and accurate processing tweets or any short, “noisy” text. Since there is no state retained in the cloud, learning mode (finding new entities) is only available through Rosette Enterprise.

Quality documentation and support

Customers love our thorough and responsive support team. We also provide in-depth documentation that lists all the features and functions of the various Rosette Cloud endpoints along-side examples in the binding of your choice.

Visit our GitHub for the binding and documentation.

Enterprise ready

Evaluate Rosette’s functional fit with your business and data needs on Rosette Cloud knowing that scalable, customizable, enterprise deployments are available if you need them.

  "entities": [
      "type": "PERSON",
      "mention": "Bill Murray",
      "normalized": "Bill Murray",
      "count": 1,
      "entityId": "Q29250",
      "confidence": 0.9990000128746033
      "type": "PRODUCT",
      "mention": "Ghostbusters",
      "normalized": "Ghostbusters",
      "count": 1,
      "entityId": "Q108745"
      "type": "TITLE",
      "mention": "Dr.",
      "normalized": "Dr.",
      "count": 1,
      "entityId": "T2",
      "confidence": 0.9990000128746033
      "type": "PERSON",
      "mention": "Peter Venkman",
      "normalized": "Peter Venkman",
      "count": 1,
      "entityId": "Q2483011",
      "confidence": 0.9990000128746033
      "type": "LOCATION",
      "mention": "Boston",
      "normalized": "Boston",
      "count": 1,
      "entityId": "Q100"
      "type": "IDENTIFIER:URL",
      "mention": "",
      "normalized": "",
      "count": 1,
      "entityId": "T5"


Customize and scale your text analytics on premise

Take advantage of our entity resolver’s learning mode to track new entities or to train Rosette Enterprise to link to your own knowledge base. Organizations with vast data quantities, unique integration needs, and data security restrictions, may want to investigate our enterprise deployments to host on your internal servers.

Request product evaluation

If your organization requires an enterprise solution, we’re happy to work with you to meet your business’ unique needs. For free evaluation of our enterprise deployments please complete the form below and our Customer Engineering team will provide you with an evaluation package.

Drop us a line



Select Customers Include

No Coding Required



RapidMiner is the industry’s #1 predictive analytics platform. The client platform, RapidMiner Studio, empowers organizations to easily prep data, create models and operationalize predictive analytics within any business process.

Try RapidMiner