Rosette Entity Resolver


Find out who, where, and what are really being discussed

Overview

Resolve entity issues of variety, ambiguity, and discover “ghosts”

Entity extraction just finds the words representing entities, whereas entity resolution connects the words to real-life people, organizations and locations by looking at the context of each entity mention. Is “Clinton” in the context of “former U.S. president” or “former U.S. secretary of state”? Our entity resolver then knows if it is Bill Clinton or Hillary Clinton.

Our entity resolver accounts for:

  • Ambiguity: When one name can refer to two or more entities, use context to decide which entity it is
  • Variety: For entities with more than one name, group them as “synonyms”
  • “New” entities: For entities new to your knowledge base, use context to track these “ghosts” in learning mode
  • Connections: Follow and connect multiple mentions of each entity in one document or across all your documents

Designed for real-world applications

Entity resolution becomes metadata to power entity-centric search and discovery. With confidence measures for each linking decision, you control how the data is used. Rosette provides the building blocks for notification applications detecting and tracking new people in text streams and for creating custom knowledge graphs.

Product Highlights

  • Pre-trained to link to 2.5 million Wikipedia entities
  • Customizable to link to custom knowledge bases
  • 4-13 languages supported
  • Entity type support for person, location, organization
  • Linking mode: rapidly links entities to knowledge base entries
  • Learning mode: finds and tracks entities (“ghosts”) not in your knowledge base (SDK only)
  • Optimized for tweets (cloud API only)
  • Industrial-strength support

Examples of ambiguity

Tamerlin Tsarnaev (TheAtlantic.com)
—or—
Tamerlane Tsarnaevy (Mir24.net)

Paris, Texas (33°39 N, 95°32 W)
—or—
Paris, France (48°51 N, 2°21 E)

How It Works

Linking mode

In linking mode, Rosette will rapidly link the names of people, organizations and locations in your text to entities in the chosen knowledge base. Anything that can’t be associated with an existing entity will be ignored.

This mode is optimized for high scale and stable throughput.

Learning mode

In learning mode, Rosette not only links names to known entities, but also discovers new entities (called “ghosts”), and remembers the new aliases and contexts it has found for all these entities.

For example, once “J. Doe” has been encountered and linked to the “John Doe” entity, future occurrences of “J. Doe” will be matched with greater confidence.

Connecting to your own knowledge base

Rosette comes pre-trained to link to a Wikipedia-derived 2M+ entity database. It may be further trained by adding to this entity database or by providing an entirely new database.

Training currently involves adding information about real-world entities to the system such as names, aliases, related entities, and example documents. A simple example is adding a new alias to a Wikipedia-derived entity to improve resolution accuracy.

EXAMPLE: Basketball player Jeremy Lin is often referred to as “Linsanity”.

Training allows developers to add the “Linsanity” alias to the entry for Jeremy Lin. The next time “Linsanity” is encountered, it will be resolved appropriately.

Under the hood

Rosette Entity Resolver uses a machine-learned model to associate names and their contexts with collections of information drawn from the entity databases with known entities.

In linking mode, Rosette fixes both the number of entities and the information within.

Learning mode allows new entities to be created and new information to be added to existing entities. As this system state grows, the entity resolver intelligently prunes the information to maintain performance.

Tech Specs

Availability and Platform Support

Deployment availability:
Plugins:
Bindings:

Languages for On-Premise SDK

Arabic Japanese Pashto Russian
Chinese, Simplified Korean Persian (Dari) Spanish
Chinese, Traditional Malay Persian (Farsi) Urdu
English

Languages in the Cloud API

Chinese Japanese
English Spanish

Entity Types

Person Location Organization

Try the Demo

Cloud API

Easy to use API

Ideal for product evaluation, academic research, and smaller, cost-conscious businesses, our fast and powerful API is instantly accessible and free to get started.

Try entity linking and the rest of Rosette API’s endpoints for free up to 10,000 calls/month!

Get an API Key

Optimized for speed & tweets

Our cloud API implementation of entity resolution has been optimized to be extra fast and accurate processing tweets or any short, “noisy” text. Since there is no state retained in the cloud API, learning mode (finding new entities) is only available through Rosette on-premise.

Quality documentation and support

Customers love our thorough and responsive support team. We also provide in-depth documentation that lists all the features and functions of the various API endpoints along-side examples in the binding of your choice.

Visit our GitHub for the binding and documentation.

Enterprise ready

Evaluate Rosette’s functional fit with your business and data needs on our cloud API knowing that scalable, customizable, on-premise deployments are available if you need them.

{
  "entities": [
    {
      "type": "PERSON",
      "mention": "Bill Murray",
      "normalized": "Bill Murray",
      "count": 1,
      "entityId": "Q29250",
      "confidence": 0.9990000128746033
    },
    {
      "type": "PRODUCT",
      "mention": "Ghostbusters",
      "normalized": "Ghostbusters",
      "count": 1,
      "entityId": "Q108745"
    },
    {
      "type": "TITLE",
      "mention": "Dr.",
      "normalized": "Dr.",
      "count": 1,
      "entityId": "T2",
      "confidence": 0.9990000128746033
    },
    {
      "type": "PERSON",
      "mention": "Peter Venkman",
      "normalized": "Peter Venkman",
      "count": 1,
      "entityId": "Q2483011",
      "confidence": 0.9990000128746033
    },
    {
      "type": "LOCATION",
      "mention": "Boston",
      "normalized": "Boston",
      "count": 1,
      "entityId": "Q100"
    },
    {
      "type": "IDENTIFIER:URL",
      "mention": "http://dlvr.it/BnsFfS",
      "normalized": "http://dlvr.it/BnsFfS",
      "count": 1,
      "entityId": "T5"
    }
  ]
}

On Premise

Customize and scale your text analytics on premise

Take advantage of our entity resolver’s learning mode to track new entities or to train Rosette to link to your own knowledge base using Rosette on-premise. Organizations with vast data quantities, unique integration needs, and data security restrictions, may also want to investigate our on-premise API deployment and SDKs to host on your internal servers.

Request product evaluation

If your organization requires an on-premise solution, we’re happy to work with you to meet your business’ unique needs. For free evaluation of our on-premise deployments please complete the form below and our Customer Engineering team will provide you with an on-premise evaluation package.

Drop us a line

EMAIL:
info@basistech.com

PHONE:
+1-617-386-2000

Select Customers Include

No coding required

rapidminer-1

rapidminer

RapidMiner is the industry’s #1 predictive analytics platform. The client platform, RapidMiner Studio, empowers organizations to easily prep data, create models and operationalize predictive analytics within any business process.

Try RapidMiner