Unique solutions for solving the challenges of multilingual information processing


Search and Text Mining | Analytics and Business Intelligence
Information Filtering | Internationalization


Our solutions are a customer-specific blend of off-the-shelf software, professional services and custom application development. Following are examples of some of our solutions for solving the needs of specific markets.

Search and Text Mining

Most of the world's popular information retrieval solutions incorporate our technology so that they can deliver more accurate, meaningful results to their customers. Whether their customers are searching the Web for information, or web sites for their favorite product or service, or within their own company for important business information, their search experience is enhanced by Basis Technology's Rosette Linguistics Platform. Search and Text Mining applications use Rosette to produce accurate indexing and search results in a variety of major languages and to power conceptual search and text mining through entity extraction.

For example, Search and Text Mining applications use Rosette to pre-process search queries in Asian languages which do not use spaces to distinguish between words. They also use Rosette to normalize search queries in a variety of languages. As a result, search engines can deliver quality results no matter the language.

Analytics and Business Intelligence

ETL, Data Warehousing and Business Intelligence vendors have just about solved the problem of consolidating an organization's disparate data and making it useful to the many stakeholders in the business. But what about all of the rich knowledge that can come from unstructured sources of information such as emails, documents, and collaboration tools? Whether to manage compliance issues such as Sarbanes-Oxley or to gain a competitive edge by leveraging all of their internal data, corporations will demand access to this unstructured information in the same way they have demanded access to their structured data sources.

Basis Technology's Rosette Linguistics Platform is an integral component in the process of turning a mass of unstructured text into structured information ready to be loaded, stored and analyzed by business intelligence applications. Text is fed into the Platform through a single interface (API). As the text moves through the components of the platform it is classified by its language and its encoding, normalized and transcoded into Unicode, and then advanced linguistics are applied to it in order to tag each word and to identify the words or phrases that will become the structured data within this text. The text is delivered back through the API completely tagged and ready to be loaded into the data warehouse for processing by business intelligence applications.

Now, a corporation can not only find out what is happening by looking at trends in their results from structured sources, but they can use unstructured data to answer the why question. For example, an executive has been alerted by his dashboard that sales of a particular product have reached a critical alert threshold. In drilling down on this alert using his business intelligence application, he can see that sales are in fact critically low for this product. Now he wants to know why. So, he further drills down on the information and looks at the data that has been gathered from free text fields in his CRM application, and the numerous emails that have gone back and forth between his field sales people and their customers. The summary of all of this collected data tells him that there is an unusually high instance of phrases indicating customer unhappiness in these communications, and the concepts "accuracy" and "scalability" seem to also have a high frequency from among the text. This gives the executive and the people he manages valuable information to quickly get to the root cause of their decline in sales.

Information Filtering

Whether your application is designed to filter intrusive or non-intrusive information, the Rosette Linguistics Platform delivers the computational linguistics necessary to identify key words and phrases that trigger your filtering application's results. Example applications include:

  • Call Center Software - For applications that manage inbound inquiries from customers, Rosette can identify the language of incoming messages and let you route them to the associated native language customer service representative. In a deeper application of Rosette for message routing, you may want to be able to pick out key words or phrases from among the text which will route the message to the native language speaker who handles, say, "laptops, " or "ovens."
  • Anti-Spam - Critical to the process of filtering spam is being able to identify words and phrases from among the incoming text that are germane to the type of spam you are filtering. Add to this challenge the complexities of different languages, and delivering an Anti-Spam product for global customers is difficult. Rosette provides the multi-lingual identification and tagging functions necessary to accurately identify and filter out unwanted messages. Rosette can process these messages based upon statistical analysis, or it can be trained to identify domain specific words or phrases to fine tune your filters.
  • Market Intelligence - Applications that comb the web looking for specific information about a company, its products, its competitors, its coverage, its reputation, must analyze text in many languages to extract the critical pieces of information that help this company make educated business decisions. By applying Rosette's linguistics, this application will identify the language and encoding of web content, break the text into its components and extract the important concepts to build its database of this company's metrics. From there, it can apply its analytics to deliver concise decision support information.