Solutions
Home»Solutions»E-Discovery

E-Discovery Whitepaper

Download

Supported Platforms

Windows, Linux, Solaris, AIX, HPUX, and MacOS

Rosette for E-Discovery

Expand Your E-Discovery Scope to Languages Beyond English

In the age of globalization, documents created in different countries and different languages can be highly relevant to legal investigations. Litigation often spans national boundaries, and legal teams are flooded with thousands of documents in languages other than English that need to be filtered, evaluated, and analyzed.

The growing importance of multi-language discovery creates new challenges for attorneys and their technology partners. E-Discovery is already complex, and that complexity grows by orders of magnitude when it involves documents in different languages, writing systems, and character sets. Yet the need for meticulous investigation is as crucial as ever for lawyers to provide the best outcomes for their clients.

Advancing Legal Technology

Basis Technology helps the legal community meet its multilingual discovery challenges head-on. We provide comprehensive electronic discovery solutions that uncover evidence buried in terabytes of unstructured multilingual text—accurately, quickly, and cost-effectively. We do it using the most advanced linguistics software in the industry—software found at the core of virtually all leading multilingual search engines and information retrieval applications.

Our multilingual e-discovery solutions are based on Rosette®, a linguistics platform proven in hundreds of commercial and government environments. Interoperable Rosette software components are configured as building blocks for multilingual e-discovery, working seamlessly within discovery workflows and information retrieval applications while handling many different languages, character sets, and data sources.

Make Your Discovery Application Multilingual

Our industry-leading linguistics software is complemented by ease of integration within data mining, reviewing, search, and other discovery applications used by legal teams. By plugging the Rosette API into an application, users get instant access to unique e-discovery tools covering major European, Asian, and Middle Eastern languages. For legal professionals, it means the ability to examine multilingual text with unparalleled accuracy and efficiency.

Benefits

  • Finds evidence buried in terabytes of unstructured multilingual text
  • Discovers without language barriers
  • Produces all relevant multilingual documents
  • Supports over 55 languages, including Asian, European, and Middle Eastern

Three Steps to Multilingual E-Discovery

  • Step 1: Identify the language(s) and encoding in a document, and convert to Unicode Component: Rosette Language Identifier (RLI) RLI identifies the language(s) and character encoding systems present in a document so its textual content can be filtered and processed. Extracted text is converted to Unicode so that discovery and information retrieval applications can access a single data representation regardless of language. Using a module called Rosette Language Boundary Locator (RLBL), mixed-language documents may be segmented into regions so that language-specific processing may be performed on each region. RLI identifies 55 languages with high accuracy, even when presented with short strings of text.
  • Step 2: Apply linguistic intelligence to identify word forms, parts of speech, and sentence structure Component: Rosette Base Linguistics (RBL) RBL examines documents and performs a complete morphological analysis so text can be accurately filtered, analyzed, and searched. RBL identifies parts of speech, sentence boundaries, word breaks, tokens, and other linguistic components within a document, in European, Asian, and Middle Eastern languages. The technology and linguistic data in RBL results from over 10 years of development and use in web and enterprise search engines.
  • Step 3: Extract the items of interest (including those you didn’t know about) Component: Rosette Entity Extractor (REX) REX sifts through unstructured text and identifies people, places, dates, and other items that establish the true meaning of a document for further analysis. REX locates generic terms as well as custom entities such as specific names, phone numbers, and email addresses. Statistical modeling helps determine if an entity resides within a document, rather than simply referring to a list of possibilities and risk overlooking a variation. The result is entity extraction technology that lets you find what you know—and also what you didn’t know.

System Specifications

Rosette is a portable and highly scalable software developer kit (SDK) that runs on platforms ranging from laptop PCs to multi-CPU servers processing thousands of documents per second.

A fully-documented API is provided and may be accessed from applications written in C, C++, Java, and other languages. A command-line interface is also available for testing purposes.

SDKs are available for Apple MacOS, HP‑UX, IBM AIX, Microsoft Windows, Sun Solaris, and multiple Linux distributions.

For More Information

Error

Fill out the form below, and we’ll contact you about your Rosette for E-Discovery questions.

* indicates a required field

Learn More

For more information check out a solution brief on E-Discovery, or read a whitepaper on Rosette for E-Discovery.