The old adage “You can’t see the forest for the trees” applies to the acres and acres of data that overrun government, legal, and those in e-discovery. The gardener in this case might be Equivio, whose business is managing data redundancy. Equivio’s software mimics human intuition by organizing sets of documents and emails in meaningful ways: grouping near-duplicate documents, reconstructing email threads, clustering by subject, search, language detection, data mining, and more.
The first step in sorting documents of any type is determining its language. For this critical step, Equivio relies on the Rosette Language Identifier, the leader in its area for wide language coverage (55 and counting!) and high performance to churn through terabytes of data. Unique to Rosette is its ability to identify multiple languages within a single document. For example, an email might be in French, but its disclaimer footer might be in English. A document might be in one language, but then quote from another document in a different language. Whatever the document, Rosette delivers dependable results quickly.