Basis Technology Blog
The Importance of Japanese Readings in Search and More
Japanese is unusual in that a word and its pronunciation are both valid keyword searches. Imagine if you could search in English on “Seezer Salad Resipi” and get recipes for “Caesar salad.” In Japanese you can, because it is written with Chinese ideographs, called kanji, and two phonetic alphabets. The two alphabets (hiragana and katakana) […]Read more →
What’s New in Highlight 7.2
This latest version of Highlight has significant enhancements for government linguists and translators that use this Microsoft Office plug-in to translate and standardize names between English and non-Latin languages—Arabic, Dari, Farsi, Korean, Mandarin Chinese, Pashto, and Russian.Read more →
Python Autopsy Module Tutorial #1: The File Ingest Module
There is still plenty of time to work on an Autopsy module that will get you cash prizes (and bragging rights) from Basis Technology at OSDFCon 2015. The easiest way for most people to write a module is to use Python and this will be a gentle intro to doing so. The developer docs contain far more details. Why Bother? If cash […]Read more →
An Elegant and Efficient Way to Fuzzy Search Names in Elasticsearch
Elasticsearch developers who want to fuzzy search names across multiple fields and cover the spectrum of name variations (sometimes two or more in a single name), know how much of a bear it can be. Until now, the solution has not been completely satisfactory, comprehensive, nor clean, but that’s all about to change.Read more →
Elasticsearch and Fuzzy Name Matching Meetup, World Tour
Normalization is crucial to high quality search results — who wants irrelevant variations between queries and documents leading to missed hits (e.g., “celebrity” v. “celebrities”)? Normalizing dictionary words works, but what if your application focuses on names? Whether you’re tackling log analysis, e-commerce, watch list screening or other applications, names are often the key. Can […]Read more →
Language Learning Gets a Boost From Lingua.ly
New browser extension accelerates language learning on the Web Lingua.ly—the latest innovative business to take advantage of the Basis Technology Startup Program—was making a splash in the Chrome Web Store last week, where the editorial team loved it so much they featured it on their central banner. The extension incorporates language learning into the context […]Read more →
Fuzzy Name Search and Name Matching Presentations in San Francisco
Names connect data points and are frequently the most important piece of information in a document. But unlike common nouns and verbs, they defy standardization, making them an elusive search target. But you can, in just two days, go from “neophyte” to “well-informed” in the realm of fuzzy name searching and matching. Basis Technology’s VP […]Read more →
Multilingual Search With Solr? No Problem!
Whitepaper – Optimizing Multilingual Search With Solr: Recall, Precision and Accuracy INTRODUCTION Today’s search application users expect search engines to just work seamlessly across multiple languages. They should be able to issue queries in any language against a document corpus that spans multiple languages, and get quality results back. This whitepaper offers the search application engineer […]Read more →
What is in your CSIRT First Responder’s Jump Kit?
Like other services, effective Computer Security Incident Response Teams (CSIRTs) are tiered. The First Responder on a CSIRT is much like the EMT who assess the situation and either deals with it themselves or brings the case to more specialized teams. In this blog post, we cover the role of the First Responder on a […]Read more →
Could Better Name Matching Have Prevented the Boston Marathon Bombings?
Whitepaper – Making the Most of Intelligence: The Importance of Name Matching in Identity Resolution in Government In the Federal Government, making the right connections among field intelligence, open source media, and watchlist systems often comes down to matching names. The consequences of a missed match or a false match range from minor embarrassment to […]Read more →
Predictive Analytics Case Study
EMBERS Successfully Forecasts Future Events How Virginia Tech’s EMBERS project “beat the news” by predicting civil unrest in Latin America Is a fascinating case study in the power of Big Data, advanced text analytics, and human/computer collaboration. Case Study Summary Since November 2012, the EMBERS project has been accurately forecasting civil unrest events in Latin […]Read more →
Accurate Language Detection for Queries & Tweets
Doubles the Accuracy of Existing Language Identification Software Basis Technology’s Rosette Language Identifier (RLI) has been improved to solve the problem of language detection for short texts. Existing language detectors require many words to confidently identify the language of a string of text, and are therefore unreliable when trying to detect the language of queries, […]Read more →
Can you rely on the Treasury Department’s Sanctions List Search?
When the United States wants to prohibit its citizens and corporations from doing business with a foreign national, that individual is added to the Specially Designated Nationals list maintained by the Office of Foreign Assets Control of the US Department of the Treasury. One person on that list is Chabaane Ben Mohamed al-Trabelsi*, a Tunisian […]Read more →
Adapt Rosette Entity Extractor to Your Content for Increased Accuracy
Entity extraction is becoming a mission-critical tool for finding mentions of people, places, organizations, and products in massive quantities of text. In patent searches, law enforcement, voice-of-the-customer analysis, ad targeting, content recommendation, eDiscovery, and anti-fraud, entity extraction enables swift analysis of gigabytes of data. Among named entity recognition systems, those such as Rosette Entity Extractor […]Read more →
God is not Divine? The Pope is not Catholic? or how translation introduces loss and can distort meaning
Pope Francis has grabbed many headlines by discussing what appear to be bold changes to Catholic doctrine. However, one recent headline would make it sound like he wants to rethink the concept of religion and God entirely. The alleged quote goes something like “God is not a divine being or a magician” (this phrasing is […]Read more →
See All Events
See All NewsTweets by @basistechnology