Basis Technology Blog
Fuzzy Name Search and Name Matching Presentations in San Francisco
Names connect data points and are frequently the most important piece of information in a document. But unlike common nouns and verbs, they defy standardization, making them an elusive search target. But you can, in just two days, go from “neophyte” to “well-informed” in the realm of fuzzy name searching and matching. Basis Technology’s VP […]Read more →
Multilingual Search With Solr? No Problem!
Whitepaper – Optimizing Multilingual Search With Solr: Recall, Precision and Accuracy INTRODUCTION Today’s search application users expect search engines to just work seamlessly across multiple languages. They should be able to issue queries in any language against a document corpus that spans multiple languages, and get quality results back. This whitepaper offers the search application engineer […]Read more →
What is in your CSIRT First Responder’s Jump Kit?
Like other services, effective Computer Security Incident Response Teams (CSIRTs) are tiered. The First Responder on a CSIRT is much like the EMT who assess the situation and either deals with it themselves or brings the case to more specialized teams. In this blog post, we cover the role of the First Responder on a […]Read more →
Could Better Name Matching Have Prevented the Boston Marathon Bombings?
Whitepaper – Making the Most of Intelligence: The Importance of Name Matching in Identity Resolution in Government In the Federal Government, making the right connections among field intelligence, open source media, and watchlist systems often comes down to matching names. The consequences of a missed match or a false match range from minor embarrassment to […]Read more →
Predictive Analytics Case Study
EMBERS Successfully Forecasts Future Events How Virginia Tech’s EMBERS project “beat the news” by predicting civil unrest in Latin America Is a fascinating case study in the power of Big Data, advanced text analytics, and human/computer collaboration. Case Study Summary Since November 2012, the EMBERS project has been accurately forecasting civil unrest events in Latin […]Read more →
Accurate Language Detection for Queries & Tweets
Doubles the Accuracy of Existing Language Identification Software Basis Technology’s Rosette Language Identifier (RLI) has been improved to solve the problem of language detection for short texts. Existing language detectors require many words to confidently identify the language of a string of text, and are therefore unreliable when trying to detect the language of queries, […]Read more →
Can you rely on the Treasury Department’s Sanctions List Search?
When the United States wants to prohibit its citizens and corporations from doing business with a foreign national, that individual is added to the Specially Designated Nationals list maintained by the Office of Foreign Assets Control of the US Department of the Treasury. One person on that list is Chabaane Ben Mohamed al-Trabelsi, a Tunisian […]Read more →
Adapt Rosette Entity Extractor to Your Content for Increased Accuracy
Entity extraction is becoming a mission-critical tool for finding mentions of people, places, organizations, and products in massive quantities of text. In patent searches, law enforcement, voice-of-the-customer analysis, ad targeting, content recommendation, eDiscovery, and anti-fraud, entity extraction enables swift analysis of gigabytes of data. Among named entity recognition systems, those such as Rosette Entity Extractor […]Read more →
God is not Divine? The Pope is not Catholic? or how translation introduces loss and can distort meaning
Pope Francis has grabbed many headlines by discussing what appear to be bold changes to Catholic doctrine. However, one recent headline would make it sound like he wants to rethink the concept of religion and God entirely. The alleged quote goes something like “God is not a divine being or a magician” (this phrasing is […]Read more →
A Better Pure Java RegEx Engine
Regular expressions are ubiquitous in NLP, not to mention many miscellaneous text-processing tasks. People use regular expressions as a quick solution to matching and parsing. People build entire complex extraction systems with regular expressions. Some people get themselves into serious trouble by trying to apply them beyond their natural limits. Once upon a time, regular […]Read more →
Interview: The Future of Human Language Technology
Our VP of Engineering, David Murgatroyd, was recently invited to provide input to a US Government effort to chart the future of Human Language Technology (HLT). Input was collected as a question and answer, with questions bolded. He was asked to respond with respect to one task area: Triage, Translation Support or Knowledge Discovery. He […]Read more →
Keeping pace with the ever-changing name of ISIS through the lens of Wikipedia
If you’ve followed recent events in Syria and Iraq, then you’ve surely heard of an organization that at various times is referred to as ISIL (Islamic State of Iraq and the Levant), ISIS (Islamic State of Iraq and Syria), or just IS (Islamic State). While the New York Times recently decided to use “ISIS” in […]Read more →
Optimizing Multilingual Search Using Solr
David Anthony Troiano Principal Software Engineer This talk was delivered at the Boston Data Con 2014 on September 13th, 2014. David Troiano explains how to optimize Apache Solr for multilingual search. Using the example of “Serie A” (which is an Italian Football Club), David shows how a major search engine finds documents in both Italian and English. To […]Read more →
Making sense of unstructured data by turning strings into things
Gregor Stewart Product Manager for Text Analytics This talk was delivered at the Big Data Analytics, Discovery & Visualization Meetup in Cambridge, MA on August 21st, 2014. Gregor Stewart gives an overview of Basis Technology’s text analytics offerings and how various components can be used to make sense of big text data. In the course of the […]Read more →
Cyber Triage: Act Faster!
If you are responsible for protecting digital information, then you will need to respond to a security incident at some point. However, many challenges arise during a response: Unfocused tools that can be complicated to use High false positive rates from automated tools Changing attack tactics that grow more advanced every day Whether you have […]Read more →
See All Events
See All NewsTweets by @basistechnology