Understanding Unstructured Data

 
Unstructured Data

Our digital lives

We live in an age of information. Our daily lives are digitized, reported and recorded more than ever before. Stop for a minute and think about all the ways in which you have digitally recorded your life already today.

You might have come up with some of these:

  • Text messages
  • Emails
  • Documents
  • Appointments you have added to your calendar
  • Voice commands you have given to Alexa or Siri
  • Searches you have typed into your search engine
  • Comments, likes and posts that you have created on social media or the wider internet.

 

Now, stop and think about all of the information has been recorded about you, perhaps without you even realizing it.

You might have thought of:

  • CCTV: Closed Circuit Television Cameras
  • ANPR: Automatic Number Plate Recognition or Licence Plate tracking systems
  • Call information from your telephone
  • Bank transactions: The direct debits that have left your account, your internet purchases and the coffee you bought on the way to work
  • Access cards: The record created when you used your pass to access public transport, the gym or your office building
  • Internet browsing history.

 

These lists are just a few examples and you have probably thought of many more.

 
Making our world safer

All this information can be used to make our world a safer and better place to live. Criminal Intelligence Analysts working in Police forces and other Law Enforcement Agencies can use this information to help solve crimes and protect us all. They use investigative software to help them to consolidate and understand information drawn from a wide range of sources. In just the same way as you or I might use Google maps to visualize our route to a new destination, Intelligence Analysts will use software to visualize the people, phone numbers, events or places involved in a criminal investigation and draw out the links and connections between them.

It can be reasonably straightforward to ‘map’ or visualize structured data. Police reports, databases, telephone records, bank transactions and custody records all have common pieces of information which can be extracted, organized, visualized and analyzed automatically. This allows Analysts to identify critical data, uncover patterns and solve crime.

But what about the data held in unstructured sources? The free-form text we use to communicate our thoughts in texts, emails or documents is a rich source of information, but much harder for Analysts to process.

Imagine you were an Analyst investigating a criminal case. You might want to access unstructured information held in the data sources that you considered at the start of the article, you might also like to process unstructured data from sources linked to the investigation such as:

  • Reports: The narratives that police officers write in their reports
  • Witness statements
  • Interview transcripts
  • Documents, presentations or spreadsheets taken from suspects’ hard drives
  • Calendar notes

So how do Intelligence Analysts go about gathering and mapping out this information? There is far too much of it for one person to read through and manually process. Instead, they need to have a fast, accurate and seamless way of capturing, mapping and interpreting unstructured data.

This is where Sintelix comes in, we understand the challenges that Analysts face, and we are passionate about making the world a safer and better place to live. We have combined this passion with advanced technical capabilities to develop software which empowers and enables Intelligence Analysts to process unstructured text. Our software can load information from over 1500 different types of file, harvest data from multiple web data sources and integrate with email servers and SharePoint to extract the information contained within. We understand that during active investigations time is of the essence. Sintelix does all of this at high speed, processing 30 pages of text per core per second*.

Once data has been collected, Sintelix provides Analysts with a suite of tools to deliver actionable intelligence in a variety of formats. Sintelix will accurately identify entities (telephone numbers, individuals, locations, organizations) and the links between them, extracting meaningful information from unstructured text. It will recognize multiple references to the same entity within the same data source and across different data sources, eliminating duplication and making the picture and connections much clearer. The extracted entities and links can then be organized with visual tools including network charts, timelines or tables ready for Analysts to interpret. Not only that, but Sintelix integrates seamlessly with systems like IBM i2, a commonly used investigative analysis tool.

The sheer volume of information that exists in our modern world and the speed at which new information is being recorded can mean that Analysts are deluged by data and may struggle to interpret it if they don’t have the right tools for the job. If you or your organization need to make sense of unstructured data then talk to Sintelix, we can help.

*Tested with 100,000 newswire publications on an AMD Ryzen 7 1700 with 6 cores enabled (hyperthreading disabled), no network creation

 

Text analytics video

text analytics demonstration

 

Text analytics brochure

text analytics brochure

 


Leave a Reply

Your email address will not be published. Required fields are marked *

USA
Phone:
703-481-9831
Address:
Sintelix Incorporated, LLC 2201 Cooperative Way, Suite 600, Herndon, VA 20171
Australia
Phone:
+61 (8) 7221 3200
Office:
Unit 13, 202-208 Glen Osmond Road, Fullarton SA 5063
Post:
PO Box 114, Fullarton SA 5063, Australia
© Sintelix Pty Ltd 2018