Geospatial Big Data Analysis Opens Up New Opportunities for Homeland Security

By Arnon Shimoni

9.29.2015 twitter linkedin facebook

Abundance of information

It is estimated that over 90% of the world’s data was generated over the past two years.

As more and more connected sensors are added, more opportunities open up for optimizing various aspects of day to day life. From improving network coverage, streamlining traffic flows in cities and more.

 (IDC, 2012)

This incredible abundance of data is difficult to translate into actionable data. What was easy in the past has suddenly become very difficult.

This is especially noticeable in the intelligence community, where most of information is still synthesized by analysts.

The birth of SIGINT

Intercepting messages for intelligence purposes has existed for thousands of years. Messages carried by messengers and mail were regularly intercepted, and a lot of effort was put into deciphering the content, which was written in secret code.
By the middle of the 19th century, most countries around the world had a simple and cheap method of conveying messages – the telegraph. Intercepting these messages was more difficult, because the lines had to be physically tapped.
By World War I, military communication had switched to wireless radio communication. Navies could communicate with ships in ways not possible before. This led to the birth of SIGINT – Signal intelligence. Thanks to what was called traffic analysis back in the day, any person with a correctly tuned radio could eavesdrop on ships. Despite not being able to decrypt messages, these broadcasts are vulnerable to direction-finding – by triangulating a signal from multiple receiving stations, the transmitter can be geolocated – placing the transmitter at risk of discovery and attack.

 

British Post Office RDF lorry from 1927 for finding unlicensed amateur radio transmitters. It was also used to find regenerative receivers which radiated interfering signals due to feedback, a big problem at the time.   

“How Motor Patrol Wars with Bloopers” in Radio News magazine, Experimenter Publications, Inc., New York, Vol. 9, No. 1, July 1927, p. 37

 

In comes geolocation

Today, most communication methods are within reach of more than one receiver – cellular phones pick the strongest of the radio towers they can access, computers can connect to several Wi-Fi hotspots. These features are useful for both the customer and the network providers, as well as law enforcement. For example, a cell phone user can get faster locations for their phones, which aids in navigation without GPS, while law enforcement can locate citizens in distress upon call. Geolocation is also used to track protected wildlife and for search and rescue missions, when hikers use emergency beacons.

In the intelligence field, an Electronic Intelligence (ELINT) sensor such as an aircraft can also track stand-alone devices, such as a walkie-talkie, over a relatively large area. By fusing many ELINT sensors together, a much larger area can be covered while including many different entities gathered from many sensors.

Usually, a SIGINT analyst will regularly inspect tens of thousands of entities every shift. As sensor fusion becomes more commonplace, more and more entities and data is displayed, and analysts are often bombarded with more data than they can handle. By narrowing down the number of entities displayed to dozens instead of thousands by filtering according to locations and other parameters, intelligence analysts can gain a clearer image of the battlefield, resulting in a better analysis provided to decision makers and commanders, supporting better decisions.


 

Abundance of data

As more and more sensors and transmitters are found in the wild, the amount of data increases at blazing speeds. It is no longer feasible to display all data points on a map, as it was acceptable in the past decade.

Instead of storing big-data quantities, many intelligence communities now store all data. Therefore, today’s intelligence analyst needs the ability to filter available data with an easy to change set of rules to avoid information overload.

 (IDC, 2012)

Many contemporary solutions use universal RDBMSs like PostgreSQL or Microsoft SQL Server, due to their abundance of geospatial methods and plugins. However, as more and more data feeds into these systems, the clearer it becomes that these weren’t designed to handle geospatial analytic workloads. When you need real time insights for mission critical queries, you simply can’t afford to wait half an hour for an answer.

Today, new database engines specifically designed with geospatial processing in mind have started appearing. The most interesting are SQream DB by SQream Technologies and GPUdb by GIS Federal. While regular RDBMSs can give back results for geospatial queries, their scaling is problematic. Because both SQream DB and GPUdb use the supercomputer power of the GPU in order to calculate the complex geospatial functions, they are able to answer analyst’s questions must faster.

However, contrary to GPUdb which was designed almost exclusively for in-memory processing of geospatial queries, SQream DB was designed for all types of structured or semi-structured data. In fact, the geospatial methods were built alongside the fully featured SQL engine – meaning you can plug SQream DB in alongside or instead of your current database instance, and instantly handle complex queries involving locations. An easy-to-use SQL interface that enables ad-hoc querying and filtering of data, in all dimensions –

  • Spatial – Find entities inside/outside an arbitrary polygon
  • Temporal – Find entities in a specific time frame
  • Metadata attributes – Find entities that have or don’t have specific features

Spatial filtering by polygon

By enabling rapid, multi-dimensional filtering, the analyst can narrow down the amount of objects on screen, focus on the necessary information, simplify the decision-making, and ultimately produce a higher quality intelligence analysis.

Additionally, SQream DB can easily handle hundreds of thousands of entities entering the database every second, from multiple sensors and data sources. With a small footprint, SQream DB can be run on a standard blade server or even on a laptop.