Open source intelligence provides a rich treasure-trove of freely available data. It’s possible to tap into it with nothing more than a web browser and a search engine.
A data point such as an email address or telephone number is all that’s needed to start following a trail – and there are many possible directions to take. For example, you can Google an email address, run the domain through a free WHOIS checker, search for it on data breach databases, or on multiple social networks.
The trouble is, it’s a slow process.
For example, say you’ve been harvesting email addresses and want to drill down to the best marketing targets, or that you wish to batch check the legitimacy of potential customers. OSINT research is a great way of doing these things, but manual searches can take forever. It just doesn't scale.
Data enrichment can provide the solution, and that’s what we look at in this guide. Read on to find out how data enrichment can help you to perform your OSINT research in a fast and more automated way.
Data enrichment is the process of gathering a wealth of information from a single data point and combining it into a report or profile. Such a pursuit can start with the phone number provided by a website user, as SEON explains in a breakdown of reverse phone lookups, or their IP address, as well as anything that might be easy to access or ask for.
For example, from an email address, you can – among other things – find out:
From a phone number, you can ascertain things like:
As you can imagine, there are many possible ways to make use of this information. On the legitimate side, it’s often used for fraud prevention and customer verification purposes. As Trifacta says, there are several benefits to data enrichment, including how it “offers opportunities for cross-sells and upsells because a business has the right data and knows its customers well.”
It would be amiss, however, not to also mention that OSINT data can be used for less honorable purposes – by everyone from marketers to cybercriminals. Fittingly, Cisco describes it as “a boon and an Achilles’ heel.”
There are three key ways that data enrichment can help with OSINT research: speed, automation, and bulk processing.
Much of the information listed above is freely available. It is, after all, open. But a business looking to onboard a new customer cannot spend hours searching through dozens of social networks and online databases. Data enrichment tools do it in seconds: Simply feed in a data point (such as email address, IP address or phone number) and receive a full breakdown of all the available information.
In some cases, such manual lookups are sufficient. A manual search tool or a browser extension considerably speeds up the process of small-scale OSINT research tasks. However, tasks with more volume or throughput benefit from a level of automation.
This can include batch processing tools, APIs that integrate with existing systems, and tools that include risk score functionality. The latter can add or subtract points based on factors that support or question the legitimacy of a phone number, email address, or other data point. Instead of manually reviewing all of the provided OSINT data, users can simply sort by a risk score, calculated using criteria of their choice.
Let’s consider a couple of examples of how data enrichment is used in practice:
An eCommerce store can integrate a fraud prevention tool with its online ordering systems. As soon as a customer provides an email address and/or phone number, the system can perform an automated check – via an API – that generates a risk score.
A low-risk score could allow an order to go straight through with no friction. Meanwhile, a high-risk score due to factors like the use of a suspicious email address or temporary phone number will trigger a manual check, or cause the order to be rejected automatically.
This goes beyond fraud prevention too, as a similar check can tell the merchant which customers are more likely to spend more or be more receptive to cross-selling.
Marketers can also make use of data enrichment. A firm in possession of a large email list may wish to filter out people based in certain countries or remove suspicious and fake addresses before commencing a campaign.
OSINT research is invaluable here but needs to be done in a fast and automated way. For this latter use case, a CSV file that’s easy to sort and manipulate is ideal.
Here are a few example tools that can assist with OSINT research. While each tends to have a primary purpose, it’s often possible to adapt them to your specific use case.
Part of its end-to-end fraud prevention solutions, SEON offers everything from simple (free) online phone number and email lookup tools to a fully-fledged API. The social media lookup alone queries over 50 social networks – something that would be hugely laborious to do manually.
SEON incorporates risk scores that are highly customizable and use a whitebox approach. This means that you can have full visibility of how the risk score is calculated. It also uses machine learning to spot new fraud patterns.
Clearbit is a marketing focussed tool, intended to build prospect lists and to finely target advertising and email campaigns. It’s particularly strong on company and contact data.
Drawing from over 250 data sources, Clearbit is known for its open-source intelligence on companies rather than individuals. It looks for everything from HQ addresses to estimated annual company revenue.
While ClearBit has extensive functionality, it all comes at a price. Other than a couple of basic free tools, everything is available to paying subscribers only. This is very much an enterprise-grade tool.
BeenVerified is a US-only tool. It’s focused on helping people do due diligence on everything from people to properties and vehicles, primarily using OSINT data. You can also look up criminal records with BeenVerified.
In order to comply with legislation, BeenVerified has to be very specific about how it can be used. It’s intended more as a consumer tool than for business use. For example, it’s not supposed to be used for employment screening or credit checks. That said, there are APIs available.
Here’s an example of how to use a data enrichment tool to speed up your own OSINT research. It uses the free trial of SEON’s product.
Before beginning, you need a list of email addresses, phone numbers, or IP addresses you wish to research.
Using the tool in this way allows you to manipulate and process the results however you wish – sorting based on different criteria, as needed.
SEON’s free trial also includes access to the REST APIs. This is restricted to 120 searches, which is sufficient for a small research project or to get used to how the system works. The API works with Python, Java, cURL and PHP, and is well documented for developers.
As you will quickly realize, data enrichment is extremely useful in itself, but the true power lies in the ability to do batch lookups. API integration can then exponentially increase its potential.
Obviously, the tools listed here are far from an exhaustive selection. OSINT tools run from consumer-focused online lookups to enterprise-grade systems with a price to match. There are also CLI-based tools like MOSINT that perform similar functions.
Ultimately, data enrichment is about compiling all of that freely available OSINT data without the need for lots of manual intervention. An hour spent automating the process will achieve a whole lot more in the long run than an hour flicking from search box to search box.
ABOUT THE AUTHOR
Gergo Varga has been fighting online fraud since 2009 at various companies – even co-founding his own anti-fraud startup. He's the author of the Fraud Prevention Guide for Dummies – SEON Special edition. He currently works as the Senior Content Manager / Evangelist at SEON, using his industry knowledge to keep marketing sharp, communicating between the different departments to understand what's happening on the frontlines of fraud detection. He lives in Budapest, Hungary, and is an avid reader of philosophy and history.