Using OSINT and SOCMINT to protect yourself from hackers


Open-Source Intelligence implies powerful knowledge.

What is OSINT ?

OSINT stands for Open-Source Intelligence. It is the science of collecting and analyzing public information, that can be found on websites, blogs, social media, newspapers,…OSINT is strongly used for general investigations, such as in journalism, law enforcement, or cybersecurity.

You can perform OSINT with several tools, but even with some tricks with a web browser only. For example, you can find all PDF files on a website, by doing what is called Google Dorking. This is just some keywords in a Google search, that specifies what you want to find.

This is the result for searching for only PDF files on (for example site : filetype : pdf)

There are many other techniques that can lead to get really useful information about anything or anyone. You can use OSINT to find out how you can appear on the internet, as many companies are doing to keep up with their public image.


Social Media Intelligence, known as SOCMINT, is a subdomain of OSINT. It consists in gathering information about anything based on what is publicly available on social media, like pictures, location data, or simply content.
It was created back in 2012 by David Omand, in his article Introducing Social Media Intelligence (SOCMINT). The term was invented right after a police event, because all social networks were flooded by people’s reactions against the police, as a result they could not handle such a large amount of virtual messages at the moment. A special section was then created to be prepared to this type of reaction and to investigate about potential criminals.

From now on, people expose their whole life on social media, with pictures of them, what they are wearing, their passions, where they are living, their job and a lot more information. Anyone can know anything about almost anybody on earth just by checking their social media accounts.

SOCMINT is often described with two types of data : the original content published by the user, their pictures or text; and the metadata, which is information associated with the post, like location or date and time.

By reuniting information from different sources, we can get a powerful knowledge over an enterprise, a group or a single person. “Through SOCMINT, the intelligence community can determine some behavioural patterns that can apply for certain groups or certain individuals” according to the article Social Media Intelligence: Opportunities And Limitations by Adrian Liviu IVAN, Claudia Anamaria IOV, Raluca Codruta LUTAI and Marius Nicolae GRAD. Thus, if we manage to grab such useful information about a hacker or a hacking group, we can get clues about their way of attacking and exploiting different flaws and vulnerabilities, in order to improve the reaction to an attack:

  • Detect the attack faster
  • Be more prompt to react
  • Improve the defence
  • Reduce the damages

But this can also help to be more preventive about hackers, and block any attack before they can be set up.

Here is a quick example of the power of open-source information.

In May 2021, a cybersecurity researcher posted a tweet about a data breach on Domino’s India (see the tweet). The tweet shows a screenshot of a darkweb website, presented as a search engine, on which one can search for the information leaked. The site was enormously visited following the tweet, and anybody can search for data about employees and customers of Domino’s India. The social media helped the hackers to gain visibility, but also helped security researchers to improve their cyberwatch. This shows how big is the impact of social media, here just by a single tweet.

Twitter information retrieving

In order to be reactive to a cyber attack, we need to stay well informed. Nowadays, tons of Twitter accounts are made to alert people about cybersecurity issues, as the CERT accounts, like @CERT_FR. Also, hackers can be very pretentious and expose their exploits on the Internet.
Twitter is currently growing quickly, and a significant percentage of people worlwide use it to get news. It is also the #1 social media in Japan, and 90% of americans are familiar with it.

Twitter is the social media where people can expose their opinion and express their discontent or happiness about everything that happens. It is also the right one to get news about cybersecurity and hackers bragging.
That is why HTTPCS is now monitoring Twitter among other social media, helping avoid great disasters.
We are currently analyzing tweets to get information faster and be prompt to react if any suspicion of an attack is detected by our product.

How do we analyse and qualifiy those tweets ?

Our analysis is based on keywords which are present or not in the tweet. Those keywords are weighted: a value is associated to them, which indicates how much they are important in the text. In order to do this weighting, we are using the Term Frequency – Inverse Document Frequency method, which calculates the importance of a word inside a group of texts. This statistics allows us to know if a word appears frequently, but it does not take usual words into account. Therefore, words as “the”, “a” or “and” have a low weight. According to Wikipedia, “a survey conducted in 2015 showed that 83% of text-based recommender systems in digital libraries use tf–idf”. This analysis is then reliable thanks to the method used, but also thanks to the number of tweets we have treated to increase weight accuracy.

Other social media

Analyzing tweets is great but what about the other social medias ? Facebook, Instagram, Reddit, LinkedIn and many social medias contain a large amount of information.
For example, on LinkedIn and its german equivalent Xing, you can get where people work, and thus the region they live, target what an enterprise offers as jobs, the technologies they are using, and so on.

As we saw through this article, SOCMINT, and generally OSINT, is very important in cybersecurity. It can help monitor the company public image, resolve an investigation, and of course prevent cyberattacks. Our reliable qualifying method is currently analyzing millions of tweets everyday to protect our clients.



Your email address will not be published.