
Research by: Itay Cohen (@megabeets_)
Over the past few decades, hacktivism has been, in a lot of cases, characterized by minor website defacements and distributed denial-of-service (DDoS) attacks, which, while making headlines, had minimal lasting impact. However, in recent years, we have observed a significant shift in the nature of these activities. Groups that appear to be state-sponsored, yet masquerade as hacktivists, are now conducting large-scale cyber and influence operations. These actors often use various group names and grassroots facades to maintain anonymity and reduce international backlash. These sophisticated attacks serve as powerful tools for political and social influence, providing plausible deniability and the illusion of legitimacy, thereby evading direct government attribution.
This research explores whether such operations’ growing popularity and wide and repeating adoption might also be their Achilles’ heel. We introduce a new approach to attributing hacktivist groups by employing language-based machine-learning models and linguistic analysis. Our study examines thousands of public messages from dozens of hacktivist groups, integrating traditional cyber threat intelligence methods with modern machine learning technologies. This combined approach aims to uncover the key topics discussed by these groups, understand their motivations over time, tie some of these groups together, and improve the way we do attribution when it comes to hacktivism.
Hacktivism, a blend of hacking and activism, has evolved dramatically since its inception. Emerging in the 1980s with groups like the Cult of the Dead Cow, which pioneered the concept of hacktivism, the phenomenon has grown from a fringe subculture into a significant force in global cyber politics. Early hacktivists targeted symbolic institutions and networks, aiming to draw attention to various social and political issues. The 1990s and 2000s saw the rise of more organized groups like Anonymous, whose operations, from the 2008 Scientology attacks to the 2010 PayPal protests, showcased the potential of digital activism. Today, hacktivism is not just about defacing websites or DDoS attacks—it has become a sophisticated tool in the arsenal of state and non-state actors alike, influencing geopolitical landscapes worldwide.
One of the most significant developments in hacktivism in recent years is the involvement of nation-state actors. Governments have recognized the strategic value of hacktivist methods, adopting and adapting them for their own purposes. Nations employ cyber operatives to conduct operations that mimic grassroots hacktivism. These activities often aim to sow discord, influence public opinion, or undermine political adversaries while maintaining plausible deniability. By cloaking their actions in the guise of independent hacktivism, governments can achieve their goals without direct attribution, complicating international responses and accountability.
State-sponsored entities often stage operations to appear as though they are the work of independent hacktivists, thereby muddling the waters of attribution. They often operate multiple, sometimes dozens, of groups that appear to act independently. This strategy not only serves to mislead targets but also to discredit genuine activist movements by associating them with state agendas. Such deceptive tactics complicate the global cyber landscape, making it challenging to distinguish between actual hacktivist actions and state-sponsored subterfuge.
In recent years, ever since this shift in the hacktivism landscape started, Check Point Research has been tracking dozens of hacktivism groups — some are operated by groups of hackers with mutual ideology or interest, and others by nation-state actors masquerade as hacktivists. Our investigations have led to several published articles detailing significant operations we believe were conducted by these state actors. In some cases, we have successfully linked different hacktivist groups to a single operator, piercing their veil of anonymity. Our research has employed traditional threat intelligence and investigative methods, such as binary analysis, code similarity, and the Diamond Model of Intrusion Analysis. For instance, we demonstrated how one hacktivist group used the same unique tools and methods as another group targeting different entities years earlier, suggesting they were the same group. However, since then, nation-state hacktivist groups have refined their techniques, trying even harder to obscure their tracks and act independently.
Tracking these groups has provided us with valuable insights into their motives, attack tactics, and usual targets. Our researchers have developed various theories and hypotheses regarding the operators behind these groups, suspecting that many are run, either directly or indirectly, by the intelligence agencies of different nation-states. The recent major geopolitical events, such as the Russian invasion of Ukraine and the conflict between Israel and Hamas, have highlighted shifts that would have otherwise gone unnoticed. In response to these events, hacking groups that had been inactive for years suddenly resurfaced, often with new focus areas. Independent groups began to echo similar messages, sometimes even referencing one another. These conflicts created a fertile ground for the emergence of new groups and cyber personas at an astonishing pace.
The increased frequency of hacktivist operations and the rise of new groups have added complexity to the cyber threat landscape. The rapid formation and dissolution of these groups, coupled with their ability to adapt quickly to geopolitical changes, make it challenging for threat researchers to manually keep track of them. Recognizing that relying on gut feeling and intuition is not a solid approach to threat intelligence, we decided to explore more reliable, robust, and consistent methods.
In order to address our research questions and tackle the complexities we presented, we adopted a combined approach utilizing Topic Modeling and Stylometric analysis. This decision was driven by the need for better methods to analyze the vast amount of textual data written by hacktivist groups. By integrating language-based machine learning models and linguistic analysis, we aim to systematically uncover the motivations behind hacktivist activities, identify connections between different groups, and enhance the attribution of these operations to specific actors.
Topic Modeling helps us understand the key themes and topics discussed by these groups over time. It allows us to answer questions such as: What are the main goals and targets of these hacktivist groups? How do their focus areas change in response to geopolitical events? By analyzing the content of the groups’ social media messages, we can identify clusters of topics and track their evolution, providing insights into the strategic objectives of these actors.
Stylometric analysis, on the other hand, focuses on the unique writing styles of different groups. By examining features like word choice, sentence structure, and other linguistic patterns, we can answer questions such as: Can we tie different hacktivist groups together based on their writing style? Are there groups with similar textual fingerprints that might indicate common authorship or collaboration? This analysis helps us build a network of connections between different groups, offering a deeper understanding of their relationships and potentially revealing the operators behind these cyber personas.
To conduct this research, we collected data from the social media accounts and messages of hacktivist groups believed to be operated by nation-state actors. The primary sources of data were Twitter and Telegram, as these platforms are commonly used by hacktivist groups to disseminate their messages. The collection process involved several steps:
Having all the data structured and ready for analysis we moved on to the analysis.
Topic modeling is a machine-learning technique used to identify themes (topics) within large collections of textual data. In this research, we employed BERTopic, a modular and flexible framework for topic modeling. The process involved the following steps:
# Docs | Representation | Documents | |
---|---|---|---|
Topic 1 | 2775 | attack – russian – cyber – ukrainian – hack | [“we announce for all russian hackers, we begin the cyber attack on ukrainian websites …”] |
Topic 2 | 1205 | israel – cyber – leak – zionist – gaza | [“#IsraelLeaks – Day Four – Leak One – Israel Innovation Authority – #OpCyberToufan #OpIsrael, …”] |
Topic 3 | 978 | confidential – letter – leak – regime – iran | [“We expose a very confidential letter from the cultural and social vice president, …”] |
Topic 4 | 490 | cyber – operation – russia – slava – ukraine | [“today we will cut the tongue of russian propaganda and carry out an do attack on rt (tv “russia today”) …”] |
… | |||
Topic N | 33 | italy – european – attack – farmers – cyber | [“and we continue our joint attack on italy and with the help of our friends from, …”] |
Topic -1 | 283 | NaN | [“because you are a dwarf or a small child, you certainly do not know him”, …] |
Figure 1: Table of the topics after the Topic Representation stage
Figure 2: The topic-modeling process step by step. BERTopic is modular and flexible.
By employing topic modeling, we can inspect the topics and themes within vast amounts of textual data — the messages written by the different profiles. It allows us to gain insights into the key topics and trends discussed by these hacktivist groups, track changes in their focus areas over time, and understand their strategic objectives. For example, we can identify how the focus of certain groups shifts in response to geopolitical events, such as an uptick in messages related to cyber attacks following a major conflict. Additionally, topic modeling can reveal unexpected connections between groups discussing similar topics, suggesting potential coordination or shared motivations. Since these groups were most likely created by intelligence agencies, understanding the topics discussed by each of them can help us understand what mission or goal these accounts are serving and what purpose they were created for. Visualizing the topics over time allows us to detect shifts in focus or interest by the entities operating these accounts. Combining these insights together, we can have a better understanding of who might be behind these groups, based on the interest and the change of interest of certain groups.
Stylometry is the study of linguistic style, often used for authorship attribution. It leverages the unique way individuals write, akin to a linguistic fingerprint, to identify authors or establish connections between texts. In this research, we utilized stylometric analysis to compare the writing styles of different hacktivist groups. This method provided us with potential relationships and commonalities between these groups. The process involved the following steps:
Stylometry can reveal a wealth of information beyond mere authorship. For instance:
By identifying stylistic consistencies across different messages and accounts, we can:
By using stylometric analysis in our methodology, we were able to gain a deeper understanding of the relationships and behaviors of hacktivist groups. This approach provided a layer of analysis that complements traditional methods, offering a more nuanced view of these groups.
In our analysis, we utilized the BERTopic framework to identify and categorize the main themes discussed by various hacktivist groups. After clustering the messages, we extracted the most representative words for each cluster using the class-based TF-IDF (c-TF-IDF) method. We used GPT to assign meaningful names to these topics, which provided concise and descriptive labels based on the top keywords and representative documents for each cluster. Some initial topics were too specific or too broad, requiring refinement. For instance, smaller topics related to leaking documents from the Ukrainian Army were merged into the broader “Leaking Documents from Ukrainian organizations” topic, while the general topic of “Cyber Attacks in Europe” was split into more specific topics like “Cyber Attacks Against Spain” and “Cyber Attacks Against UK.” This iterative process ensured that the final set of topics was both comprehensive and precise. Eventually, we ended up with a set of topics that represented the main interests of these groups.
Figure 3: Topic modeling visualization of users-per-topic
There were multiple topics discussing cyber attacks against organizations in different countries such as Israel, Ukraine, Russia, Iran, India, Spain, the USA, and more. Others focused on cyber attacks against militant organizations such as Hezbollah in Lebanon and the Huthis in Yemen. Some topics were related to the leaking of sensitive information from countries such as Ukraine, Israel, Russia, and Iran. Many topics were directly linked to major geopolitical events, such as the Russian invasion of Ukraine and the Israel-Hamas conflict. This highlights the role of hacktivist groups in cyber warfare and propaganda. On some occasions, similar topics discussed by different groups suggest possible coordination or shared objectives.
Figure 4: Topic modeling visualization of users-per-topic
Given that these groups are likely orchestrated by intelligence agencies, analyzing the topics they discuss provides insight into the missions or goals these accounts are intended to serve. By visualizing the evolution of these topics over time, we detected changes in focus or interest by the entities managing these accounts. This combined analysis offers a clearer understanding of who might be behind these groups, based on their shifting interests and objectives. We observed how groups react to certain geopolitical events and the response time of others. For example, Russia-affiliated hacktivist groups initiated attacks in tandem with the Russian invasion of Ukraine. It took a few months for Ukrainian-related hacktivist groups to consistently retaliate with their own attacks.
Figure 5: Topic modeling visualization of topics-per-user
Figure 6: Topic modeling timeline visualization of topics and volume related to certain geopolitical events
Stylometric analysis was used to compare the writing styles of different hacktivist groups and uncover potential connections. By extracting nearly 200 features from the collected messages, including lexical, syntactic, and stylistic elements, we created unique stylistic fingerprints for each account. This analysis not only identifies authorship but also provides deeper insights into the organizational structure and operational strategies of these groups.
Figure 7: Stylometric similarity visualization draws connections between different hacktivist groups
Using similarity measures to compare these stylistic fingerprints, we identified several notable connections. Certain groups exhibited highly similar writing styles, suggesting they might be operated by the same individuals or teams. For instance, the Cyber Army of Russia Reborn, Solntsepek, and XakNet showed significant stylistic overlap. This finding supports Google Mandiant’s reports indicating that these groups are cyber personas used by APT44, providing further evidence of their coordinated operations. In addition, two groups not mentioned in the Mandiant report—JustEvil and NoName057—also showed some similarity to this cluster of activity. We also discovered connections between other clusters, including those targeting organizations in Israel and Albania, and another cluster focusing on Iran.
Sudden changes in writing style within a single account indicated possible changes in authorship. For example, the IT Army of Ukraine showed a distinct shift in style around 2022, coinciding with a change in its messaging focus. Initially, the account featured messages in Arabic cheering for the Egyptian football team, but after the Russian invasion of Ukraine, the content shifted dramatically to focus on cyber attacks against Russia. This stark change suggests that the account was purchased, repurposed, or taken over by the group. Other examples demonstrate that more than one person is writing the messages for some groups, while other groups have only a single author who consistently writes in a similar manner.
The identification of stylistic overlaps between different groups suggests coordinated efforts and shared objectives, possibly under a single directive from intelligence agencies. This enhances our ability to attribute attacks and understand the broader strategy behind them.
Moreover, detecting sudden shifts in writing style within an account can signal changes in the group’s control or a strategic redirection. This information is vital for threat intelligence, as it provides early indicators of significant changes in hacktivist activities.
By combining these insights with the results from topic modeling, we gain a better understanding of hacktivist groups’ motivations, strategies, and interconnections. This holistic view is essential for understanding and improving the attribution of operations by hacktivist groups.
The findings from our topic modeling and stylometric analysis provide insights into the activities and evolution of hacktivist groups. By identifying key themes discussed by these groups, we get a better understanding of their primary targets, which include various countries and organizations. The topics frequently centered around major geopolitical events, such as the Russian invasion of Ukraine and the Israel-Hamas conflict, highlighting the role of hacktivist groups in cyber warfare and propaganda. The evolution of topics over time reveals how these groups adapt their messaging in response to global events, reflecting their strategic objectives.
Stylometric analysis showed the interconnectedness of hacktivist groups, with certain groups exhibiting similar writing styles indicative of common authorship or coordination. This was particularly evident in clusters of groups targeting specific regions, such as Ukraine, Israel, and Iran. Additionally, shifts in writing style within individual accounts, suggest changes in ownership of the accounts and might indicate a shift in strategic direction.
Understanding the topics and writing styles of hacktivist groups improves threat intelligence by organizing the knowledge regarding their motivations and strategies. This is essential for creating effective countermeasures and accurately attributing cyber attacks to specific groups or state-sponsored entities. Continuous monitoring and analysis of hacktivist communications can provide early warnings of emerging threats, enabling proactive defense measures.
Despite the valuable insights gained, this research has several limitations. Data collection posed significant challenges, as hacktivist groups frequently change their accounts (the social media platforms often ban their old accounts), limiting the availability of data. Also, it is often hard to understand if an account is legit or some kind of a fan page or a copycat of a hacktivist group. Moreover, the reliance on social media messages means that the analysis may not capture the full scope of hacktivist activities, which might include messages written on their website (often on the Darkweb), forums, and sometimes directly to journalists.
The machine learning models used in this study, while powerful, have inherent limitations. Topic modeling relies on the quality and comprehensiveness of the input data, and small or obscure topics may be overlooked. Stylometric analysis, though effective in identifying writing style similarities, may produce false positives if different authors share similar linguistic habits or use a language that is not properly supported by the models. Also, with enough effort, it is possible to mimic the writing style of certain groups, hence misleading the stylometric model.
To build on the research findings, future research should expand the analysis to include more hacktivist accounts. With many groups emerging rapidly, it is essential to set up automated processes for continuous monitoring and analysis. This automation would allow for real-time insights and a more comprehensive understanding of hacktivist activities.
Additionally, researchers should explore other data types, such as the metadata of videos and PDFs, to gain further insights. Analyzing unique hashtags, recurring typos, and mentions across different accounts can also provide valuable information. Applying visual stylometry concepts to videos and graphic images posted by these groups can reveal additional patterns and connections.
Our research’s significance lies in its innovative approach to analyzing hacktivist communications. By combining topic modeling and stylometric analysis, we provided a comprehensive view of these groups’ activities and their evolution. However, it’s important to note that this approach is not infallible. The cyber threat landscape is continuously evolving, and so must our methods. We need to keep experimenting, innovating, and collaborating to stay ahead of these threats.
The blurred lines between state-sponsored activities and traditional grassroots hacktivism highlight the need for adaptive and innovative threat intelligence methods. Our efforts to understand these complex dynamics are driven by the recognition that we do not have all the answers and this fact is what drove us to seek an understanding in the beginning.
I would like to extend my gratitude to my colleague, Amit Levin, from Check Point’s Data Science team, for their invaluable assistance and guidance throughout this research. Thank you, Amit, for your contributions and collaboration.