Research into misinformation, disinformation, and social media manipulation sits at a dangerous intersection of urgency and ambiguity. The high stakes involved – including elections, public health, national security, and social cohesion – demand timely responses. The pressure to get out in front and report findings as quickly as possible is intense. But speed without rigor does not strengthen this field of research and analysis; it weakens it.
For this work to be trusted, funded, cited, and used to inform strategy and policy, it must meet a higher standard than relying on notions that online activity “feels big” or that “the numbers look alarming.” That means being explicit about what was found, what those findings actually demonstrate, and, just as importantly, what they do not prove.
While observations can drive hunches and hypotheses about what online activity means, they are neither equivalent to findings nor adequate proof of coordinated, inauthentic behavior, malicious or otherwise. These domains of misinformation, disinformation, and social media manipulation are often discussed interchangeably in public discourse. In research and analysis, they should not be.
At Alethea, we strive to add context and provide meaningful insights to our analysis, helping our customers and the public better understand the actual significance of our findings.
Before discussing methods or findings, it is worth clarifying terms that are too often blurred together:
Misinformation refers to false or misleading information shared without clear evidence of intent to deceive. This can include rumors, misunderstandings, outdated facts, or incorrect interpretations shared by ordinary users.
Disinformation refers to false or misleading information that is deliberately created or disseminated to deceive, manipulate, or cause harm. This implies intent—but intent must be demonstrated, not assumed.
These distinctions matter because they determine:
What claims can be responsibly made about online activity
What evidence is required to support these claims
What policy, mitigations, or platform responses are available and appropriate
Calling all of these anomalous online phenomena “disinformation” may be rhetorically powerful, but it is analytically reductive and often wrong.
One of the most common failures in mis- and disinformation and social media manipulation research is the use of large, uncontextualized numbers:
“Millions of users were exposed to misinformation...”
“Hundreds of thousands of disinformation accounts…”
“Billions of impressions tied to manipulation...”
“A coordinated influence network across platforms…”
These figures are often technically accurate but analytically misleading; they are also often unreproducible. They do not offer practical, actionable insights for communications, security, or policy teams, but rather create a perception of an urgent crisis with massive scale and impact. While social media can cause offline crises, it takes a trained eye to know when and whether to react.
In reality, exposure to content does not automatically lead to persuasion. The impressions garnered by a post do not directly equate to influence. Accounts do not always correspond one-to-one to human actors. A network of accounts does not always have coordinated intent.
When researchers fail to specify which phenomenon they are measuring—such as the spread of misinformation, actor-led disinformation campaigns, or other manipulation tactics—they collapse distinct behaviors into a single alarming headline.
Rigor demands that we treat numbers as evidence, not rhetoric.
In information integrity research, a common challenge involves the conflation of potential reach with actual impact. Analysts often cite high-level follower counts as a proxy for exposure, leading to conclusions that may overstate the scale of a "disinformation" event. Without verifying feed delivery or engagement metrics, it is difficult to distinguish between high-volume misinformation and a deliberate, coordinated disinformation campaign.
Let’s walk through an example of how this common pitfall often plays out:
The Surface-Level Claim: A report cites that a narrative was “exposed to over 10 million users,” labeling it a “large-scale disinformation event” based on follower counts.
The Reality: That “10 million” number represents a theoretical maximum reach based on follower counts. Without further data, this only supports a claim of misinformation, not disinformation.
The Rigorous Standard: Analysis must include verification of feed delivery to “exposed” users, engagement or retention metrics, and evidence of intent behind the original posts. Researchers must differentiate between organic user sharing and deliberate seeding.
Without verifying delivery and engagement, large-scale reach is a measure of potential, not proof of a coordinated campaign.
Patterned behavior, such as synchronized posting or content duplication, is often flagged as evidence of a coordinated manipulation network. However, the widespread use of legitimate scheduling tools and public content scrapers can create technical signatures that mimic coordination. Rigor in this space requires moving beyond timing-based heuristics to look for evidence that systems are being gamed rather than simply utilized.
Let’s walk through an example of how this common pitfall often plays out:
The Surface-Level Claim: Thousands of accounts are identified as a “coordinated social media manipulation network” based solely on synchronized posting behavior.
The Reality: Legitimate professional scheduling tools (like Hootsuite) and public content scrapers create technical signatures that mimic coordination. Without additional context, this incorrectly infers manipulative intent from mere automation.
The Rigorous Standard: Analysis must move beyond timing-based heuristics to look for evidence that systems were being deliberately gamed rather than simply utilized. Researchers should provide concrete examples of deceptive behavior and explicitly state that intent cannot be inferred from timing alone.
True manipulation requires evidence of a deliberate attempt to game the system, not just the presence of automated tools.
Large datasets of inauthentic accounts can create a sense of scale that may not translate to real-world influence. When inauthentic actors/ accounts are identified, it is essential to analyze the "echo chamber" effect. High account volume within an isolated, low-reach network often fails to penetrate mainstream discourse.
Overstating the significance of these networks can lead to skewed risk assessments and misallocated resources.
Let’s walk through an example of how this common pitfall often plays out:
The Surface-Level Claim: A report flags 100,000 accounts as “inauthentic actors amplifying disinformation”.
The Reality: If the accounts have negligible follower counts and engagement is confined to an isolated, low-quality network, the real-world influence is likely overstated.
The Rigorous Standard: Precision requires analyzing the follower counts (and corresponding capacity to meaningfully amplify content) and engagement (and corresponding risk of a narrative crossing over from the network into mainstream discourse). Researchers must then contextualize account volume against actual engagement and follower quality to paint an accurate picture of impact.
Proper framing leads to different—and better—policy decisions, and prevents the unnecessary allocation of resources to low-risk situations.
In a hyper-connected media environment, similar narratives often appear across multiple platforms simultaneously. While this can mirror the appearance of a unified campaign, it is frequently the result of algorithmic feedback loops or shared reactions to real-world events.
Let’s walk through an example of how this common pitfall often plays out:
The Surface-Level Claim: A report documents the similar false narratives appearing across different websites, platforms, and forums, and implies this is a unified disinformation campaign.
The Reality: Narrative convergence is not proof of disinformation, but it is often a feature of misinformation dynamics during viral or emerging events, which can drive algorithmic social media manipulation and create a perception of coordinated interference. Without evidence of shared infrastructure, attributing it to a centralized actor is speculative.
The Rigorous Standard: Coordination must be supported by evidence of shared infrastructure (linked IPs, account creation dates, geolocation data) or cross-platform identity linkages (reused profile names, identical bios). Researchers must identify indicators of a centralized command-and-control operation, such as identical posting times, synchronized reposting, or uniform engagement across posts.
Narrative alignment may suggest a shared theme, but it only becomes a campaign when supported by shared technical infrastructure.
Across research on misinformation, disinformation, and social media manipulation, the same standard applies as in any other field: claims must be proportionate to the evidence.
Credible research about social media activity should always include:
Clear definitions of the phenomena being studied
Transparent methodology, assumptions, and known limitations
Comparison of the observed activity to a baseline activity level or to other analogous online activity
Concrete, inspectable examples supporting findings
Explicit limits on inference, including acknowledgements of the lack of data available to support a definitive conclusion, or qualifiers about the level of confidence of assessments
If a claim cannot be traced to an observable behavior, it is not yet ready to be shared as a finding.
Overstated claims blur important distinctions, erode trust, and invite backlash. In this environment, journalists grow wary, policymakers hesitate, platforms disengage, and adversaries adapt.
Precision does the opposite:
It strengthens the legitimacy of the field
It enables constructive challenge to claims and conclusions
It supports proportional responses to nefarious actors
It differentiates objective research from sometimes biased narrative signaling
In a field already accused of politicization and alarmism, rigor is not optional. It is defensive infrastructure.
Research into misinformation, disinformation, and social media manipulation does not merely describe online behavior—it shapes real-world decisions.
With that influence in mind, researchers are obligated to clearly define:
What they observed
What they can reasonably conclude
Where uncertainty remains in their findings
Rigor in high-pressure situations is not a matter of caution, but rather of honesty. Reporting clear, defensible claims based on the best available information can prevent researchers themselves from spreading distortions similar to those they are trying to characterize and expose. In a domain focused on online distortions, losing the advantages of transparency and honesty to speed and hype is not an option.
The urgency of the information environment often creates a bias toward action, where the pressure to report findings quickly can outpace the time required for deep technical validation. However, as this field matures, the distinction between a "hunch" and a "finding" must remain absolute. High-volume data and alarming headlines may capture attention, but only rigorous, transparent, and verifiable analysis can inform effective policy and security responses.
At Alethea, we believe that showing the work is as important as the claim itself. By maintaining clear definitions, acknowledging the limits of available data, and distinguishing organic online behavior from coordinated manipulation, researchers protect the legitimacy of this vital work. Precision is not a barrier to speed; it is the defensive infrastructure that ensures our findings lead to better decisions and a more resilient information ecosystem.
In a domain dedicated to exposing distortions, our greatest assets are honesty, transparency, and a commitment to evidence that stands up to scrutiny. Rigor is not optional — it is the standard that ensures our field remains a trusted source of truth.