Tracking Phishing Bots Via Locations

Gwern

Tracking Phishing Bots Via Locations

Proposal: use an information leak from catfishing/thirst-trap bots of ‘their location’, which they set to a target’s location, to monitor their collective activities and infer their attacks.

by: Gwern 2025-06-04–2025-06-06 finished certainty: log importance: 0 similar

Pig-butchering botnet analysis via ‘location’ targeting: If you track catfishing or spam-bots which try to better target people by setting fake geolocation data to their location, then you might be able to observe the tempo of attack campaigns and infer a lot of things.

While talking with a guy who has an odd Twitter hobby of chatting with the catfishing DM bots, he mentioned something interesting about how they try to appeal to their targets: the bot account will set their ‘location’ to the target’s location (if available), to hint at a possible physical liaison for horny targets; and so, they will change it when they target a new person. (The accounts get reused for multiple attacks until banned.)

This struck me as an interesting metadata leak and side-channel about the bots’ secret activities: the public ‘location’ tells you where they are attacking someone, obviously, but it also tells you when they attack each new person, which would ordinarily require highly privileged access to things like internal DMs and can only be done by the social media/apps themselves.

A nice feature of the location metadata leak is that it cannot be abandoned by attackers without a cost: it presumably works, which is why they do it; but the leak is necessarily public, because they want the target to see it without making it too obvious to the target that it is manipulative. The location has to be casual or ‘accidental’. So the signature might persist for a long time. But even if it doesn’t, by its nature, this data may already be collected in existing scrapes of public profile metadata, so it can be done retrospectively.

So if you are able to reliably find such sex-themed bots (which is a much lower bar), you can infer many things. A short list would include: how many people they are targeting, how quickly they move on, whether they are succeeding right then, whether they ever ‘cool off’ targets and try again later on (by observing enrichment in ‘returning’ to a location); and if you know enough about the economics of the crime, you can try to back out their average revenue vs costs based on the temporal patterns (especially after inferring the country of the attackers), and the implied value of different customers (eg. by country of origin—users from America or Japan will be worth much more than from Latin America, etc., and one would expect those attacks to last longer, or by whether a user is involved in cryptocurrency and may be worth a highly tailored attack).

The absence of attacks is also relevant: how specialized are bot networks? One would expect that with the rise of LLMs, bot networks will get more versatile and multilingual, as all interactions, even by the final human specialist scammer, can simply be machine-translated. (The specializations might help with qualitative understanding of the crime networks: perhaps there are implied agreements to not poach in each other’s language or region.)

Since ‘giving up’ is probably not fully automated, there will be human manual intervention, and the exact timing of location switches will give you indications of how many humans are involved due to the granularity of switches (a worker will switch only one at a time, and it probably takes at least a minute to handle each oversight task start-to-finish), and their timezone, and gaps corresponding to local holidays will imply a country. If one has the clusters by another method, one can instead estimate the level of automation: clusters which switch rapidly at all hours, and are more consistent with maximizing and implied economic model, are more automated and presumably more sophisticated.

You can look for cross-bot patterns as well: bots may switch in clusters, which allows inferring the number and nature of the bot nets.

And it can imply things about more-complicated strategies: are there suspiciously many location overlaps, where bot A sets their location to bot B’s old location, right after bot B switches to a new target? This could indicate that different bot types are used to attack a user—sometimes there is a ‘handoff’, where scammers will scam a victim, and then followup by scamming them a second time, often by pretending to help them deal with the first scam. (Because they are not yet drained of all money, and by falling for the first scam, they prove they are vulnerable.) And when bots get banned, particularly if it happens in a wave, do the remaining bots change their behavior (eg. an implied higher discount rate, and so more rapid switching)?

Aside from retrospective analysis, the location switches can be followed in realtime to ‘nowcast’ attack patterns. (Although there are enough ways to track attacks that simply knowing a targeted geographic region may not be too useful, and it is unlikely that vulnerable users would pay attention to a generic ‘users in California are being targeted heavily by scam bots right now’ warning; if they were that savvy, they wouldn’t need a warning at all.)

And with enough data, some correlations or trends might emerge—economic or political statistics might correlate with trends in attacks.

Finally, the lack of location metadata itself can become a useful fingerprint. If there is a botnet which doesn’t use it, that implies that it’s monetizing in some other form; if it’s not obvious how, it may represent something even more sinister, like a propaganda network or an APT.

[Return to blog index]

Error: JavaScript disabled.

Backlinks, similar links, and the bibliography require JS enabled to load.

Similar Links