“Voting for Authorship Attribution Applied to Dark Web Data”, 2020-11-10 (; backlinks; similar):
This research is about authorship attribution (AA) within multiple Dark Web forums and the question of whether AA is possible beyond the boundaries of a single forum. AA can become a curse for users that try to protect their anonymity and simultaneously become a blessing for law enforcement groups that try to track users.
In this paper, we explore AA within multiple Dark Web forums [DNM Avengers, The Majestic Garden (TMG), The Hub (TH), Dread] to determine whether AA is possible beyond the boundaries of a single forum.
The analysis revealed that analyzing all features together with a single classifier does not achieve as good results as when they are classified separately and the final result is computed by a voting mechanism. The latter achieves an F1-Score that is up to 44% higher than in the former case. On top of that, the analyses show that the author of a post is at least 94% within the top 3 most likely candidates.
This shows that AA can threaten the anonymity of Dark Web users across the boundaries of different forums.
[Keywords: authorship attribution, Dark Web, machine learning, natural language processing, voting]
…3.2 Dark Web Forums Used: The number of active users in the dark web forums found between October–December 2019 within the context of this research ranged either between a few hundred or between a thousand and more. Since the probability of finding users who are active in 2 or more forums is expected to be higher when concentrating on those forums that seem to be the most popular, only forums with more than 1,000 active users were selected. However, in future work, this threshold could be lowered to also include smaller forums with only a few hundred users to increase the size of the data set.
At the end of 2019 there were fewer than 10 Dark Web forums found with a large community (around 1,000 active authors or more). Unfortunately, the number of those forums that allow users to publish their PGP keys in their user profiles, was even smaller. In the end, only 4 forums fulfilled the requirements for this analysis, which are presented in Table 1.