Analyzing Trends in the Silk Road 2.0

Written by Daryl Lau


Last Friday, one of the top articles on hacker news was called Breaking the Silk Road’s Captcha

This sounded pretty cool to me, though not necessarily applicable because the current Silk Road 2.0 (I’ll just be calling it SR from now on) isn’t using anything nearly as sophisticated.

I thought it would be really interesting to scrape SR for, let’s say a month or two. I could do cool stuff like make a stock ticker and display the values like COK XTC LSD etc.


The following information is for educational purposes only, I have no affiliation with the Silk Road 2.0, nor have I ever purchased anything off the site. As far as I know, visiting the site and writing about it with no intention to buy (commit a crime) is perfectly legal.

Some implementation quirks

Before we begin: I only wanted to spend an hour or two doing this. I was late for a dinner and wanted it to run overnight while I was sleeping. If you are looking to build a robust system, you should consider a different solution.


Simply download the captcha, run it through some opencv transforms, then feed it to tesseract. If it doesn’t work, just keep on trying until we can get a relatively easy one. I think my sucess rate was >90% with some very tranforms using opencv.

Connecting through tor

The SR site is an anonymous hidden service reachable only through the tor network. You run the tor client daemon on your machine, then use it as a SOCKS5 proxy.

This has some complications, because dns requests also have to go through tor.

The quick and dirty solution is to just spawn the scraper through torsocks which wraps all the net requests from my scraper.

Automatic logouts/timeouts

The SR site seems to be very eager to automatically log out users. When logged out, I simply create a new user. When I am back on the site, I make sure to traverse to the last known point from the root node of our crawl tree. This is to avoid detection.

The nature of web crawling through tor:

Crawling through tor already obfuscates your identity to a certain degree, so we don’t really have to do anything other than cycling User-Agent strings to look different from any other client.

Data Extract

I’ve made a one day snapshot available at

I will release the source code for the crawler when I am done, with the SR specific portions removed if anyone is interested. This will all go to the same repo.


Alright enough technical details, let’s see what useful information we can get out of this.

Knowing very little about recreational drug use, I visited the National Institute of Drug Abuse’s website which conveniently provided the names of, what the US considers, to be the most widely used drugs.

I thought, if I know them, they must be a big deal right?! I guess so. Here are the drugs I picked out:

Total number of listings

Your browser does not support SVG
Sorted by number of listings
MDMA        1321
Weed        761
LSD         523
Cocaine     475
Amphetamine 215
Heroin      150
Ketamine    67
Opium       53
Mescaline   20
Total       3585

weed is simply marijuana that is smoked, not any other derivative such as hash

To put things in perspective, at the moment of writing this SR has approximately 13,000 listings for drugs. Just a guess, but it looks like prescription drugs account for a large portion of SR drug listings.

Nothing much to say here, other than the fact that MDMA seems to have the most listings.

Highest number of ratings

Your browser does not support SVG
Your browser does not support SVG

Just like buying off Amazon, users can review the specific product. SR gives a rating from 1-5 stars and the total number of reviews per product listing.

The average number of ratings per product as shown here seem to be rather uniform, there is on average 29 reviews per product.

MDMA           33822    25
Weed           28213    37
LSD            12122    23
Cocaine        16591    34
Amphetamine    6251     29
Heroin         3132     20
Ketamine       1504     22
Opium          1256     23
Mescaline      62       3

Total          102953

Top 100 Most Reviewed Items

MDMA 48 Weed 22 LSD 10 Cocaine 9 Amphetamine 7 Ketamine 1 Opium 1 Mescaline 1 Heroin 1

Your browser does not support SVG
Your browser does not support SVG

In case you are wondering, there were some outliers:

  • One had 100g of MDMA for $1510.77. It had 392 ratings.
  • Another was selling 100g of mdma for $1186 and 50g for $659. They had 293 ratings and *279 ratings respectively.
  • The other was for 1/4lb of bulk medical marijuana for $619.10. It had 378 ratings.

I somehow doubt this guy has sold half a million dollars worth of MDMA at $1.5k a pop in such a huge quantity, but the price seems to be in line with other sellers for an equivalent amount. I’m not entirely sure what the rules are regarding who can give feedback, but there seem to be people buying huge quantites if a user must buy a product to be able to review it. I have never purchased anything from the site, and I wasn’t presented with any choices to review an item.

If only people who purchase the item can review it, then I am a bit less skeptical. I saw one canadian seller listing 1 kilo of MDMA for USD $8k with 1 review!


The average price of the top 100 items is $129

The average price of the top 500 items is $188

The average price of the top 1000 items is $236

Prices are converted to USD at time of crawl using exchange rates from the coinbase api.


Sellers on SR can specify where they ship from and where they ship to.