ā€œThe Availability and Persistence of Web References in D-Lib Magazineā€, Frank McCown, Sheffan Chan, Michael L. Nelson, Johan Bollen2005-11-21 (; backlinks; similar)⁠:

We explore the availability and persistence of URLs cited in articles published in D-Lib Magazine.

We extracted 4,387 unique URLs referenced in 453 articles published July 1995–August 2004. The availability was checked 3 times a week for 25 weeks September 2004–February 2005.

We found that ~28% of those URLs failed to resolve initially, and 30% failed to resolve at the last check. A majority of the unresolved URLs were due to 404 (ā€˜page not found’) and 500 (ā€˜internal server error’) errors. The content pointed to by the URLs was relatively stable; only 16% of the content registered more than a 1 KB change during the testing period.

We explore possible factors which may cause a URL to fail by examining its age, path depth, top-level domain and file extension.

Based on the data collected, we found the half-life of a URL referenced in a D-Lib Magazine article is ~10 years. We also found that URLs were more likely to be unavailable if they pointed to resources in the .net, .edu, or country-specific top-level domain, used non-standard ports (ie. not port 80), or pointed to resources with uncommon or deprecated extensions (eg. .shtml, .ps, .txt).