The FASEB Journal is now also available on Wiley Online Library.
Life Sciences ForumFull Access

Unavailability of online supplementary scientific information from articles published in major journals

  • ,
  • , and
    Published Online:https://doi.org/10.1096/fj.05-4784lsf

    Abstract

    Printed articles increasingly rely on online supplements to store critical scientific information, but such data may eventually become unavailable. We checked the current availability of online supplementary scientific information published in six top-cited scientific journals (Science, Nature, Cell, New England Journal of Medicine, Lancet, Proceedings of the National Academy of Sciences USA). Here we show that in 4.7% and 9.6% of articles with online supplementary material, some of the supplements became unavailable within 2 and 5 years of their publication, respectively.—Evangelou, E., Trikalinos, T. A., Ioannidis, J. P. A. Unavailability of online supplementary scientific information from articles published in major journals.

    Given page and cost limitations, much critical information on published scientific articles appears increasingly only as online supplementary information. This material includes key methodological details, data, tables, images, and video clips and may be a sine qua non for the critical appraisal, documentation, replication, and further exploitation of published research. Print versions of articles are becoming increasingly shorter and more elliptical given the convenience of web posting (1). However, might online supplements eventually be lost and become unavailable to the interested scientists? In previous studies, “broken links” have already been found to be a problem for online references, educational material, and web pages in general (2345).

    We checked the current availability of online supplementary scientific information published 2 and 5 years ago in the six top scientific journals that receive > 120,000 citations annually and have journal impact factors > 10 according to the 2003 edition of the Journal Citation Reports issued in the Thompson-Institute for Scientific Information Web of Knowledge (Science, Nature, Cell, New England Journal of Medicine, Lancet, Proceedings of the National Academy of Sciences USA) (6). We probed all articles published in March 2000 and 2003 that included links pointing to any type of supplementary online information, except for general reference links.

    We focused our screening on papers that represented original research. This included research articles, brevia, reports, and technical comments in Science; articles, letters, and brief communications in Nature; all regular articles in the Proceedings of the National Academy of Sciences; original articles and special articles in the New England Journal of Medicine; articles, research letters, and mechanisms of disease articles in Lancet; and all regular articles in Cell.

    We recorded the following information from each eligible article: authors, title, scientific domain, journal of publication, and number of active and broken web links directing to supplementary material. We also recorded, judging from the link’s address, whether the supplementary information was stored in the pertinent journal’s web servers (servers belonging to the journal or the publisher) or in other web pages (personal web pages or web spaces belonging to the authors’ affiliated institutions and organizations).

    We did not consider general reference links, such as links referring to National Center for Biotechnology Information databases and sequence databases in the public domains.

    Link verification was done with the specialized freeware software Xenu’s Link Sleuth (ver 1.2, Tilman Hausherr, 2005), which can check links and report their status. Links pointing to supplementary scientific material were checked in March and April 2005. Information was considered missing if links could not be accessed in two consecutive attempts 2 weeks apart during this period. Links were checked at varied times of the day and at different days of the week to avoid routine server maintenance. All inactive links were manually verified as well.

    Of 955 screened articles, 244 articles with links (n=585 links) qualified. There was considerable variability in the extent of use of online links to post supplementary information across these six journals (Table 1 ). Online supplements were particularly prevalent in Science, Nature, and the Proceedings of the National Academy of Science but had become more common in 2003 compared with 2000 for the other three journals as well.

    Online supplementary information could not be retrieved for 21 links in 15 articles (6.2%) (Table 1) . Seven of 73 papers (9.6%) contained 12 broken links in 2000 and 8 of 171 articles (4.7%) contained 9 broken links in 2003. The entire online information was missing in 8.2% and 1.8% of the articles with links published in 2000 and 2003, respectively.

    Broken links were encountered in four of the six journals, and their articles pertained to a wide variety of scientific topics (Table 1) . The proportion of articles with broken links was highest in the two high-profile medical journals (33% [1/3] in 2000 and 20% [4/20] in 2003). Of the 21 broken links, 14 were referring to web pages not belonging to the journal or publisher, and these apparently were personal or institutional web pages; seven broken links were hosted in journal/publisher servers.

    These results indicate that even in the most prestigious and visible journals, some scientific information may eventually become unavailable when it is supplied only online; personal and institutional web pages may be particularly vulnerable. Here we examined quite recent publications, and the proportion of lost information may increase as time passes unless proper measures are taken to remedy the situation. It should not be taken for granted that investigators would be able to furnish again this unavailable information (3, 7). Given that supplementary information is an essential part of a published article, journals and publishing groups need to improve their methods to ensure the maintenance and continuous availability of this important scientific material.

    Table 1. Unavailable online supplementary information

    Journal (year)Broken/total supplementsSpecific articles (volume; pages [field])a
    ArticlesLinks
    Nature (2000)2/155/23404;385–387 (Conservation Biology)
    404;398–402 (Molecular Genetics)
    Science (2000)0/300/33None
    Cell (2000)0/10/4None
    NEJM (2000)0/00/0None—no articles published with links
    Lancet (2000)1/31/5355;1064–1069 (Medicine/Clinical trials)
    PNAS (2000)4/246/4697;2450–2455 (Applied Biological Sciences)
    97;2562–2566 (Biophysics)
    97;2680–2685 (Genetics)
    97;3509–3514 (Microbiology)
    Nature (2003)1/301/80422;317–322 (Cell Biology)
    Science (2003)0/410/55None
    Cell (2003)0/50/30None
    NEJM (2003)2/52/10348;977–985 (Medicine/Epidemiology)
    348;1223–1232 (Medicine)
    Lancet (2003)2/153/49361;918–922 (Medicine/Ethics)
    361;923–929 (Medicine/Microarrays)
    PNAS (2003)3/753/250100;2807–2812 (Microbiology)
    100;3233–3238 (Cell Biology)
    100;3293–3298 (Developmental Biology)

    aSome (not necessarily all) of the article links with online supplementary information were broken.