“Tagsoup: Parsing and Extracting Information from (possibly Malformed) HTML/XML Documents” (; backlinks)