Hacker News new | past | comments | ask | show | jobs | submit login
In Defense of Inclusionism (gwern.net)
178 points by Tomte on Dec 12, 2016 | hide | past | favorite | 171 comments



This is a long but excellent article - well worth reading.

By coincidence, I became aware of this bizarre "deletionist" culture at Wikipedia recently when I was searching for information about a particular musician. This is someone who has a handful of popular-ish songs on streaming sites, all from television soundtracks, but hasn't really charted as far as I can tell - in other words, an artist I would peg at about 50/50 odds to have had someone bother to write a Wikipedia entry for them.

Lo and behold, I found that someone had written a Wikipedia entry for them, and that it had been deleted because they weren't deemed famous enough to have an entry! I was dumbfounded... This is someone that millions of people have probably heard in the background of primetime television, and information about them was actively deemed unworthy of Wikipedia!

It was a strange, depressing, and disturbing moment of realization about how Wikipedia had evolved from the early days when I was consistently delighted to find information on all manner of obscure topics, lovingly curated by people who cared deeply enough about them to invest their time in informing the world about them. Thanks to this article I now know that it wasn't just an isolated incident.


Perhaps Wikipedia should approach this from another angle: loosen their baseline criteria, and then treat "notability" like Twitter's "verified" tag. Let the debate be whether or not an article should have the notability checkmark, not whether or not it should exist at all (within reason).


Because of the way Google prefers Wikipedia articles over other sources in its SERPs, this would ultimately have the effect of turning Wikipedia into a UGC version of AOL: it would be overrun with crud (because the Internet incentivizes Wikipedia articles), and the project would have to expend significant effort at retraining the Internet to look for "verified" stickers on the things that actually were encyclopedia articles. For what benefit?

Wikipedia's response to this problem is the right one: there's a whole big wide Internet out there, and if Wikipedia isn't the appropriate place for your article, surely there are many other places that are. UGC sites are falling over themselves to get people to write on them. Why force them on the one Internet project that doesn't want them?


Good point on SERPs, and certainly nobody should force Wikipedia to act against what they believe is their strategic best interest. But isn't "the way Google prefers Wikipedia articles over other sources in its SERPs" an epiphenomenon? In an alternate universe where Wikipedia was more inclusive and used some kind of "notability" flag, Google could just as easily prefer those "notable" articles the way it prefers Wikipedia articles more generally in this universe.

Edit: clarity.


I think it is indeed the case that if Wikipedia was subjected to less entropy, so that its editors could work on building up the encyclopedia in a sort of secluded peace, the project would be far less itchy about notability.

But remember that minimizing error is only half the argument for notability. The other half is, again, definitional: an article about a non-notable topic is almost by definition original research, and "no original research" is one of Wikipedia's oldest rules. The project's charter is to be a tertiary source.


> an article about a non-notable topic is almost by definition original research, and "no original research" is one of Wikipedia's oldest rules

While I understand it as a means to keep random cranks out of the science pages, all it has ever incentivized is to have people 'launder' their research via some 'reliable source'.

But which sources are 'reliable' is quite often purely a matter of editorial bias and there are other wiki projects with a very different take on the matter, for example:

https://infogalactic.com/info/Infogalactic:Reliability


Could you be more specific about this "laundering" of research through reliable sources?

I definitely saw savvy Wikipedia spammers lawyering their way into the encyclopedia, sometimes successfully, for instance by citing marginal trade press cites as evidence of notability ("my client is notable because one time a trade press writer got a quote from them on the importance of FCIP products for disaster recovery programs").

What I don't see is a lot of bogus research hiding in the secondary sources of major articles.

Whatever you might think about WP's policies on what does or doesn't constitute a "reliable" source, I think it's difficult to argue that any community outside of Wikipedia has spent more time thinking about this problem.


It's simple, you put it on your web page with some puffed-up credentials and then have a friend link to it. This works better for things that aren't commonly challenged as it often doesn't stand up to more than casual scrutiny as you do have to pass yourself off as somehow 'reliable'.


My experience editing Wikipedia suggests to me that this is a dubious tactic. Anything I cited on my own web pages, for topics I feel pretty comfortable asserting expertise on (like, for instance, the presence of a lisp interpreter embedded in the Seatbelt ACL system in OSX) was immediately sniped by other editors.


You made the mistake of citing yourself instead of getting a friend to do it :)


Google can easily retrain itself to weight verified Wikipedia topics higher than unverified. There is a lot of benefit to having these articles maintained by a non-profit; Wikia's load time on a phone browser is stupid long, mainly because of ads and tracking scripts. It is several seconds before they are loaded, then you start getting bombarded with full-screen ads. There are dozens of Wikia apps in the Play store, so effectively that means that, unless this is a topic that I have a deep and pre-existing commitment to, I can't have the accelerated app experience. Wikia is just not a replacement for Wikipedia.

Yes, UGC sites are clamoring for content: so they can monetize the shit out of it with no respect for the content, community, creator, or reader. Your comment makes it seem like there's no merit to keeping the content on WP, but if you actually use some of these other sites I don't know how you can even compare the two.


Totally coincidentally Jimmy Wales makes money from wikia, which is a convenient home for all of those articles deleted from Wikipedia.

Plastered in advertising of course.


Wikia is the reason Wikipedia's constant begging for money really annoys me.


Something sort of like it already exists; it's called a 'Featured Article' badge. But since there is an expectation that every article should eventually reach 'featured' status (or at least it should be possible in principle), it doesn't have the result you'd expect.

And why wouldn't people have this expectation? Why keep an article at all if it's just going to lie there and gather dust, unmaintained and exposed to vandalism?


>unmaintained and exposed to vandalism?

Well written articles that contain no speculation shouldn't rot.

>Bill cosby is a well respected television figure

Would be badly written content that, left unmaintained, would now be inaccurate.

>Bill cosby is a television figure who was held in high regard during the 70's-90's

Is better written, because it hinges on verifiable information that will not change. It would become incomplete if no one added to it, but I would argue that it is better to have incomplete information than no information.

Vandalism issues are a problem with the wikipedia software, not a problem with the concept of having lesser-trafficked pages. If this is an issue for low-traffic pages, then there should be an anti-vandalism queue where edits to articles that receive less than a certain amount of traffic should be put in a queue to be reviewed for vandalism (and just vandalism, not content accuracy etc). It might take a while for edits to bubble through, but that's basically ok if the page receives that little traffic. Eventual information is better than no information.


> Well written articles that contain no speculation shouldn't rot.

And who is going to ensure that they are well-written and contain no speculation?

> Vandalism issues are a problem with the wikipedia software, not a problem with the concept of having lesser-trafficked pages.

If you mean to say that any software explicitly written for the purpose of letting random people publish content instantaneously with no supervision is bad software, I might agree. Otherwise, no. Vandalism is a social problem resulting from lack of article oversight. And I'm not just talking about inserting 'PENIS' into pages, because that is basically solved already with edit filters. I am talking about hoaxes, I am talking about presenting speculation as fact, I am talking about fabricated citations, I am also talking about well-meaning people who inadvertently turn carefully written technical articles into Potato Jesus. A while ago, I've looked at https://en.wikipedia.org/w/index.php?title=Blue_Screen_of_De... which contains this lovely sentence:

> Windows 98 and early builds of Windows Vista displayed the red screen from a boot loader error raised by ACPI.[19][20][21]

Cited to three sources. Looks great, doesn't it? But if you know anything about boot loaders, about ACPI or just about Windows, you understand this sentence is complete nonsense. You don't even need to look at the citations which obviously don't say any such thing. This is no doubt a result of shoddy copyediting by someone who didn't understand the subject. What is your answer to that?


>What is your answer to that?

That you should follow the links and see if the citation matches as part of routine anti-vandalism checks.

>Cited to three sources. Looks great, doesn't it? But if you know anything about boot loaders, about ACPI or just about Windows, you understand this sentence is complete nonsense.

All that says to me is that articles about computers should be reviewed by people who know about computers, which makes total common sense.

An encyclopedia is defined as much by it's breadth as by it's depth. Cutting notable but lesser-trafficked chunks out of it to save on effort is a lazy-arse solution that lowers the usefulness of the wiki.

I'm sure you could come up with a thousand reasons why it's too hard, but that's not how you make a good wiki. Hell, it's not how you make anything good.

Ultimately, as the consumer of your product, I don't care about your troubles. Sympathetic customer fallacy applies. If wikipedia doesn't have the information I'm looking for, I won't use it. It really doesn't matter how lovely and maintainable the articles I'm not looking for are. If you wanna be proud of the work you do, you're gonna have to dig deep, maybe make some compromises, and figure it out.


Here's the list of FAs, to get some sense of what the bar is here. The overwhelming majority of WP articles --- including the important ones people care about --- aren't FA's; an FA is essentially a very high quality professional grade encyclopedia article.

https://en.wikipedia.org/wiki/Wikipedia:Featured_articles


Wikipedia is currently -- for the second time -- trying to delete its article about the actress who voices/sings the title character of "Moana", on grounds that she's only notable for that one film and thus doesn't "deserve" to be written about separately from it.


What do you like so much about this article? What makes it excellent? What qualities of its analysis particularly appealed to you?

What I read was a piece that centered on a single --- and I thought dormant --- controversy about the purpose of Wikipedia: that, because the marginal cost of a new article on Wikipedia is zero, its charter should include comprehensive character-by-character breakdowns of fictional works. Not just of Pokemon, but of every fictional work in which its editors have an interest.

Reasonable people can disagree about this. But it's not the sweeping indictment of Wikipedia I'd expect from this comment thread.

The article offers two empirical studies to support its argument that deletionism is harming Wikipedia. In the first, it collects the "External Link" contributions of an anime-focused editor and records the (very small) percentage of proposed links that were ultimately accepted into the bodies of articles.

But the links he's highlighted appear to be of very low quality. I clicked through a random 10 of them, and most of them were 404'd. The ones that were alive all appeared to be UGC online reviews of individual episodes of anime films.

I'm not surprised that editors who watchlist anime titles were reluctant to stuff articles with tens of links to individual "mania.com" review articles (none of which appear to live today).

In his second experiment, Gwern selected a random set of Wikipedia articles and killed existing external links, to measure how long it would take for them to be restored. But by his own admission, the process of selecting random articles meant he was working primarily with articles nobody cares about. He suggests in the text that the median number of users watchlisting the articles he tampered with was 5 --- but links to the Wikipedia editor who informed him of that without noting that 5 is a very low number; by comparison, the Wikipedia article on Paul Graham (once notable on HN for being put up for deletion on WP) has over 125.

But the bigger flaw with the experiment is that it misconstrues Wikipedia's position on external links. The WP:EL policy Gwern cited in his external deletions is direct about this: Wikipedia is not a collection of links. The project does not view sprawling collections of "External Links" as a good thing: what it wants are links to reliable sources that back up points made in the prose article itself.

It's no wonder that article watchers didn't rush to add links back to stories; the links that he removed were marginal, at times totally disconnected from the stories themselves (as with the "nsplacenames.ca" link on Rockingham, Nova Scotia), or dead (as with the video link, unreferenced by the article, on the Shahrnush Parsipur article. This can't be surprising to Gwern: his methodology clearly selects for marginal links, by trawling for them in the disfavored "External Links" sections of articles far out into the long tail of interest on WP.

This article was written in 2009, and makes predictions about the future of the project. Were those predictions accurate? It doesn't look like it: article creation has grown steadily, and editor participation has been remarkably stable for years. Has the Wikipedia community moved away from its "deletionist" stance? This comment thread sure doesn't think so, and I agree. So: what gives?


I bet all those links weren't 404 errors in 2009. Which raises the question: How are external links in Wikipedia articles maintained? Is there some automated process that checks those links periodically to see if they are rotten?

Is that why Wikipedia hates external links?


> How are external links in Wikipedia articles maintained?

Until recently, "not very well". Recently, they have been doing https://blog.wikimedia.org/2016/10/26/internet-archive-broke...


I'm sure they weren't 404s in 2009! It's not my argument that the links were added in bad faith.


This is someone who has a handful of popular-ish songs on streaming sites, all from television soundtracks, but hasn't really charted as far as I can tell

That's what's called "fails [[WP:MUSIC]]" on Wikipedia. The basic criteria are two recordings on a major label, or some major award, or charting on some well known chart, or historical importance. These criteria filter out the several million garage and Myspace bands that would like to be on Wikipedia.


Your kind of thinking is the problem.

You state the rule, state one positive from it, and don't even bother to consider why it could possibly be a positive (or not). You also failed to consider any negatives.


Wikipedia could have a rule that if a piece of media is notable enough to have an article on wikipedia, then everyone in the credits for that piece of media is notable enough to have an article on wikipedia. 99% of those people aren't famous, but why not?


Can you reword this? I don't understand the suggestion you're making.

I think you might be making a category error. Notability on Wikipedia isn't a reward earned by acquisition of fame; it's a level of status at which sourcing about a subject is likely to be reliable. The problem Wikipedia has isn't that no-names might get articles. It's that articles will be written about subjects for whom reliable sourcing is impossible, because no reliable source has ever deigned to write about them.

This gets us to a fundamental principle of what an encyclopedia actually is --- and to a project rule that is as old as "NPOV". An encyclopedia --- at least Wikipedia's conception --- is a tertiary source. It's a roadmap that summarizes and points to other, more in-depth sources.

By definition, if you don't have reliable sources to back the subject of an article up, it can't be hosted in an encyclopedia. The project's answer to this challenge would be, first create the reliable secondary sources you'd need to support an encyclopedia article, and then create the article itself.


Thank you for this explanation. You're right. For people who are only mentioned in credits for TV shows and not in newspaper articles, Wikipedia is not the place to list in which other TV shows someone was credited.


So, every runner and key grip that has worked in production of a popular TV series is notable now?

Someone please formulate a 'notability' version of Schneier's law.


Anything that goes on the TV or the radio will be seen by millions of people. Just because it's on TV doesn't make it worthy of an article.


Why not? It's not like having article costs anything. It's just a few kilobytes of text.


This a very popular canard.

It's true that an additional article on Wikipedia has zero marginal cost in terms of compute, bandwidth, and storage.

But every article on Wikipedia imposes maintenance costs on the project, because every article is an opportunity for error, and it's the responsibility of every member of the project to eliminate those errors.

You don't have to spend much time on Wikipedia to get a visceral sense of the validity of this argument, whether or not you agree with it. It is kind of a miracle --- not a small one --- that Wikipedia exists at all. It is one of the great achievements of the Internet writ large. And it exists in spite of:

* Enormous numbers of articles written as advertisements designed to piggyback off Google's preference for Wikipedia articles on its first SERPs

* Enormous amounts of casually malicious spam and vandalism, some of which is purposely designed to avoid detection as long as possible

* Enormous amounts of agenda-driven bias working continuously to turn Wikipedia articles into advocacy pieces for one side or another of a given controversy

The point of the notability standard isn't to reward people for fame, or to save hard disk space. It's to put some reasonable boundary on the subjects for which unpaid editors should be expected to mount the often-tedious defense against these forces.


> But every article on Wikipedia imposes maintenance costs on the project, because every article is an opportunity for error, and it's the responsibility of every member of the project to eliminate those errors

Indeed. The best way to answer that is to recruit more members, not to reduce the size of the project.

This article struck a chord with me as a lot of the issues are growing pains that we've been going through with OpenStreetMap. OSM is expressly not deletionist. We do have maintenance concerns - particularly in the case where "out-of-towners" come in and blitz a town, adding all the shops then leaving before a community has formed to maintain them. But our chief response to that is to try to grow our community.

FWIW, here's a page summarising the differences between OSM and Wikipedia editing philosophies: http://wiki.openstreetmap.org/wiki/Welcome_to_Wikipedia_user...


OpenStreetMap is an impressive project, but Wikipedia is in some sense the most impressive, most ambitious project in the history of the Internet (and, because of how important the Internet is in the history of recorded human knowledge... well, &c &c &c).

The active Wikipedia editing community has been remarkably stable for years (the predictions in Gwern's 2009 post do not appear to have borne out). It's a large and vibrant community. OSM is still a growth community (to wit: it is not the most important mapping project in the world, while Wikipedia is almost definitely the most important encyclopedia), and has a narrower charter. Community growth targets that might be a reasonable lift for OSM might not be for WP.

There are a lot of things Wikipedia could potentially do to grow the community. But curbing deletionism isn't likely to be one of the more important ones. Deletionism is primarily an Internet message board concern.

So my counterargument would be: Wikipedia should first do big things to improve participation (for instance: it can and should be made much easier, in a technical/UX sense, to write or edit an article). Then, once the community has grown, it can start turning the dials on how much vandalism and error its community can sustainable fight.

I think the page you linked to is a pretty wonderful capsule summary of how Wikipedia's community differs from other communities. I hadn't read it before and am glad I did. Thanks!


> it is not the most important mapping project in the world

Sorry for the aside, but what do you think is the most important right now?


Probably Google's, right?


Except as you and I have argued about in the past, "notability" is redundant. Wikipedia already requires verifiable information from reliable sources. That should be the only bar.

"Notability" introduces a way to subjectively say "well, yes, this is verifiable information from reliable sources, but I don't like this subject, so it shouldn't be allowed anyway". No amount of hand-waving can change the fact that this is what the criterion is used to do, and it makes Wikipedia a laughingstock when coupled with its protestations of neutrality.


Having detailed discussion of "frivolous" topics costs the ability to be taken seriously as a source for less trivial information in certain circles.

Whether the tradeoff is worthwhile is left as an exercise for the reader.


The same reason libraries deaccession books even when they have spare shelf space, and, unless you are a hoarder, you get rid of/donate/sell unused items from your dwelling.

Sifting through hoarded garbage has a huge cost - your time. You should not have to waste time even bothering to skip through garbage articles in an encyclopedia. Wikipedia editors should not have to waste their time reviewing updates to garbage articles, making sure the articles are categorized properly, etc.


I don't think you realize the amount of new stuff that pop up in the world every day. That would be a tremendous amount of articles if we wrote about every single little thing.


> Wikipedia had evolved from the early days when I was consistently delighted to find information on all manner of obscure topics

IMO, encyclopaedias aren't meant for obscure topics. And any products or services with a focus on curation have to discriminate. Of course the criteria for doing so will probably always be controversial.


"encyclopaedias aren't meant for obscure topics"

I'm a bit confused by this reasoning. I can see space considerations in a paper environment, but not on a wiki. Why wouldn't Wikipedia try to be the home of all knowledge?


Space has little to do with it. Wikipedia is meant to be a reliable compendium of knowledge, and it can succeed only to the extent that it contains easily verified information. Obscure topics will of course not have much in the way of verifiable information available.


How much research is too much for "easily verified" information?


If you have to do original research to justify a topic's inclusion, you're outside the charter of the encyclopedia. The only "research" Wikipedia accepts are direct citations to reliable sources.


I was more thinking of finding the citations as opposed to research in the discovery of new knowledge sense.


That may be true regarding physical encyclopedias, which are, of course, limited by space and weight (ha!). With an online encyclopedia like Wikipedia, though, those limitations are gone.

I'm looking forward to seeing Infogalactic give Wikipedia a run for its money!


>If you would like to edit articles on Infogalactic: the planetary knowledge core, you may complete and submit the following form to request a user account. Please read the Terms of Service before requesting an account. Once the account is approved, you will be emailed a notification message and the account will be usable at login.

Request an account?!? What an exclusionary language. How did I find that out? Because I tried to make a tiny edit. Then found out I need an account. Then that I need to 'request' account. Yeah, no, infogalactic just lost me forever.

A very nice example of barriers to participation in action.

>Infogalactic has the right to block or ban any user for any reason whatsoever.

Even more totalitarian language... infogalactic seems to be less inclusive & open than wikipedia today.


Can folks link to the best defense of deletionism they know? Here is an OK one I've found:

https://en.wikipedia.org/wiki/User:LeilaniLad

To me personally, verifiability seems a vastly better criteria for inclusion in Wikipedia. It's (a) a much less subjective standard than notability, (b) well-justified by the limitations of the medium (i.e., no original research), and (c) still prevents Wikipedia from being a "dumping ground" for everything on the internet.

The only general argument for deletionism I've seen that survives is that the editor resources of Wikipedia will be stretched so thin that the checking of verifiability itself is imperiled. But (1) Wikipedia survived fine back when it had vastly fewer editors and random people were writing on a bajillion then-empty topics and (2) there are much better ways to handle this minor problem, e.g., pages are hidden from non-registered users until they reach a critical mass of contributing editors.

Measured by pure prevalence, the dominant argument seems to be about status: that non-notable items just don't deserve to be in Wikipedia. It's very hard to argue against an emotional appeal like this.


Is this one of those situations where someone uses a proxy metric for something that is itself easily measured? "We must protect verifiability, non-notable articles aren't easily verified, therefore non-notable articles should be banned"?

Is there a name for this? I've been noticing (and being frustrated by) it a lot.


I don't understand how this is a "proxy metric".

I'll give you the shortest defense of deletionism I can write:

There are two problems with non-notable articles in Wikipedia.

The first is definitional. An encyclopedia is a tertiary source: a roadmap and synopsis of existing sources. Non-notable sources are by definition those for which existing reliable sources are absent. All such articles are, in a very simple sense, "original research". One of the most basic founding principles of Wikipedia, as old as NPOV, is "no original research".

The second is pragmatic. Wikipedia is committed to minimizing error. Entropy exerts tremendous force on the project, which occupies extremely valuable Internet real estate. Every article on Wikipedia is in a sense a commitment by the Wikipedia community to mount a defense against bias, advertisement, promotion, vandalism, and basic inaccuracy. The notability requirement, which is at bottom based not on "fame" but on the quality of sources available for a topic, is an extremely reasonable boundary to put on the expectations Wikipedia can have of its unpaid editors to combat that entropy: to wit, that we will do so only when available sources make the job possible to do in the first place.


> Wikipedia is committed to minimizing error.

Try to tell that to the guys who feel like they own the /dev/random article on Wiki-DE:

https://de.wikipedia.org/wiki/Diskussion:/dev/random#.2Fdev....

An "IP" (which is basically a kind of online racism, because ostensibly "IPs" are not addresses but a certain kind of people) tried to correct the usual /dev/urandom misinformation. It was reverted, called vandalism and the user bakunin finally stated that he reverted because "the discussion was getting on his nerves" and "even if the change was factually correct", it was obviously vandalism.

I know why I don't try to contribute to Wikipedia. When the half-time of donations to the project can be measured in minutes, it's just not worth it.


It will probably not surprise you that this is also the reason I no longer contribute to Wikipedia. When you're an expert in a subject, writing an encyclopedia article about it is very tedious. It's easy for me to write an explanation of how the LRNG works, or what its faults are, or what the urban mythology is about it. It's much more painful for me to do so while backing every point I make up with some secondary source --- none of which can be related to me in any way, lest I be accused of injecting my own bias into the encyclopedia.

But while I'm not going to say that the way Wikipedia deals with expertise is perfect (it is deeply imperfect), there's a sense in which this is the way it's supposed to be.

If I have important things to say about the LRNG, why would I write them in Wikipedia? Should I not write my own articles, or a chapter of a book, or an academic paper, and let people who want to write encyclopedia articles cite them? In my own work, I can obey whatever guidelines I want to, and write in whatever fashion is most gratifying and effective for me.


(EDITED)

> An encyclopedia is a tertiary source: a roadmap and synopsis of existing sources.Non-notable sources are by definition those for which existing reliable sources are absent. All such articles are, in a very simple sense, "original research". One of the most basic founding principles of Wikipedia, as old as NPOV, is "no original research".... we will do so only when available sources make the job possible to do in the first place.

Why not just incorporate that into verifiability then? As I understand you, you're arguing that (1) non-notability generally implies un-verifiable, (2) Wikipedia info should be verifiable, and so (3) Wikipedia info should be notable.

Isn't it easier to demonstrate objectively that an article contains unverifiable info than it's non-notable?


I don't understand the question. That's what the standard in fact is.


I expanded my comment for clarity. Let me know what you think.


The notability guideline is part of the verifiability requirement.


Yes I know. Your argument sounds like you're justifying the notability guideline primarily by appealing to its implication for verifiability, and I'm asking why you don't just require verifiability (and possibly making it more stringent).


> I don't understand how this is a "proxy metric".

I didn't say it was, I asked if it was.

> Non-notable sources are by definition those for which existing reliable sources are absent.

This would also make them non-verifiable, no?

> Wikipedia is committed to minimizing error.

That's fair.


It reminds me of the XY problem [1] that plagues both our field and any other field where one can devise a solution to a problem without formalism or thoroughly understanding what's being solved.

> The XY problem is asking about your attempted solution rather than your actual problem. This leads to enormous amounts of wasted time and energy, both on the part of people asking for help, and on the part of those providing help.

> - User wants to do X.

> - User doesn't know how to do X, but thinks they can fumble their way to a solution if they can just manage to do Y.

> - User doesn't know how to do Y either.

> - User asks for help with Y.

> - Others try to help user with Y, but are confused because Y seems like a strange problem to want to solve.

> - After much interaction and wasted time, it finally becomes clear that the user really wants help with X, and that Y wasn't even a suitable solution for X.

I'm sure that's not the name of the phenomenon you are asking but it is certainly a closely related one.

[1] http://xyproblem.info/


Yep, that is very closely related, just not exactly that. Mine goes something like:

- We need to improve X, something that is easy to measure improvement in.

- I've noticed that Y somewhat correlates with X.

- Let's make an effort to improve Y!


It's not a "proxy metric", it's the only possible metric for inclusion in an encyclopedia as I understand encyclopedias. I saw the example of a mildly popular band somewhere else on the thread, so I'll run with that.

What can an encyclopedia say about a band that has a website and a Spotify page, but isn't big enough to attract attention from the press or music critics? We can certainly say that they are a band, anyone can see that. We can say they have 10,000 plays on Spotify, and list their current members. Maybe we can dig up a newspaper article saying that they performed someplace. But beyond that, what is there? They've probably got a biography on their website, but it's in no way verifiable, and the only person required to believe it is the guy putting it on the Wikipedia page. Everyone after that will read it on Wikipedia and assume it's 95% likely to be true. And so on. We're left with barebones facts that are of no help to anyone looking for reliable information about the band. All of this applies equally well to little-known authors, random Indian corporations, and your cat. If all we have to go on is a couple of primary sources and maybe a passing mention in the press, what is there to say without giving undeserved credence to the primary sources?

When notability is assessed, don't think "is this topic something that will be googled by someone looking for more information?" Think about the role of Wikipedia and what we can offer by providing an overview and references to secondary sources. The notability guidelines are about ensuring that there is a baseline of quality facts that we can provide to readers.

But really, I'm just rewriting the ideas of the consensus based notability guideline, which is much better written and specific: https://en.wikipedia.org/wiki/Wikipedia:Notability


> We're left with barebones facts that are of no help to anyone looking for reliable information about the band.

The fact that the band exists and has a newspaper article about it is helpful. Otherwise it's article will be very short. So what?

> All of this applies equally well to little-known authors, random Indian corporations, and your cat.

Yes to the first two. (My cat has zero verifiable information about it.) They will have very tiny stub articles. So what?

> If all we have to go on is a couple of primary sources and maybe a passing mention in the press, what is there to say without giving undeserved credence to the primary sources?

So don't say anything else.

I really, really don't understand what's motivating your position. I'm trying!

> But really, I'm just rewriting the ideas of the consensus based notability guideline, which is much better written and specific: https://en.wikipedia.org/wiki/Wikipedia:Notability

That page is almost exclusively about defining notability (which, because of the intrinsic vagueness, necessarily takes a long time and is mostly unsuccessful). The only part that's directly about the reasons for the requirement is this subsection:

https://en.wikipedia.org/wiki/Wikipedia:Notability#Why_we_ha...

And the only argument it brings up that I didn't address is this: "We require multiple sources so that we can write a reasonably balanced article that complies with Wikipedia:Neutral point of view, rather than representing only one author's point of view." This, again, is something that can easily be addressed by just requiring verifiability to include coverage by multiple sources. (Even in highly notable articles, there are a large number of facts that are only verified through a single source; this isn't a reason to delete the fact.)


The problem Wikipedia sees for these very short articles that state nothing other than the fact that they were mentioned in passing in a newspaper article is that there is no potential for them to ever be good articles. The band is defunct, no additional reliable sources will ever be created for it, all available sources have been exhausted to produce an article of almost no value.

You're right that the article has some marginal value. I think many deletionists would even agree.

The problem is risk/reward. Every WP article is a commitment to defend a topic against entropy and malice. Every article is part of the attack surface exposed to vandals and spammers.

So in the view of most of Wikipedia, articles that are fated to forever remain "stubs" have negative net value.

Reasonable people can obviously disagree about this, but I find the deletionist argument against zombie stub articles to be very compelling.


> Every WP article is a commitment to defend a topic against entropy and malice.

But at this point you've basically given up any attempt of civil community processes and you're doing things not because they are right, but because not doing them might give the enemy a speculative advantage.


If that were true, we'd refuse to have articles on Middle East topics and tell people to look elsewhere, because it's an absolute nightmare and NPOV routinely loses to people with agendas. But it's worth it in the mission to be a comprehensive encyclopedia.


I don't understand. My paraphrase of Wikipedia's position on (what I'm calling) zombie stub articles is that the risk/reward for them doesn't pan out. They have marginal positive value but pose significant risk of overhead and error.

But that's very much not the case with other stub articles, or even other zombie articles (where sources have no doubt been permanently exhausted, but the sources we have today make room for a solid 100 word article).


I my have read too much into the word "defend", but before my inner eye I saw all-out war against "them".

All over some moderately important issue, at best.


If this is really a problem, why not just lock the article at a stub with a few verifiable facts? How is this more editorial work than deleting, and fighting a subjective battle over notability?


Because it's obviously (well, to me) better to leave the article in the default state of not existing and have the option of allowing anyone to quickly add coverage, rather than needing a committee to unlock a topic if it does become relevant. This is good adherence to the "Wikipedia Is Not Paper" idea.


Except that there already exists a whole special mechanism for patrolling frequently created-then-deleted pages.

You only need to lock stub articles that repeatedly have people add unverifiable info to them.

Finally, if the pages doesn't exist, then anyone who wants to exercise the "option...to quickly add coverage" doesn't have access to the info that would have been contained in the stub!


In my opinion, there is absolutely no reason to have a tiny stub article with information of no interest to anyone. Keeping the band example going, people come to Wikipedia to read about a band without diving into everything that has ever been written about them; if there's nothing to say, we can't help. But I guess your question is more "why not?" Editors will spend more time per year reverting vandalism and removing original research than readers will ever spend using it. And that's really, really optimistic: what actually happens with stubs on little-known topics is that they fill to the brim with original research and info from bad sources, and anti-vandalism editors don't stop it because either they don't know if the sources are good, or they're rightly afraid of being yelled at for removing content, since anything looks like a potentially constructive addition to a shell of an article. If you require a decent amount of sourced information, someone adding a wall of text without adding references looks suspect and is (somewhat) more likely to be removed in a timely manner.


If this is really a problem, why not just lock the article at a stub with a few verifiable facts? How is this more editorial work than deleting, and fighting a subjective battle over notability?

I can't understand why you think one can't defend against this stuff using mechanism that don't lead to effects like the one described by rsync https://news.ycombinator.com/item?id=13159057


Addressed the locking below. I'm sympathetic to rsync.net, and maybe I spent too long in the Wikipedia bubble, but what exactly is the argument for it having a Wikipedia article?

>rsync.net has been mentioned in the press widely over the course of over 10 years - everything from articles about our warrant canary to articles about our ZFS support. In fact, other wikipedia pages mention and discuss rsync.net.

So it's had some coverage related to first commercial use of a warrant canary and is discussed in that context. It commercially implemented a feature of rsync (the application) that should be talked about on the page about rsync the application. But what can be said about rsync.net? Maybe someone read about their warrant canary and wants to know more about this company...but no independent source has written about who they are and what makes them so savvy with opsec and rsync. Anything outside those narrow topics is original research or primary sources. I think things are working as intended.


You're arguing that because the info on rsync.net is contained in other Wikipedia pages, there's no point in having an rsync page. But the page discussing ZFS support doesn't link to the discussion of the warrant canary, i.e., there's nowhere to go to find all the rsync.net info on Wikipedia. And the reason to not have this page is...because it's too much of an exposed surface for bad actors?


This is answered here, in the second place you asked this question:

https://news.ycombinator.com/item?id=13161015

(better if replies go there too)


OK. I explained there why I think resfirestar's answer is unsatisfactory, i.e., why there are easier, cleaner, and less destructive ways of dealing with that problem than using notability. Would still be interested in your take there.


>But beyond that, what is there?

The hypertext. The names of the band members can be links that lead to the articles on individual members. There can be a section describing the music style, with quotations of review sites or just the band homepage and the part that says they are a neo-core-techno-metal or whatever can have links to articles that explain what those things are.


Can you tell me how do you 'easily measure' notability in a robust, non-gameable way? I'm not aware of any such measure existing. 'Substantial coverage in reliable, independent sources' seems the best possible at the moment.


> Is there a name for this? I've been noticing (and being frustrated by) it a lot.

Laziness?


Wikipedia's Verifiability policy:

> In Wikipedia, verifiability means that anyone using the encyclopedia can check that the information comes from a reliable source.

The general notability guideline:

> If a topic has received significant coverage in reliable sources that are independent of the subject, it is presumed to be suitable for a stand-alone article or list.

The latter is little more than a restatement of the former, so I don't get the point about it being more 'subjective'.


It is probably the 'significant coverage' part of the second bit that makes it more subjective. The subjective part of verifiability is whether a source is reliable, while notability has two subjective parts; whether it has verifiable sources AND whether there are enough of those verifiable sources.


The "significant coverage" clause prevents sprawling articles about non-notable subjects based on the technicality that the subject's name once occurred in a regional newspaper article; for instance, by being quoted by a local sports reporter at a softball game.

If you think this is a contrived example, I'd exhort you to spend some quality time on Wikipedia's AfD page watching the debates.


I was not trying to make any judgement calls on whether the 'significant coverage' clause is a good one to have or not... I was merely pointing out to the parent comment why 'notability' is not simply a restatement of 'verifiability', and why someone might think 'notability' is the more subjective of the two.

Obviously, as is the case for any subjective category distinctions, there are examples so extreme that everyone can agree which category they belong in. That does not change the level of subjectivity there is.


> The "significant coverage" clause prevents sprawling articles about non-notable subjects based on the technicality that the subject's name once occurred in a regional newspaper article; for instance, by being quoted by a local sports reporter at a softball game.

Why not just cut any material that's not actually verifiable?


That's basically what they do. When you're done cutting out the non-verifiable stuff in those articles, you're left with an article that says "Bob Flendersonhaver was probably the name of a person who once stood beside a softball field outside Cedar Rapids, Iowa".


"Significant" is the problem. There are more than a few editors who presume that this basically means "anything I haven't personally heard of".


As someone else pointed out, this is a long, detailed critique of Wikipedia by someone whose main objection is that "by 2007 the water had become hot enough to be felt by devotees of modern fiction (that is, anime & manga franchises, video games, novels, etc.)" There are legitimate criticisms of Wikipedia, but that it now discourages "fancruft" isn't much of one. There are many other popular culture forums. If you want to write about anime, get an account on MyAnimeList or Daisuki or Wikia. (Daisuki, incidentally, is a project of the Cool Japan Fund, a VC fund to export Japanese culture, funded by the government of Japan and some big banks.) There's Pottermore and Wookiepedia and the Marvel Universe Wiki. There are places for that stuff, and they're big and active.

"Deletionism" is mostly pushback against the incoming tide of promotional articles. Here's an essay I wrote on dealing with conflict of interest editing.[1] Without considerable pushback, Wikipedia would read like PR Newswire. Too often, it still does. As I note in my essay, on Wikipedia, it's not what you have to say about yourself, it's what other reliable sources say about you. Wikipedia is one of the very few resources on the Web that's not choked with advertising and promotion. That's valuable.

Editor retention is a problem. Editing Wikipedia is hard. It's not like writing on a blog or forum. It's like submitting a pull request to a major open source project. It's not difficult for anyone who has published in a refereed journal, but that's under 1% of the population. It's painful for someone who's never had their writing tightly edited. So is submitting code to a successful project. It's one way to get better at writing.

Despite this, Wikipedia is dealing well with the hard problems. As the Trump administration takes power, many articles needed to be updated, and, despite controversies, that process is going reasonably well. The Washington Post comments that, over time, Wikipedia articles on controversial subjects approach a neutral point of view.[2] Few other places on the Web achieve that, or even try.

[1] https://en.wikipedia.org/wiki/Wikipedia:Hints_on_dealing_wit... [2] https://www.washingtonpost.com/news/wonk/wp/2016/10/25/somet...


Good, someone's making this point!

I'd argue that it's specifically "modern fiction" that made Wikipedia hard to take seriously in the early years (was it that they had a biography of Pikachu but not of Charles XII of Sweden?), and that a hard line on fancruft but lenience in other areas would make sense.

On the other hand, I hadn't known about the PR issue, and I think I now understand why Wikipedia is so unforgiving towards Mittelstand businesses. If the only information you can get about the second-largest sprocket-maker in Topeka was written by the second-largest sproket-maker in Topeka, it might indeed be better to leave them off the site entirely...


> We talked idealistically about how Wikipedia could become an encyclopedia of specialist encyclopedias, the superset of encyclopedias. "would you expect to see a Bulbasaur article in a Pokemon encyclopedia? yes? then let’s have a Bulbasaur article". The potential was that Wikipedia would be the summary of the Internet and books/media. Instead of punching in a keyword to a search engine and getting 100 pages dealing with tiny fragments of the topic (in however much detail), you would get a coherent overview summarizing everything worth knowing about the topic, for almost all topics.

That sounds much cooler than what we've actually ended up with.


The single most ambitious and effective single resource in perhaps the history of recorded human knowledge?

It's not cool enough for you because there isn't a totally separate independent article about whatever the fuck a Bulbasaurus is?


You don't know what a Bulbasaur is? Maybe look it up on Wikipedia because as has been mentioned here already there IS an article on Bulbasaur! It is not a stub either.

https://en.wikipedia.org/wiki/Bulbasaur

Wikipedia has righted the wrong eventually.


Yeah it turns out that Bulbasaurus is a very bad example. Sorry. I literally know Pokemon principally as a thing people use to complain about Wikipedia with. I Googled "obscure Pokemon" and will henceforth be using "Kingler" as my example of how unreasonably demanding the Internet is of Wikipedia.


"Bulbasaur" != "Bulbasaurus"


That is what Wikipedia actually was in the early 2000s.


I think everybody who isn't highly involved in Wikipedia has always hated the whole "notability" thing. Drives me crazy. If there's something more than a dozen people care about and they're not a family give them the benefit of the doubt.

And I think the author is right on point with the barriers to contribution thing.

My brother, for example, says he created the Wikipedia page for Cheescake (~2004) because he noticed there wasn't one so he started it by creating a sentence that defined what he thought cheesecake was. And he didn't register, sign up or even make a good article. He just saw something he thought was missing and created it with a single sentence.

And now there's an actual "proper" article.

Not sure if he could do that today. I imagine most people find a missing article, discover there are hoops, and don't bother.


Right? I feel like if I am clicking on a link to read about something on wikipedia, it is 'notable' enough to me. If no one looks at the wikipedia article, then you could say it isn't notable... but then, who cares? A few extra bytes on the server that no one looks at doesn't hurt anything.


He couldn't do it today because there is a Wikipedia article already for pretty much every conceivable pastry, from Stroopwafels to Cronuts to Fudgie the Whale.

To me, this strongly suggests that if you just today realized that Wikipedia was lacking an article for, let's say, Red Velvet Cake, you would have very little trouble creating that article.


Red velvet cake is simply a variation of chocolate cake, and not notable enough to have its own article. Also this article is low-quality and contains several non-encyclopedic/non-verifiable statements. Propose merging into a passing note in [[Chocolate cake]].


Deleting links from Wikipedia to your competitors is a famous black hat tactic. You can get away with almost anything on Wikipedia if you are deleting.

I think the world needs a list of web pages categorized by Wikipedia topics, but the people who will appreciate it most are spammers and there lies the rub.


Yeah, a decade and a bit ago one could share a bit of knowledge on wikipedia fairly informally and be sure it would feed into the article and help the project.

Fast forward to recent years and you can write a decent article with references and have someone come along from one of the 'patrol' teams and nominate it for swift deletion in under 5 minutes, regardless of whether they know anything about the subject area.

New users are unlikely to try twice, when the initial reaction is so hostile.


FWIW wikipedia seems to be aware that those patrol teams are more harmful than useful.

Vandal Patrol got shut down; new page patrol is now a user right and it's getting a lot more scrutiny and discussion - and the discussion is all focused on preventing the newbie-biting harms. Twinkle and rollback got a lot of scrutiny.

I hate wikipedia, but I'm glad to see some movement in the right direction.


Every year something in my brain breaks and I think "I know, I'll finally get rsync.net a wikipedia entry".

This is somewhat important because the little-hitler that owns (yes, owns) the "cloud storage" wikipedia page refuses all additions of cloud storage providers that "aren't notable". So therefore, rsync.net has to have a wikipedia page.

This shouldn't be a problem - rsync.net has been mentioned in the press widely over the course of over 10 years - everything from articles about our warrant canary to articles about our ZFS support. In fact, other wikipedia pages mention and discuss rsync.net.

You know where this is going.

A perfectly well written article, well cited with 15+ citations from "respectable" journalist sources is nearly insta-deleted due to notability.

Every single time. I've tried 4-5 times over the past few years. Every time it's a different little-hitler that swoops in to bravely defend wikipedia.


Add it to https://infogalactic.com/info/Main_Page

They started by forking wikipedia and have expanded the notability guidelines.


The problem for forks is to get enough users to keep up with ongoing events. For example, according to Infogalactic, Leonard Cohen is still alive: https://infogalactic.com/info/Leonard_Cohen

The Chicago Cubs haven't won a World Series since 1908: https://infogalactic.com/info/Chicago_Cubs

Without enough users, a snapshot of Wikipedia will quickly decay into unreliability. Most pages don't become inaccurate that fast, but you can't know if the page you're reading is one of them or not.


Surely a lower volume fork of wikipedia could somehow incorporate upstream changes in a more or less automated way? It seems like a site serious about doing this would work on some kind of moderated conflict resolution system for when it does happen, but the vast majority of pages would presumably not need it.


It's in the works. Forking something the size of wikipedia is a serious undertaking, not to mention the rickety pile of code it's built on. It's still early days.


Yeah this seems like a great use of hg update nightly.


This looks interesting, but I'm a bit iffy on a few things.

First of all is the perspectives filtering by relativity rating, which, while not yet implemented, is on the roadmap and seems to be an important future part of this encyclopedia. This idea seems to be an attempt to be more inclusive of people with various political views. This looks like a pretty big gamble to me, because it's not at all clear that this solution won't cause its own set of problems, or even work at solving the problem it purports to solve. The relativity ratings being on a one-dimensional political Left to political Right scale doesn't help my perspective on this, either.

There also seems to be some stuff about advertising in the info pages I read, which does not sound like a good thing. And the "Corelords" being corporate sponsors who oversee the editing of pages in their industry sounds pretty dangerous for an encyclopedia. But it's not clear how all this will work from what I've read, so it might be innocuous. Does anyone know more about this?

Then there are more cosmetic things, like the name being "Infogalactic", and the non-standard names for editors and admins, who are called "Galaxians" and "Starlords" respectively. These are harder to remember, and it's geeky in a sci-fi sort of way that's a bit off-putting to those of us who aren't big fans of sci-fi.

I do like that it's gamified a bit with a level system, though.

Anyone care to share their own perspective on this?


While their current focus does seem and probably should be on politics due to the wide reach of that topic, I think the idea of perspectives in general is a great idea for such an encyclopedia. The problem, in my view, that leads to the culture of deletionism at Wikipedia is the inability for one article to coherently explore all aspects and interpretations of a topic, so editors start deleting all but the parts they care about. Facts that are relevant to one perspective are not relevant to another. For example, Jesus-the-historical-figure is a different topic from Jesus-the-religous-figure.

I think the way Infogalactic solves the problem it is trying to solve is that people no longer have to argue if a particular fact is worth including in an article and have an edit war. It opens up a third option of recognizing that including or not depends on the circumstances of what the reader is looking for. I'm not sure what potential new problems you are thinking of.

As for corporate sponsors, they are bound by the same rules as the rest. I don't think explicitly recognizing corporate interest in the topic would be worse than trying to have an objective page which would probably be subverted anyway.

As for terminology, I don't see how "Infogalactic" as a name is more confusing to new users than "Wikipedia". Most people didn't know what "wiki" meant before Wikipedia. And I don't think new contributors would really have a problem figuring out "Galaxians" and "Starlords" with the obvious context.


wow a wiki fork is amazing. its been needed for a while too.


Out of curiousity, have there been any other forks that have focused on editing the content? I mean, I know there's a bajillion wikipedia mirrors that are just trying to get page views for cheap, but have there been any other forks that are actually trying to bootstrap from Wikipedia and make changes to the underlying content?

As I sometimes mention, informally studying the effects of code structure on community creation is a bit of a long-running hobby of mine, and I'd be intrigued to see any other extant examples of a wikipedia fork.


It's also extremely unlikely to ever get enough traction to steal even a tiny fraction of wikipedia's user base.


That fork by Vox Day? I'm not sure I want to associate with that.


What was the name of the article you created? The deletion log shows nothing for "rsync.net". [1]

[1] https://en.wikipedia.org/w/index.php?title=Special:Log&type=...


I didn't find any evidence of Rsync in the Cloud Storage edit history, but there is evidence that Rsync was added to the Comparison of Online Backup Services article on November 11, 2015.

It was reverted less than an hour later with the comment "Thanks for the addition, but as wikipedia is not a directory, please limit the list to services with wikipedia articles." So it's at least plausible that he tried to create an article for Rsync.

https://en.wikipedia.org/w/index.php?title=Comparison_of_onl...

Edit: I initially thought you couldn't search for deletions older than a year. I was mistaken.


"Deletion records go back only a year"? Could you explain what you mean by this? Here, for instance, is an AfD log I cited several years ago on HN; it works fine, as I presume do all the AfD logs for every page deleted through the normal deletion process on WP:

https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletio...


My mistake, I tried using just a year in the search results and got older results, so I don't really know what the restrictions on the search are.

I wasn't able to bring up any pages from earlier than 12 months ago without putting in a specific year. I tried using Wikipedia pages that I have bookmarked in the past but no longer exist. I understood this to mean that I couldn't access older deletions, because of the following line on the page:

> Below is a list of recent deletions and restorations.

edit: Either way, it appears Rsync.net would be notable, as it appears on the Warrant Canary page as having the first commercial use of a Warrant Canary. That's the only current use I can find of Rsync.net. Internet cache pages are blocked where I work, so I can't find any more information.


Hold on. I'm not debating whether Rsync.net is notable. I'm saying that you confidently made a statement about Wikipedia that appears to be false --- and you did so while condemning other commenters here for holding an opinion contrary to your own.


I think you may be ascribing too much confidence and condemnation to my comment. I removed my inaccurate statement about the edit when I changed my comment, as I couldn't determine how to strikeout my text as I normally prefer, but my intention was not to make a definitive claim about the nature of Wikipedia, and I don't believe I worded my comment as such.

As far as I recall, my initial comment about how the deletion search worked was hedged by the phrase "it appears", (or something substantially similar). I provided evidence of what steps I had taken in my work. My joke about pitchforks was merely a joke, meant to point out how quickly users make up their mind one way or another.

I edited my initial comment to admit my error as soon as it was pointed out to me. Had I been acting maliciously, I would have just removed that section completely.

Regardless of what words I used to describe how Wikipedia works, the word I used in my assessment was "plausible", which I think is a very even-handed word. I specifically chose it because it doesn't make a strong claim either way.

You are reading a bit too far in to my comments if you think what I said was a definitive claim about how Wikipedia works, or an attack on other users.


Yes, it was "comparison of cloud storage blah blah" that we've tried, several times, to be added to, but again, "not notable" since we're only discussed in wikipedia, but don't have our own page.

But more to the point, as I have described, we have tried to add a rsync.net wikipedia several times (I'm sorry - I don't know the exact naming that "rsync.net" generates in a wikipedia URL) and as I described - almost instantaneously deleted due to notability - even with 12-15 serious, journalistic sources.


In order to look into this controversy, I searched for rsync.net on Wikipedia, and this is what I found. The user Kozubik submitted a draft with references, but the user Arthur Goes Shopping dismissed each of the references. Then when no one edited the draft for six months, the user JMHamo deleted it.

https://en.wikipedia.org/wiki/User_talk:Arthur_goes_shopping...

https://en.wikipedia.org/wiki/User_talk:Kozubik#Your_submiss...


Yes, that (2015) was the most recent attempt.

I have a locally saved copy of the submission and the references included a long form article at arstechnica, a long form print article in a magazine (Linux Format), The Yale Law Journal, theregister.co.uk, Lifehacker, ComputerWorld, EFF/Canarywatch, and more ... all over a 10+ year period.

Dismissed, flagged as not notable, and nothing to do but let the submission expire.


It's likely because of the conflict of interest rules which can go unmentioned in discussion threads

Get someone else with a mild level of Wikikarma to add the same content


Is that something you can escalate to someone higher-up in the Wikipedia chain of command (if such a thing exists)?


First you need to spend months doing minor edits around wikipeida and participating in various social functions (like talk boards) in order to develop a reputation. Otherwise nothing you do will be taken seriously. Then you need to familiarize yourself with the all the nuances of the wikipedia internal policy guidelines both as written and as practiced, because you can be sure Mr. "This is my area, no one touches it without my permission" is an expert on all that.

Then and only then, will you have a shot at getting a change in that the tin pot dictator doesn't want. And you will need to be constantly vigilant that it doesn't disappear two weeks later.

I've never found it worth it to go through this idiotic process.


That's always what puts me off as well. It's turned into a bureaucratic mess – which is funny, considering that wikis were originally invented to be quick ways of collaborating with minimal effort and a low barrier to entry.


A very nice thing about Wikipedia is its obsessive tendency towards logging all this stuff. What's your Wikipedia username? We could look at some of the exchanges you had with more "reputable" contributors and get a sense of how valid this complaint is.


Theoretically he could exhaust the 3 revert rule with the delete-happy user (he creates it, guy deletes it 1, he puts it back 2, guy deletes it again 3), make a note on the talk page, and head to the edit war noticeboard where an admin can see it.

If you don't understand any of this, you see why WP is a bureaucratic pain in the arse.


It's 3 reverts in 24h, for what it's worth. You have to have a very trigger-happy editor.


I can't find Rsync.net anywhere in the deletion logs, but one thing you might be able to remember is the username you used to add the article (or, really, to edit anything on Wikipedia). Could you give us that, and we can use it to track down the deleted article?

I would not be surprised to see that you had sufficient reliable sources to sustain an Rsync.net article, so it'd be interesting to see the logic used to speedy-delete your article.

One nice outcome of this would be that you'd end up with a functioning Wikipedia page. :)


In the meantime, add references and statements to https://www.wikidata.org/wiki/Q27998978 (item about rsync.net)


You could always submit it to Infogalactic and see what happens.


They don't include your offering because it's success is not notable enough. Strangely, they do include a list of failures in another segment:

https://en.wikipedia.org/wiki/List_of_commercial_failures_in...

Maybe you need to go belly up then rebound with all your customers data. You'd be a "commercial failure" plus a "significant event." That might qualify for Wikipedia. Or just keep doing business as usual. ;)


I think that the notability clause is doing a ton of damage, especially because it is enforced in a way that discourages any new users. My one and only try to create an article was shot down over it and I simply stopped trying after that.


> "by (...) using software that makes undoing most vandalism far easier than doing it, the participation goes through the roof"

The bitter irony is that this cause of wikipedia's early success is the seed of its current rot: one person's contribution is another person's vandalism; if undoing a contribution is always less effort than making it, participation must decrease.

Ah, well; everyone who's ever sneered at the list of fictional starfish back in the day have what they wanted now, I guess.


I wonder if there's a name for this phenomenon, where a small group emerge from a userbase, gain control of the platform and start applying their preferences to the detriment of the average user. It strikes me as very similar to the problems stack overflow faces.


Plutocracy? You have to have a decent amount of free time and resources initially to value spending time working on Wikipedia articles enough to gain working knowledge and social connections in the system to advance enough to advance broader agendas, or enforce narrower biases.


It's not just wikipedia that does this; it's not even specifically a problem of open communities. Organisations all over do this. People underestimate complexity and think that they can assign permissions and control access beforehand, and they invariably end up making things hugely complicated in some vain endeavor to solve largely imaginary problems.

Wikipedia, stackoverflow - but also companies with their crazy policies, and governments (which kind of inspired the word kafkaesque). Clearly, this is human nature.

Oh, it doesn't actually work, so any successful organization also develops an unofficial way to get stuff done, and those facing real risks need to develop an audit culture too. But the lure of just that slightly more detailed access control system stays, and rules proliferate.

Heck, our entire system of human self-organisation is one big cruft of rapidly growing rules, contracts, laws and treaties. It's everywhere. They only ever really shrink at any meaningful rate when the parties involved cease to exist.


It hurts to be reminded of this. I quit wikipedia because of that.

Having your page deleted is an absolute slap to the face. It's like bringing a home-baked cake to a charity cake sale and having the other bakers take your cake, throw it to the ground and stomp on it while cackling maniacally and shouting "not notable" and other stock phrases.

I refuse to accept their excuses.


Seems to me it's more like bringing your home-baked cake to Wal-mart and being horribly offended that they don't just put it on the shelves. I think Animats' analogy in a sibling comment of making a pull request against a major open-source is very apt. Wikipedia is a huge and hard to manage project with extremely high standards.


I did an inclusionist fork of Wikipedia, Includipedia, a few years ago, which failed due to inadequate execution on my part.

I still think the underlying idea of an inclusionist fork is sound, and I'm surprised one hasn't been successful.


Wikipedia is "good enough"; they have the funding and it's too much effort for most people to switch (particularly now that e.g. Google integrates wikipedia directly into search results). It's a shame.


As a deletionist organization they certainly won't get my money, meanwhile as more articles are deleted, there is less for google to link to. They're narrowcasting to an ever smaller group, and that can't end well. Eventually they have to close or change course. They won't change course, so ...


Do you know if they still have the deleted pages? It would be nice to be able to recover stuff from before the deletions started.


Not as far as I'm aware.


Even aside from a biased editor wanting to control a subject, reputation, edit history, and voting on administrative decisions are all important parts of a Wikipedia editor's account and experience of Wikipedia. Enough 'good' participation can get you an administrator account and further privileges.

This turns Wikipedia into a game for these people which, inevitably, to arcane rules-lawyering. Creating content is hard, and when you want to increase your metrics and contribute to the community because it is the community and not because you have a passion for Subject X, creating content is not a sustainable strategy in the long-term. You can't know everything, you can't be an expert on everything. It's much easier to revert, to rules-laywer, to delete. So that's what happens.


I used to contribute a lot to Wikipedia back around 2004-2005 (I did a rather large rewrite of the article on photon mapping). But in the following years, almost every edit I made or article I wrote was reverted or deleted, so I just gave up. Maybe a better version of Wikipedia will come along at some point, but until then I am no longer making any contributions.


The key insight here is that you cannot drive quality in a project like Wikipedia at the expense of driving contributors away. When people leave, who will watch an maintain the articles? You can try to lock down even harder, but then nobody new will join, and the existing contributors will eventually churn and leave.


> who will watch and maintain the articles?

Exactly the people you don't want - powerhungry, fiefdom building diehards.

The whole point of Wikipedia was to be a repository of our knowledge, even if there are flaws, but it just seems so pointless to contribute. I know I stopped bothering years ago.


I don't know how true it is that niche topics are unwelcome. I wrote this article about a year ago in hopes of getting this implemented in Python's statsmodels, and so far nobody has told me that this very niche topic shouldn't be in Wikipedia:

https://en.wikipedia.org/wiki/Medcouple

I've gotten a number of helpful edits for it too.

I also referenced it from the much more mainstream Boxplot article:

https://en.wikipedia.org/wiki/Box_plot#Variations

So, my experience has been positive so far.

(Btw, I'm still hoping someone will implement this in statsmodels.)


I think it depends a bit on the topic and area how easy and/or hard it is to add anything. I also think that it is a bit easier to add something than to start a new article.


Sometimes I wonder if the solution to this isn't just to add a second tier of pages to Wikipedia, alpha (canonical) and beta (non-canonical): with pages starting in the alpha Wikipedia, then being moved to the beta Wikipedia if they're not considered sufficiently notable, where they can be worked on and eventually promoted back to the alpha Wikipedia once they've reached a certain level and combination of quality and notability. Deletions could happen for straight-up junk pages, but at least 'notability' would be removed as a qualification for inclusion.


Could Wikipedia ever be forked by a group that wanted to compete by trying to win with a better culture and/or technology?

It's such a hostile culture it'd be nice to get rid of persnickety behavior, and indulgence of unproductive obsessive personality traits (like deletion binging, worrying more about rules than the spirit of the rules).


Everything on Wikipedia is CC-BY-SA or free-er, and the software that runs it is GPL. If you think you can do better, feel free to try. I'm fairly certain several people have already tried. I personally find the "hostile culture" to be not too different from a culture with high standards, and high standards are exactly why Wikipedia is popular - you see exactly the same criticisms leveled against StackOverflow and other essentially "canonical" resources on the web. I would not be surprised if anyone who cares at all about quality will get a lot of people complaining that their contributions were ignored or that it's impossible to add anything to these resources.


There is a real difference between pursuing quality and being unproductively persnickety. One doesn't require the other.

Take your example of SO, the issue became so bad in 2013 they had to revamp the way moderation worked. They wouldn't have done that if it were just a bunch of people wanting to lower quality.


It's becoming a fairly common occurrence where a person whose personal contribution isn't accepted as-is by a community writes an emotional, novel-length post about why the community is failing and will die soon. Someone did this recently with Stack Overflow.


Maybe because I grew up reading HHGTTG I've always thought that of course a freely editable internet encyclopedia should be as inclusive as possible. No, I won't buy the deletionist argument. Personally, I wouldn't mind if people started to cite other sources than WP again. The WWW as a whole will be the ultimate Guide.


But HHGTTG was far from inclusive. It was only edited by staff writers, and they had a team of central editors that would cut down long, detailed articles to their bare necessities ("Mostly Harmless").


Oh, true, true. The core idea stuck, still.


It is fun to see how AfD debates that are closed with "merge", just never get merged, deletionists will not bother. Too much work.

For example: https://en.wikipedia.org/wiki/Crucible_(software)


Great article. Sad to hear there is a cultural shift...

Worth noting that there is definitely a Bulbasaur article on Wikipedia though. https://en.wikipedia.org/wiki/Bulbasaur


[flagged]


We detached this subthread from https://news.ycombinator.com/item?id=13159057 and marked it off-topic.


Then should there not be any commercial products mentioned on Wikipedia ever? There's going to be a number of entries to delete now.


That is correct. Many articles with comparison and product lists are under a permanent edit war of [interested] people adding and removing products.


This is more or less an argument that Wikipedia is impossible and shouldn't exist, since almost every topic is a potential battlefield. And yet the project not only manages to handle this, but counts among it's highest-quality FA contentious subjects like renewable energy, Everglades restoration, BAE Systems, and Microsoft's Security Essentials package.


In theory any topic could be a battlefield. In practise, there are only a few ones which have a lots of edit and counter edits.

Comparison_of_<something>_tools is one of them. The products will come and go, YYMV depending on when you read the list.


Personal attacks are unwelcome on Hacker News.


How is that a personal attack? (Even if you consider it an attack, it's fairly impersonal.)


You accused 'rsync of adding articles to Wikipedia solely to promote their company. Casting aspersions on the motives of fellow HN users is corrosive to civility. Please don't do that. If it feels unnatural to bend over backwards to avoid giving that kind of offense, that's all the more reason HN needs to be careful about this kind of thing.


The message used the word "You" twice. I edited it since.




Applications are open for YC Summer 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: