Wikipedia:Frequently misinterpreted sourcing policy

Source: Wikipedia, the free encyclopedia.

This is a list of key points of frequently misinterpreted sourcing policy, guidelines, and community norms at Wikipedia. It also covers some related issues surrounding enforcement. It is not complete, and may mutate over time. It is primarily focused on WP:Verifiability, WP:Identifying reliable sources, and WP:No original research § Primary, secondary and tertiary sources.

As of 2020, the points below badly need to be reinforced in our policies and guidelines, perhaps in multiple places (though in much more compact wording – this page explains the issues, and is not wordsmithing them into rule rewrites):

What a writer/organization is an expert in matters – a lot

Editors misuse authors who are expert in one field, for their claims regarding another field.

For example, it doesn't matter if an author is a world-class expert on quantum mechanics, if what they're writing about is a professional digression into human psychology or the history of Cyprus. They are not necessarily more reliable for those topics than any other random writer.

  • The central principles behind WP:Identifying reliable sources (medicine) (WP:MEDRS) – intended to be our strictest sourcing guideline, aside from perhaps WP:Biographies of living people policy – really apply to all topics and need to be integrated into our sourcing model more generally. [Aside from one unresolved error in that guideline, treated below.]
  • Another example: A lawyer is not a linguist, and punditry from one attorney about English-language dialects is not reliably sourced just because it got published (especially if perpetuation of incorrect and overtly nationalistic beliefs about dialectal matters is what drives his book sales!).
  • Journalism and news-reporting about or touching on technical topics cannot be used as a reliable source for technical claims. Journalism (in either sense) simplifies both concepts and wording in such topics. Nor is it reliable for statistics; journalists get statistics from somewhere else (hopefully the actually reliable sources for them) and then often misinterpret and misrepresent this data.
  • Our policies do not yet codify this, but current community best practice on technical topics (scientific, medical, IT, legal, etc.) is to cite, when available, both a high-quality secondary source and the original primary-research source, for our claim. Non-experts will understand the secondary source, and Wikipedia itself wants it as the main source. But technical readers in that field or a related one aren't interested in that; they want to read the gory details in the original journal paper. Similarly, in a computer science topic, we definitely want secondary sources for, say, the history of an Internet protocol; but IT people want to read the actual specs, too.
  • However, it is never enough to cite only primary sourcing for any kind for analytic, evaluative, interpretive, or synthesizing claim. Just the peer-reviewed original journal articles aren't sufficient sourcing for such a claim, no matter how many technical/professional editors from that field tell you otherwise. They are trying to project onto Wikipedia the typical professional researcher's preference for primary sources (the hot new research) and weariness of secondary ones (abstracts they skim as "homework", and popularized materials that elide the details they want to see).

Wikipedia over-focuses on publisher instead of author reputability

Editors use the existence of a publisher as evidence of an effective field review system that would ensure the quality of an author's claims.

Most of our assessments of publisher reliability are based on pre-Internet reputation, and reputable publishers often print material by people who turn out to be quacks or frauds, anyway.

  • This approach does not work in an era when publishers are buying each other, selling off divisions (without changing the names) to whoever has the most money, getting entirely bought out by international infotainment conglomerates, and also going more and more toward where the money is, into nonsense populist works full of outlandish claims (ancient aliens, angels among us, extremist political views, pop psychology, pseudo-medical twaddle, etc.).
  • Many editors confuse party and person, and this confusion can run deep. Secondary is not synonymous with independent sourcing; they are separate concerns. Third party authorship doesn't equate to secondary source nor to tertiary, despite the terminological similarity. See WP:No original research § Primary, secondary and tertiary sources.
  • Being from a "major" (says who?) publisher is not proof that a source is reliable; it's just an indication that it is more likely to be reliable than self-published blogging or e-books – because at least one professional editor acted as a filter, and because other reliable sources cite material from this publisher on a regular basis.
  • This relates to a failure to understand the importance of the body responsible for the quality of the work. Some scholarly publishers are known to take responsibility for monographs, while others will shift it onto the editors of edited collections. Some take more responsibility by implementing systematic book reviews which amount to field reviews of literature. Simply publishing a book is not taking such a responsibility. Nor is outsourcing editorial review and fact-checking to scholars who lack field-specific expertise. Wikipedia is mostly interested in the quality of the body, system, or network which ensures that a work meets the standards of field-specific expertise, in a field whose standards of expertise we esteem. Did a body of scholarly historians check that a scholarly history really is scholarly history?
  • Conversely, it is illegitimate for activists to cast doubt on well-accepted information sources as a class, just because they supposedly don't have the "correct" slant. Examples: claiming all the newspapers are controlled by the far-left, saying nothing from Oxford University Press is reliable because of British classism, dismissing everything published by subsidiaries of 21st Century Fox because Fox News has a conservative bias, spinning the conspiracy theory that medical journals are all in bed with big pharma, etc., etc. No one here takes that crap seriously, and it's grounds for a topic ban or worse.

Editorials, columns, and blogs are categorically primary sources

Editors misuse opinion as fact. The opinions of geniuses and respected organizations are still just opinions.

Most editorials, op-eds, reviews, blogs, and advice and essay columns are not high-quality primary sources. When they have not been written by notable individuals (in their areas of expertise) or those acting as official spokespersons of notable organizations, either with their own sources cited or at least a very clear indication where the information is ultimately coming from, they're just noise.[a]

  • Even when editorializing material has been written by such ostensibly reputable parties, it is viewpoint (and often outright advocacy), not fact.
  • Many columnists and editorialists are wrong about many things, and the entire point of an op-ed (opinion editorial) is to press an opinion or view, just like a press release.
  • Even WP:MEDRS has a blatant error in this regard, which some editors have been trying to fix for several years in the face of stonewalling: "Ideal sources for biomedical information include ... position statements from national or international expert bodies". Position statements are press releases (though often citing their own sources for background facts, as do high-end op-eds), and the guideline even used to say "press releases" there, but the term was removed to sweep under the rug the fact that MEDRS is saying, e.g., the British Medical Association or FDA organizational stance on e-cigarettes is "ideal" secondary sourcing when it is actually highly politicized primary material.
  • When usable at all (with due weight), opinion material must always be directly attributed inline and often should be directly quoted. We don't repeat opinions as facts in Wikipedia's own voice.
  • Reviews (in the book, film, etc. sense; this doesn't mean academic literature reviews) are by nature subjective; a work cannot be said by WP to be "derivative", "thrilling", etc., based on them. Reviewer speculation about inspirations for, influences on, and meaning of a work are wholly subjective and unreliable, absent statements from the creators of the work, or numerous notable reviewers all concurring. For opinions on the tone, style, and characteristics of a work, we can quote/paraphrase reviewers with attribution in a due and balanced manner.
  • However, opinion pieces can sometimes be secondary reliable sources for particular facts, when they are based on research and are stating a simple objective claim. If a reviewer did journalism-style work to get the facts, we can treat them as reliable that, e.g., such-and-such film scene was shot in Botswana. Equally, an FDA statement can be a secondary source for factual (not stance/advocacy) material when its own sources for the claim are clear, as they often are in well-drafted medical organization or regulatory body statements. Such a primary source can also be reliable for uncontroversial statements about itself or its writer (e.g. what the FDA's stance actually is, or whether the reviewer really did like the movie and gave it four stars).

Journalism and news are not guaranteed reliable or secondary sources

Editors use journalism and news reporting for claims which newspapers, magazines, documentaries, and news sites cannot substantiate.

Journalism "proper" and news reporting are not the same (though the profession of journalism covers both). Each has its reliability problems.

  • Learn this, know it, live it: Publication in a major newspaper, news site, magazine, non-fiction TV show, etc. – even an academic journal – doesn't automatically make it secondary. "Secondary" is a quality of the writing and the editorial process that led to publication, not of the publishing company or publication itself. Various publications that focus on secondary material also include lots of primary material. "It was in a newspaper so it must be secondary" is nonsense, a misunderstanding of the concept. (Do you think the advertisements in the newspaper are secondary sources? What about the "situation wanted" classifieds?)
  • News reporting is generally too close to the events it is reporting on to have a clear idea of their significance, or even what is really happening. It also over-relies on eyewitness testimony, and is credulous of press releases and other biased statements. Worse, many news organizations "cannibalize" from each other without fact-checking and introduce new errors in the process. Virtually all of it is low-quality primary sourcing.
  • Plenty of investigative journalism is primary, especially where it hides sources or comes to a conclusion reached by the writer as if that individual had the fact-finding and deductive powers of a huge agency. Some of it is tertiary, e.g. tables of statistics and sidebars of factoids.
  • Headlines and similar news blurbs ("kickers" and "deks" – see News style) are not sources; they're metadata and advertising: summaries and attention-getting teasers that not only are not the actual substance of the piece but often misrepresent it, either as to material facts or as to balance.
  • Some careful news reporting contains secondary material (based on interviewing multiple experts, agencies, etc., not on repeating what eyewitnesses said), when it's not regurgitating press releases or leaping to early conclusions. Even then, it has to be treated more and more like primary sourcing the closer it is to the events it's reporting on, and the further those events recede in time. This is not a Wikipedia idea, but standard treatment of news sources in the wider world.

Not all tertiary sources are created equal, and none are ideal

Editors use poor-quality tertiary sources where appropriate higher-quality secondary sources should be cited.

Reputable encyclopedias and dictionaries, both general and field-specific, are reasonable (at least temporarily) for basic and uncontroversial information, as long as we understand that more in-depth and current secondary sourcing trumps them. Being a compilation of previously-published claims doesn't "automagically" make a work reliable.

  • Dictionaries are generally not reliable except for what a term means in everyday casual speech and writing – which is usually not what we're writing about, except in an article about a slang expression. They cannot be used to trump more in-depth sources. If we have an article about a term, the most notable[b] and encyclopedic information is how the term is used in one or more professional fields; we should note the broadened everyday-banter definition in passing only, and otherwise focus entirely on what reliable sources in the field(s) say about the term and the concept(s) it describes. A dictionary's definition that doesn't include that meaning cannot be used to suppress it.
  • Similarly, if a dictionary (a highly tertiary source) gives a concise definition of how the term is used in a specific field, this cannot be used to constrain the scope or content relating to that field either; we should use the same sorts of secondary sources to provide encyclopedic coverage that the dictionary writers used [we hope] to arrive at their over-simplified topical dicdef (which may also be decades out of date); we should do a better job of it.
  • Coffee table books, school textbooks (below the graduate-school level), and children's, new-reader, or abridged works are not in the same class, and verge on categorically unreliable.
  • See WP:Use of tertiary sources for more detail on subtypes of tertiary sources and how/why/when to use (and not use) them.

Scholarly coverage does not equate to scholarly consensus

Editors mistake claims about reality for reality itself, and equate both frequent coverage and newness to veracity.

  • Scholarly fields produce lots of terms and descriptions regarding reality. The mere existence of such material in reliable sources doesn't mean these are the scholarly consensus about reality, nor that Wikipedia must cover them all. Mistaking subjective description of reality for reality itself is confusing the menu with the meal.
  • Undue weight can easily be given to ideas appearing only single publications, or advanced by small in-groups of scholars, making claims which are not widely agreed upon within the applicable research community.
  • Notable, weighty, or even simply jargony terms and descriptions from technical and academic fields are too often reported in Wikipedia's voice as actual fact, or implied to be true with weasel-wording ("According to researchers ..." – like who, exactly?)
  • Editors read published but questionable sources and produce these sources' claims as fact. Editors read tentative or novel positions and produce these sources' hypotheses as conclusions. Editors often read the results of a smaller body of academic work (for instance a research project, research program, or sub-field) as if it were the consensus of a larger body of academic work (for instance a field, sub-discipline, or discipline). By searching for scholarly consensus at the incorrect level of abstraction, editors misrepresent the scholarly consensus: this puts demonstrably unverifiable claims in the voice of the encyclopedia.
  • This all negatively impacts the encyclopedia. As just one example, the "myth of a clean Wehrmacht" is commonly (sometimes quite calmly) pushed on the English Wikipedia, despite it being a fringe view rejected by the consensus of reputable historians.
  • Various claims without professional consensus may prove true some day, but newly published research (a primary source) hasn't been subjected to systematic review and is very likely to be upended. Wikipedia isn't a journal, or a news source. In an encyclopedic work, being accurate matters more than being up-to-the-minute; we have no deadline. Wikipedia wants to be "scooped" by other publishers, because we rely mostly on secondary sources digesting and sanity-checking primary sources for us. Yet there is intensifying pressure in our click-bait society to rush to release questionable factoids. The impulse must be resisted individually and collectively at this site.
  • A related fallacy is that uncritically covering novel ideas, just because they're "interesting", is harmless, entertaining, even important. Not so. The better Wikipedia is written and sourced, the more people rely on it, and the more they believe it. Despite disclaimers, people do use Wikipedia to make medical, financial, political, and other decisions with serious potential consequences, even though they should not. We have a duty to get it right, as best with can, with the best sources we can find.
  • Wikipedia does have a responsibility, under the due-weight policy and our goal of presenting complete information, to not completely ignore a minority viewpoint, as long as it is treated substantially as a subject itself in multiple independent reliable sources. But a fringe view cannot be implied to have equal validity to real-world consensus, much less presented as truth. The fact that one exists and has some prominence is a strong reason to include it, if only to debunk it – not all scholarly coverage is positive about an idea. This is why we have articles like Vaccine controversies. Correspondingly, reasonably novel scholarly ideas may be notable in themselves as ideas, but not weighty for their claims about the nature of reality.

Editor understanding of original research is at an all-time low

Editors both create original research by inferring correlations, and fail to summarize reliable-source consensuses by claiming that it would be original research.

The WP:No original research policy needs to be rewritten with greatly enhanced clarity, both as to what various classes of sources are permissible for what kind of info in what contexts, and as to what does and doesn't constitute original research at all.

  • We might consider reviving the narrowly failing proposal, back in the 2000s, to merge all of the sourcing-related core content policy material – WP:Verifiability (WP:V), WP:No original research (WP:NOR), and WP:Identifying reliable sources (WP:RS).
  • Every day, we see farcical antics at both extremes of NOR misinterpretation.
    • This time, it's someone taking multiple sources about event X that suggest correlations with entity Y and perhaps outcome Z, none of them in agreement about exactly what happened, and turning this into a Wikipedia article statement that Z is a direct result of Y definitely doing X.
    • Next time, it's someone denying that we're able (instructed, in fact) to use multiple sources that are in agreement to summarize the RS consensus in our own words, just because they didn't all use exactly the same phrasing.
  • And a day later it'll be someone arguing that because they're only applying their personal, novel interpretation to a single source, and not synthesizing a novel conclusion from multiple sources, that it's not OR. They're wrong: "all material added to articles must be attributable to a reliable, published source", as stated in the very lead section of WP:No original research, and repeated at WP:Verifiability § Original research. The policy is really clear about this twice.
  • This kind of stuff is intensely disruptive, in a far worse way than chest-beating contests on talk pages, since it results in skewed edits in the articles, and thus direct misinformation or misleadingly cagey and incomplete information being sent to our readers.
  • If in doubt, look at featured articles and their treatment of sources, then skim the discussions that lead to their featured status. The reviewers of these pages are hard-core about sources being used properly, everything in an article being reliably sourced, and nothing encyclopedically important about the subject being excluded.

We must enforce against disruptive editing more swiftly and broadly, with less drama

Our editorial community and admin corps are not taking sufficient steps to protect the integrity of the content and of the project.

Wikipedia can and should more quickly shut down disruptive editing of all kinds. This is especially the case when discretionary sanctions (WP:AC/DS, or just "DS") have been already been authorized for a topic; what are we waiting for? This includes enforcement of WP:Wikipedia is not, and the WP:Core content policies, not just WP:Civility-related matters.

  • It is not okay to filibuster against addition of reliably sourced information by pretending journals or major newspapers are trash sources (or pull the vice-versa act, revertwarring fringe material in play because it's in a book from a no-name publisher).
  • Lack of enforcement of WP:Wikipedia is not a forum and WP:Tendentious editing has seen several topic areas' article talk pages turn into the equivalent of 4chan or Reddit webboards, and this has to stop.
  • A better way for the community to approach this than current usual enforcement would be to issue short-term (but escalating) topic bans and blocks with less hesitation and debate, either on the part of DS-using admins or by the community at WP:ANI. If someone's being an asshat, remove them from the topic area and let the rest of us get back to work. If the sanctions are short-term, they will either a) have the desired effect and shift the editor's behavior, or b) demonstrate the editor has some kind of fundamental competency problem or isn't really here to build the encyclopedia if they keep doing it again and again despite escalating sanctions each time.
  • Our current process usually involves a too-lengthy litigation, and too-high standards of proof, because the typical sanction imposed is too long and dramatic. Stop making it about a one-year sanction, and instead about a two-day sanction with the next one being a week, then a month, then three, then indefinite (or some other particular numbers). Analogy: Our criminal justice system would be unworkable if every traffic ticket could lead to a life sentence; it would be hard to secure a conviction, and no one would be willing to do their time, but would instead desperately exhaust every avenue of appeal, and thereby drown the system. The wiki-equivalent is already happening to us, and has been for a long time.
  • Current practice is to over-focus on rudeness, with a lack of attention paid to source abuse, original research, and stonewalling against reliably sourced edits. These are less visible than flamewars, but actually more disruptive. Some topic areas, like modern American politics, are hellholes, because of a toxic combination of battlegrounding and willful sourcing-policy misinterpretation. It's the latter that is the root cause of the incivility and poisonous atmosphere, so it must be addressed directly. Yes, do also consider personal attacks and such, but look into whether the angry party has exploded because the "victim" has been gaming the system relentlessly, and perhaps even goading to eliminate a rival.
  • We have to stop our hand-wringing about editorial intent, previous good edits, and personal feelings. It doesn't matter if someone is usually a productive and nice editor; if they're being an intolerable zealot on a particular topic, they have to be removed from it quickly, for longer and longer, until they either get it or they're indefinitely topic-banned.
  • A good side effect of this would be that having a short-term block or T-ban on your log would no longer be so much of a scarlet letter; lots more people would have them! Consequently, we would be also much more willing to implement them to shut down disruption.

In conclusion

If a lot of the above were resolved through better-written policies (and better enforcement thereof), then it wouldn't matter so much if screaming obsessives on either side showed up to rant about Trump or e-cigarettes or a fringe topic. If they tried to use sources incorrectly we'd just revert them, and if they unreverted, someone else would revert them again because we'd all be on the same page about sourcing. If they didn't stop, they'd be swiftly removed from the topic area, but given a chance to learn from the experience.

Notes

  1. ^ Occasionally, the noise is actually what we want, as a primary source; e.g. when someone famous wrote something inflammatory, and our article is writing about the controversy.
  2. ^ Technically, "non-indiscriminate" a.k.a. "non-trivial"; WP:Notability only determines whether a topic can have a stand-alone article here, while the standard for inclusion inside another article is WP:What Wikipedia is not § Wikipedia is not an indiscriminate collection of information.

See also