I'm watching with some
air of detachment, the unfolding
clash between the editors on
Wikipedia trying to delete the Shen article on
grounds of non-notability and the members
of the Shen group trying to keep
it. I can't guess for certain what the outcome
will be, but since the editors have the power
they will probably get their way. Actually I did
not create the Shen entry, and left to me, there
would be no entry. But somebody else did and I
cleaned it up a little and left it as a stub. So
you'll understand why I'm detached. I haven't put
much effort into it.
Yet, at another level, I'm not detached, because
there is a misuse of power going on here. Certain
criteria of notability, most principally, the
idea that a subject is notable if enough people
are interested in it, are being ignored by the
editors. At least one correspondent has pointed
this out in a quote from Wikipedia.
The common theme in the notability guidelines is
that there must be verifiable, objective evidence
that the subject has received significant
attention from independent sources to support a
claim of notability.
Only by wilfully downplaying that passage could
the editors succeed in deleting a entry on a
language with 467 members in the news group. But
there is a general confusion with the word
'notable', which I will come to. Behind this
unedifying backroom scuffle, there are issues
bigger than the meanings of words. Meanings are
being adjusted to reflect and favour decisions
about really big issues in scientific research;
meanings are the chips in a high stakes poker
game involving prestige and a lot of money. This
is really why I'm writing this essay; so people
will understand the game that is being played
out.So What's
Notable?
First
let's start small; by discussing the word
'notable'. 'Notable' is really an adjective; it's
natural place in the English language is waving a
flag in front of a noun. 'John is a notable
boxer, but not a notable chess player' makes
perfect sense. If faced with the question 'Is
John notable, yes or no?' we would only answer
with a qualifier 'Yes, as a boxer'. That's the
right answer. The hijacking of the English
language begins when we detach 'notable' from the
adjective position and insist on trying to make
it stand on its own. What then happens is that it
becomes open to covert manipulation whereby the
missing noun is implicitly defined by the user to
be the right and natural sense of 'notable'.
Something like that is going on in Wikipedia,
with the even more depressing observation, that
even a covert sense supplied in their own web
pages is being downplayed. The question is 'Is
Shen notable?'. If you've followed the previous
paragraph, you'll realise that question is a dud.
The proper response is 'In what respect?'.
If we play the notability game with computer
languages we get all sorts of conflicting answers
until we realise the proper adjectival position
of 'notable'. Lets show how by beginning with
Clojure. 'Is Clojure notable?'. Well, surely yes,
because it has a large user group and a number of
commercial applications. But in another sense, it
is not. As Rich Hickey said in a thread 'Clojure
is mostly unoriginal'. We restore our sense of
balance when we realise that as a commercial
development, in helping to introduce Lisp into
the market place, Clojure is very notable, but as
a development in language design much less so.
If we ask the notability question of Prolog, the
situation is precisely reversed. Prolog is not
much used as a commercial language, so it is not
as notable as Clojure in this respect, but it is
extremely notable as a step in the development of
programming languages. I hope by now everybody
reading this will now want to walk away from the
question 'Is Prolog more notable than Clojure?'.
Yet the Wikipedia editors are still struggling
with the attempt to define 'notable' and the
result is just to make a decent English word the
hostage of a political game. 'Is Shen notable?';
with a news group of 467, one could say 'yes'. It
is certainly more notable in that respect than
Brainfuck and Malbolge whose articles sit
unassaulted; not as notable as Clojure,
certainly, but then Clojure has a very large
group. On that basis, Shen is notable. That
cannot be allowed. So the notability game is
changed so that 'notability' becomes defined in a
way that allows the Shen article to be deleted.
Now the editors talk about notability with
respect to academic citations. This is where the
game gets serious and it is worth looking at the
power issues.
How
Shen Came About
To
understand these issues, and why Shen is mired in
this controversy, you have to understand how Shen
came into being. The genesis of Shen, and its
predecessor Qi, is inextricably mixed with the
development of the Internet and rise of
information sharing. Shen in particular could not
exist without the Internet. The goal of Shen
was to
develop a next-generation functional language
that was implemented in a very small instruction
set and that could be ported to almost any
platform. But that goal would always remain
unrealised if Shen sources and Shen technology
were not freely shared with other programmers. So
that was done, and under the understanding that
implementations of my work would be coded
correctly, the sources were placed at the
disposal of the community. A lot of very good
programmers pitched in and Shen was ported to
Clojure, CL, Scheme, Python etc. Later people
wanted even more freedom, so the work was placed
under BSD.
The technology, meaning the ideas, were published
in book form and for Qi, were made freely
readable as HTML. For financial reasons, in order
to fund the work, the Shen book was sold as a
hardcopy, but many people decided to buy and so
the work became circulated and a decent trade was
done. Many more probably decided to read the
HTML. But more importantly, the code itself was
downloaded many times. Qi had over 1500 downloads
before I stopped counting and Shen probably no
less. During the 10 years, many people read the
texts and many people played with the code, and
many who did were fascinated and some chose to
stay.
Now the point of this is not to sell Shen, but to
point out that Shen was very thoroughly tested
and that many able people read the texts and the
math'l proofs of correctness in the book
(including my good friend Dr Riha) and that
during that whole time - 10 years - no serious
error was found in either in the proofs or the
realisation of those ideas in code. Bugs were
found certainly, but none of a foundational
nature. Shen emerged from a process of testing
and scrutiny more demanding than anything I could
have devised. Does this mean it must therefore be
free from serious error? No, errors are always
possible - even in proofs. But it does mean that
the chances of such an error are less.
This process of correction and validation is
completely C21; it is born out of the Internet
and social networking on a large scale. For this
reason it is orthogonal to the usual scientific
approach which I will discuss next. This
traditional approach has some serious weaknesses
and is already under assault, but there are
powerful vested interests to keep it going. The
clash of the old and the new is at play here, and
this is why the Wikipedia drama is being played
out.
The
Traditional Scientific Model
The
traditional scientific model runs something like
this. A scientist has an original idea (so he
thinks) and so he writes it up and submits it to
the review process. His paper, which may or may
not be anonymous, is passed to a number of
anonymous referees, who possess expert knowledge,
are completely impartial and have no vested
interests or are capable of putting them aside.
They go through the paper in great detail, giving
it thorough scrutiny. The paper is finally either
accepted or rejected, and if it is accepted then
it joins that body of learning called scientific
knowledge.
This wonderful picture is at odds with the
classical economic model of human beings which
portrays them as essentially selfish being who
try to maximise their utility. So are scientists
somehow exempt? Does the process of being a
scientist make one that noble creature exempt
from the temptations of lesser beings?
The answer of course is 'no'. Scientists are
human beings and share the weaknesses of our
species. This is not to say that they are all
corrupt, but that the picture painted in the
opening paragraph is absurdly idealistic. Let's
see what really happens.
The first hurdle any editor has to overcome is
'Oh God, where will I find a reviewer?'.
Reviewing is a thankless task. It does not
contribute much to your CV and it takes a lot of
work. So the editor may find that the best person
for the job will not do it. In practice it often
means that the person chosen will be the person
closest to the topic who is prepared to take on
the job. This of course, is consistent with being
less expert than the person writing the paper. In
fact I would say this might be more the rule.
Second there is the question of time devoted to
the review. Papers are hard to read, and a
reviewer may want to get through the process as
quickly as he can. Hence if he encounters a bump,
something that is not clear, the instinct might
be to reject the paper and go on to the next task
on his job list. Very often papers are not clear;
this may be a function of ineptitude with the
English language, but it is frequently a function
of length restrictions whereby papers are
compressed to that very verge of intelligibility
consistent with the use of their native tongue.
Frequently as a referee, when faced with some
convoluted piece of presentation, one is given
the choice of either rejecting the paper (thereby
relieving one of the odious task of wading
through it all) or of being a nice guy (giving
him the benefit of the doubt) or writing back
asking for clarification (thereby doubling one's
work load). No option is really attractive, but
the first leads to brilliant work often being
rejected (we'll come to that) and the second
leads to the existence of errors in published
work - and we'll come to that too.
Then there is the question of impartiality.
Refereeing is anonymous and there is no right of
reply. This gives enormous scope for abuse.
Though papers themselves may be anonymous, it is
often easy to guess the identity of the author
from the content of the paper. This gives ample
scope for the settling of scores. Does the author
fail to cite or to recognise work that the
referee is involved with? Is he involved in
approaching the problems in a way that is
different in method from the one's used by the
referee? Would publishing the paper allow an
approach to arise that might put in question the
ideas and approach of the referee? Would allowing
these ideas to take hold threaten the lifeblood
of grant money that the referee and his
co-workers depend on? All these questions may
have positive answers, and it would be naive to
believe that referees are immune to these
considerations when anonymity protects their
decisions.
The
Traditional Model and the Paradigm Shift
The
traditional model often comes unstuck with papers
are submitted that represent paradigm shifts or
conceptual leaps in their field. These papers may
put into question, the approaches and ideas of
the scientific community. Hence the tension
between objectivity and self-interest is often
played out to the detriment of scientific
progress. Sometimes the problem is not
professional self-interest; a brilliant paper may
reference material quite unfamiliar to the normal
scientist working in the field. The lines are
dramatically redrawn. Scanning the paper for
familiar landmarks, the reviewer does not find
them and so the paper is rejected. The result is
often professional and emotional damage to the
brilliant scientist.
History is so full of these examples that the
tormented genius struggling for recognition is
become a cliche, which is sad, because it is
really a tragedy. And it is a tragedy that is
replayed in every generation. Jenner's initial
paper on smallpox vaccination was rejected,
Semmelweis was crucified for suggesting that
doctors were responsible for spreading puerperal
fever, Cantor was hounded by Kronecker into a
mental asylum for his work on infinity. And it
has got worse now and not better. Since citation
circles and publish-or-perish have grown up in
the wake of research assessment exercises and
'centres of excellence', the pressure to conform
is even stronger. Many potentially brilliant and
innovative young academics are aware of the
perils of non-conformity and subsequent
rejection, so the result is self-censorship and
the proliferation of dull conformist work.
The
Traditional Model and Getting It Wrong
It is
sometimes argued that the traditional model,
although making it hard for brilliant and
innovative minds, does at least filter out
mistakes. Better to throw out a few really good
ideas, if that is the price for ensuring errors
are not published. But actually the traditional
model is not even good at that.
The problem, as has been pointed out, is that
referees are under time pressure and struggling
with dense material. Proofs are often shortened
to sketches and much may be assumed. In such a
situation, referees may err on the side of
charity and assume that the author has got it
right. Since there are only perhaps three
referees, this is not unusual. The paper is
published and read, again by possibly only a very
few people, possibly less than a dozen. The
mistake goes uncorrected and becomes part of
science.
Mistakes like this are, I believe, more common
than one might think. They are most likely to
occur in papers published in obscure conferences
read by a few cognoscenti. I have come across
them myself twice. The first time arose within
the Qualitative Spatial Reasoning Group at Leeds.
A somewhat abashed conversation with one of the
research assistants involved in a SERC grant,
revealed that the group has unearthed a
contradiction in their published work to do with
the pointwise connection of regions of space. It
was a beautiful paradox, of the kind Zeno would
have loved, but they were not enamoured to own
it. Whether the contradiction was published I do
not know. I never heard what the resolution was.
The second experience was when I was researching
my own work and I wanted an answer to the type
theory of a specific extended lambda calculus. I
had formalised an account, but realised it was
not quite right because there was a deep
counterinstance to one of the rules of my system.
Though I had corrected the rule, I wanted more
assurance, so I wrote to Henk Barendregt in
Holland for insight. He in turn referred me to an
authority in Imperial College.
I wrote to the authority and offered my solution,
and he swiftly (as he believed) corrected me and
sent me the right version of the rule. It was, he
told me, the only substantially correct analysis
of the question I had asked him and he had
published it 10 years previously.
Except it wasn't right. It was identical to the
incorrect rule I had started with. He was
confronted with the counterinstance and his
weekend was probably ruined. He wrote back once
and admitted the error but had nothing to offer.
This sparked a search for a correctness proof of
my solution which I eventually found and the
result is in The Book of Shen. The system has
worked like clockwork for 10 years, but I still
worry that somewhere there is a mistake.
The
Eyeballs Principle and Why Science Works
I think
science and scientific papers contain many more
mistakes than one might commonly suppose. The
traditional scientific model for vetting work is
really quite inefficient. Yet oddly, despite
these weaknesses, science manages to work quite
well. The reason why is to do with Eric S.
Raymond's famous dictum: 'Many eyeballs make all
bugs shallow.' Mistakes generally go undetected
when papers are not widely read or the ideas are
not implemented and put into use. Such work is
not notable (in the citation sense used by
Wikipedia). Notable work is more likely to be
correct and since science depends more on notable
work, it proceeds forward unimpeded by the hidden
mistakes in published non-notable science.
When one realises this, one begins to see that
what holds science together is not the
anachronistic and rather creaking C19 peer review
process - which frequently excludes brilliant
work - but the social network of reading minds
that can follow the publication. In other words,
ironically, it is the same hidden review process
that Shen has followed, the gauntlet of users and
readers, that gives science much of its
integrity. I say 'ironically' because it is this
sort of review process, which depends on a large
community interest and Internet exchange, which
is right now being discounted as irrelevant by
the Wikipedia editors.
Priests
of Science and Rogue Scholars
Given
the arbitrary, inefficient and conformist model
of scientific publication that we have inherited
from the last century and before, one could ask
with justice, why do young scientists submit to
it? The short answer is that scientists have to
eat like everybody else. Hence they submit to the
whims of referees and bite the bullet in order to
climb the ranks of their profession. Doing this
requires considerable sacrifice and work, but if
successful they are rewarded by sizable chunks of
public money for research (suitably approved
research) and also power. They in turn acquire
the power to determine what does and does not get
published and hence what counts as serious
science. They become priests of science.
Priesthoods with respect to knowledge are nothing
new. The early Christian church quickly
interposed itself between God, Jesus and the
laity and reserved the interpretation of the
Bible for itself. The doctors of the medieval
School of Medicine in Paris persecuted the herbal
healers of the day in order to set up a monopoly
in medicine and slaughtered their patients.
Brahmins control the liturgy of Hinduism and
psychiatrists define sanity. In other words,
knowledge is power and so is the power to label
something as knowledge.
It was this model and lifestyle I rejected 15
years ago and I wrote about in Why I am Not a
Professor. The priesthood I observed seemed to me
to be corrupt and complacent and wasteful of
public money. Above all it was not fun. Hence I
left and ceased to be an academic and became a
scholar. I simply published my results for fun
and let people decide whether they wanted to read
my work and use it. The response surprised me;
people were fascinated and tried the work and
read the text. The main complaint from 2005-2008
was that they wanted a different license - not
GPL. We evolved eventually to BSD.
The path I was following was essentially one of
Open Science and I gradually became aware that
more traditional members of the academic
community were not happy with Qi or Shen. I began
to realise that, as a rogue scholar who had
rejected academia, the following that Qi and Shen
had accumulated (without submitting to
traditional peer review) was deeply upsetting. Qi
and Shen had eclipsed some publically funded
work.
Moreover if such a process did catch on, then the
role of the priests of science in determining
knowledge might be overturned. Even worse, if the
Open Science community overturned the position of
the priests of science, the priests of science
would not only lose the power to determine what
counted as knowledge, they might lose the
absolute power to control how science money was
spent. The cherry on top, the piece de
resistance, was that Shen technology and ideas
borrowed from work in high-performance reasoning,
a rather different idea pool than the one that
many priests in the field were working with, and
it delivered to the working programmer the sort
of power to shape his ideas that was generally
reserved for the priesthood. All this resentment
more or less simmered in the background until the
Wikipedia debate brought it all out.
So
Where Do We Go From Here?
This
question has a local sense; in relation to the
drama being played out with Shen on Wikipedia and
a more global sense to do with how science is
funded and researched. The local sense is fairly
unimportant. As said, I'm detached from the Shen
stub on Wikipedia and I've got more fun things to
do than hammer the rudiments of Shen into
neolithic referees hooked on Haskell. Being a
scholar is about doing things you like. Right now
I'm editing a course on hermetic philosophy.
The much more important question is where does
science research and computing research go from
here? Because like it or not, Open Science of the
kind that Shen represents is not going to go
away. Open Science will only get bigger as time
rolls on and since it is not subject to the
censorship, we may well see the appearance of
brilliant as well as batty ideas from this
source. In turn, if Open Science does become
established, it will start to demand some say in
the way that science money is doled out. The sad
thing about Wikipedia is that it has set itself
against the very forces that made Wikipedia
possible and chained itself to an older model of
scientific review that is under attack.
|