When web democracy fails – Wikipedia and Musipedia


[addendum – I’m delighted to say that Dr Typke has replied to the blog post personally, with a very interesting point about the development of Musipedia – click on ‘comments‘ at the bottom of the post to see it]

A lot of the posts on this blog have been somewhat one-sided, perhaps even evangelical. This is because I believe that there are serious strategic benefits to Universities and other large organisations of adopting ‘free’ web-based interactive services, rather than trying to source all their IT needs in-house.

But today I’d like to take a different angle on a much-rehearsed debate – the idea of democratically-collated knowledge, most famously exemplified by Wikipedia.

The Wikipedia arguments

I meet academics all the time who are not regular Wikipedia users, and many of them are critical of it because the very concept sounds absurd. To publish an article on which people rely for research, and to make it editable by anyone in the world seems anathema to HE’s methodology of peer review and serious scholarship. But this point of view misses two important characteristics of Wikipedia – firstly, it is not a primary source. Its policy states that entries should cite only verifiable and reliable primary and secondary sources. Secondly, to criticise it based on the potential for malicious damage is to misunderstand the basically altruistic nature of humans; the majority of people seem to enjoy sharing knowledge. Wilful sabotage takes place, of course, on Wikipedia as in physical textbooks (remember those rude pencil drawings in the margin of your teenage classroom copy of Hamlet?), and, this being the web, the online version is instantly published worldwide. But there is a critical mass of opinion that will prevent inaccuracy; try sabotaging an important Wikipedia page and you’ll see what I mean – it will revert to the accurate version within minutes, as a member of the community swoops in to heal the wound.

There have been attempts to compare online and print encyclopedias, notably the Nature survey earlier this year, and Wikipedia comes out fighting in these cases. But, just like a regular encyclopedia, it is not a one-stop-shop for research – it’s a starting point to get an overview of a subject, leading hopefully to investigation of the reliable sources it cites. Like all academics, I tell my students that Wikipedia is not a source in itself and should not be cited in research (indeed, Wikipedia’s own policy makes this clear). But unlike some colleagues, I do encourage students to use it in order to identify the reliable sources on which the article is based. Wikipedia works. It’s not the fount of all human knowledge, but it does link to it.

These arguments have been well rehearsed in the blogosphere, in the mainstream press, and even in scholarly research. But today, in the interests of balance, I want to discuss a site that falls down precisely because of its democratic, participatory online approach – Musipedia.

What is Musipedia?

It’s a website, founded by Dr Rainer Typke, that attempts to document and make searchable melodic themes from copyright and non-copyright musical works, mainly from Western/tonal music, covering the classical repertoire, popular song and jazz. Here’s the ‘About’ page from the site. Its philosophy is inspired by Wikipedia (although it is a separate organisation) in that it asks the worldwide community of musicians, musicologists and music-lovers to contribute melodies through various web-based interfaces, and then provides mechanisms for visitors to search its database for melodies. The site went ‘democratic’ in 2004 by adding any-user contributions and edits.

And, speaking as a music specialist, it’s very difficult to use. Entries are unreliable, the database is patchy (it includes some really obscure folksongs and omits some massive international pop hits), and it is musicologically underpowered in several ways, making no reference to harmonic context or bar placement, and suffering from an under-developed rhythmic engine (made worse by some contributor entries that contain no rhythmic information). This is not to criticise Typke – he is an eminent published academic with extensive knowledge of music information retrieval systems and some outstanding primary research. But I suggest that it is Musipedia’s Wikipedia-like contributor system that is its downfall.

The idea of a ‘melody dictionary’ is not new. Barlow and Morgenstern published their ‘Dictionary of Musical Themes’ in the late 1940s, and their database (of 10,000 Western classical themes) is now available online. This is much more reliable (than Musipedia), perhaps because of its non-collaborative nature; it was researched by individuals who had a clear overview of a particular musical canon, and more importantly these individuals had a particular level of musical literacy. It’s not flawless – like Musipedia, it omits harmonic context and rhythmic placement, but as a source of monophonic musical lines it’s perfectly usable. Personally I use it in songwriting dispute cases when I’m acting as a consultant to copyright lawyers – it’s a great way of calculating the statistical likelihood of particular pitch choices. And the updated/online version improves hugely on the original print publication because there is a playable MIDI file of each entry.

Musipedia, I suggest, is hampered because there is no measure of the musical knowledge of its contributors, and no quality assurance mechanism to ensure that entries are accurate (plus inevitable legal hindrances related to online music publishing and copyright). But surely one could say that Wikipedia suffers from the same lack of contributor-screening? Certainly, but in the latter case, there are enough suitably-informed people who can spot an error in an instant; the majority of those with an interest in a particular subject can (and do) error-trap Wikipedia articles. Musipedia is different; making contributions requires a certain level of subject-specific skill (aural pitch analysis, music reading etc) beyond the generic research skills of cross-referencing needed to contribute to Wikipedia. Musipedia’s input interface cannot differentiate between an experienced musician and a tone-deaf music fan, and the same problem applies to members of the online community who might error-trap entries by the latter.

For Wikipedians, a democratic approach has achieved a stable welfare state; but I suggest Barlow and Morgenstern’s benign autocracy is more successful than Musipedia’s hippy commune, despite Typke’s excellent architectural drawings for the squat. Hmmm – might have tortured that metaphor far enough now.

Daves Big Society plan - surgeons or pilots need not apply

Dave's Big Society - surgeons or pilots need not apply

There’s a political parallel here in the UK; whatever one thinks of David Cameron’s Big Society arguments, some roles require specialist expertise and can’t be democratised. There are arguments in favour of self-appointed/untrained community religious leaders or even educators, but I’m sure none of us would want to be operated on by a community surgeon, or be a passenger with a community airline pilot. But I digress.

So we’re back to the gatekeepers debate. Wikipedia shows us that democratisation of factual knowledge seems to work – there are enough people in the know (who care enough) to outnumber the saboteurs, the ‘haters’ and the mis-informed. And the ignorant (I use the term in its non-pejorative sense) will mostly stay away from editing Wikipedia articles about which they have no knowledge. There is little incentive for anyone to make malicious edits to, say, an article about a DNA polymerase, and thus it is more likely that such an entry will be accurate because it will, by its nature, attract interested experts as editors.

Music is different. Everybody loves it, and everybody has an opinion about it. But to perform, compose, notate or analyse music requires a set of learned skills that are diluted, not multiplied, by mass democratic knowledge. And if we have no democratically effective mechanism of differentiating between accurate and inaccurate entries, the database’s integrity will suffer.

So I conclude, tentatively, that applying democratic principles to factual knowledge seems to be a recipe for accuracy. Applying them to technically challenging skills such as melody transcription doesn’t seem to bring the same benefits. It’s early days for Musipedia, and I really hope it succeeds, but its wikipedia-like strategy may just be its downfall.

Which is maybe why 99% of the songs on myspace aren’t so great. Sometimes you need gatekeepers.

Comments

  1. Joe,

    thanks for sharing thoughts on Musipedia!

    You mention the numerous obscure folk music entries in Musipedia as an example for how web democracy leads to questionable content choices. Almost all of the folk music entries were entered by me shortly after putting Musipedia online, just to have some content. I felt the need to jump-start the collection (to have a meaningful number of items to search, which was important for search algorithm development). So, the obscurity of a lot of entries should not be attributed to failing web democracy, but to my choices, which were influenced by what I know about copyright law.

    I decided not to add the Barlow and Morgenstern collection because the book is still under copyright, and I am not sure how risky it would be for me to do what the multimedialibrary.com folks did, that is, to publish the collection from the book in a different format. Instead, I went for content which was already online, so I would only cache it and make it searchable pretty much like any content-based web search engine does with web content.

    Instead of blaming the poor taste of the masses, I would therefore, if I had to make any conclusion, rather say that copyright law in its current form hinders innovation. It was copyright law which prevented me from making an arguably better collection searchable in new ways.

    – Rainer

  2. Thanks Rainer for replying personally. My point was not a qualitative one about the poor tastes of contributors – more a technical one about the musical skills necessary for users to enter accurate melodic data. In fact, the folk material is among the best in the database (in terms of rhythmic accuracy particularly). By contrast, many of the pop examples (presumably entered by contributors) often have no rhythmic values and some pitch inaccuracies. I like the option to search online MIDI files (of which there are many accurate examples web-wide) and hope that the search algorithm develops further ways to interrogate these (presumably fully polyphonic) files to extra melodic themes – no mean feat on a technical level, I imagine.
    Thanks again for giving the world Musipedia – I hope it continues to grow and develop.

  3. [typo]
    extra = extract

  4. The rhythmless pop songs are without rhythm for technical reasons. They were imported from a collection whose owner had decided against representing rhythm. I imported them anyways – my reasoning was that pop songs are woefully underrepresented, and the pitches are already good enough for identifying tunes when using Parsons Code.

    Anyways, I got some inspiration from your blog post for the following improvements, which are now incorporated in Musipedia:

    1. Search by Parsons Code now also works properly for very short search queries (that is, between 3 and 5 characters long). Such short queries used to deliver meaningless results – the assumption was that nobody cares because such queries have distance zero to way too many items anyways, so any desired matches would be crowded out by countless other matches. However, this assumption becomes wrong once one narrows down the search by using keywords.

    2. folk music and rhythmless entries are now ranked lower when searching based on Parsons Code, if the distance is zero, and if the entered Parsons Code query is very short. This makes it less likely for them to “crowd out” search results which are possibly more useful.

    And since I was already fiddling with Musipedia, I added a new feature which is not inspired by your blog entry but which I find very cool anyways: Every result item is now automatically linked to ten videos on Youtube. Most of the time, at least one of these ten videos actually contains the piece of music in question (at least for the classical pieces, this seems to work pretty well).
    So, now one has a much better chance to check whether a piece was identified correctly by listening. And there is also some nice scope for serendipity.

    Yet another new feature is that one now can jump right to search results by entering the URL in a special syntax:

    musipedia.org/[/keywords]

    So, for instance, to find Beethoven’s symphony 5, one could enter:

    musipedia.org/RRDURRD

    or, to find fewer undesired matches:

    musipedia.org/RRDURRD/beethoven

    Or, to find Mozart’s piano quartet KV 478:

    musipedia.org/drudr

    or, if one remembers not just the melody but also that it is by Mozart:

    musipedia.org/drudr/mozart

  5. forgot to mention the meaning of the letters U, D, and R.

    U-Up

    D-Down

    R-Repeat

    Details: http://www.musipedia.org/pc.0.html

  6. now I am also noticing that your blog gobbles up stuff that is entered between pointy brackets.

    The syntax is supposed to be:

    musipedia.org/Contour_in_UDR_notation[/keywords]

    (with the /keywords an optional addition)

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: