A mission: providing free knowledge. Free as in ?
I have given myself the mission of updating the Wikimedia Foundation bylaws.
And a couple of days ago, I had a thought. The sort of thought that gives you a bad chill in the back, going up to the neck and twisting your mind with fear.
I was looking at the mission statement.
"Wikimedia Foundation is dedicated to the development and maintenance of online free, open content encyclopedias (...) and other collections of documents, information, and other informational databases in all the languages of the world that will be distributed free of charge to the public under a free documentation license such as the Free Documentation License (...). The goals of the foundation are to encourage the further growth and development of open content, social software WikiWiki-based projects and to provide the full contents of those projects to the public free of charge. "
Free content is a matter of liberty, not price
This statement insists *very much* on providing content free of charge. But not so much insists on the free as in freedom, which is what unites all wikipedians.
See the Free Software Definition here. Transcripted for Wikimedia projects, we get something like:
Free content is a matter of liberty, not price. To understand the concept, you should think of free as in free speech, not as in free beer.
Free content is a matter of the users' freedom to run, copy, distribute, study, change and improve the content. More precisely, it refers to four kinds of freedom, for the users of the content:
The freedom to read the content, for any purpose (freedom 0).
The freedom to get the knowledge from the content, and adapt it to your needs (freedom 1).
The freedom to redistribute copies so you can help your neighbor (freedom 2).
The freedom to improve the content, and release your improvements to the public, so that the whole community benefits (freedom 3).
The full content should always stay fully and easily accessible.
Access to the source is a precondition for this. For Wikipedia and other wikimedia projects, access to the source is essentially access to the webpage itself. The content may be accessible from various scales. From the webpage itself (use your mouse and drag and click to copy/paste the content of the article) to the full database.
Access to the webpage itself is cool. This is for the journalist to complete a news article. This is for the student to copy for doing his homework.
But to really fullfill our mission, the full content should always stay fully and easily accessible. Making a DVD or making a book is only really feasible with quick and easy access to the entire database. If the person interested in making a DVD has to retrieve the pages one by one, reuse of the content, whilst still possible, will in reality become a real chore.
If at some point, access to the full content is restricted either because of the addition of a technical barrier, or because of the addition of a financial barrier, then the Foundation will be failing to the global spirit uniting the thousands of contributors of the website.
Could that happen ?
DataFeed for a fee. Dumps for free until ... ?
The whole content may be retrieved only by two means. Spidering the site itself (and thus consuming bandwidth, which has a cost for the Foundation). Or using the dumps.
As of today, dumps are available for free. They are done roughly once a month (though when a language dump fails, it is usually necessary to wait for the next month). Using a dump is not so easy...
In december 2005, Tim Starling made a html dump, which is much easier to use. No update has been provided though, and I have not heard any further html dump planned.
Spidering the site is discouraged. Sites are sometimes blocked for doing so. It is recommanded to ask for a datafeed agreeement. Live feeds are available that provide more up-to-date content and eliminate the requirement to install new dumps. This service involves a financial arrangement as it requires developer time and the use of the servers.
What I fear could happen in the future is that blocking sites mirroring our content occur more and more frequently, with a strong incentive for paying the datafeed. Hence setting up a financial barrier for reuse.
Of course, the dumps are free... but if the argument was made that datafeed were against a fee, to balance the bandwidth use involved, and the salary of the developers setting up the datafeed... why would not the argument be made that the dumps also require a payment ? After all, making the dumps is also taking several hours of developer time. Developers now employees of the Foundation (so arguably, making the dumps require Foundation money through the payroll).
As soon as a payment is asked, there is a financial barrier to reuse. Whatever the amount. And the barrier will be higher for those with little money... who may precisely be those who needs the content primarily.
I already hear the argument... "but we'll ask money only from those sites with money. The commercial ones. Non profit websites will get it for free if they ask". This is already what we are doing with the spidering (we do not block non profit websites, we block commercial mirrors).
The problem I have with this argument is simply... that it does not fit with the licence we chose. Our licence allow reuse without restriction. Included for commercial reasons.
Low update frequency of the dumps
Naturally, it may be that no money is ever asked for the dumps, in which case, my whole argument falls. But there are other means to limit reuse. For example, instead of doing the dumps once a month... they might get done once every 6 months, or once a year. One may get for free the outdated content. The updated content might be available with a datafeed... against a payment.
Another possible directions to consider...
A dynamic website with no dumps for the stable version ?
There are frequent discussions about stable versions. One step which might be perceived as a "good idea" is to set up an independant website, with the "qualified" version on it, whilst Wikipedia stays the live website with open editing. Thousands of editors work on this "stable" version, with in mind the license. Technically speaking, it may have no sense whatsoever to set up a wiki to host the stable version. Instead, the website might be a dynamic one, fetching and distributing the page automatically. In which case, the source code is not visible any more. The dumps are still available, but only for the live version of Wikipedia, not for the "stable" version. Ultimately, the "stable" version is a fork, with no visible source code, nor dumps, nor spidering possible. Pages may only be copied one by one... But editors are still working on what they believe is a free project.
The new website is free of charge, but not free as in freedom. A liberty was lost in the process. The text is still under gfdl, but to make a DVD of this, one better go up early. Or negociate with the Foundation. Which implies... the content is actually under Foundation control. Not free.
What about an update of the Foundation mission ?
To go back to my mission statement.
"Wikimedia Foundation is dedicated to the development and maintenance of online free, open content encyclopedias (...) and other collections of documents, information, and other informational databases in all the languages of the world that will be distributed free of charge to the public under a free documentation license such as (...). The goals of the foundation are to encourage the further growth and development of open content, social software WikiWiki-based projects and to provide the full contents of those projects to the public free of charge. "
That mission statement garantees that the content will stay accessible free of charge to the public. Nothing else.
Today is today. Tomorrow is tomorrow. Do you know who will be running the Foundation next year ? In 2 years ? In 10 years ?