New subject: Dream a little...

15 Oct 2006

My own thoughts on this, which I also expressed on the meta page: 

1. There is plenty of material out that that is already public domain. Part  
of the problem is that it can take forever and a day to digitize it all. In 
the  case of books and magazines, digitization often involves destroying the 
hard  copies in the process. There are, however, specialized scanners that can do 
the  work without ruining the books themselves. These are expensive (about US 
$30,000  a machine). Ten machines, strategically located around the world, 
along with  student staff to operate them around the clock could help to 
preserve these  texts and store them for prosperity. Additional people (paid and 
volunteer) will  be needed to OCR, proof, and hyperlink the material to ensure 
that it  doesn't get lost in a glut of material (I have visions of the final 
scene of  Raiders of the Lost Ark, when the Ark was finally stored in some crate 
in an  army warehouse). 

2. While OCR capacities exist for some languages, they do not exist for  
other languages, where the material is much more likely to get lost. Manuscripts  
in Tibetan monasteries, for example, can be scanend but not OCRed easily. To  
make this information available, developers should be paid to create adequate  
OCR tools for these languages. Rough cost: $5 million.

3. Music has been recorded around the world for well over a century, yet  
many of the early recordings are being lost, especially those on wax cylinders  
and porcelain records. Preservation includes locating, identifying, and  
remastering. People must be trained to do this. Rough cost: $35 million over two  
years.

4. This is true of old films as well. Celluloid copies are extremely rare  
and extremely flammable. Restoration is exceedingly costly. For example, [[Theda 
 Bara]] is a well-known vamp of early Hollywood (the word "vamp" was first 
used  to describe her), yet none of her films survive, and they were made less 
than a  hundred years ago. Films are international, they include important 
historic  documents such as newsreels, and they are being lost every day. Today, 
most  preservation work is being done by major studios, since it is so costly. 
In  other words, they are taking important works now in the public domain, 
restoring  them, and contending that the restoration is an original work, i.e., 
another  hundred years at least until some Vigo or Charlie Chaplin films enter 
the public  domain ... and little attention is being paid to newsreels of 
events like the  Russian revolution, World War I, etc. Like music, people should 
be offered  scholarships to learn the art of film restoration and work on these 
projects.  Until this happens it can be outsourced. Rough cost: $50 million. 

5. To ensure all of this remains accessible, we will need a LOT of  servers 
and bandwidth: Initial outlay: $10 million.

Total $100 million dollars, spent over 5 years. Costs include staffing,  
identifying prospective targets, transportation, overhead, etc. Just  coordinating 
a project of this scope will take a lot of effort. 

And there is competition too. As an example, 
_http://historical.library.cornell.edu/IWP/_ (http://historical.library.cornell.edu/IWP/) 
is  a collection of 
Internation Women's Journals, some of which are  very important historically. 
They are already scanned, but they are  inaccessible because a private 
company has (rightfully or wrongfully)  copyrighted the scans. 

Lots to be done. You will see how quickly $100 million can be spent. 

Danny

In a message dated 10/15/2006 11:27:57 AM Eastern Daylight Time,  
jwales(a)wikia.com writes:

I  would like to gather from the community some examples of works you 
would  like to see made free, works that we are not doing a good job of  
generating free replacements for, works that could in theory be  
purchased and freed.

Dream big.  Imagine there existed a  budget of $100 million to purchase 
copyrights to be made available under a  free license.  What would you 
like to see purchased and released  under a free license?

Photos libraries? textbooks? newspaper archives?  Be bold, be specific, 
be general, brainstorm, have fun with it.

I  was recently asked this question by someone who is potentially in a  
position to make this happen, and he wanted to know what we need, what  
we dream of, that we can't accomplish on our own, or that we would  
expect to take a long time to accomplish on our  own.

--Jimbo
_______________________________________________
Commons-l  mailing  list
Commons-l(a)wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/commons-l

Re: [Commons-l] Dream a little...