Uploading the original PDFs to a publicly accessable website would most likely be a copyright violation, so we wouldn't want to do that anyway. However the originals need to be available somehow (pehaps in a restricted sense) for people to verify the OCR against when marking up (I'm assuming that this is to be going on to wikisource), as an error in a formula would be very hard to spot for a layman

Another question is what to do about about diagrams (assuming that there are some), I would imagine that if the the RS claims copyright of the scans we can't just extract them and use them. Simple ones I imagine we can (and probably should) convert to SVG, but for more detailed ones, that could be tricky.

James

On 25/09/06, geni <geniice@gmail.com> wrote:
On 9/25/06, Rich <rich.rr@gmail.com> wrote:
> Great, I don't mind helping, when I know where and what. An obvious
> suggestion but do we want to have a wiki to 1) Coordinating
> download/scans/error checking 2)Upload the pdf's to and 3) store the OCR's
> and as a base for error checking?


If you are going to go to the trouble of running it through an OCR you
might as well upload in text form rather than messing around with
PDFs.
--
geni
_______________________________________________
Wikimedia UK mailing list
wikimediauk-l@wikimedia.org
http://meta.wikimedia.org/wiki/Wikimedia_UK
http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l