---------- Forwarded message ----------
From: jamesmikedupont(a)googlemail.com <jamesmikedupont(a)googlemail.com>
Date: Sun, Oct 18, 2009 at 3:39 AM
Subject: Re: [Foundation-l] Wikipedia meets git
To: Wikimedia Foundation Mailing List <foundation-l(a)lists.wikimedia.org>
see my new blogpost word leve blaming for wikipedia via git and perl ...
http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.h…
Next step is ready :
1. I have a single script that will pull a given article and check in
the revisions into git,
it is not perfect, but works.
http://bazaar.launchpad.net/~jamesmikedupont/+junk/wikiatransfer/revision/8
you run it like this,from inside a git repo :
perl GetRevisions.pl "Article_Name"
git blame Article_Name/Article.xml
git push origin master
The code that splits up the line is in Process File, this splits all
spaces into newlines.
that way we get a word level blame.
if ($insidetext)
{
## split all lines on the space
s/(\ )/\\\n/g;
print OUT $_;
}
The Article is here:
http://github.com/h4ck3rm1k3/KosovoWikipedia/blob/master/Wiki/2008_Kosovo_d…
here are the blame results.
http://github.com/h4ck3rm1k3/KosovoWikipedia/blob/master/Wiki/2008_Kosovo_d…
Problem is that github does not like this amount of processor power
begin used and kills the process, you can do a local git blame.
Now we have the tool to easily create a repository from wikipedia, or
any other export enabled mediawiki.
mike
_______________________________________________
foundation-l mailing list
foundation-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/foundation-l