[Foundation-l] Wikipedia meets git

Gregory Maxwell gmaxwell at gmail.com
Thu Oct 15 20:16:58 UTC 2009


On Thu, Oct 15, 2009 at 2:55 PM, jamesmikedupont at googlemail.com
<jamesmikedupont at googlemail.com> wrote:
> Hallo,
> I have gotten the wikipedia article for Kosovo in git.
> It is fast, distributed, highly compressed, redundant, branchable and usable.
>
> The blame function will show you who edited what version.
>
> Here Blame on the up to date kosovo article!
> http://github.com/h4ck3rm1k3/KosovoWikipedia/blame/master/Wiki/Kosovo/article.xml
> git
>
> I have checked in all the code to produce this here :
> https://code.launchpad.net/~jamesmikedupont/+junk/wikiatransfer

It is cool that you get the complete history.

But— it's a bit uncool that its about 14mbytes when the article is
100k; understandable given that the expanded uncompressed history is
about 337mbytes...

I repacked the repository using
git-pack-objects --progress --window=40000 --depth=40000
--compression=9 --all --delta-base-offset

(git-repack doesn't repack, really)

And now have 4168915 2009-10-15 16:12
KosovoWikipedia-ae859bbf9446ddcde4b17e09c99c28dcf594da89.pack, which
is more reasonable.

The number of revisions to a single article is a little bit outside of
the normal usage of git. ;)



More information about the foundation-l mailing list