Domas Mituzas wrote:
Anyway, we have to ensure, that most of wikis (at
least top20 ones)
have got ridden of curly braces and any other expensive parser stuff
in these messages, as that costs them up to 10 milliseconds per
pageview (if anyone writes a bot to do this automatically, I'd gladly
run it with my global super duper privileges :)) :
1) Copy that list
2) Prepend MediaWiki: namespace
3) Post to Special:Export
4) Automate it:
sed s/wiki$/wikipedia/ all.dblist > all.domains
sed -i s/metawikipedia/metawikimedia/ all.domains
sed -i s/commonswikipedia/commonswikimedia/ all.domains
sed -i s/wik/.wik/ all.domains
sed -i s/.wikimania\([0-9]\+\)wikipedia/wikimania\1.wikimedia/ all.domains
sed -i s/.wikimaniateamwikipedia/wikimaniateam.wikimedia/ all.domains
sed -i s/foundation.wikipedia/wikimediafoundation/ all.domains
sed -i
"s/\(strategy\|usability\|collab\|advisory\|grants\|board\|incubator\|internal\|chair\|quality\|exec\|wikimaniateam\|office\|.*com\).wikipedia/\1.wikimedia/"
all.domains
sed -i s/_/-/g all.domains
sed -i s/arbcom-/arbcom./ all.domains
sed -i s/-labs/.labs/ all.domains
sed -i s/wg-en.wikipedia/wg.en.wikipedia/ all.domains
sed -i s/media.wikiwikipedia/www.mediawiki/ all.domains
while read domain; do
wget http://$domain.org/wiki/Special:Export --post-file=postdata.txt -O
$domain.txt
done < all.domains
6) Profit!!
Wikis using some kind of templating
grep -l "{{" *|wc -l
255
Total usage:
grep "{{" *|wc -l
732
Using parserfunctions
grep "{{#" *|wc -l
28 (across 22 wikis:
als.wikipedia.org bar.wikipedia.org
ca.wikipedia.org commons.wikimedia.org en.labs.wikimedia.org
en.wikibooks.org fa.wikipedia.org fa.wikiquote.org gl.wikipedia.org
it.wikinews.org it.wikiquote.org meta.wikimedia.org ru.wikipedia.org
simple.wikipedia.org sv.wikibooks.org tr.wikibooks.org tr.wikipedia.org
tr.wikisource.org zh.wikibooks.org zh.wikipedia.org zh.wikiquote.org
zh.wikisource.org)
grep "{{PAGENAME}}" *|wc -l
18
Used for namespace name:
grep "{{ns:" *|wc -l
226
grep "{{localurl:" *|wc -l
5
grep "{{grammar:" *|wc -l
8
grep "{{plural:" *|wc -l
0
grep "<nowiki" *|wc -l
0
Wikis with using all default messages:
grep -L "<revision>" * | wc -l
273
Private wikis not read:
grep "<html" *|wc -l
23