Happy Monday,
There are strange people who make such links (kindof urlencoded?):
[[Második világháború#Partrasz.C3.A1ll.C3.A1s Szic.C3.ADli.C3.A1ban
.28Huskey hadm.C5.B1velet.29|Huskey hadműveletben]]
So the section title must have been copied from the URL.
Do we have a ready tool to fix these?
--
Bináris
Hello all
>From one of my assignments as a bot operator I have some code which
does template parsing and general text parsing (e.g. Image/File tags).
It is not using regex and thus able to correctly parse nested
templates and other such nasty things. I have written those as library
classes and written tests for them which cover almost all of the code.
I would now really like to contribute that code back to the community.
Would you be interested in adding this code to the pywikibot
framework? If yes, can I send the code to someone for code review or
how do you usually operate?
Greetings
Hannes
PS: wiki userpage is http://en.wikipedia.org/wiki/User:Hannes_R%C3%B6st
Hi all,
I had mentioned this in the rewrite roadmap, and noticed it came up on IRC
as well, so I'd like to run this by the mailing list:
User:The Earwig has written a pure-python (with optional C-speedups)
MediaWiki text parser named mwparserfromhell[1]. Currently we have the
textlib library and some various regexes that implement this in a
non-perfect way. From my experience using mwparser (over 400k successful
edits with no issues) I believe it is ready to be bundled with the
framework. I think it would still be a good idea to keep textlib in as a
fallback or for users who are currently using it and don't need to migrate.
As for actually adding it, in the rewrite branch we can just add it as a
dependency in setup.py, and then convert various methods over.
In trunk, I'm guessing we would need to add it as an external. (I'm not
sure how that's actually done.)
[1] https://github.com/earwig/mwparserfromhell
-- Legoktm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello all!
Following issue: I am up re-organizing the whole "externals" part in
trunk as you might have recognized already. In fact this is done now
with the single exception of a generic patching system, e.g. needed
for BeautifulSoup.py. (As usual) this is no problem under linux, but
becomes a major issue under win.
The mechanism I want to use is the well known diff-patch duo.
Therefore a "patch" executable/binary (OR python script) is needed
(for every OS). While this is kind of "built-in" in linux, win needs
extra-attention. This is what I found so far:
* http://sourceforge.net/projects/unxutils/
The executables do only depend on the Microsoft C-runtime
(msvcrt.dll) and not on an emulation layer like that provided by
Cygwin tools - windows ONLY not multi OS
* http://python-patch.googlecode.com/svn/trunk/patch.py
Python script therefore multi OS - but does not support the full
diff-patch "command set" e.g. cannot create new files
* https://code.google.com/p/google-diff-match-patch/
Not a command-line tool like "patch" but a python library/module.
Multi OS.
So I am stuck here and need some further knowledge, experience and
personal preferences from your side in order to make a good decicion.
In my opinion we should also keep further os (than just linux, mac,
win) in our mind, becuase they are very close to what we already have too.
Thanks for any help and Greetings
DrTrigon
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iEYEARECAAYFAlGghyQACgkQAXWvBxzBrDBgwQCgpIQi+omcb2mZHRtqvGs48EO1
6OIAmgNAQqz/00mdJuX6uiJtEoSdOzNC
=5mCh
-----END PGP SIGNATURE-----
Hi all,
The time has finally come upon us--I'm finally moving forward with shutting
down SVN and making it a read-only service. As Pywikipedia is the only
consumer of SVN anymore, I wanted to reach out to the community to find
out what everyone wants to do. As I see it, there's three courses of action
that Pywikipedia can go in:
1) Move to Gerrit
2) Move to Git elsewhere (Github, Google Code, etc)
3) Move to some other SVN service
I'm more than willing to help with any of these choices--the first two would
involve a conversion of the history to Git, along with importing it to the
destination of choice. Staying with SVN is also potentially possible, I'm
more than happy to provide full SVN dumps if someone's wanting to setup
that service elsewhere.
What are people's thoughts? I've not come up with a firm date yet, but
coming to consensus sooner rather than later would be nice.
Thanks!
-Chad H.
Hello,
An idea came into my mind and I think you should know this. Buzilla is
WMF bug trakcer and I think we should migrate from sourceforge to
Bugzilla because of these reasons:
1-In the sourceforge, people can report bugs via OpenID. So you don't
need to register and every week we receive spam/non-sense reports [1]
that won't happen If we use bugzilla
2-This become a centralized bug tracker. Some of PWB devs are also
mediawiki devs so they have to work on bugzilla and sourceforge
separately to track bugs but this way all of related bugs for them
will show in a page
3-People don't know sourceforge but they know bugzilla
[1] https://sourceforge.net/tracker/?limit=25&func=&group_id=93107&atid=603138&…
Best
--
Amir
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello all!
This is just an idea, but what about writing a paper about the
framework (may be rewrite, with some adds from trunk) and what it is
able to do - how it interacts with the mw API and so on and so forth.
This will be a long way away to go still but I wanted to raise this
idea at least once. In fact I am not an exper in writing papers and no
IT guy either, but I would be intressted in doing this and I think it
would be worth doing it.
It should not be a manual, but an overview (may be teaser) for people
not using it and trying to get a clue what this is about. I think we
should cover the most intressting techniques we are using/have
implemented. Everybody intressted could e.g. do "it's" stuff, may be
1-5 points to cover or mention.
May be we could post it in the "Computer Science" part of
http://arxiv.org/ what do you think about this?
Greetings and thanks for (any) feedback!
DrTrigon
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iEYEARECAAYFAlGgi9QACgkQAXWvBxzBrDBGyACfaSCjo7t8iTMPnAGhOkkqKZ+c
aFkAn1fxt76oEi/HJNrtiO2fF9VOBRvB
=7wEa
-----END PGP SIGNATURE-----
Hi everyone,
Today I published and documented two scripts to mass add claims using
the rewrite branch:
* claimit.py: A script to mass add Wikidata claims to a lot of items
based on pages on Wikipedia [1]
* harvest_template.py: A script to mass add Wikidata claims based on
information harvested from Wikipedia templates. [2]
Both are still a bit rough, but now it's possible to add claims on
Wikidata without writing a single line of code. Just run these bots from
the commandline.
I implemented it in rewrite because the I like the implementation of
Wikidata more than in trunk. It would be nice if trunk could just have
the same objects and interface as rewrite. Any opinions of this?
Maarten
[1] https://www.mediawiki.org/wiki/Manual:Pywikipediabot/claimit.py
[2] https://www.mediawiki.org/wiki/Manual:Pywikipediabot/harvest_template.py
Hello,
It seems we might have to keep a copy of the openvc and pycolorname externals in our repo (at least for trunk, I don't use rewrite), because the TS servers aren't up, so errors are received when trying to update/checkout the inaccessible externals.
Hazard-SJ
For any users interested in migrating Pywikipedia-based bots from the
Toolserver to the new Labs platform, or just interested in using Labs to
run bots, I've written a page[1] describing the steps I took to get my
bots running under the rewrite branch.
If someone who uses trunk has gotten their bot(s) running on Labs, feel
free to add instructions for that process, too.
[1]
https://wikitech.wikimedia.org/wiki/User:Russell_Blau/Using_pywikibot_on_La…
--
Russell Blau
russblau(a)imapmail.org