Bugs item #2531935, was opened at 2009-01-23 23:29
Message generated for change (Comment added) made by nobody
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=253193…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Purodha B Blissenbach (purodha)
Summary: -hintfile: option
Initial Comment:
The newly introduced version -hintfile: is not well-documented or it's not working as
expected.
It asks for a page to be checked (see below) while (according to [2284955] interwiki hints
from file) it's supposed to read both a local page and a hint page from file. Please
fix it. Thanks!
python interwiki.py -hintfile:
Please enter the hint filename: hints.txt
Which page to check:
Pywikipedia [http] trunk/pywikipedia (r6291, Jan 23 2009, 16:08:14)
Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)]
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2010-04-27 20:42
Message:
anyone out there to take care of this?
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2009-06-26 13:25
Message:
this simple code should be working for this purpose
f = codecs.open(hintfilename, 'r', config.textfile_encoding)
R = re.compile(ur'\[\[\:?(.*?)\]\]\s+\[\[\:?(.*)\]\]')
for line in R.findall(f.read()):
pageTitle = line[0]
hintTitle = line[1]
just make a proper call to
yield wikipedia.Page(site, pageTitle)
and
hints.append(hintTitle)
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2009-06-25 16:07
Message:
I guess we need to combine "TextfilePageGenerator" from pagegenerators.py
and "hintfile" from interwiki.py, so that both the page title and the hint
are read, line by line, from the same hintfilename - page title from the
first pair of brackets [[]], and the hint - from the second pair of
brackets in the same line within hintfile. Is it possible to implement
this, please?
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2009-03-07 09:52
Message:
No, it's not exactly what I asked for. In the original feature request
#2284955
[
http://sourceforge.net/tracker/index.php?func=detail&aid=2284955&gr…]atid=603141],
as far as I can see, the idea was to read both starting pages and hints
from the same file, line per line, and to make an array of pages to be
processed and relevant hints.
# [[:xx:page_without_interwiki]] [[:en:English_page_used_as_a_hint]]
Working on a single page with -hintfile option doesn't seem to be that
useful.
----------------------------------------------------------------------
Comment By: Purodha B Blissenbach (purodha)
Date: 2009-03-03 09:44
Message:
What you want to have, in the above example, can be had with:
python interwiki.py -v -hintfile: -file:
Pywikipediabot (r6439 (wikipedia.py), Feb 24 2009, 21:48:26)
Python 2.5.2 (r252:60911, Jan 4 2009, 21:59:32)
[GCC 4.3.2]
Please enter the hint filename: hints.txt
Please enter the local file name: local-page-title.txt
There is no documentation saying that -hintfile: was overriding or
altering
the processing of any other parameter (and in fact, it does not)
Be aware that it is hardly useful to have a file with several page titles
given
via -file: when -hintfile: is being used, since hints would apply to each
of those
pages, provoking interwiki conflicts.
Thus -hintfile: is likely more often used with a singe page title on the
command
line. That does not preclude, however, a single page title being read from
a file
using -file:
If, and only if, the file given via -hintfile: has only unspecific hints,
such as [[10:]]
or [[en:]] or [[latin:]], (or all specific hinted pages do not exist) then
supplying a
list of pages via -file: would be likely free of conflicts.
There is a difference between hints and the page being processed. While
for the
outcome, in properly preset cases, it is often irrelevant where the bot
starts
processing, and which pages are then added because hinted, for the paths
the
bot follows while collecting links, it does make a huge difference
sometimes.
We can have hintless processing, but we cannot have a bot run on hints
alone,
without a starting page.
Maybe we should add some of these to the documentation? Is that, which
you
are asking for?
----------------------------------------------------------------------
Comment By: siebrand (siebrand)
Date: 2009-01-27 08:54
Message:
Assigned to committer.
----------------------------------------------------------------------
Comment By: siebrand (siebrand)
Date: 2009-01-27 08:50
Message:
Assigned to committer.
----------------------------------------------------------------------
Comment By: siebrand (siebrand)
Date: 2009-01-27 08:45
Message:
Assigned to committer.
----------------------------------------------------------------------
Comment By: siebrand (siebrand)
Date: 2009-01-27 08:36
Message:
Assigned to committer.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=253193…