Patches item #1879579, was opened at 2008-01-25 10:58
Message generated for change (Comment added) made by filnik
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1879579&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
Resolution: None
Priority: 5
Private: No
Submitted By: Alex S.H. Lin (lin4h)
Assigned to: Nobody/Anonymous (nobody)
Summary: Welcome.py - signature with botname
Initial Comment:
add a list for languages, when the lang in list, script will sign the bot account name automatic(not only in zh).
----------------------------------------------------------------------
>Comment By: Filnik (filnik)
Date: 2008-01-25 16:04
Message:
Logged In: YES
user_id=1834469
Originator: NO
Added a feature to add what text you want after the signature (see the
'final_new_text_additions' variable). An example of how it works:
http://it.wikipedia.org/w/index.php?title=Discussioni_utente%3ASmb&diff=136…
"Fixed" in revision: 4938
Bug closed (if you need to add some settings just write me on it wikipedia
or on commons ;-))
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1879579&group_…
Revision: 4937
Author: filnik
Date: 2008-01-25 14:13:06 +0000 (Fri, 25 Jan 2008)
Log Message:
-----------
Seems that the pid file sometimes is corrupt, usually not. By the way, I'm tire to get every day 5-7 email from crontab, it's time to 'fix it' with a try/except block
Modified Paths:
--------------
trunk/pywikipedia/wikipedia.py
Modified: trunk/pywikipedia/wikipedia.py
===================================================================
--- trunk/pywikipedia/wikipedia.py 2008-01-24 22:36:12 UTC (rev 4936)
+++ trunk/pywikipedia/wikipedia.py 2008-01-25 14:13:06 UTC (rev 4937)
@@ -2767,9 +2767,16 @@
else:
now = time.time()
for line in f.readlines():
- line = line.split(' ')
- pid = int(line[0])
- ptime = int(line[1].split('.')[0])
+ try:
+ line = line.split(' ')
+ pid = int(line[0])
+ ptime = int(line[1].split('.')[0])
+ except ValueError:
+ # I go a lot of crontab errors because line is not a number.
+ # Better to prevent that. If you find out the error, feel free
+ # to fix it better.
+ pid = 1
+ ptime = time.time()
if now - ptime <= self.releasepid and pid != self.pid:
processes[pid] = ptime
f = open(self.logfn(), 'w')
Bugs item #1878986, was opened at 2008-01-24 15:59
Message generated for change (Comment added) made by filnik
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1878986&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: Filnik (filnik)
Assigned to: Nobody/Anonymous (nobody)
Summary: getUrl() has a problem. No timeout?
Initial Comment:
Hello, I've seen that in my processes there are some scripts that are started something like 1-2 weeks ago that are still running.
The problem is that the function getUrl() of wikipedia.py doesn't raise any error after x time (or, I suppose that's this the reason, otherwise we have a bot that is trying to get a page for 1 week without a specific reason...).
I've not fixed the Bug only because I've no idea how to fix it (I have never handle with HTTP connections directly on python) but Bryan has said:
<Bryan> yes, but that would require you to modify the socket settings
<Bryan> sock.settimeout(1500)
<Bryan> or you do select.select on the socket
<Bryan> which is very hard in pywiki
Some ideas? :-) The 1500 by the way is only a number, we should/can set it on config.py. I've set this bug with high priority because infinite loops on toolserver are really a big problem.
Thanks, Filnik
----------------------------------------------------------------------
>Comment By: Filnik (filnik)
Date: 2008-01-25 13:54
Message:
Logged In: YES
user_id=1834469
Originator: YES
Ok, thanks russblau, should I close the topic or you aren't sure at 100%
that it has been fixed? :-) Bye Filnik
----------------------------------------------------------------------
Comment By: Russell Blau (russblau)
Date: 2008-01-24 22:41
Message:
Logged In: YES
user_id=855050
Originator: NO
Sorry, that last comment was me, and the revision was r4936
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2008-01-24 22:37
Message:
Logged In: NO
Added a 120-second timeout in r4796; seems to work in initial testing.
The problem with libcurl suggestion is that it would require every user of
every bot to download and install one or more third-party packages.
----------------------------------------------------------------------
Comment By: Francesco Cosoleto (cosoleto)
Date: 2008-01-24 17:21
Message:
Logged In: YES
user_id=181280
Originator: NO
I am not sure PyWikipediaBot cause intensive cpu usage in Toolserver due
to this problem, anyway to fix temporary the no timeout problem seems there
is this easy solution:
import socket
socket.setdefaulttimeout(0.1)
urllib2.urlopen("http://cosoleto.free.fr").read()
[...]
urllib2.URLError: <urlopen error timed out>
urllib.urlopen("http://cosoleto.free.fr").read()
[...]
IOError: [Errno socket error] timed out
But I suggest libcurl (http://curl.haxx.se/libcurl/) to improve easily and
simplify the net side of the PyWikipedia code. libcurl is a feature rich
(persistant connections, trasparent compression support, etc...) and
portable URL transfer library written in C. Why not?
----------------------------------------------------------------------
Comment By: Bryan (btongminh)
Date: 2008-01-24 16:06
Message:
Logged In: YES
user_id=1806226
Originator: NO
Note that it is much easier to do settimeout if persistent_http was
working. Unfortunately, it is not. I disabled it some time ago
(http://fisheye.ts.wikimedia.org/browse/pywikipedia/trunk/pywikipedia/wikipe…)
saying it needs investigation. Anybody here who is having to do this
investigation? It would not only solve Filnik's bug
(site.conn.sock.settimeout), but it would also greatly improve performance
for single threaded bots.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1878986&group_…
Patches item #1879579, was opened at 2008-01-25 18:58
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1879579&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Alex S.H. Lin (lin4h)
Assigned to: Nobody/Anonymous (nobody)
Summary: Welcome.py - signature with botname
Initial Comment:
add a list for languages, when the lang in list, script will sign the bot account name automatic(not only in zh).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1879579&group_…
Bugs item #1878986, was opened at 2008-01-24 10:59
Message generated for change (Comment added) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1878986&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: Filnik (filnik)
Assigned to: Nobody/Anonymous (nobody)
Summary: getUrl() has a problem. No timeout?
Initial Comment:
Hello, I've seen that in my processes there are some scripts that are started something like 1-2 weeks ago that are still running.
The problem is that the function getUrl() of wikipedia.py doesn't raise any error after x time (or, I suppose that's this the reason, otherwise we have a bot that is trying to get a page for 1 week without a specific reason...).
I've not fixed the Bug only because I've no idea how to fix it (I have never handle with HTTP connections directly on python) but Bryan has said:
<Bryan> yes, but that would require you to modify the socket settings
<Bryan> sock.settimeout(1500)
<Bryan> or you do select.select on the socket
<Bryan> which is very hard in pywiki
Some ideas? :-) The 1500 by the way is only a number, we should/can set it on config.py. I've set this bug with high priority because infinite loops on toolserver are really a big problem.
Thanks, Filnik
----------------------------------------------------------------------
>Comment By: Russell Blau (russblau)
Date: 2008-01-24 17:41
Message:
Logged In: YES
user_id=855050
Originator: NO
Sorry, that last comment was me, and the revision was r4936
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2008-01-24 17:37
Message:
Logged In: NO
Added a 120-second timeout in r4796; seems to work in initial testing.
The problem with libcurl suggestion is that it would require every user of
every bot to download and install one or more third-party packages.
----------------------------------------------------------------------
Comment By: Francesco Cosoleto (cosoleto)
Date: 2008-01-24 12:21
Message:
Logged In: YES
user_id=181280
Originator: NO
I am not sure PyWikipediaBot cause intensive cpu usage in Toolserver due
to this problem, anyway to fix temporary the no timeout problem seems there
is this easy solution:
import socket
socket.setdefaulttimeout(0.1)
urllib2.urlopen("http://cosoleto.free.fr").read()
[...]
urllib2.URLError: <urlopen error timed out>
urllib.urlopen("http://cosoleto.free.fr").read()
[...]
IOError: [Errno socket error] timed out
But I suggest libcurl (http://curl.haxx.se/libcurl/) to improve easily and
simplify the net side of the PyWikipedia code. libcurl is a feature rich
(persistant connections, trasparent compression support, etc...) and
portable URL transfer library written in C. Why not?
----------------------------------------------------------------------
Comment By: Bryan (btongminh)
Date: 2008-01-24 11:06
Message:
Logged In: YES
user_id=1806226
Originator: NO
Note that it is much easier to do settimeout if persistent_http was
working. Unfortunately, it is not. I disabled it some time ago
(http://fisheye.ts.wikimedia.org/browse/pywikipedia/trunk/pywikipedia/wikipe…)
saying it needs investigation. Anybody here who is having to do this
investigation? It would not only solve Filnik's bug
(site.conn.sock.settimeout), but it would also greatly improve performance
for single threaded bots.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1878986&group_…
Bugs item #1878986, was opened at 2008-01-24 07:59
Message generated for change (Comment added) made by nobody
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1878986&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: Filnik (filnik)
Assigned to: Nobody/Anonymous (nobody)
Summary: getUrl() has a problem. No timeout?
Initial Comment:
Hello, I've seen that in my processes there are some scripts that are started something like 1-2 weeks ago that are still running.
The problem is that the function getUrl() of wikipedia.py doesn't raise any error after x time (or, I suppose that's this the reason, otherwise we have a bot that is trying to get a page for 1 week without a specific reason...).
I've not fixed the Bug only because I've no idea how to fix it (I have never handle with HTTP connections directly on python) but Bryan has said:
<Bryan> yes, but that would require you to modify the socket settings
<Bryan> sock.settimeout(1500)
<Bryan> or you do select.select on the socket
<Bryan> which is very hard in pywiki
Some ideas? :-) The 1500 by the way is only a number, we should/can set it on config.py. I've set this bug with high priority because infinite loops on toolserver are really a big problem.
Thanks, Filnik
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2008-01-24 14:37
Message:
Logged In: NO
Added a 120-second timeout in r4796; seems to work in initial testing.
The problem with libcurl suggestion is that it would require every user of
every bot to download and install one or more third-party packages.
----------------------------------------------------------------------
Comment By: Francesco Cosoleto (cosoleto)
Date: 2008-01-24 09:21
Message:
Logged In: YES
user_id=181280
Originator: NO
I am not sure PyWikipediaBot cause intensive cpu usage in Toolserver due
to this problem, anyway to fix temporary the no timeout problem seems there
is this easy solution:
import socket
socket.setdefaulttimeout(0.1)
urllib2.urlopen("http://cosoleto.free.fr").read()
[...]
urllib2.URLError: <urlopen error timed out>
urllib.urlopen("http://cosoleto.free.fr").read()
[...]
IOError: [Errno socket error] timed out
But I suggest libcurl (http://curl.haxx.se/libcurl/) to improve easily and
simplify the net side of the PyWikipedia code. libcurl is a feature rich
(persistant connections, trasparent compression support, etc...) and
portable URL transfer library written in C. Why not?
----------------------------------------------------------------------
Comment By: Bryan (btongminh)
Date: 2008-01-24 08:06
Message:
Logged In: YES
user_id=1806226
Originator: NO
Note that it is much easier to do settimeout if persistent_http was
working. Unfortunately, it is not. I disabled it some time ago
(http://fisheye.ts.wikimedia.org/browse/pywikipedia/trunk/pywikipedia/wikipe…)
saying it needs investigation. Anybody here who is having to do this
investigation? It would not only solve Filnik's bug
(site.conn.sock.settimeout), but it would also greatly improve performance
for single threaded bots.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1878986&group_…
Bugs item #1879270, was opened at 2008-01-24 23:04
Message generated for change (Comment added) made by btongminh
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1879270&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 9
Private: No
Submitted By: Andre Engels (a_engels)
Assigned to: Nobody/Anonymous (nobody)
Summary: Bot losing text
Initial Comment:
I have had several cases the last few months that the bot, when doing disambiguations, saved only the beginning of the text. Compared to the total number of edits it's not very often (maybe one in 20,000 or so), but it's a serious bug that has gotten people justifiably worried at my bot.
----------------------------------------------------------------------
>Comment By: Bryan (btongminh)
Date: 2008-01-24 23:11
Message:
Logged In: YES
user_id=1806226
Originator: NO
Did you ever have had this problem after you updated wikipedia.py to
r4877? Since then it should be fixed and I have never had the problem since
then. (13 January 2008)
http://fisheye.ts.wikimedia.org/browse/pywikipedia/trunk/pywikipedia/wikipe…
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1879270&group_…