Revision: 6020
Author: filnik
Date: 2008-10-25 20:27:59 +0000 (Sat, 25 Oct 2008)
Log Message:
-----------
Little bugfix in getTemplates(), it returned only the first 10 templates, now all
Modified Paths:
--------------
trunk/pywikipedia/wikipedia.py
Modified: trunk/pywikipedia/wikipedia.py
===================================================================
--- trunk/pywikipedia/wikipedia.py 2008-10-25 16:16:12 UTC (rev 6019)
+++ trunk/pywikipedia/wikipedia.py 2008-10-25 20:27:59 UTC (rev 6020)
@@ -921,11 +921,15 @@
It works through the APIs.
If no templates found, returns None.
+
+ Note: It returns "only" the first 5000 templates, if there
+ are more, they won't be returned, sorry.
"""
params = {
'action' :'query',
'prop' :'templates',
'titles' :self.title(),
+ 'tllimit' :5000
}
data = query.GetData(params,
Support Requests item #2194817, was opened at 2008-10-25 17:39
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=2194817&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Dutch values for noreferences.py
Initial Comment:
I created a diff file with dutch values for noreferences.py, so that it can be used in the dutch wikipedia. Request to add this to the standard version on the SVN network.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=2194817&group_…
Bugs item #2193543, was opened at 2008-10-25 09:35
Message generated for change (Settings changed) made by silvonen
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2193543&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Mikko Silvonen (silvonen)
Assigned to: Nobody/Anonymous (nobody)
Summary: Page generation with -start crashes with TypeError
Initial Comment:
What is causing this crash?
>interwiki.py -start:A
Checked for running processes. 1 processes currently running, including the current process.
NOTE: Number of pages queued is 0, trying to add 60 more.
Dump fi (wikipedia) saved
Traceback (most recent call last):
File "C:\svn\pywikipedia\interwiki.py", line 1771, in <module>
bot.run()
File "C:\svn\pywikipedia\interwiki.py", line 1520, in run
self.queryStep()
File "C:\svn\pywikipedia\interwiki.py", line 1494, in queryStep
self.oneQuery()
File "C:\svn\pywikipedia\interwiki.py", line 1462, in oneQuery
site = self.selectQuerySite()
File "C:\svn\pywikipedia\interwiki.py", line 1436, in selectQuerySite
self.generateMore(globalvar.maxquerysize - mycount)
File "C:\svn\pywikipedia\interwiki.py", line 1370, in generateMore
page = self.pageGenerator.next()
File "c:\svn\pywikipedia\pagegenerators.py", line 684, in DuplicateFilterPageGenerator
for page in generator:
File "c:\svn\pywikipedia\pagegenerators.py", line 235, in AllpagesPageGenerator
for page in site.allpages(start = start, namespace = namespace, includeredirects = includeredirects):
File "c:\svn\pywikipedia\wikipedia.py", line 5247, in allpages
for p in soup.api.query.allpages:
TypeError: 'NoneType' object is not iterable
>python version.py
Pywikipedia [http] trunk/pywikipedia (r6015, Oct 24 2008, 18:29:39)
Python 2.5.1 (r251:54863, May 1 2007, 17:47:05) [MSC v.1310 32 bit (Intel)]
----------------------------------------------------------------------
Comment By: Mikko Silvonen (silvonen)
Date: 2008-10-25 16:22
Message:
I don't have this problem anymore. Was something changed?
----------------------------------------------------------------------
Comment By: Stig Meireles Johansen (stigmj)
Date: 2008-10-25 10:02
Message:
API list=allpages, backlinks, categorymembers, logevents disabled due to
performance issue. We'll have to bug the developers to fix the bug they
introduced recently.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2193543&group_…
Bugs item #2193543, was opened at 2008-10-25 09:35
Message generated for change (Comment added) made by silvonen
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2193543&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Mikko Silvonen (silvonen)
Assigned to: Nobody/Anonymous (nobody)
Summary: Page generation with -start crashes with TypeError
Initial Comment:
What is causing this crash?
>interwiki.py -start:A
Checked for running processes. 1 processes currently running, including the current process.
NOTE: Number of pages queued is 0, trying to add 60 more.
Dump fi (wikipedia) saved
Traceback (most recent call last):
File "C:\svn\pywikipedia\interwiki.py", line 1771, in <module>
bot.run()
File "C:\svn\pywikipedia\interwiki.py", line 1520, in run
self.queryStep()
File "C:\svn\pywikipedia\interwiki.py", line 1494, in queryStep
self.oneQuery()
File "C:\svn\pywikipedia\interwiki.py", line 1462, in oneQuery
site = self.selectQuerySite()
File "C:\svn\pywikipedia\interwiki.py", line 1436, in selectQuerySite
self.generateMore(globalvar.maxquerysize - mycount)
File "C:\svn\pywikipedia\interwiki.py", line 1370, in generateMore
page = self.pageGenerator.next()
File "c:\svn\pywikipedia\pagegenerators.py", line 684, in DuplicateFilterPageGenerator
for page in generator:
File "c:\svn\pywikipedia\pagegenerators.py", line 235, in AllpagesPageGenerator
for page in site.allpages(start = start, namespace = namespace, includeredirects = includeredirects):
File "c:\svn\pywikipedia\wikipedia.py", line 5247, in allpages
for p in soup.api.query.allpages:
TypeError: 'NoneType' object is not iterable
>python version.py
Pywikipedia [http] trunk/pywikipedia (r6015, Oct 24 2008, 18:29:39)
Python 2.5.1 (r251:54863, May 1 2007, 17:47:05) [MSC v.1310 32 bit (Intel)]
----------------------------------------------------------------------
Comment By: Mikko Silvonen (silvonen)
Date: 2008-10-25 16:22
Message:
I don't have this problem anymore. Was something changed?
----------------------------------------------------------------------
Comment By: Stig Meireles Johansen (stigmj)
Date: 2008-10-25 10:02
Message:
API list=allpages, backlinks, categorymembers, logevents disabled due to
performance issue. We'll have to bug the developers to fix the bug they
introduced recently.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2193543&group_…
Revision: 6018
Author: filnik
Date: 2008-10-25 13:03:52 +0000 (Sat, 25 Oct 2008)
Log Message:
-----------
New function that I will use in checkimages.py
Modified Paths:
--------------
trunk/pywikipedia/wikipedia.py
Modified: trunk/pywikipedia/wikipedia.py
===================================================================
--- trunk/pywikipedia/wikipedia.py 2008-10-25 12:13:08 UTC (rev 6017)
+++ trunk/pywikipedia/wikipedia.py 2008-10-25 13:03:52 UTC (rev 6018)
@@ -913,6 +913,34 @@
x = self.get()
return True # if we reach this point, we had no problems.
+ def getTemplates(self):
+ #action=query&prop=templates&titles=Main Page
+ """
+ Returns the templates that are used in the page given.
+
+ It works through the APIs.
+
+ If no templates found, returns None.
+ """
+ params = {
+ 'action' :'query',
+ 'prop' :'templates',
+ 'titles' :self.title(),
+ }
+
+ data = query.GetData(params,
+ useAPI = True, encodeTitle = False)
+ pageid = data['query']['pages'].keys()[0]
+ try:
+ templates = data['query']['pages'][pageid]['templates']
+ except KeyError:
+ return None
+ templatesFound = list()
+ for template in templates:
+ templateName = template['title']
+ templatesFound.append(Page(self.site(), templateName))
+ return templatesFound
+
def isRedirectPage(self):
"""Return True if this is a redirect, False if not or not existing."""
try:
Revision: 6017
Author: misza13
Date: 2008-10-25 12:13:08 +0000 (Sat, 25 Oct 2008)
Log Message:
-----------
Fix post parameters - when saving a new page, edittime and starttime must also be non-empty (changed in r42037 of mediawiki).
Modified Paths:
--------------
trunk/pywikipedia/wikipedia.py
Modified: trunk/pywikipedia/wikipedia.py
===================================================================
--- trunk/pywikipedia/wikipedia.py 2008-10-25 11:32:44 UTC (rev 6016)
+++ trunk/pywikipedia/wikipedia.py 2008-10-25 12:13:08 UTC (rev 6017)
@@ -1375,14 +1375,11 @@
# Add server lag parameter (see config.py for details)
if config.maxlag:
predata['maxlag'] = str(config.maxlag)
- # Except if the page is new, we need to supply the time of the
- # previous version to the wiki to prevent edit collisions
- if newPage:
- predata['wpEdittime'] = ''
- predata['wpStarttime'] = ''
- else:
- predata['wpEdittime'] = self._editTime
- predata['wpStarttime'] = self._startTime
+ # <s>Except if the page is new, we need to supply the time of the
+ # previous version to the wiki to prevent edit collisions</s>
+ # As of Oct 2008, these must be filled also for new pages
+ predata['wpEdittime'] = self._editTime
+ predata['wpStarttime'] = self._startTime
if self._revisionId:
predata['baseRevId'] = self._revisionId
# Pass the minorEdit and watchArticle arguments to the Wiki.
Revision: 6016
Author: filnik
Date: 2008-10-25 11:32:44 +0000 (Sat, 25 Oct 2008)
Log Message:
-----------
A little fix in the newimages() function, now the error returned is ServerError and not KeyError, when the server is down
Modified Paths:
--------------
trunk/pywikipedia/wikipedia.py
Modified: trunk/pywikipedia/wikipedia.py
===================================================================
--- trunk/pywikipedia/wikipedia.py 2008-10-24 18:29:39 UTC (rev 6015)
+++ trunk/pywikipedia/wikipedia.py 2008-10-25 11:32:44 UTC (rev 6016)
@@ -5080,7 +5080,10 @@
data = query.GetData(params,
useAPI = True, encodeTitle = False)
- imagesData = data['query']['logevents']
+ try:
+ imagesData = data['query']['logevents']
+ except KeyError:
+ raise ServerError("The APIs don't return the data, the site may be down")
while True:
for imageData in imagesData:
try:
Bugs item #2193942, was opened at 2008-10-25 13:10
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2193942&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: category
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Simone Malacarne (smalacarne)
Assigned to: Nobody/Anonymous (nobody)
Summary: reading category: memory leak and slow down
Initial Comment:
I need to read a very big category (80.000+ articles).
So i just do:
site = wikipedia.getSite()
cat = catlib.Category(site,'category name')
gen = pagegenerators.PreloadingGenerator(cat.articles(), pageNumber=100)
for page in gen:
do_something
problem is that the program start using more and more memory (at the end near 2giga ram). Even cpu time increase over time, if first 10.000 articles are processed in 10 min, second 10.000 double that time and so on... it takes about 20 hours to read all the articles.
If i use:
gen = pagegenerators.CategorizedPageGenerator(cat , recurse=False, start=u'')
instead of PreloadingGenerator i dont have mem or cpu leaks but it's slow as hell to read and articles at the time (more than 24 hours to finish).
Pywikipedia [http] trunk/pywikipedia (r6015, Oct 24 2008, 18:29:39)
Python 2.5.2 (r252:60911, Oct 5 2008, 19:29:17)
[GCC 4.3.2]
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2193942&group_…
Bugs item #2193543, was opened at 2008-10-25 02:35
Message generated for change (Comment added) made by stigmj
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2193543&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Mikko Silvonen (silvonen)
Assigned to: Nobody/Anonymous (nobody)
Summary: Page generation with -start crashes with TypeError
Initial Comment:
What is causing this crash?
>interwiki.py -start:A
Checked for running processes. 1 processes currently running, including the current process.
NOTE: Number of pages queued is 0, trying to add 60 more.
Dump fi (wikipedia) saved
Traceback (most recent call last):
File "C:\svn\pywikipedia\interwiki.py", line 1771, in <module>
bot.run()
File "C:\svn\pywikipedia\interwiki.py", line 1520, in run
self.queryStep()
File "C:\svn\pywikipedia\interwiki.py", line 1494, in queryStep
self.oneQuery()
File "C:\svn\pywikipedia\interwiki.py", line 1462, in oneQuery
site = self.selectQuerySite()
File "C:\svn\pywikipedia\interwiki.py", line 1436, in selectQuerySite
self.generateMore(globalvar.maxquerysize - mycount)
File "C:\svn\pywikipedia\interwiki.py", line 1370, in generateMore
page = self.pageGenerator.next()
File "c:\svn\pywikipedia\pagegenerators.py", line 684, in DuplicateFilterPageGenerator
for page in generator:
File "c:\svn\pywikipedia\pagegenerators.py", line 235, in AllpagesPageGenerator
for page in site.allpages(start = start, namespace = namespace, includeredirects = includeredirects):
File "c:\svn\pywikipedia\wikipedia.py", line 5247, in allpages
for p in soup.api.query.allpages:
TypeError: 'NoneType' object is not iterable
>python version.py
Pywikipedia [http] trunk/pywikipedia (r6015, Oct 24 2008, 18:29:39)
Python 2.5.1 (r251:54863, May 1 2007, 17:47:05) [MSC v.1310 32 bit (Intel)]
----------------------------------------------------------------------
Comment By: Stig Meireles Johansen (stigmj)
Date: 2008-10-25 03:02
Message:
API list=allpages, backlinks, categorymembers, logevents disabled due to
performance issue. We'll have to bug the developers to fix the bug they
introduced recently.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2193543&group_…