jenkins-bot has submitted this change and it was merged.
Change subject: mwlib used by patrol is not available for py3
......................................................................
mwlib used by patrol is not available for py3
Change-Id: Ic59df33f7e7ddaccbe0a008f3a870c1f94dd5ae8
---
M setup.py
1 file changed, 3 insertions(+), 1 deletion(-)
Approvals:
John Vandenberg: Looks good to me, but someone else must approve
XZise: Looks good to me, approved
jenkins-bot: Verified
diff --git a/setup.py b/setup.py
index fe8bf18..2886d88 100644
--- a/setup.py
+++ b/setup.py
@@ -40,7 +40,6 @@
# Note: None of the 'lunatic-python' repos on github support MS Windows.
'flickrripper.py': ['Pillow'],
'states_redirect.py': ['pycountry'],
- 'patrol': ['mwlib'],
}
# flickrapi 1.4.4 installs a root logger in verbose mode; 1.4.5 fixes this.
# The problem doesnt exist in flickrapi 2.x.
@@ -78,6 +77,9 @@
# Other backports are likely broken.
dependencies.append('ipaddress')
+ # mwlib is not available for py3
+ script_deps['patrol'] = ['mwlib']
+
if sys.version_info[0] == 3:
if sys.version_info[1] < 3:
print("ERROR: Python 3.3 or higher is required!")
--
To view, visit https://gerrit.wikimedia.org/r/201960
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic59df33f7e7ddaccbe0a008f3a870c1f94dd5ae8
Gerrit-PatchSet: 1
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Ladsgroup <ladsgroup(a)gmail.com>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: XZise <CommodoreFabianus(a)gmx.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged.
Change subject: [FIX] Request: Log entire error response
......................................................................
[FIX] Request: Log entire error response
The error code and info has been removed from the response so that in
the APIError it does not appear as a keyword argument. But it also logs
the response without both values. While they are existing in the
APIError instance it is confusing that the response looks very empty:
{u'servedby': u'mw1122',
u'error': {'help': u'See https://ru.wikipedia.org/w/api.php for '
'API usage'}}
This patch does not remove both keys from the dictionary but skips both
when the keyword arguments are generated.
Change-Id: I9ab89bda4b90e1c14fe25b9fa728f69d2da5b190
---
M pywikibot/data/api.py
1 file changed, 3 insertions(+), 3 deletions(-)
Approvals:
John Vandenberg: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/data/api.py b/pywikibot/data/api.py
index b57a61d..4cf1e0c 100644
--- a/pywikibot/data/api.py
+++ b/pywikibot/data/api.py
@@ -1638,8 +1638,8 @@
if "*" in result["error"]:
# help text returned
result['error']['help'] = result['error'].pop("*")
- code = result["error"].pop("code", "Unknown")
- info = result["error"].pop("info", None)
+ code = result['error'].setdefault('code', 'Unknown')
+ info = result['error'].setdefault('info', None)
if code == "maxlag":
lag = lagpattern.search(info)
if lag:
@@ -1695,7 +1695,7 @@
pywikibot.log(u" response=\n%s"
% result)
- raise APIError(code, info, **result["error"])
+ raise APIError(**result['error'])
except TypeError:
raise RuntimeError(result)
--
To view, visit https://gerrit.wikimedia.org/r/201954
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I9ab89bda4b90e1c14fe25b9fa728f69d2da5b190
Gerrit-PatchSet: 2
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: XZise <CommodoreFabianus(a)gmx.de>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Ladsgroup <ladsgroup(a)gmail.com>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: Rubin <rubin(a)wikimedia.ru>
Gerrit-Reviewer: XZise <CommodoreFabianus(a)gmx.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged.
Change subject: http.request signature
......................................................................
http.request signature
Add method, body and headers as explicit arguments of http.request.
These are the standard arguments of httplib2's request method, and
other http packages have similar arguments.
Including them as explicit arguments makes it clearer from the function
is capable of, allows static checking of invocations, and forces these
parameters into the *args of threadedhttp.HttpRequest() instead of kwargs.
It also allows *args to be removed from the http.request function signature,
as it is no longer necessary.
Also deprecate using http.request for non-site requests.
Change-Id: Id4008cd470b224ffcd3c0b894bba90a25e7611bd
---
M pywikibot/comms/http.py
M pywikibot/data/api.py
M pywikibot/data/wikidataquery.py
M pywikibot/data/wikistats.py
M pywikibot/page.py
M pywikibot/pagegenerators.py
M pywikibot/version.py
7 files changed, 35 insertions(+), 37 deletions(-)
Approvals:
John Vandenberg: Looks good to me, but someone else must approve
XZise: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/comms/http.py b/pywikibot/comms/http.py
index 4c2b2ec..011eb59 100644
--- a/pywikibot/comms/http.py
+++ b/pywikibot/comms/http.py
@@ -21,9 +21,12 @@
__version__ = '$Id$'
__docformat__ = 'epytext'
-import sys
import atexit
+import sys
import time
+
+from distutils.version import StrictVersion
+from warnings import warn
# Verify that a working httplib2 is present.
try:
@@ -32,7 +35,6 @@
print("Error: Python module httplib2 >= 0.6.0 is required.")
sys.exit(1)
-from distutils.version import StrictVersion
# httplib2 0.6.0 was released with __version__ as '$Rev$'
# and no module variable CA_CERTS.
if httplib2.__version__ == '$Rev$' and 'CA_CERTS' not in httplib2.__dict__:
@@ -220,7 +222,8 @@
@deprecate_arg('ssl', None)
-def request(site=None, uri=None, charset=None, *args, **kwargs):
+def request(site=None, uri=None, method='GET', body=None, headers=None,
+ **kwargs):
"""
Request to Site with default error handling and response decoding.
@@ -244,9 +247,9 @@
"""
assert(site or uri)
if not site:
- # TODO: deprecate this usage, once the library code has been
- # migrated to using the other request methods.
- r = fetch(uri, *args, **kwargs)
+ warn('Invoking http.request without argument site is deprecated. '
+ 'Use http.fetch.', DeprecationWarning, 2)
+ r = fetch(uri, method, body, headers, **kwargs)
return r.content
baseuri = site.base_url(uri)
@@ -254,11 +257,15 @@
kwargs.setdefault("disable_ssl_certificate_validation",
site.ignore_certificate_error())
- format_string = kwargs.setdefault("headers", {}).get("user-agent")
- kwargs["headers"]["user-agent"] = user_agent(site, format_string)
- kwargs['charset'] = charset
+ if not headers:
+ headers = {}
+ format_string = None
+ else:
+ format_string = headers.get('user-agent', None)
- r = fetch(baseuri, *args, **kwargs)
+ headers['user-agent'] = user_agent(site, format_string)
+
+ r = fetch(baseuri, method, body, headers, **kwargs)
return r.content
diff --git a/pywikibot/data/api.py b/pywikibot/data/api.py
index b57a61d..a71c17c 100644
--- a/pywikibot/data/api.py
+++ b/pywikibot/data/api.py
@@ -1550,8 +1550,8 @@
body = paramstring
rawdata = http.request(
- self.site, uri, method='GET' if use_get else 'POST',
- headers=headers, body=body)
+ site=self.site, uri=uri, method='GET' if use_get else 'POST',
+ body=body, headers=headers)
except Server504Error:
pywikibot.log(u"Caught HTTP 504 error; retrying")
self.wait()
diff --git a/pywikibot/data/wikidataquery.py b/pywikibot/data/wikidataquery.py
index b78b3e6..fd08839 100644
--- a/pywikibot/data/wikidataquery.py
+++ b/pywikibot/data/wikidataquery.py
@@ -558,13 +558,13 @@
url = self.getUrl(queryStr)
try:
- resp = http.request(None, url)
+ resp = http.fetch(url)
except:
pywikibot.warning(u"Failed to retrieve %s" % url)
raise
try:
- data = json.loads(resp)
+ data = json.loads(resp.content)
except ValueError:
pywikibot.warning(u"Data received from host but no JSON could be decoded")
raise pywikibot.ServerError("Data received from host but no JSON could be decoded")
diff --git a/pywikibot/data/wikistats.py b/pywikibot/data/wikistats.py
index b8f64fe..dee09fd 100644
--- a/pywikibot/data/wikistats.py
+++ b/pywikibot/data/wikistats.py
@@ -22,7 +22,7 @@
' falling back to using the larger XML datasets.')
csv = None
-from pywikibot.comms import threadedhttp
+from pywikibot.comms import http
class WikiStats(object):
@@ -110,11 +110,8 @@
if table in self.FAMILY_MAPPING:
table = self.FAMILY_MAPPING[table]
- o = threadedhttp.Http()
- r = o.request(uri=URL % (table, format))
- if isinstance(r, Exception):
- raise r
- return r[1]
+ r = http.fetch(URL % (table, format))
+ return r.raw
def raw_cached(self, table, format):
"""
diff --git a/pywikibot/page.py b/pywikibot/page.py
index 0cdcbc8..25b0892 100644
--- a/pywikibot/page.py
+++ b/pywikibot/page.py
@@ -31,12 +31,10 @@
long = int
from html import entities as htmlentitydefs
from urllib.parse import quote_from_bytes, unquote_to_bytes
- from urllib.request import urlopen
else:
chr = unichr # noqa
import htmlentitydefs
from urllib import quote as quote_from_bytes, unquote as unquote_to_bytes
- from urllib import urlopen
import pywikibot
@@ -2104,14 +2102,11 @@
@deprecated("FilePage.latest_file_info.sha1")
def getFileMd5Sum(self):
"""Return image file's MD5 checksum."""
- # FIXME: MD5 might be performed on incomplete file due to server disconnection
- # (see bug #1795683).
- f = urlopen(self.fileUrl())
# TODO: check whether this needs a User-Agent header added
+ req = http.fetch(self.fileUrl())
h = hashlib.md5()
- h.update(f.read())
+ h.update(req.raw)
md5Checksum = h.hexdigest()
- f.close()
return md5Checksum
@deprecated("FilePage.latest_file_info.sha1")
diff --git a/pywikibot/pagegenerators.py b/pywikibot/pagegenerators.py
index b01ae3f..f797874 100644
--- a/pywikibot/pagegenerators.py
+++ b/pywikibot/pagegenerators.py
@@ -2013,7 +2013,7 @@
else:
wiki = 'wikilang=%s&wikifam=.%s' % (lang, project)
link = '%s&%s&max=%d&order=img_timestamp' % (URL, wiki, limit)
- results = re.findall(REGEXP, http.request(site=None, uri=link))
+ results = re.findall(REGEXP, http.fetch(link))
if not results:
raise pywikibot.Error(
u'Nothing found at %s! Try to use the tool by yourself to be sure '
diff --git a/pywikibot/version.py b/pywikibot/version.py
index 527f210..5063908 100644
--- a/pywikibot/version.py
+++ b/pywikibot/version.py
@@ -144,11 +144,12 @@
from pywikibot.comms import http
uri = 'https://github.com/wikimedia/%s/!svn/vcc/default' % tag
- data = http.request(site=None, uri=uri, method='PROPFIND',
- body="<?xml version='1.0' encoding='utf-8'?>"
- "<propfind xmlns=\"DAV:\"><allprop/></propfind>",
- headers={'label': str(rev), 'user-agent': 'SVN/1.7.5 {pwb}'})
-
+ request = http.fetch(uri=uri, method='PROPFIND',
+ body="<?xml version='1.0' encoding='utf-8'?>"
+ "<propfind xmlns=\"DAV:\"><allprop/></propfind>",
+ headers={'label': str(rev),
+ 'user-agent': 'SVN/1.7.5 {pwb}'})
+ data = request.content
dom = xml.dom.minidom.parse(StringIO(data))
hsh = dom.getElementsByTagName("C:git-commit")[0].firstChild.nodeValue
return hsh
@@ -240,14 +241,12 @@
from pywikibot.comms import http
url = repo or 'https://git.wikimedia.org/feed/pywikibot/core'
- hsh = None
- buf = http.request(site=None, uri=url)
- buf = buf.split('\r\n')
+ buf = http.fetch(url).content.splitlines()
try:
hsh = buf[13].split('/')[5][:-1]
+ return hsh
except Exception as e:
raise ParseError(repr(e) + ' while parsing ' + repr(buf))
- return hsh
@deprecated('get_module_version, get_module_filename and get_module_mtime')
--
To view, visit https://gerrit.wikimedia.org/r/170054
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Id4008cd470b224ffcd3c0b894bba90a25e7611bd
Gerrit-PatchSet: 8
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Ladsgroup <ladsgroup(a)gmail.com>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: XZise <CommodoreFabianus(a)gmx.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged.
Change subject: Port patrol.py to core
......................................................................
Port patrol.py to core
Made minor necessary changes to make it compatible with core
Depends on mwlib
Bug: T74206
Change-Id: I8612ce905d149d0e440d819f62f923385a583920
---
A scripts/patrol.py
M setup.py
M tests/script_tests.py
3 files changed, 529 insertions(+), 1 deletion(-)
Approvals:
John Vandenberg: Looks good to me, approved
jenkins-bot: Verified
diff --git a/scripts/patrol.py b/scripts/patrol.py
new file mode 100644
index 0000000..b347c1f
--- /dev/null
+++ b/scripts/patrol.py
@@ -0,0 +1,525 @@
+#!/usr/bin/python
+# -*- coding: utf-8 -*-
+"""
+The bot is meant to mark the edits based on info obtained by whitelist.
+
+This bot obtains a list of recent changes and newpages and marks the
+edits as patrolled based on a whitelist.
+See http://en.wikisource.org/wiki/User:JVbot/patrol_whitelist
+
+Commandline parameters that are supported:
+
+-namespace Filter the page generator to only yield pages in
+ specified namespaces
+-ask If True, confirm each patrol action
+-whitelist page title for whitelist (optional)
+-autopatroluserns Takes user consent to automatically patrol
+-versionchecktime Check versionchecktime lapse in sec
+
+"""
+#
+# (C) Pywikibot team, 2011-2015
+#
+# Distributed under the terms of the MIT license.
+#
+__version__ = '$Id$'
+import pywikibot
+from pywikibot import pagegenerators, Bot
+import mwlib.uparser # used to parse the whitelist
+import mwlib.parser # used to parse the whitelist
+import time
+
+_logger = 'patrol'
+
+# This is required for the text that is shown when you run this script
+# with the parameter -help.
+docuReplacements = {
+ '¶ms;': pagegenerators.parameterHelp
+}
+
+
+class PatrolBot(Bot):
+
+ """Bot marks the edits as patrolled based on info obtained by whitelist."""
+
+ # Localised name of the whitelist page
+ whitelist_subpage_name = {
+ 'en': u'patrol_whitelist',
+ }
+
+ def __init__(self, **kwargs):
+ """
+ Constructor.
+
+ @kwarg feed - The changes feed to work on (Newpages
+ or Recentchanges)
+ @kwarg ask - If True, confirm each patrol action
+ @kwarg whitelist - page title for whitelist (optional)
+ @kwarg autopatroluserns - Takes user consent to automatically patrol
+ @kwarg versionchecktime - Check versionchecktime lapse in sec
+ """
+ self.availableOptions.update({
+ 'ask': False,
+ 'feed': None,
+ 'whitelist': None,
+ 'versionchecktime': 300,
+ 'autopatroluserns': False
+ })
+ super(PatrolBot, self).__init__(**kwargs)
+ self.recent_gen = True
+ self.user = None
+ self.site = pywikibot.Site()
+ if self.getOption('whitelist'):
+ self.whitelist_pagename = self.getOption('whitelist')
+ else:
+ local_whitelist_subpage_name = pywikibot.translate(
+ self.site, self.whitelist_subpage_name, fallback=True)
+ self.whitelist_pagename = u'%s:%s/%s' % (
+ self.site.namespace(2),
+ self.site.username(),
+ local_whitelist_subpage_name)
+ self.whitelist = self.getOption('whitelist')
+ self.whitelist_ts = 0
+ self.whitelist_load_ts = 0
+
+ self.highest_rcid = 0 # used to track loops
+ self.last_rcid = 0
+ self.repeat_start_ts = 0
+
+ self.rc_item_counter = 0 # counts how many items have been reviewed
+ self.patrol_counter = 0 # and how many times an action was taken
+
+ def load_whitelist(self):
+ """Load most recent watchlist_page for further processing."""
+ # Check for a more recent version after versionchecktime in sec.
+ if (self.whitelist_load_ts and (time.time() - self.whitelist_load_ts <
+ self.getOption('versionchecktime'))):
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Whitelist not stale yet')
+ return
+
+ whitelist_page = pywikibot.Page(self.site,
+ self.whitelist_pagename)
+
+ if not self.whitelist:
+ pywikibot.output(u'Loading %s' % self.whitelist_pagename)
+
+ try:
+ if self.whitelist_ts:
+ # check for a more recent version
+ h = whitelist_page.revisions()
+ last_edit_ts = next(h).timestamp
+ if last_edit_ts == self.whitelist_ts:
+ # As there hasn't been any change to the whitelist
+ # it has been effectively reloaded 'now'
+ self.whitelist_load_ts = time.time()
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Whitelist not modified')
+ return
+
+ if self.whitelist:
+ pywikibot.output(u'Reloading whitelist')
+
+ # Fetch whitelist
+ wikitext = whitelist_page.get()
+ # Parse whitelist
+ self.whitelist = self.parse_page_tuples(wikitext, self.user)
+ # Record timestamp
+ self.whitelist_ts = whitelist_page.editTime()
+ self.whitelist_load_ts = time.time()
+ except Exception as e:
+ # cascade if there isnt a whitelist to fallback on
+ if not self.whitelist:
+ raise
+ pywikibot.error(u'%s' % e)
+
+ @staticmethod
+ def add_to_tuples(tuples, user, page):
+ """Update tuples 'user' key by adding page."""
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u"Adding %s:%s" % (user, page.title()))
+
+ if user in tuples:
+ tuples[user].append(page)
+ else:
+ tuples[user] = [page]
+
+ def in_list(self, pagelist, title):
+ """Check if title present in pagelist."""
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Checking whitelist for: %s' % title)
+
+ # quick check for exact match
+ if title in pagelist:
+ return title
+
+ # quick check for wildcard
+ if '' in pagelist:
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'wildcarded')
+ return '.*'
+
+ for item in pagelist:
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'checking against whitelist item = %s' % item)
+
+ if isinstance(item, PatrolRule):
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'invoking programmed rule')
+ if item.match(title):
+ return item
+
+ elif title_match(item, title):
+ return item
+
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'not found')
+
+ def parse_page_tuples(self, wikitext, user=None):
+ """Parse page details apart from 'user:' for use."""
+ tuples = {}
+
+ # for any structure, the only first 'user:' page
+ # is registered as the user the rest of the structure
+ # refers to.
+ def process_children(obj, current_user):
+ pywikibot.debug(u'Parsing node: %s' % obj, _logger)
+ for c in obj.children:
+ temp = process_node(c, current_user)
+ if temp and not current_user:
+ current_user = temp
+
+ def process_node(obj, current_user):
+ # links are analysed; interwiki links are included because mwlib
+ # incorrectly calls 'Wikisource:' namespace links an interwiki
+ if isinstance(obj, mwlib.parser.NamespaceLink) or \
+ isinstance(obj, mwlib.parser.InterwikiLink) or \
+ isinstance(obj, mwlib.parser.ArticleLink):
+ if obj.namespace == -1:
+ # the parser accepts 'special:prefixindex/' as a wildcard
+ # this allows a prefix that doesnt match an existing page
+ # to be a blue link, and can be clicked to see what pages
+ # will be included in the whitelist
+ if obj.target[:20].lower() == 'special:prefixindex/':
+ if len(obj.target) == 20:
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Whitelist everything')
+ page = ''
+ else:
+ page = obj.target[20:]
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Whitelist prefixindex hack '
+ u'for: %s' % page)
+ # p = pywikibot.Page(self.site, obj.target[20:])
+ # obj.namespace = p.namespace
+ # obj.target = p.title()
+
+ elif obj.namespace == 2 and not current_user:
+ # if a target user hasn't been found yet, and the link is
+ # 'user:'
+ # the user will be the target of subsequent rules
+ page_prefix_len = len(self.site.namespace(2))
+ current_user = obj.target[(page_prefix_len + 1):]
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Whitelist user: %s' % current_user)
+ return current_user
+ else:
+ page = obj.target
+
+ if current_user:
+ if not user or current_user == user:
+ if self.is_wikisource_author_page(page):
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Whitelist author: %s' % page)
+ author = LinkedPagesRule(page)
+ self.add_to_tuples(tuples, current_user, author)
+ else:
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Whitelist page: %s' % page)
+ self.add_to_tuples(tuples, current_user, page)
+ elif pywikibot.config.verbose_output:
+ pywikibot.output(u'Discarding whitelist page for '
+ u'another user: %s' % page)
+ else:
+ raise Exception(u'No user set for page %s' % page)
+ else:
+ process_children(obj, current_user)
+
+ root = mwlib.uparser.parseString(title='Not used', raw=wikitext)
+ process_children(root, None)
+
+ return tuples
+
+ def is_wikisource_author_page(self, title):
+ """Initialise author_ns if site family is 'wikisource' else pass."""
+ if self.site.family.name != 'wikisource':
+ return
+
+ author_ns = 0
+ try:
+ author_ns = self.site.family.authornamespaces[self.site.lang][0]
+ except:
+ pass
+ if author_ns:
+ author_ns_prefix = self.site.namespace(author_ns)
+ pywikibot.debug(u'Author ns: %d; name: %s'
+ % (author_ns, author_ns_prefix), _logger)
+ if title.find(author_ns_prefix + ':') == 0:
+ if pywikibot.config.verbose_output:
+ author_page_name = title[len(author_ns_prefix) + 1:]
+ pywikibot.output(u'Found author %s' % author_page_name)
+ return True
+
+ def run(self, feed=None):
+ """Process 'whitelist' page absent in generator."""
+ if self.whitelist is None:
+ self.load_whitelist()
+ if not feed:
+ feed = self.getOption('feed')
+ for page in feed:
+ self.treat(page)
+
+ def treat(self, page):
+ """It loads the given page, does some changes, and saves it."""
+ choice = False
+ try:
+ # page: title, date, username, comment, loginfo, rcid, token
+ username = page['user']
+ # when the feed isnt from the API, it used to contain
+ # '(not yet written)' or '(page does not exist)' when it was
+ # a redlink
+ rcid = page['rcid']
+ title = page['title']
+ if not rcid:
+ raise Exception('rcid not present')
+
+ # check whether we have wrapped around to higher rcids
+ # which indicates a new RC feed is being processed
+ if rcid > self.last_rcid:
+ # refresh the whitelist
+ self.load_whitelist()
+ self.repeat_start_ts = time.time()
+
+ if pywikibot.config.verbose_output or self.getOption('ask'):
+ pywikibot.output(u'User %s has created or modified page %s'
+ % (username, title))
+
+ if self.getOption('autopatroluserns') and (page['ns'] == 2 or
+ page['ns'] == 3):
+ # simple rule to whitelist any user editing their own userspace
+ if title.partition(':')[2].split('/')[0].startswith(username):
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'%s is whitelisted to modify %s'
+ % (username, title))
+ choice = True
+
+ if not choice and username in self.whitelist:
+ if self.in_list(self.whitelist[username], title):
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'%s is whitelisted to modify %s'
+ % (username, title))
+ choice = True
+
+ if self.getOption('ask'):
+ choice = pywikibot.input_yn(
+ u'Do you want to mark page as patrolled?', automatic_quit=False)
+
+ # Patrol the page
+ if choice:
+ # list() iterates over patrol() which returns a generator
+ list(self.site.patrol(rcid))
+ self.patrol_counter = self.patrol_counter + 1
+ pywikibot.output(u'Patrolled %s (rcid %d) by user %s'
+ % (title, rcid, username))
+ else:
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Skipped')
+
+ if rcid > self.highest_rcid:
+ self.highest_rcid = rcid
+ self.last_rcid = rcid
+ self.rc_item_counter = self.rc_item_counter + 1
+
+ except pywikibot.NoPage:
+ pywikibot.output(u'Page %s does not exist; skipping.'
+ % title(asLink=True))
+ except pywikibot.IsRedirectPage:
+ pywikibot.output(u'Page %s is a redirect; skipping.'
+ % title(asLink=True))
+
+
+def title_match(prefix, title):
+ """Match title substring with given prefix."""
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Matching %s to prefix %s' % (title, prefix))
+ if title.startswith(prefix):
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'substr match')
+ return True
+ return
+
+
+class PatrolRule(object):
+
+ """Bot marks the edit.startswith("-s as patrolled based on info obtained by whitelist."""
+
+ def __init__(self, page_title):
+ """
+ Constructor.
+
+ @param page_title: The page title for this rule
+ @type page_title: pywikibot.Page
+ """
+ self.page_title = page_title
+
+ def title(self):
+ """Obtain page title."""
+ return self.page_title
+
+ def match(self, page):
+ """Added for future use."""
+ pass
+
+
+class LinkedPagesRule(PatrolRule):
+
+ """Matches of page site title and linked pages title."""
+
+ def __init__(self, page_title):
+ """Constructor.
+
+ @param page_title: The page title for this rule
+ @type page_title: pywikibot.Page
+ """
+ self.site = pywikibot.Site()
+ self.page_title = page_title
+ self.linkedpages = None
+
+ def match(self, page_title):
+ """Match page_title to linkedpages elements."""
+ if page_title == self.page_title:
+ return True
+
+ if not self.site.family.name == 'wikisource':
+ raise Exception('This is a wikisource rule')
+
+ if not self.linkedpages:
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'loading page links on %s' % self.page_title)
+ p = pywikibot.Page(self.site, self.page_title)
+ linkedpages = list()
+ for linkedpage in p.linkedPages():
+ linkedpages.append(linkedpage.title())
+
+ self.linkedpages = linkedpages
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Loaded %d page links' % len(linkedpages))
+
+ for p in self.linkedpages:
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u"Checking against '%s'" % p)
+ if title_match(p, page_title):
+ if pywikibot.config.verbose_output:
+ pywikibot.output(u'Matched.')
+ return p
+
+
+def api_feed_repeater(gen, delay=0, repeat=False, number=1000, namespaces=None,
+ user=None, recent_new_gen=True):
+ """Generator which loads pages details to be processed."""
+ while True:
+ if recent_new_gen:
+ generator = gen(step=number, namespaces=namespaces, user=user,
+ showPatrolled=False)
+ else:
+ generator = gen(step=number, namespaces=namespaces, user=user,
+ returndict=True, showPatrolled=False)
+ for page in generator:
+ if recent_new_gen:
+ yield page
+ else:
+ yield page[1]
+ if repeat:
+ pywikibot.output(u'Sleeping for %d seconds' % delay)
+ time.sleep(delay)
+ else:
+ break
+
+
+def main(*args):
+ """Process command line arguments and invoke PatrolBot."""
+ # This factory is responsible for processing command line arguments
+ # that are also used by other scripts and that determine on which pages
+ # to work on.
+ usercontribs = None
+ gen = None
+ recentchanges = False
+ newpages = False
+ repeat = False
+ genFactory = pagegenerators.GeneratorFactory()
+ options = {}
+
+ # Parse command line arguments
+ for arg in pywikibot.handle_args(args):
+ if arg.startswith('-ask'):
+ options['ask'] = True
+ elif arg.startswith('-autopatroluserns'):
+ options['autopatroluserns'] = True
+ elif arg.startswith('-repeat'):
+ repeat = True
+ elif arg.startswith('-newpages'):
+ newpages = True
+ elif arg.startswith('-recentchanges'):
+ recentchanges = True
+ elif arg.startswith('-usercontribs:'):
+ usercontribs = arg[14:]
+ elif arg.startswith('-versionchecktime:'):
+ versionchecktime = arg[len('-versionchecktime:'):]
+ options['versionchecktime'] = int(versionchecktime)
+ elif arg.startswith("-whitelist:"):
+ options['whitelist'] = arg[len('-whitelist:'):]
+ else:
+ generator = genFactory.handleArg(arg)
+ if not generator:
+ if ':' in arg:
+ m = arg.split(':')
+ options[m[0]] = m[1]
+
+ site = pywikibot.Site()
+ site.login()
+
+ if usercontribs:
+ pywikibot.output(u'Processing user: %s' % usercontribs)
+
+ newpage_count = 300
+ if not newpages and not recentchanges and not usercontribs:
+ if site.family.name == 'wikipedia':
+ newpages = True
+ newpage_count = 5000
+ else:
+ recentchanges = True
+
+ bot = PatrolBot(**options)
+
+ if newpages or usercontribs:
+ pywikibot.output(u'Newpages:')
+ gen = site.newpages
+ feed = api_feed_repeater(gen, delay=60, repeat=repeat,
+ number=newpage_count, user=usercontribs,
+ namespaces=genFactory.namespaces,
+ recent_new_gen=False)
+ bot.run(feed)
+
+ if recentchanges or usercontribs:
+ pywikibot.output(u'Recentchanges:')
+ gen = site.recentchanges
+ feed = api_feed_repeater(gen, delay=60, repeat=repeat, number=1000,
+ namespaces=genFactory.namespaces,
+ user=usercontribs)
+ bot.run(feed)
+
+ pywikibot.output(u'%d/%d patrolled'
+ % (bot.patrol_counter, bot.rc_item_counter))
+
+if __name__ == '__main__':
+ main()
diff --git a/setup.py b/setup.py
index 274cdc2..fe8bf18 100644
--- a/setup.py
+++ b/setup.py
@@ -39,7 +39,8 @@
'script_wui.py': [irc_dep, 'lunatic-python', 'crontab'],
# Note: None of the 'lunatic-python' repos on github support MS Windows.
'flickrripper.py': ['Pillow'],
- 'states_redirect.py': ['pycountry']
+ 'states_redirect.py': ['pycountry'],
+ 'patrol': ['mwlib'],
}
# flickrapi 1.4.4 installs a root logger in verbose mode; 1.4.5 fixes this.
# The problem doesnt exist in flickrapi 2.x.
diff --git a/tests/script_tests.py b/tests/script_tests.py
index a85088b..fb47dea 100644
--- a/tests/script_tests.py
+++ b/tests/script_tests.py
@@ -31,6 +31,7 @@
'flickrripper': ['flickrapi'],
'match_images': ['PIL.ImageTk'],
'states_redirect': ['pycountry'],
+ 'patrol': ['mwlib'],
}
if sys.version_info < (2, 7):
@@ -108,6 +109,7 @@
'revertbot',
'noreferences',
'nowcommons',
+ 'patrol',
'script_wui',
'shell',
'states_redirect',
--
To view, visit https://gerrit.wikimedia.org/r/184118
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I8612ce905d149d0e440d819f62f923385a583920
Gerrit-PatchSet: 28
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Prianka <priyankajayaswal025(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Ladsgroup <ladsgroup(a)gmail.com>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Prianka <priyankajayaswal025(a)gmail.com>
Gerrit-Reviewer: XZise <CommodoreFabianus(a)gmx.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged.
Change subject: Adding commands.log generating function in core.
......................................................................
Adding commands.log generating function in core.
The compat version contains history of all command which were used for
running bot. It is useful when we want to run some more complicated
command again.There is no such log in core branch. As such I have added
this enhancement as part of Pywikibot:Compat to Core Migration.
Bug:T69724
Change-Id: I8e18c07f1b06569ff29cd8cab3900d515c187f1b
---
M pywikibot/bot.py
1 file changed, 27 insertions(+), 0 deletions(-)
Approvals:
John Vandenberg: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/bot.py b/pywikibot/bot.py
index be83074..3a5ea91 100644
--- a/pywikibot/bot.py
+++ b/pywikibot/bot.py
@@ -13,6 +13,7 @@
# Note: all output goes thru python std library "logging" module
+import codecs
import datetime
import json
import logging
@@ -20,6 +21,7 @@
import os
import re
import sys
+import time
import warnings
import webbrowser
@@ -774,6 +776,7 @@
config.usernames[config.family][config.mylang] = username
init_handlers()
+ writeToCommandLogFile()
if config.verbose_output:
# Please don't change the regular expression here unless you really
@@ -889,6 +892,30 @@
pywikibot.stdout(globalHelp)
+def writeToCommandLogFile():
+ """
+ Save name of the called module along with all parameters to logs/commands.log.
+
+ This can be used by user later to track errors or report bugs.
+ """
+ modname = calledModuleName()
+ # put quotation marks around all parameters
+ args = [modname] + [u'"%s"' % s for s in pywikibot.argvu[1:]]
+ command_log_filename = config.datafilepath('logs', 'commands.log')
+ try:
+ command_log_file = codecs.open(command_log_filename, 'a', 'utf-8')
+ except IOError:
+ command_log_file = codecs.open(command_log_filename, 'w', 'utf-8')
+ # add a timestamp in ISO 8601 formulation
+ isoDate = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime())
+ command_log_file.write('%s r%s Python %s '
+ % (isoDate, version.getversiondict()['rev'],
+ sys.version.split()[0]))
+ s = u' '.join(args)
+ command_log_file.write(s + os.linesep)
+ command_log_file.close()
+
+
def open_webbrowser(page):
"""Open the web browser displaying the page and wait for input."""
from pywikibot import i18n
--
To view, visit https://gerrit.wikimedia.org/r/198547
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I8e18c07f1b06569ff29cd8cab3900d515c187f1b
Gerrit-PatchSet: 8
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Prianka <priyankajayaswal025(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Ladsgroup <ladsgroup(a)gmail.com>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: XZise <CommodoreFabianus(a)gmx.de>
Gerrit-Reviewer: jenkins-bot <>