Pywikibot-commits March 2022

pywikibot-commits@lists.wikimedia.org

1 participants
162 discussions

[Gerrit] ...core[master]: [doc] Update documentation

by Meno25 (Code Review)

Meno25 has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/772348 ) Change subject: [doc] Update documentation ...................................................................... [doc] Update documentation Change-Id: I67750355e8fcc576b6521c056dbd6ed7bde5ab69 --- M pywikibot/comms/eventstreams.py M pywikibot/site/__init__.py 2 files changed, 2 insertions(+), 2 deletions(-) Approvals: Meno25: Verified; Looks good to me, approved diff --git a/pywikibot/comms/eventstreams.py b/pywikibot/comms/eventstreams.py index 78c0fef..a984370 100644 --- a/pywikibot/comms/eventstreams.py +++ b/pywikibot/comms/eventstreams.py @@ -10,7 +10,7 @@ .. versionadded:: 3.0 """ # -# (C) Pywikibot team, 2017-2021 +# (C) Pywikibot team, 2017-2022 # # Distributed under the terms of the MIT license. # diff --git a/pywikibot/site/__init__.py b/pywikibot/site/__init__.py index 33243e9..c0caf25 100644 --- a/pywikibot/site/__init__.py +++ b/pywikibot/site/__init__.py @@ -1,6 +1,6 @@ """Library module representing MediaWiki sites (wikis).""" # -# (C) Pywikibot team, 2021 +# (C) Pywikibot team, 2021-2022 # # Distributed under the terms of the MIT license. # -- To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/772348 To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Change-Id: I67750355e8fcc576b6521c056dbd6ed7bde5ab69 Gerrit-Change-Number: 772348 Gerrit-PatchSet: 2 Gerrit-Owner: Meno25 <meno25mail(a)gmail.com> Gerrit-Reviewer: Meno25 <meno25mail(a)gmail.com> Gerrit-Reviewer: jenkins-bot Gerrit-MessageType: merged

2 years, 2 months

[Gerrit] ...core[master]: [doc] Update documentation

by Meno25 (Code Review)

Meno25 has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/771925 ) Change subject: [doc] Update documentation ...................................................................... [doc] Update documentation Change-Id: Ib2563dc29163aa42e0be847589bbb0442ffe1650 --- M pywikibot/data/mysql.py M pywikibot/data/wikistats.py M pywikibot/scripts/version.py M pywikibot/site/_interwikimap.py M pywikibot/userinterfaces/terminal_interface_unix.py M pywikibot/userinterfaces/terminal_interface_win32.py 6 files changed, 6 insertions(+), 6 deletions(-) Approvals: Meno25: Verified; Looks good to me, approved diff --git a/pywikibot/data/mysql.py b/pywikibot/data/mysql.py index 376e1ab..cc31277 100644 --- a/pywikibot/data/mysql.py +++ b/pywikibot/data/mysql.py @@ -1,6 +1,6 @@ """Miscellaneous helper functions for mysql queries.""" # -# (C) Pywikibot team, 2016-2021 +# (C) Pywikibot team, 2016-2022 # # Distributed under the terms of the MIT license. # diff --git a/pywikibot/data/wikistats.py b/pywikibot/data/wikistats.py index c56f207..0522b4d 100644 --- a/pywikibot/data/wikistats.py +++ b/pywikibot/data/wikistats.py @@ -1,6 +1,6 @@ """Objects representing WikiStats API.""" # -# (C) Pywikibot team, 2014-2021 +# (C) Pywikibot team, 2014-2022 # # Distributed under the terms of the MIT license. # diff --git a/pywikibot/scripts/version.py b/pywikibot/scripts/version.py index 69759cb..662be38 100755 --- a/pywikibot/scripts/version.py +++ b/pywikibot/scripts/version.py @@ -5,7 +5,7 @@ version script was moved to the framework scripts folder """ # -# (C) Pywikibot team, 2007-2021 +# (C) Pywikibot team, 2007-2022 # # Distributed under the terms of the MIT license. # diff --git a/pywikibot/site/_interwikimap.py b/pywikibot/site/_interwikimap.py index 77bbfb6..2f00367 100644 --- a/pywikibot/site/_interwikimap.py +++ b/pywikibot/site/_interwikimap.py @@ -1,6 +1,6 @@ """Objects representing interwiki map of MediaWiki site.""" # -# (C) Pywikibot team, 2015-2021 +# (C) Pywikibot team, 2015-2022 # # Distributed under the terms of the MIT license. # diff --git a/pywikibot/userinterfaces/terminal_interface_unix.py b/pywikibot/userinterfaces/terminal_interface_unix.py index 27cd70a..c5dc737 100644 --- a/pywikibot/userinterfaces/terminal_interface_unix.py +++ b/pywikibot/userinterfaces/terminal_interface_unix.py @@ -1,6 +1,6 @@ """User interface for Unix terminals.""" # -# (C) Pywikibot team, 2003-2021 +# (C) Pywikibot team, 2003-2022 # # Distributed under the terms of the MIT license. # diff --git a/pywikibot/userinterfaces/terminal_interface_win32.py b/pywikibot/userinterfaces/terminal_interface_win32.py index 20d4b12..930c82b 100644 --- a/pywikibot/userinterfaces/terminal_interface_win32.py +++ b/pywikibot/userinterfaces/terminal_interface_win32.py @@ -1,6 +1,6 @@ """User interface for Win32 terminals.""" # -# (C) Pywikibot team, 2003-2020 +# (C) Pywikibot team, 2003-2022 # # Distributed under the terms of the MIT license. # -- To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/771925 To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Change-Id: Ib2563dc29163aa42e0be847589bbb0442ffe1650 Gerrit-Change-Number: 771925 Gerrit-PatchSet: 5 Gerrit-Owner: Meno25 <meno25mail(a)gmail.com> Gerrit-Reviewer: Meno25 <meno25mail(a)gmail.com> Gerrit-Reviewer: jenkins-bot Gerrit-MessageType: merged

2 years, 2 months

[Gerrit] ...core[master]: apisite: Add typing

by Xqt (Code Review)

Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/770146 ) Change subject: apisite: Add typing ...................................................................... apisite: Add typing Change-Id: I5acc6664512cce786426a06915fdcbb072cd9ec3 --- M pywikibot/data/api.py M pywikibot/site/_apisite.py M pywikibot/site/_basesite.py M pywikibot/site/_siteinfo.py 4 files changed, 367 insertions(+), 258 deletions(-) Approvals: Xqt: Verified; Looks good to me, approved diff --git a/pywikibot/data/api.py b/pywikibot/data/api.py index f03af4d..70c7f9c 100644 --- a/pywikibot/data/api.py +++ b/pywikibot/data/api.py @@ -18,13 +18,13 @@ from email.mime.nonmultipart import MIMENonMultipart from inspect import getfullargspec from io import BytesIO -from typing import Optional, Union +from typing import Any, Optional, Union from urllib.parse import unquote, urlencode from warnings import warn import pywikibot from pywikibot import config, login -from pywikibot.backports import Dict, Tuple, removeprefix +from pywikibot.backports import Callable, Dict, Match, Tuple, removeprefix from pywikibot.comms import http from pywikibot.exceptions import ( Error, @@ -551,7 +551,11 @@ """Return number of cached modules.""" return len(self._paraminfo) - def parameter(self, module: str, param_name: str) -> Optional[dict]: + def parameter( + self, + module: str, + param_name: str + ) -> Optional[Dict[str, Any]]: """ Get details about one modules parameter. @@ -1011,7 +1015,7 @@ raise ValueError("'action' specification missing from Request.") self.action = parameters['action'] self.update(parameters) # also convert all parameter values to lists - self._warning_handler = None + self._warning_handler = None # type: Optional[Callable[[str, str], Union[Match[str], bool, None]]] # noqa: E501 # Actions that imply database updates on the server, used for various # things like throttling or skipping actions when we're in simulation # mode @@ -2051,6 +2055,9 @@ kwargs['parameters'].update(mw_api_args) return kwargs + def set_maximum_items(self, value: Union[int, str, None]) -> None: + raise NotImplementedError + class APIGenerator(_RequestWrapper): diff --git a/pywikibot/site/_apisite.py b/pywikibot/site/_apisite.py index d79e8f5..df15811 100644 --- a/pywikibot/site/_apisite.py +++ b/pywikibot/site/_apisite.py @@ -11,10 +11,12 @@ from collections import OrderedDict, defaultdict, namedtuple from contextlib import suppress from textwrap import fill -from typing import Any, Optional, Union +from typing import Any, Iterable, Optional, Type, TypeVar, Union import pywikibot -from pywikibot.backports import Dict, List +from pywikibot.backports import DefaultDict, Dict, List, Match +from pywikibot.backports import OrderedDict as OrderedDictType +from pywikibot.backports import Pattern, Set, Tuple from pywikibot.comms.http import get_authentication from pywikibot.data import api from pywikibot.exceptions import ( @@ -62,7 +64,7 @@ ) from pywikibot.site._generators import GeneratorsMixin from pywikibot.site._interwikimap import _InterwikiMap -from pywikibot.site._namespace import Namespace +from pywikibot.site._namespace import Namespace, NamespaceArgType from pywikibot.site._siteinfo import Siteinfo from pywikibot.site._tokenwallet import TokenWallet from pywikibot.site._upload import Uploader @@ -76,7 +78,11 @@ __all__ = ('APISite', ) _logger = 'wiki.apisite' -_mw_msg_cache = defaultdict(dict) +_mw_msg_cache = defaultdict(dict) # type: DefaultDict[str, Dict[str, str]] + + +_CompType = Union[int, str, 'pywikibot.page.Page', 'pywikibot.page.Revision'] +_RequestWrapperT = TypeVar('_RequestWrapperT', bound='api._RequestWrapper') class APISite( @@ -101,31 +107,36 @@ Do not instantiate directly; use :py:obj:`pywikibot.Site` function. """ - def __init__(self, code, fam=None, user=None) -> None: + def __init__( + self, + code: str, + fam: Union[str, 'pywikibot.family.Family', None] = None, + user: Optional[str] = None + ) -> None: """Initializer.""" super().__init__(code, fam, user) - self._globaluserinfo = {} + self._globaluserinfo = {} # type: Dict[Union[int, str], Any] self._interwikimap = _InterwikiMap(self) self._loginstatus = _LoginStatus.NOT_ATTEMPTED - self._msgcache = {} + self._msgcache = {} # type: Dict[str, str] self._paraminfo = api.ParamInfo(self) self._siteinfo = Siteinfo(self) self.tokens = TokenWallet(self) - def __getstate__(self): + def __getstate__(self) -> Dict[str, Any]: """Remove TokenWallet before pickling, for security reasons.""" new = super().__getstate__() del new['tokens'] del new['_interwikimap'] return new - def __setstate__(self, attrs) -> None: + def __setstate__(self, attrs: Dict[str, Any]) -> None: """Restore things removed in __getstate__.""" super().__setstate__(attrs) self._interwikimap = _InterwikiMap(self) self.tokens = TokenWallet(self) - def interwiki(self, prefix): + def interwiki(self, prefix: str) -> BaseSite: """ Return the site for a corresponding interwiki prefix. @@ -135,7 +146,7 @@ """ return self._interwikimap[prefix].site - def interwiki_prefix(self, site): + def interwiki_prefix(self, site: BaseSite) -> List[str]: """ Return the interwiki prefixes going to that site. @@ -145,9 +156,6 @@ function). :param site: The targeted site, which might be it's own. - :type site: :py:obj:`BaseSite` - :return: The interwiki prefixes - :rtype: list (guaranteed to be not empty) :raises KeyError: if there is no interwiki prefix for that site. """ assert site is not None, 'Site must not be None' @@ -159,7 +167,7 @@ "There is no interwiki prefix to '{}'".format(site)) return sorted(prefixes, key=lambda p: (len(p), p)) - def local_interwiki(self, prefix): + def local_interwiki(self, prefix: str) -> bool: """ Return whether the interwiki prefix is local. @@ -174,16 +182,17 @@ return self._interwikimap[prefix].local @classmethod - def fromDBName(cls, dbname, site=None): # noqa: N802 + def fromDBName( # noqa: N802 + cls, + dbname: str, + site: Optional[BaseSite] = None + ) -> BaseSite: """ Create a site from a database name using the sitematrix. :param dbname: database name - :type dbname: str :param site: Site to load sitematrix from. (Default meta.wikimedia.org) - :type site: pywikibot.site.APISite :return: site object for the database name - :rtype: pywikibot.site.APISite """ # TODO this only works for some WMF sites if not site: @@ -196,19 +205,26 @@ continue if 'code' in val: lang = val['code'] - for site in val['site']: - if site['dbname'] == dbname: - if site['code'] == 'wiki': - site['code'] = 'wikipedia' - return pywikibot.Site(lang, site['code']) + for m_site in val['site']: + if m_site['dbname'] == dbname: + if m_site['code'] == 'wiki': + m_site['code'] = 'wikipedia' + return pywikibot.Site(lang, m_site['code']) else: # key == 'specials' - for site in val: - if site['dbname'] == dbname: - return pywikibot.Site(url=site['url'] + '/w/index.php') + for m_site in val: + if m_site['dbname'] == dbname: + url = m_site['url'] + '/w/index.php' + return pywikibot.Site(url=url) raise ValueError('Cannot parse a site out of {}.'.format(dbname)) - def _generator(self, gen_class, type_arg: Optional[str] = None, - namespaces=None, total: Optional[int] = None, **args): + def _generator( + self, + gen_class: Type[_RequestWrapperT], + type_arg: Optional[str] = None, + namespaces: NamespaceArgType = None, + total: Optional[int] = None, + **args: Any + ) -> _RequestWrapperT: """Convenience method that returns an API generator. All generic keyword arguments are passed as MW API parameter @@ -221,18 +237,14 @@ constructor unchanged (not all types require this) :param namespaces: if not None, limit the query to namespaces in this list - :type namespaces: iterable of str or Namespace key, - or a single instance of those types. May be a '|' separated - list of namespace identifiers. :param total: if not None, limit the generator to yielding this many items in total :return: iterable with parameters set - :rtype: _RequestWrapper :raises KeyError: a namespace identifier was not resolved :raises TypeError: a namespace identifier has an inappropriate type such as NoneType or bool """ - req_args = {'site': self} + req_args = {'site': self} # type: Dict[str, Any] if 'g_content' in args: req_args['g_content'] = args.pop('g_content') if 'parameters' in args: @@ -249,7 +261,7 @@ return gen @staticmethod - def _request_class(kwargs): + def _request_class(kwargs: Dict[str, Any]) -> Type[api.Request]: """ Get the appropriate class. @@ -263,7 +275,7 @@ return api.CachedRequest return api.Request - def _request(self, **kwargs): + def _request(self, **kwargs: Any) -> api.Request: """Create a request by forwarding all parameters directly.""" if 'expiry' in kwargs and kwargs['expiry'] is None: del kwargs['expiry'] @@ -271,11 +283,11 @@ return self._request_class(kwargs)(site=self, **kwargs) @deprecated('simple_request', since='7.1.0') - def _simple_request(self, **kwargs): + def _simple_request(self, **kwargs: Any) -> api.Request: """DEPRECATED. Create a request using all kwargs as parameters.""" return self.simple_request(**kwargs) - def simple_request(self, **kwargs): + def simple_request(self, **kwargs: Any) -> api.Request: """Create a request by defining all kwargs as parameters. .. versionchanged:: 7.1 @@ -289,8 +301,6 @@ The expected usernames are those provided as the user parameter at instantiation. - - :rtype: bool """ if not hasattr(self, '_userinfo'): return False @@ -305,16 +315,16 @@ return True - def is_oauth_token_available(self): - """ - Check whether OAuth token is set for this site. - - :rtype: bool - """ + def is_oauth_token_available(self) -> bool: + """Check whether OAuth token is set for this site.""" auth_token = get_authentication(self.base_url('')) return auth_token is not None and len(auth_token) == 4 - def login(self, autocreate: bool = False, user: Optional[str] = None): + def login( + self, + autocreate: bool = False, + user: Optional[str] = None + ) -> None: """ Log the user in if not already logged in. @@ -443,19 +453,20 @@ api._invalidate_superior_cookies(self.family) @property - def maxlimit(self): + def maxlimit(self) -> int: """Get the maximum limit of pages to be retrieved. .. versionadded:: 7.0 """ parameter = self._paraminfo.parameter('query+info', 'prop') + assert parameter is not None if self.logged_in() and self.has_right('apihighlimits'): return int(parameter['highlimit']) return int(parameter['limit']) # T78333, T161783 @property - def userinfo(self): + def userinfo(self) -> Dict[str, Any]: """Retrieve userinfo from site and store in _userinfo attribute. To force retrieving userinfo ignoring cache, just delete this @@ -521,9 +532,10 @@ :raises TypeError: Inappropriate argument type of 'user' """ + param = {} # type: Dict[str, Union[int, str]] if user is None: user = self.username() - param = {} + assert isinstance(user, str) elif isinstance(user, str): param = {'guiuser': user} elif isinstance(user, int): @@ -567,8 +579,10 @@ ..versionadded:: 7.0 """ + username = self.username() + assert username is not None with suppress(KeyError): - del self._globaluserinfo[self.username()] + del self._globaluserinfo[username] def is_blocked(self, force: bool = False) -> bool: """Return True when logged in user is blocked. @@ -599,7 +613,7 @@ """ return 'locked' in self.get_globaluserinfo(user, force) - def get_searched_namespaces(self, force: bool = False): + def get_searched_namespaces(self, force: bool = False) -> Set[Namespace]: """ Retrieve the default searched namespaces for the user. @@ -609,7 +623,6 @@ :param force: Whether the cache should be discarded. :return: The namespaces which are searched by default. - :rtype: ``set`` of :py:obj:`Namespace` """ # TODO: Integrate into _userinfo if (force or not hasattr(self, '_useroptions') @@ -624,7 +637,7 @@ "API userinfo response lacks 'query' key" assert 'userinfo' in uidata['query'], \ "API userinfo response lacks 'userinfo' key" - self._useroptions = uidata['query']['userinfo']['options'] + self._useroptions = uidata['query']['userinfo']['options'] # type: Dict[str, Any] # noqa: E501 # To determine if user name has changed self._useroptions['_name'] = ( None if 'anon' in uidata['query']['userinfo'] else @@ -633,9 +646,9 @@ and self._useroptions['searchNs{}'.format(ns.id)] in ['1', True]} - @property + @property # type: ignore[misc] @deprecated('articlepath', since='7.0.0') - def article_path(self): + def article_path(self) -> str: """Get the nice article path without $1. .. deprecated:: 7.0 @@ -644,7 +657,7 @@ return self.articlepath[:-2] @property - def articlepath(self): + def articlepath(self) -> str: """Get the nice article path with placeholder. .. versionadded:: 7.0 @@ -656,19 +669,29 @@ return path.replace('$1', '{}') @staticmethod - def assert_valid_iter_params(msg_prefix, start, end, reverse: bool, - is_ts: bool = True) -> None: + def assert_valid_iter_params( + msg_prefix: str, + start: Union[datetime.datetime, int, str], + end: Union[datetime.datetime, int, str], + reverse: bool, + is_ts: bool = True + ) -> None: """Validate iterating API parameters. :param msg_prefix: The calling method name - :type msg_prefix: str :param start: The start value to compare :param end: The end value to compare :param reverse: The reverse option :param is_ts: When comparing timestamps (with is_ts=True) the start is usually greater than end. Comparing titles this is vice versa. - :raises AssertionError: start/end values are in wrong order + :raises AssertionError: start/end values are not comparabel types or + are in the wrong order """ + if not (isinstance(end, type(start)) or isinstance(start, type(end))): + raise TypeError( + 'start ({!r}) and end ({!r}) must be comparable' + .format(start, end) + ) if reverse ^ is_ts: low, high = end, start order = 'follow' @@ -678,22 +701,25 @@ msg = ('{method}: "start" must {order} "end" ' 'with reverse={reverse} and is_ts={is_ts} ' 'but "start" is "{start}" and "end" is "{end}".') - assert low < high, fill(msg.format(method=msg_prefix, order=order, - start=start, end=end, - reverse=reverse, is_ts=is_ts)) + assert low < high, fill(msg.format( # type: ignore[operator] + method=msg_prefix, + order=order, + start=start, + end=end, + reverse=reverse, + is_ts=is_ts)) - def has_right(self, right): + def has_right(self, right: str) -> bool: """Return true if and only if the user has a specific right. Possible values of 'right' may vary depending on wiki settings. https://www.mediawiki.org/wiki/API:Userinfo :param right: a specific right to be validated - :type right: str """ return right.lower() in self.userinfo['rights'] - def has_group(self, group): + def has_group(self, group: str) -> bool: """Return true if and only if the user is a member of specified group. Possible values of 'group' may vary depending on wiki settings, @@ -702,11 +728,15 @@ """ return group.lower() in self.userinfo['groups'] - def messages(self): + def messages(self) -> bool: """Return true if the user has new messages, and false otherwise.""" return 'messages' in self.userinfo - def mediawiki_messages(self, keys, lang: Optional[str] = None): + def mediawiki_messages( + self, + keys: Iterable[str], + lang: Optional[str] = None + ) -> OrderedDictType[str, str]: """Fetch the text of a set of MediaWiki messages. The returned dict uses each key to store the associated message. @@ -714,9 +744,7 @@ :see: https://www.mediawiki.org/wiki/API:Allmessages :param keys: MediaWiki messages to fetch - :type keys: iterable of str :param lang: a language code, default is self.lang - :rtype: OrderedDict """ amlang = lang or self.lang if not all(amlang in _mw_msg_cache @@ -744,37 +772,39 @@ return OrderedDict((key, _mw_msg_cache[amlang][key]) for key in keys) - def mediawiki_message(self, key, lang=None) -> str: + def mediawiki_message( + self, + key: str, + lang: Optional[str] = None + ) -> str: """Fetch the text for a MediaWiki message. :param key: name of MediaWiki message - :type key: str :param lang: a language code, default is self.lang - :type lang: str or None """ return self.mediawiki_messages([key], lang=lang)[key] - def has_mediawiki_message(self, key, lang=None): + def has_mediawiki_message( + self, + key: str, + lang: Optional[str] = None + ) -> bool: """Determine if the site defines a MediaWiki message. :param key: name of MediaWiki message - :type key: str :param lang: a language code, default is self.lang - :type lang: str or None - - :rtype: bool """ return self.has_all_mediawiki_messages([key], lang=lang) - def has_all_mediawiki_messages(self, keys, lang=None) -> bool: + def has_all_mediawiki_messages( + self, + keys: Iterable[str], + lang: Optional[str] = None + ) -> bool: """Confirm that the site defines a set of MediaWiki messages. :param keys: names of MediaWiki messages - :type keys: iterable of str :param lang: a language code, default is self.lang - :type lang: str or None - - :rtype: bool """ try: self.mediawiki_messages(keys, lang=lang) @@ -783,14 +813,13 @@ return True @property - def months_names(self): + def months_names(self) -> List[Tuple[str, str]]: """Obtain month names from the site messages. The list is zero-indexed, ordered by month in calendar, and should be in the original site language. :return: list of tuples (month name, abbreviation) - :rtype: list """ if hasattr(self, '_months_names'): return self._months_names @@ -804,7 +833,7 @@ months = self.mediawiki_messages(months_long + months_short) - self._months_names = [] + self._months_names = [] # type: List[Tuple[str, str]] for m_l, m_s in zip(months_long, months_short): self._months_names.append((months[m_l], months[m_s])) @@ -835,7 +864,12 @@ return msgs['comma-separator'].join( args[:-2] + [concat.join(args[-2:])]) - def expand_text(self, text: str, title=None, includecomments=None) -> str: + def expand_text( + self, + text: str, + title: Optional[str] = None, + includecomments: Optional[bool] = None + ) -> str: """Parse the given text for preprocessing and rendering. e.g expand templates and strip comments if includecomments @@ -844,11 +878,8 @@ magic parser words like {{CURRENTTIMESTAMP}}. :param text: text to be expanded - :type text: str :param title: page title without section - :type title: str :param includecomments: if True do not strip comments - :type includecomments: bool """ if not isinstance(text, str): raise ValueError('text must be a string') @@ -866,19 +897,18 @@ key = '*' return req.submit()['expandtemplates'][key] - def getcurrenttimestamp(self): + def getcurrenttimestamp(self) -> str: """ Return the server time as a MediaWiki timestamp string. It calls :py:obj:`server_time` first so it queries the server to get the current server time. - :return: the server time - :rtype: str (as 'yyyymmddhhmmss') + :return: the server time (as 'yyyymmddhhmmss') """ return self.server_time().totimestampformat() - def server_time(self): + def server_time(self) -> 'pywikibot.Timestamp': """ Return a Timestamp object representing the current server time. @@ -886,12 +916,11 @@ reload before returning the time. :return: the current server time - :rtype: :py:obj:`Timestamp` """ return pywikibot.Timestamp.fromISOformat( self.siteinfo.get('time', expiry=True)) - def getmagicwords(self, word): + def getmagicwords(self, word: str) -> List[str]: """Return list of localized "word" magic words for the site.""" if not hasattr(self, '_magicwords'): magicwords = self.siteinfo.get('magicwords', cache=False) @@ -902,18 +931,18 @@ return self._magicwords[word] return [word] - def redirect(self): + def redirect(self) -> str: """Return the localized #REDIRECT keyword.""" # return the magic word without the preceding '#' character return self.getmagicwords('redirect')[0].lstrip('#') @deprecated('redirect_regex', since='5.5.0') - def redirectRegex(self): # noqa: N802 + def redirectRegex(self) -> Pattern[str]: # noqa: N802 """Return a compiled regular expression matching on redirect pages.""" return self.redirect_regex @property - def redirect_regex(self): + def redirect_regex(self) -> Pattern[str]: """Return a compiled regular expression matching on redirect pages. Group 1 in the regex match object will be the target title. @@ -929,15 +958,15 @@ pattern = None return super().redirectRegex(pattern) - def pagenamecodes(self): + def pagenamecodes(self) -> List[str]: """Return list of localized PAGENAME tags for the site.""" return self.getmagicwords('pagename') - def pagename2codes(self): + def pagename2codes(self) -> List[str]: """Return list of localized PAGENAMEE tags for the site.""" return self.getmagicwords('pagenamee') - def _build_namespaces(self): + def _build_namespaces(self) -> Dict[int, Namespace]: _namespaces = {} for nsdata in self.siteinfo.get('namespaces', cache=False).values(): @@ -959,7 +988,7 @@ namespace = Namespace(ns, canonical_name, custom_name, **nsdata) _namespaces[ns] = namespace - for item in self.siteinfo.get('namespacealiases'): + for item in self.siteinfo.get('namespacealiasesi'): ns = int(item['id']) try: namespace = _namespaces[ns] @@ -973,13 +1002,11 @@ return _namespaces - def has_extension(self, name) -> bool: + def has_extension(self, name: str) -> bool: """Determine whether extension `name` is loaded. :param name: The extension to check for, case sensitive - :type name: str :return: If the extension is loaded - :rtype: bool """ extensions = self.siteinfo['extensions'] for ext in extensions: @@ -988,16 +1015,16 @@ return False @property - def siteinfo(self): + def siteinfo(self) -> Siteinfo: """Site information dict.""" return self._siteinfo - def dbName(self): # noqa: N802 + def dbName(self) -> str: # noqa: N802 """Return this site's internal id.""" return self.siteinfo['wikiid'] @property - def lang(self): + def lang(self) -> str: """Return the code for the language of this Site.""" return self.siteinfo['lang'] @@ -1023,30 +1050,33 @@ return version @property - def mw_version(self): + def mw_version(self) -> MediaWikiVersion: """Return self.version() as a MediaWikiVersion object. Cache the result for 24 hours. - :rtype: MediaWikiVersion """ mw_ver, cache_time = getattr(self, '_mw_version_time', (None, None)) - if mw_ver is None or time.time() - cache_time > 60 * 60 * 24: + if ( + mw_ver is None + or cache_time is None + or time.time() - cache_time > 60 * 60 * 24 + ): mw_ver = MediaWikiVersion(self.version()) self._mw_version_time = mw_ver, time.time() return mw_ver @property - def has_image_repository(self): + def has_image_repository(self) -> bool: """Return True if site has a shared image repository like Commons.""" code, fam = self.shared_image_repository() return bool(code or fam) @property - def has_data_repository(self): + def has_data_repository(self) -> bool: """Return True if site has a shared data repository like Wikidata.""" return self.data_repository() is not None - def image_repository(self): + def image_repository(self) -> Optional[BaseSite]: """Return Site object for image repository e.g. commons.""" code, fam = self.shared_image_repository() if bool(code or fam): @@ -1054,14 +1084,16 @@ return None - def data_repository(self): + def data_repository(self) -> Optional['pywikibot.site.DataSite']: """ Return the data repository connected to this site. :return: The data repository if one is connected or None otherwise. - :rtype: pywikibot.site.DataSite or None """ - def handle_warning(mod, warning): + def handle_warning( + mod: str, + warning: str + ) -> Union[Match[str], bool, None]: return (mod == 'query' and re.match( r'Unrecognized value for parameter [\'"]meta[\'"]: wikibase', warning)) @@ -1084,24 +1116,25 @@ assert 'warnings' in data return None - def is_image_repository(self): + def is_image_repository(self) -> bool: """Return True if Site object is the image repository.""" return self is self.image_repository() - def is_data_repository(self): + def is_data_repository(self) -> bool: """Return True if its data repository is itself.""" # fixme: this was an identity check return self == self.data_repository() - def page_from_repository(self, item): + def page_from_repository( + self, + item: str + ) -> Optional['pywikibot.page.Page']: """ Return a Page for this site object specified by Wikibase item. :param item: id number of item, "Q###", - :type item: str :return: Page, or Category object given by Wikibase item number for this site object. - :rtype: pywikibot.Page or None :raises pywikibot.exceptions.UnknownExtensionError: site has no Wikibase extension @@ -1125,12 +1158,12 @@ page = pywikibot.Category(page) return page - def nice_get_address(self, title): + def nice_get_address(self, title: str) -> str: """Return shorter URL path to retrieve page titled 'title'.""" # 'title' is expected to be URL-encoded already return self.siteinfo['articlepath'].replace('$1', title) - def namespace(self, num: int, all: bool = False): + def namespace(self, num: int, all: bool = False) -> Union[str, Namespace]: """Return string containing local name of namespace 'num'. If optional argument 'all' is true, return all recognized @@ -1140,17 +1173,21 @@ :param all: If True return a Namespace object. Otherwise return the namespace name. :return: local name or Namespace object - :rtype: str or Namespace """ if all: return self.namespaces[num] return self.namespaces[num][0] - def _update_page(self, page, query, verify_imageinfo: bool = False): + def _update_page( + self, + page: 'pywikibot.page.BasePage', + query: api.PropertyGenerator, + verify_imageinfo: bool = False + ) -> None: """Update page attributes. :param page: page object to be updated - :param query: a api.QueryGenerator + :param query: API query generator :param verify_imageinfo: if given, every pageitem is checked whether 'imageinfo' is missing. In that case an exception is raised. @@ -1170,7 +1207,11 @@ raise PageRelatedError( page, 'loadimageinfo: Query on {} returned no imageinfo') - def loadpageinfo(self, page, preload: bool = False) -> None: + def loadpageinfo( + self, + page: 'pywikibot.page.BasePage', + preload: bool = False + ) -> None: """Load page info from api and store in page attributes. :see: https://www.mediawiki.org/wiki/API:Info @@ -1186,7 +1227,7 @@ inprop=inprop) self._update_page(page, query) - def loadpageprops(self, page) -> None: + def loadpageprops(self, page: 'pywikibot.page.BasePage') -> None: """Load page props for the given page.""" title = page.title(with_section=False) query = self._generator(api.PropertyGenerator, @@ -1195,8 +1236,14 @@ ) self._update_page(page, query) - def loadimageinfo(self, page, history: bool = False, - url_width=None, url_height=None, url_param=None) -> None: + def loadimageinfo( + self, + page: 'pywikibot.page.FilePage', + history: bool = False, + url_width: Optional[int] = None, + url_height: Optional[int] = None, + url_param: Optional[str] = None + ) -> None: """Load image info from api and save in page attributes. Parameters correspond to iiprops in: @@ -1210,39 +1257,41 @@ :param url_param: see iiurlparam in [1] """ - title = page.title(with_section=False) - args = {'titles': title, + args = {'titles': page.title(with_section=False), 'iiurlwidth': url_width, 'iiurlheight': url_height, 'iiurlparam': url_param, + 'iiprop': ['timestamp', 'user', 'comment', 'url', 'size', + 'sha1', 'mime', 'metadata', 'archivename'] } if not history: args['total'] = 1 query = self._generator(api.PropertyGenerator, type_arg='imageinfo', - iiprop=['timestamp', 'user', 'comment', - 'url', 'size', 'sha1', 'mime', - 'metadata', 'archivename'], **args) self._update_page(page, query, verify_imageinfo=True) - def page_restrictions(self, page): + def page_restrictions( + self, + page: 'pywikibot.page.BasePage' + ) -> Dict[str, Tuple[str, str]]: """Return a dictionary reflecting page protections.""" if not hasattr(page, '_protection'): self.loadpageinfo(page) return page._protection - def page_can_be_edited(self, page, action: str = 'edit') -> bool: + def page_can_be_edited( + self, + page: 'pywikibot.page.BasePage', + action: str = 'edit' + ) -> bool: """Determine if the page can be modified. Return True if the bot has the permission of needed restriction level for the given action type. - :param page: a pywikibot.Page object - :type page: pywikibot.Page + :param page: a pywikibot.page.BasePage object :param action: a valid restriction type like 'edit', 'move' - :type action: str - :rtype: bool :raises ValueError: invalid action parameter """ @@ -1262,21 +1311,22 @@ return True return False - def page_isredirect(self, page): + def page_isredirect(self, page: 'pywikibot.page.BasePage') -> bool: """Return True if and only if page is a redirect.""" if not hasattr(page, '_isredir'): page._isredir = False # bug T56684 self.loadpageinfo(page) return page._isredir - def getredirtarget(self, page): + def getredirtarget( + self, + page: 'pywikibot.page.BasePage' + ) -> 'pywikibot.page.Page': """ Return page object for the redirect target of page. :param page: page to search redirects for - :type page: pywikibot.page.BasePage :return: redirect target of page - :rtype: pywikibot.Page :raises pywikibot.exceptions.IsNotRedirectPageError: page is not a redirect @@ -1376,10 +1426,15 @@ Valid tokens depend on mw version. """ query = 'tokens' if self.mw_version < '1.24wmf19' else 'query+tokens' - token_types = self._paraminfo.parameter(query, 'type')['type'] - return [token for token in types if token in token_types] + data = self._paraminfo.parameter(query, 'type') + assert data is not None + return [token for token in types if token in data['type']] - def get_tokens(self, types: List[str], all: bool = False) -> dict: + def get_tokens( + self, + types: List[str], + all: bool = False + ) -> Dict[str, str]: """Preload one or multiple tokens. For MediaWiki version 1.23, only one token can be retrieved at once. @@ -1407,7 +1462,7 @@ return: a dict with retrieved valid tokens. """ - def warn_handler(mod, text): + def warn_handler(mod: str, text: str) -> Optional[Match[str]]: """Filter warnings for not available tokens.""" return re.match( r'Action \'\w+\' is not allowed for the current user', text) @@ -1415,16 +1470,16 @@ user_tokens = {} if self.mw_version < '1.24wmf19': if all is not False: - types_wiki = self._paraminfo.parameter('tokens', - 'type')['type'] - types.extend(types_wiki) + pdata = self._paraminfo.parameter('tokens', 'type') + assert pdata is not None + types.extend(pdata['type']) req = self.simple_request(action='tokens', type=self.validate_tokens(types)) else: if all is not False: - types_wiki = self._paraminfo.parameter('query+tokens', - 'type')['type'] - types.extend(types_wiki) + pdata = self._paraminfo.parameter('query+tokens', 'type') + assert pdata is not None + types.extend(pdata['type']) req = self.simple_request(action='query', meta='tokens', type=self.validate_tokens(types)) @@ -1443,7 +1498,7 @@ return user_tokens # TODO: expand support to other parameters of action=parse? - def get_parsed_page(self, page: 'pywikibot.Page') -> str: + def get_parsed_page(self, page: 'pywikibot.page.BasePage') -> str: """Retrieve parsed text of the page using action=parse. .. versionchanged:: 7.1 @@ -1461,7 +1516,7 @@ raise KeyError('API parse response lacks {} key'.format(e)) return parsed_text - def getcategoryinfo(self, category) -> None: + def getcategoryinfo(self, category: 'pywikibot.page.Category') -> None: """Retrieve data on contents of category. :see: https://www.mediawiki.org/wiki/API:Categoryinfo @@ -1472,7 +1527,10 @@ titles=cititle.encode(self.encoding())) self._update_page(category, ciquery) - def categoryinfo(self, category): + def categoryinfo( + self, + category: 'pywikibot.page.Category' + ) -> Dict[str, int]: """Retrieve data on contents of category.""" if not hasattr(category, '_catinfo'): self.getcategoryinfo(category) @@ -1482,19 +1540,28 @@ 'subcats': 0} return category._catinfo - def isBot(self, username): # noqa: N802 + def isBot(self, username: str) -> bool: # noqa: N802 """Return True is username is a bot user.""" return username in (userdata['name'] for userdata in self.botusers()) @property - def logtypes(self): + def logtypes(self) -> Set['str']: """Return a set of log types available on current site.""" - return set(filter(None, self._paraminfo.parameter( - 'query+logevents', 'type')['type'])) + data = self._paraminfo.parameter('query+logevents', 'type') + assert data is not None + return set(filter(None, data['type'])) @need_right('deleterevision') - def deleterevs(self, targettype: str, ids, *, - hide=None, show=None, reason: str = '', target=None): + def deleterevs( + self, + targettype: str, + ids: Union[int, str, List[Union[int, str]]], + *, + hide: Union[str, List[str], None] = None, + show: Union[str, List[str], None] = None, + reason: str = '', + target: Union['pywikibot.page.Page', str, None] = None + ) -> None: """Delete or undelete specified page revisions, file versions or logs. :see: https://www.mediawiki.org/wiki/API:Revisiondelete @@ -1507,13 +1574,10 @@ :param targettype: Type of target. One of "archive", "filearchive", "logging", "oldimage", "revision". :param ids: Identifiers for the revision, log, file version or archive. - :type ids: int, str, or list of int or str :param hide: What to delete. Can be "comment", "content", "user" or a combination of them in pipe-separate form such as "comment|user". - :type hide: str or list of str :param show: What to undelete. Can be "comment", "content", "user" or a combination of them in pipe-separate form such as "comment|user". - :type show: str or list of str :param reason: Deletion reason. :param target: Page object or page title, if required for the type. """ @@ -1595,22 +1659,22 @@ 'titleblacklist-forbidden': TitleblacklistError, 'spamblacklist': SpamblacklistError, 'abusefilter-disallowed': AbuseFilterDisallowedError, - } + } # type: Dict[str, Union[str, Type[PageSaveRelatedError]]] _ep_text_overrides = {'appendtext', 'prependtext', 'undo'} @need_right('edit') def editpage( self, - page, - summary=None, + page: 'pywikibot.page.BasBaseePage', + summary: Optional[str] = None, minor: bool = True, notminor: bool = False, bot: bool = True, recreate: bool = True, createonly: bool = False, nocreate: bool = False, - watch=None, - **kwargs + watch: Optional[str] = None, + **kwargs: Any ) -> bool: """Submit an edit to be saved to the wiki. @@ -1837,8 +1901,13 @@ @need_right('mergehistory') @need_version('1.27.0-wmf.13') - def merge_history(self, source, dest, timestamp=None, - reason: Optional[str] = None): + def merge_history( + self, + source: 'pywikibot.page.BasePage', + dest: 'pywikibot.page.BasePage', + timestamp: Optional['pywikibot.Timestamp'] = None, + reason: Optional[str] = None + ) -> None: """Merge revisions from one page into another. :see: https://www.mediawiki.org/wiki/API:Mergehistory @@ -1849,13 +1918,10 @@ revisions must be dated before the earliest dest revision). :param source: Source page from which revisions will be merged - :type source: pywikibot.Page :param dest: Destination page to which revisions will be merged - :type dest: pywikibot.Page :param timestamp: Revisions from this page dating up to this timestamp will be merged into the destination page (if not given or False, all revisions will be merged) - :type timestamp: pywikibot.Timestamp :param reason: Optional reason for the history merge """ # Data for error messages @@ -1880,8 +1946,9 @@ if source == dest: # Same pages raise PageSaveRelatedError( - 'Cannot merge revisions of {source} to itself' - .format_map(errdata)) + page=source, + message='Cannot merge revisions of {page} to itself' + ) # Send the merge API request token = self.tokens['csrf'] @@ -1949,11 +2016,17 @@ '[[{newtitle}]] file extension does not match content of ' '[[{oldtitle}]]', 'missingtitle': "{oldtitle} doesn't exist", - } + } # type: Dict[str, Union[str, OnErrorExc]] @need_right('move') - def movepage(self, page, newtitle: str, summary, movetalk: bool = True, - noredirect: bool = False): + def movepage( + self, + page: 'pywikibot.page.BasePage', + newtitle: str, + summary: str, + movetalk: bool = True, + noredirect: bool = False + ) -> 'pywikibot.page.Page': """Move a Page to a new title. :see: https://www.mediawiki.org/wiki/API:Move @@ -1965,7 +2038,6 @@ :param noredirect: if True, suppress creation of a redirect from the old title to the new one :return: Page object with the new title - :rtype: pywikibot.Page """ oldtitle = page.title(with_section=False) newlink = pywikibot.Link(newtitle, self) @@ -2002,7 +2074,7 @@ _logger) if err.code in self._mv_errors: on_error = self._mv_errors[err.code] - if hasattr(on_error, 'exception'): + if not isinstance(on_error, str): # LockedPageError can be raised both if "from" or "to" page # are locked for the user. # Both pages locked is not considered @@ -2056,7 +2128,11 @@ } # other errors shouldn't arise because we check for those errors @need_right('rollback') - def rollbackpage(self, page, **kwargs): + def rollbackpage( + self, + page: 'pywikibot.page.BasePage', + **kwargs: Any + ) -> None: """Roll back page to version before last user's edits. :see: https://www.mediawiki.org/wiki/API:Rollback @@ -2177,11 +2253,10 @@ if isinstance(page, pywikibot.page.BasePage): params['title'] = page - msg = page.title(with_section=False) + title = page.title(with_section=False) else: - pageid = int(page) - params['pageid'] = pageid - msg = pageid + params['pageid'] = int(page) + title = str(page) if deletetalk: if self.mw_version < '1.38wmf24': @@ -2198,7 +2273,7 @@ except APIError as err: errdata = { 'site': self, - 'title': msg, + 'title': title, 'user': self.user(), } if err.code in self._dl_errors: @@ -2210,12 +2285,20 @@ _logger) raise else: - page.clear_cache() + if isinstance(page, pywikibot.page.BasePage): + page.clear_cache() finally: self.unlock_page(page) @need_right('undelete') - def undelete(self, page, reason: str, *, revisions=None, fileids=None): + def undelete( + self, + page: 'pywikibot.page.BasePage', + reason: str, + *, + revisions: Optional[List[str]] = None, + fileids: Optional[List[Union[int, str]]] = None + ) -> None: """Undelete page from the wiki. Requires appropriate privilege level. :see: https://www.mediawiki.org/wiki/API:Undelete @@ -2228,13 +2311,10 @@ keyword argument required for *revisions*. :param page: Page to be deleted. - :type page: pywikibot.BasePage :param reason: Undeletion reason. :param revisions: List of timestamps to restore. If None, restores all revisions. - :type revisions: list :param fileids: List of fileids to restore. - :type fileids: list """ token = self.tokens['delete'] params = { @@ -2278,30 +2358,34 @@ 'protect-invalidlevel': 'Invalid protection level' } - def protection_types(self): + def protection_types(self) -> Set[str]: """ Return the protection types available on this site. :return: protection types available - :rtype: set of str instances :see: :py:obj:`Siteinfo._get_default()` """ return set(self.siteinfo.get('restrictions')['types']) - def protection_levels(self): + def protection_levels(self) -> Set[str]: """ Return the protection levels available on this site. :return: protection types available - :rtype: set of str instances :see: :py:obj:`Siteinfo._get_default()` """ # implemented in b73b5883d486db0e9278ef16733551f28d9e096d return set(self.siteinfo.get('restrictions')['levels']) @need_right('protect') - def protect(self, page, protections: dict, - reason: str, expiry=None, **kwargs): + def protect( + self, + page: 'pywikibot.page.BasePage', + protections: Dict[str, Optional[str]], + reason: str, + expiry: Union[datetime.datetime, str, None] = None, + **kwargs: Any + ) -> None: """(Un)protect a wiki page. Requires administrator status. :see: https://www.mediawiki.org/wiki/API:Protect @@ -2315,18 +2399,17 @@ :param expiry: When the block should expire. This expiry will be applied to all protections. If None, 'infinite', 'indefinite', 'never', or '' is given, there is no expiry. - :type expiry: pywikibot.Timestamp, string in GNU timestamp format - (including ISO 8601). """ token = self.tokens['protect'] self.lock_page(page) - protections = [ptype + '=' + level - for ptype, level in protections.items() - if level is not None] + protections_list = [ptype + '=' + level + for ptype, level in protections.items() + if level is not None] parameters = merge_unique_dicts(kwargs, action='protect', title=page, token=token, - protections=protections, reason=reason, + protections=protections_list, + reason=reason, expiry=expiry) req = self.simple_request(**parameters) @@ -2361,8 +2444,8 @@ @need_right('block') def blockuser( self, - user, - expiry, + user: 'pywikibot.page.User', + expiry: Union[datetime.datetime, str, bool], reason: str, anononly: bool = True, nocreate: bool = True, @@ -2370,14 +2453,13 @@ noemail: bool = False, reblock: bool = False, allowusertalk: bool = False - ): + ) -> Dict[str, Any]: """ Block a user for certain amount of time and for a certain reason. :see: https://www.mediawiki.org/wiki/API:Block :param user: The username/IP to be blocked without a namespace. - :type user: :py:obj:`pywikibot.User` :param expiry: The length or date/time when the block expires. If 'never', 'infinite', 'indefinite' it never does. If the value is given as a str it's parsed by php's strtotime function: @@ -2390,8 +2472,6 @@ It is recommended to not use a str if possible to be independent of the API. - :type expiry: Timestamp/datetime (absolute), - str (relative/absolute) or False ('never') :param reason: The reason for the block. :param anononly: Disable anonymous edits for this IP. :param nocreate: Prevent account creation. @@ -2403,7 +2483,6 @@ :param allowusertalk: Whether the user can edit their talk page while blocked. :return: The data retrieved from the API request. - :rtype: dict """ token = self.tokens['block'] if expiry is False: @@ -2418,14 +2497,17 @@ return data @need_right('unblock') - def unblockuser(self, user, reason: Optional[str] = None): + def unblockuser( + self, + user: 'pywikibot.page.User', + reason: Optional[str] = None + ) -> Dict[str, Any]: """ Remove the block for the user. :see: https://www.mediawiki.org/wiki/API:Block :param user: The username/IP without a namespace. - :type user: :py:obj:`pywikibot.User` :param reason: Reason for the unblock. """ req = self.simple_request(action='unblock', @@ -2437,14 +2519,19 @@ return data @need_right('editmywatchlist') - def watch(self, pages, unwatch: bool = False) -> bool: + def watch( + self, + pages: Union['pywikibot.page.BasePage', + str, + List[Union['pywikibot.page.BasePage', str]] + ], + unwatch: bool = False + ) -> bool: """Add or remove pages from watchlist. :see: https://www.mediawiki.org/wiki/API:Watch :param pages: A single page or a sequence of pages. - :type pages: A page object, a page-title string, or sequence of them. - Also accepts a single pipe-separated string like 'title1|title2'. :param unwatch: If True, remove pages from watchlist; if False add them (default). :return: True if API returned expected response; False otherwise @@ -2458,20 +2545,23 @@ } req = self.simple_request(**parameters) results = req.submit() - unwatch = 'unwatched' if unwatch else 'watched' - return all(unwatch in r for r in results['watch']) + unwatch_s = 'unwatched' if unwatch else 'watched' + return all(unwatch_s in r for r in results['watch']) @need_right('purge') - def purgepages(self, pages, forcelinkupdate: bool = False, - forcerecursivelinkupdate: bool = False, - converttitles: bool = False, redirects: bool = False - ) -> bool: + def purgepages( + self, + pages: List['pywikibot.page.BasePage'], + forcelinkupdate: bool = False, + forcerecursivelinkupdate: bool = False, + converttitles: bool = False, + redirects: bool = False + ) -> bool: """ Purge the server's cache for one or multiple pages. :param pages: list of Page objects :param redirects: Automatically resolve redirects. - :type redirects: bool :param converttitles: Convert titles to other variants if necessary. Only works if the wiki's content language supports variant conversion. @@ -2503,7 +2593,7 @@ return True @need_right('edit') - def is_uploaddisabled(self): + def is_uploaddisabled(self) -> bool: """Return True if upload is disabled on site. When the version is at least 1.27wmf9, uses general siteinfo. @@ -2541,18 +2631,26 @@ raise RuntimeError( 'Unexpected success of upload action without parameters.') - def stash_info(self, file_key, props=None): + def stash_info( + self, + file_key: str, + props: Optional[List[str]] = None + ) -> Dict[str, Any]: """Get the stash info for a given file key. :see: https://www.mediawiki.org/wiki/API:Stashimageinfo """ - props = props or False + props = props or None req = self.simple_request(action='query', prop='stashimageinfo', siifilekey=file_key, siiprop=props) return req.submit()['query']['stashimageinfo'][0] @need_right('upload') - def upload(self, filepage, **kwargs) -> bool: + def upload( + self, + filepage: 'pywikibot.page.FilePage', + **kwargs: Any + ) -> bool: """Upload a file to the wiki. :see: https://www.mediawiki.org/wiki/API:Upload @@ -2574,7 +2672,7 @@ """ return Uploader(self, filepage, **kwargs).upload() - def get_property_names(self, force: bool = False): + def get_property_names(self, force: bool = False) -> List[str]: """ Get property names for pages_with_property(). @@ -2587,7 +2685,7 @@ self._property_names = [pn['propname'] for pn in ppngen] return self._property_names - def compare(self, old, diff): + def compare(self, old: _CompType, diff: _CompType) -> str: """ Corresponding method to the 'action=compare' API action. @@ -2596,14 +2694,11 @@ See: https://en.wikipedia.org/w/api.php?action=help&modules=compare Use pywikibot.diff's html_comparator() method to parse result. :param old: starting revision ID, title, Page, or Revision - :type old: int, str, pywikibot.Page, or pywikibot.Page.Revision :param diff: ending revision ID, title, Page, or Revision - :type diff: int, str, pywikibot.Page, or pywikibot.Page.Revision :return: Returns an HTML string of a diff between two revisions. - :rtype: str """ # check old and diff types - def get_param(item): + def get_param(item: object) -> Optional[Tuple[str, Union[str, int]]]: if isinstance(item, str): return 'title', item if isinstance(item, pywikibot.Page): @@ -2614,16 +2709,16 @@ return 'rev', item.revid return None - old = get_param(old) - if not old: + old_t = get_param(old) + if not old_t: raise TypeError('old parameter is of invalid type') - diff = get_param(diff) - if not diff: + diff_t = get_param(diff) + if not diff_t: raise TypeError('diff parameter is of invalid type') params = {'action': 'compare', - 'from{}'.format(old[0]): old[1], - 'to{}'.format(diff[0]): diff[1]} + 'from{}'.format(old_t[0]): old_t[1], + 'to{}'.format(diff_t[0]): diff_t[1]} req = self.simple_request(**params) data = req.submit() diff --git a/pywikibot/site/_basesite.py b/pywikibot/site/_basesite.py index 19f3250..85da2fa 100644 --- a/pywikibot/site/_basesite.py +++ b/pywikibot/site/_basesite.py @@ -7,9 +7,11 @@ import functools import re import threading +from typing import Optional from warnings import warn import pywikibot +from pywikibot.backports import Pattern from pywikibot.exceptions import ( Error, FamilyMaintenanceWarning, @@ -177,13 +179,13 @@ self.__dict__.update(attrs) self._pagemutex = threading.Condition() - def user(self): + def user(self) -> Optional[str]: """Return the currently-logged in bot username, or None.""" if self.logged_in(): return self.username() return None - def username(self): + def username(self) -> Optional[str]: """Return the username used for the site.""" return self._username @@ -347,7 +349,10 @@ linkfam, linkcode = pywikibot.Link(text, self).parse_site() return linkfam != self.family.name or linkcode != self.code - def redirectRegex(self, pattern=None): # noqa: N802 + def redirectRegex( # noqa: N802 + self, + pattern: Optional[str] = None + ) -> Pattern[str]: """Return a compiled regular expression matching on redirect pages. Group 1 in the regex match object will be the target title. diff --git a/pywikibot/site/_siteinfo.py b/pywikibot/site/_siteinfo.py index e7c3bc5..a92311b 100644 --- a/pywikibot/site/_siteinfo.py +++ b/pywikibot/site/_siteinfo.py @@ -9,7 +9,7 @@ import re from collections.abc import Container from contextlib import suppress -from typing import Optional +from typing import Any, Optional, Union import pywikibot from pywikibot.exceptions import APIError @@ -254,8 +254,13 @@ """Return a siteinfo property, caching and not forcing it.""" return self.get(key, False) # caches and doesn't force it - def get(self, key: str, get_default: bool = True, cache: bool = True, - expiry=False): + def get( + self, + key: str, + get_default: bool = True, + cache: bool = True, + expiry: Union[datetime.datetime, float, bool] = False + ) -> Any: """ Return a siteinfo property. @@ -268,10 +273,7 @@ this method won't query the server. :param expiry: If the cache is older than the expiry it ignores the cache and queries the server to get the newest value. - :type expiry: int/float (days), :py:obj:`datetime.timedelta`, - False (never expired), True (always expired) :return: The gathered property - :rtype: various :raises KeyError: If the key is not a valid siteinfo property and the get_default option is set to False. :see: :py:obj:`_get_siteinfo` -- To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/770146 To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Change-Id: I5acc6664512cce786426a06915fdcbb072cd9ec3 Gerrit-Change-Number: 770146 Gerrit-PatchSet: 5 Gerrit-Owner: JJMC89 <JJMC89.Wikimedia(a)gmail.com> Gerrit-Reviewer: Xqt <info(a)gno.de> Gerrit-Reviewer: jenkins-bot Gerrit-MessageType: merged

2 years, 2 months

[Gerrit] ...core[master]: [doc] convert phab task references into a link

by Xqt (Code Review)

Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/772029 ) Change subject: [doc] convert phab task references into a link ...................................................................... [doc] convert phab task references into a link - use sphinx.ext.extlinks to convert :phab: role into a link to a Phabricator task - connect pywikibot_fix_phab_tasks to sphinx to convert Phabricator tasks T\d{5,6} inside code documentation to that :phab: role - add pywikibot_fix_phab_tasks app - change some documentation to use :phab: role - also use sphinx.ext.autosectionlabel that every label becomes a reference - remove sphinx.ext.coverage which is not uses as I can see Bug: T286364 Change-Id: Idfbabc1694592d93d8b344915623ad2dc6a5417e --- M HISTORY.rst M ROADMAP.rst M docs/api_ref/pywikibot.config.rst M docs/conf.py M docs/faq.rst M docs/global_options.rst M docs/glossary.rst M docs/installation.rst M docs/licenses.rst M scripts/CHANGELOG.md M tox.ini 11 files changed, 576 insertions(+), 569 deletions(-) Approvals: Xqt: Verified; Looks good to me, approved diff --git a/HISTORY.rst b/HISTORY.rst index aee8aaf..b1c3e5f 100644 --- a/HISTORY.rst +++ b/HISTORY.rst @@ -10,11 +10,11 @@ * i18n updates for date.py * Add number transliteration of 'lo', 'ml', 'pa', 'te' to NON_LATIN_DIGITS -* Detect range blocks with Page.is_blocked() method (T301282) +* Detect range blocks with Page.is_blocked() method (:phab:`T301282`) * to_latin_digits() function was added to textlib as counterpart of to_local_digits() function * api.Request.submit now handles search-title-disabled and search-text-disabled API Errors * A show_diff parameter was added to Page.put() and Page.change_category() -* Allow categories when saving IndexPage (T299806) +* Allow categories when saving IndexPage (:phab:`T299806`) * Add a new function case_escape to textlib * Support inheritance of the __STATICREDIRECT__ * Avoid non-deteministic behavior in removeDisableParts @@ -22,57 +22,57 @@ * Synchronize Page.linkedPages() parameters with Site.pagelinks() parameters * Scripts hash bang was changed from python to python3 * i18n.bundles(), i18n.known_languages and i18n._get_bundle() functions were added -* Raise ConnectionError immediately if urllib3.NewConnectionError occurs (T297994, 298859) -* Make pywikibot messages available with site package (T57109, T275981) +* Raise ConnectionError immediately if urllib3.NewConnectionError occurs (:phab:`T297994`, :phab:`T298859`) +* Make pywikibot messages available with site package (:phab:`T57109`, :phab:`T275981`) * Add support for API:Redirects * Enable shell script with Pywikibot site package -* Enable generate_user_files.py and generate_family_file with site-package (T107629) +* Enable generate_user_files.py and generate_family_file with site-package (:phab:`T107629`) * Add support for Python 3.11 -* Pywikibot supports PyPy 3 (T101592) -* A new method User.is_locked() was added to determine whether the user is currently locked globally (T249392) -* A new method APISite.is_locked() was added to determine whether a given user or user id is locked globally (T249392) -* APISite.get_globaluserinfo() method was added to retrieve globaluserinfo for any user or user id (T163629) +* Pywikibot supports PyPy 3 (:phab:`T101592`) +* A new method User.is_locked() was added to determine whether the user is currently locked globally (:phab:`T249392`) +* A new method APISite.is_locked() was added to determine whether a given user or user id is locked globally (:phab:`T249392`) +* APISite.get_globaluserinfo() method was added to retrieve globaluserinfo for any user or user id (:phab:`T163629`) * APISite.globaluserinfo attribute may be deleted to force reload * APISite.is_blocked() method has a force parameter to reload that info * Allow family files in base_dir by default -* Make pwb wrapper script a pywikibot entry point for scripts (T139143, T270480) -* Enable -version and --version with pwb wrapper or code entry point (T101828) -* Add `title_delimiter_and_aliases` attribute to family files to support WikiHow family (T294761) +* Make pwb wrapper script a pywikibot entry point for scripts (:phab:`T139143`, :phab:`T270480`) +* Enable -version and --version with pwb wrapper or code entry point (:phab:`T101828`) +* Add `title_delimiter_and_aliases` attribute to family files to support WikiHow family (:phab:`T294761`) * BaseBot has a public collections.Counter for reading, writing and skipping a page -* Upload: Retry upload if 'copyuploadbaddomain' API error occurs (T294825) +* Upload: Retry upload if 'copyuploadbaddomain' API error occurs (:phab:`T294825`) * Update invisible characters from unicodedata 14.0.0 * Add support for Wikimedia OCR engine with proofreadpage -* Rewrite tools.intersect_generators which makes it running up to 10'000 times faster. (T85623, T293276) -* The cached output functionality from compat release was re-implemented (T151727, T73646, T74942, T132135, T144698, T196039, T280466) +* Rewrite tools.intersect_generators which makes it running up to 10'000 times faster. (:phab:`T85623`, :phab:`T293276`) +* The cached output functionality from compat release was re-implemented (:phab:`T151727`, :phab:`T73646`, :phab:`T74942`, :phab:`T132135`, :phab:`T144698`, :phab:`T196039`, :phab:`T280466`) * L10N updates -* Adjust groupsize within pagegenerators.PreloadingGenerator (T291770) -* New "maxlimit" property was added to APISite (T291770) +* Adjust groupsize within pagegenerators.PreloadingGenerator (:phab:`T291770`) +* New "maxlimit" property was added to APISite (:phab:`T291770`) Bugfixes ~~~~~~~~ -* Don't raise an exception if BlockEntry initializer found a hidden title (T78152) -* Fix KeyError in create_warnings_list (T301610) -* Enable similar script call of pwb.py on toolforge (T298846) -* Remove question mark character from forbidden file name characters (T93482) -* Enable -interwiki option with pagegenerators (T57099) -* Don't assert login result (T298761) -* Allow title placeholder $1 in the middle of an url (T111513, T298078) -* Don't create a Site object if pywikibot is not fully imported (T298384) -* Use page.site.data_repository when creating a _WbDataPage (T296985) -* Fix mysql AttributeError for sock.close() on toolforge (T216741) -* Only search user_script_paths inside config.base_dir (T296204) -* pywikibot.argv has been fixed for pwb.py wrapper if called with global args (T254435) -* Only ignore FileExistsError when creating the api cache (T295924) -* Only handle query limit if query module is limited (T294836) -* Upload: Only set filekey/offset for files with names (T294916) -* Make site parameter of textlib.replace_links() mandatory (T294649) -* Raise a generic ServerError if the http status code is unofficial (T293208) +* Don't raise an exception if BlockEntry initializer found a hidden title (:phab:`T78152`) +* Fix KeyError in create_warnings_list (:phab:`T301610`) +* Enable similar script call of pwb.py on toolforge (:phab:`T298846`) +* Remove question mark character from forbidden file name characters (:phab:`T93482`) +* Enable -interwiki option with pagegenerators (:phab:`T57099`) +* Don't assert login result (:phab:`T298761`) +* Allow title placeholder $1 in the middle of an url (:phab:`T111513`, :phab:`T298078`) +* Don't create a Site object if pywikibot is not fully imported (:phab:`T298384`) +* Use page.site.data_repository when creating a _WbDataPage (:phab:`T296985`) +* Fix mysql AttributeError for sock.close() on toolforge (:phab:`T216741`) +* Only search user_script_paths inside config.base_dir (:phab:`T296204`) +* pywikibot.argv has been fixed for pwb.py wrapper if called with global args (:phab:`T254435`) +* Only ignore FileExistsError when creating the api cache (:phab:`T295924`) +* Only handle query limit if query module is limited (:phab:`T294836`) +* Upload: Only set filekey/offset for files with names (:phab:`T294916`) +* Make site parameter of textlib.replace_links() mandatory (:phab:`T294649`) +* Raise a generic ServerError if the http status code is unofficial (:phab:`T293208`) Breaking changes ~~~~~~~~~~~~~~~~ -* Support of Python 3.5.0 - 3.5.2 has been dropped (T286867) +* Support of Python 3.5.0 - 3.5.2 has been dropped (:phab:`T286867`) * generate_user_files.py, generate_user_files.py, shell.py and version.py were moved to pywikibot/scripts and must be used with pwb wrapper script * *See also Code cleanups below* @@ -86,7 +86,7 @@ * showHelp() function was remove in favour of show_help * CombinedPageGenerator pagegenerator was removed in favour of itertools.chain * Remove deprecated echo.Notification.id -* Remove APISite.newfiles() method (T168339) +* Remove APISite.newfiles() method (:phab:`T168339`) * Remove APISite.page_exists() method * Raise a TypeError if BaseBot.init_page return None * Remove private upload parameters in UploadRobot.upload_file(), FilePage.upload() and APISite.upload() methods @@ -98,10 +98,10 @@ * Deprecated namespace and pageTitle parameter of CosmeticChangesToolkit were removed * Remove deprecated BaseSite namespace shortcuts * Remove deprecated Family.get_cr_templates method in favour of Site.category_redirects() -* Remove deprecated Page.put_async() method (T193494) +* Remove deprecated Page.put_async() method (:phab:`T193494`) * Ignore baserevid parameter for several DataSite methods * Remove deprecated preloaditempages method -* Remove disable_ssl_certificate_validation kwargs in http functions in favour of verify parameter (T265206) +* Remove disable_ssl_certificate_validation kwargs in http functions in favour of verify parameter (:phab:`T265206`) * Deprecated PYWIKIBOT2 environment variables were removed * version.ParseError was removed in favour of exceptions.VersionParseError * specialbots.EditReplacement and specialbots.EditReplacementError were removed in favour of exceptions.EditReplacementError @@ -118,8 +118,8 @@ * DuplicateFilterPageGenerator was replaced by tools.filter_unique * ItemPage.concept_url method was replaced by ItemPage.concept_uri * Outdated parameter names has been dropped -* Deprecated pywikibot.Error exception were removed in favour of pywikibot.exceptions.Error classes (T280227) -* Deprecated exception identifiers were removed (T280227) +* Deprecated pywikibot.Error exception were removed in favour of pywikibot.exceptions.Error classes (:phab:`T280227`) +* Deprecated exception identifiers were removed (:phab:`T280227`) * Deprecated date.FormatDate class was removed in favour of date.format_date function * language_by_size property of wowwiki Family was removed in favour of codes attribute * availableOptions was removed in favour of available_options @@ -160,15 +160,15 @@ ----- *28 October 2021* -* L10N updates (T292423, T294526, T294527) +* L10N updates (:phab:`T292423`, :phab:`T294526`, :phab:`T294527`) 6.6.1 ----- *21 September 2021* -* Fix for removed action API token parameters of MediaWiki 1.37 (T291202) -* APISite.validate_tokens() no longer replaces outdated tokens (T291202) +* Fix for removed action API token parameters of MediaWiki 1.37 (:phab:`T291202`) +* APISite.validate_tokens() no longer replaces outdated tokens (:phab:`T291202`) * L10N updates @@ -176,8 +176,8 @@ ----- *15 September 2021* -* Drop piprop from meta=proofreadinfo API call (T290585) -* Remove use_2to3 with setup.py (T290451) +* Drop piprop from meta=proofreadinfo API call (:phab:`T290585`) +* Remove use_2to3 with setup.py (:phab:`T290451`) * Unify WbRepresentation's abstract method signature * L10N updates @@ -186,18 +186,18 @@ ----- *05 August 2021* -* Add support for jvwikisource (T286247) +* Add support for jvwikisource (:phab:`T286247`) * Handle missingtitle error code when deleting -* Check for outdated setuptools in pwb.py wrapper (T286980) +* Check for outdated setuptools in pwb.py wrapper (:phab:`T286980`) * Remove traceback for original exception for known API error codes * Unused strm parameter of init_handlers was removed -* Ignore throttle.pid if a Site object cannot be created (T286848) -* Explicitly return an empty string with OutputProxyOption.out property (T286403) -* Explicitly return None from ContextOption.result() (T286403) -* Add support for Lingua Libre family (T286303) +* Ignore throttle.pid if a Site object cannot be created (:phab:`T286848`) +* Explicitly return an empty string with OutputProxyOption.out property (:phab:`T286403`) +* Explicitly return None from ContextOption.result() (:phab:`T286403`) +* Add support for Lingua Libre family (:phab:`T286303`) * Catch invalid titles in Category.isCategoryRedirect() * L10N updates -* Provide structured data on Commons (T213904, T223820) +* Provide structured data on Commons (:phab:`T213904`, :phab:`T223820`) 6.4.0 @@ -206,25 +206,25 @@ * Add support for dagwiki, shiwiki and banwikisource * Fix and clean up DataSite.get_property_by_name -* Update handling of abusefilter-{disallow,warning} codes (T285317) -* Fix terminal_interface_base.input_list_choice (T285597) +* Update handling of abusefilter-{disallow,warning} codes (:phab:`T285317`) +* Fix terminal_interface_base.input_list_choice (:phab:`T285597`) * Fix ItemPage.fromPage call * Use \*iterables instead of genlist in intersect_generators * Use a sentinel variable to determine the end of an iterable in roundrobin_generators -* Require setuptools 20.8.1 (T284297) +* Require setuptools 20.8.1 (:phab:`T284297`) * Add setter and deleter for summary_parameters of AutomaticTWSummaryBot * L10N updates * Add update_options attribute to BaseBot class to update available_options -* Clear put_queue when canceling page save (T284396) -* Add -url option to pagegenerators (T239436) -* Add add_text function to textlib (T284388) -* Require setuptools >= 49.4.0 (T284297) +* Clear put_queue when canceling page save (:phab:`T284396`) +* Add -url option to pagegenerators (:phab:`T239436`) +* Add add_text function to textlib (:phab:`T284388`) +* Require setuptools >= 49.4.0 (:phab:`T284297`) * Require wikitextparser>=0.47.5 -* Allow images to upload locally even they exist in the shared repository (T267535) -* Show a warning if pywikibot.__version__ is behind scripts.__version__ (T282766) -* Handle <ce>/<chem> tags as <math> aliases within textlib.replaceExcept() (T283990) -* Expand simulate query response for wikibase support (T76694) -* Double the wait time if ratelimit exceeded (T270912) +* Allow images to upload locally even they exist in the shared repository (:phab:`T267535`) +* Show a warning if pywikibot.__version__ is behind scripts.__version__ (:phab:`T282766`) +* Handle <ce>/<chem> tags as <math> aliases within textlib.replaceExcept() (:phab:`T283990`) +* Expand simulate query response for wikibase support (:phab:`T76694`) +* Double the wait time if ratelimit exceeded (:phab:`T270912`) * Deprecated extract_templates_and_params_mwpfh and extract_templates_and_params_regex functions were removed @@ -232,12 +232,12 @@ ----- *31 May 2021* -* Check bot/nobots templates for cosmetic_changes hook (T283989) -* Remove outdated opt._option which is already dropped (T284005) +* Check bot/nobots templates for cosmetic_changes hook (:phab:`T283989`) +* Remove outdated opt._option which is already dropped (:phab:`T284005`) * Use IntEnum with cosmetic_changes CANCEL -* Remove lru_cahce from botMayEdit method and fix it's logic (T283957) -* DataSite.createNewItemFromPage() method was removed in favour of ImagePage.fromPage() (T98663) -* mwparserfromhell or wikitextparser MediaWiki markup parser is mandatory (T106763) +* Remove lru_cahce from botMayEdit method and fix it's logic (:phab:`T283957`) +* DataSite.createNewItemFromPage() method was removed in favour of ImagePage.fromPage() (:phab:`T98663`) +* mwparserfromhell or wikitextparser MediaWiki markup parser is mandatory (:phab:`T106763`) 6.2.0 @@ -247,23 +247,23 @@ Improvements and Bugfixes ~~~~~~~~~~~~~~~~~~~~~~~~~ -* Use different logfiles for multiple processes of the same script (T56685) +* Use different logfiles for multiple processes of the same script (:phab:`T56685`) * throttle.pip will be reused as soon as possbile * terminal_interface_base.TerminalHandler is subclassed from logging.StreamHandler -* Fix iterating of SizedKeyCollection (T282865) +* Fix iterating of SizedKeyCollection (:phab:`T282865`) * An abstract base user interface module was added -* APISite method pagelanglinks() may skip links with empty titles (T223157) +* APISite method pagelanglinks() may skip links with empty titles (:phab:`T223157`) * Fix Page.getDeletedRevision() method which always returned an empty list -* Async chunked uploads are supported (T129216, T133443) -* A new InvalidPageError will be raised if a Page has no version history (T280043) +* Async chunked uploads are supported (:phab:`T129216`, :phab:`T133443`) +* A new InvalidPageError will be raised if a Page has no version history (:phab:`T280043`) * L10N updates -* Fix __getattr__ for WikibaseEntity (T281389) -* Handle abusefilter-{disallow,warning} codes (T85656) +* Fix __getattr__ for WikibaseEntity (:phab:`T281389`) +* Handle abusefilter-{disallow,warning} codes (:phab:`T85656`) Code cleanups ~~~~~~~~~~~~~ -* MultipleSitesBot.site attribute was removed (T283209) +* MultipleSitesBot.site attribute was removed (:phab:`T283209`) * Deprecated BaseSite.category_namespaces() method was removed * i18n.twntranslate() function was removed in favour of twtranslate() * siteinfo must be used as a dictionary ad cannot be called anymore @@ -277,12 +277,12 @@ * Rename _MultiTemplateMatchBuilder to MultiTemplateMatchBuilder * User.name() method was removed in favour of User.username property * BasePage.getLatestEditors() method was removed in favour of contributors() or revisions() -* pagenenerators.handleArg() method was renamed to handle_arg() (T271437) +* pagenenerators.handleArg() method was renamed to handle_arg() (:phab:`T271437`) * CategoryGenerator, FileGenerator, ImageGenerator and ReferringPageGenerator pagegenerator functions were removed -* Family.ignore_certificate_error() method was removed in favour of verify_SSL_certificate (T265205) +* Family.ignore_certificate_error() method was removed in favour of verify_SSL_certificate (:phab:`T265205`) * tools.is_IP was renamed to is_ip_address due to PEP8 * config2.py was renamed to config.py -* Exceptions were renamed having a suffix "Error" due to PEP8 (T280227) +* Exceptions were renamed having a suffix "Error" due to PEP8 (:phab:`T280227`) 6.1.0 @@ -292,23 +292,23 @@ Improvements and Bugfixes ~~~~~~~~~~~~~~~~~~~~~~~~~ -* proofreadpage: search for "new" class after purge (T280357) +* proofreadpage: search for "new" class after purge (:phab:`T280357`) * Enable different types with BaseBot.treat() -* Context manager depends on pymysql version, not Python release (T279753) -* Bugfix for Site.interwiki_prefix() (T188179) -* Exclude expressions from parsed template in mwparserfromhell (T71384) +* Context manager depends on pymysql version, not Python release (:phab:`T279753`) +* Bugfix for Site.interwiki_prefix() (:phab:`T188179`) +* Exclude expressions from parsed template in mwparserfromhell (:phab:`T71384`) * Provide an object representation for DequeGenerator -* Allow deleting any subclass of BasePage by title (T278659) -* Add support for API:Revisiondelete with Site.deleterevs() method (T276726) +* Allow deleting any subclass of BasePage by title (:phab:`T278659`) +* Add support for API:Revisiondelete with Site.deleterevs() method (:phab:`T276726`) * L10N updates -* Family files can be collected from a zip folder (T278076) +* Family files can be collected from a zip folder (:phab:`T278076`) Dependencies ~~~~~~~~~~~~ -* **mwparserfromhell** or **wikitextparser** are strictly recommended (T106763) -* Require **Pillow**>=8.1.1 due to vulnerability found (T278743) -* TkDialog of GUI userinterface requires **Python 3.6+** (T278743) +* **mwparserfromhell** or **wikitextparser** are strictly recommended (:phab:`T106763`) +* Require **Pillow**>=8.1.1 due to vulnerability found (:phab:`T278743`) +* TkDialog of GUI userinterface requires **Python 3.6+** (:phab:`T278743`) * Enable textlib.extract_templates_and_params with **wikitextparser** package * Add support for **PyMySQL** 1.0.0+ @@ -318,7 +318,7 @@ * APISite.resolvemagicwords(), BaseSite.ns_index() and remove BaseSite.getNamespaceIndex() were removed * Deprecated MoveEntry.new_ns() and new_title() methods were removed * Unused NoSuchSite and PageNotSaved exception were removed -* Unused BadTitle exception was removed (T267768) +* Unused BadTitle exception was removed (:phab:`T267768`) * getSite() function was removed in favour of Site() constructor * Page.fileUrl() was removed in favour of Page.get_file_url() * Deprecated getuserinfo and getglobaluserinfo Site methods were removed @@ -328,7 +328,7 @@ ----- *20 March 2021* -* Add support for taywiki, trvwiki and mnwwiktionary (T275838, T276128, T276250) +* Add support for taywiki, trvwiki and mnwwiktionary (:phab:`T275838`, :phab:`T276128`, :phab:`T276250`) 6.0.0 @@ -338,74 +338,74 @@ Breaking changes ~~~~~~~~~~~~~~~~ -* interwiki_graph module was removed (T223826) +* interwiki_graph module was removed (:phab:`T223826`) * Require setuptools >= 20.2 due to PEP 440 -* Support of MediaWiki < 1.23 has been dropped (T268979) +* Support of MediaWiki < 1.23 has been dropped (:phab:`T268979`) * APISite.loadimageinfo will no longer return any content -* Return requests.Response with http.request() instead of plain text (T265206) +* Return requests.Response with http.request() instead of plain text (:phab:`T265206`) * config.db_hostname has been renamed to db_hostname_format Code cleanups ~~~~~~~~~~~~~ -* tools.PY2 was removed (T213287) +* tools.PY2 was removed (:phab:`T213287`) * Site.language() method was removed in favour of Site.lang property * Deprecated Page.getMovedTarget() method was removed in favour of moved_target() * Remove deprecated Wikibase.lastrevid attribute -* config settings of archived scripts were removed (T223826) -* Drop startsort/endsort parameter for site.categorymembers method (T74101) -* Deprecated data attribute of http.fetch() result has been dropped (T265206) +* config settings of archived scripts were removed (:phab:`T223826`) +* Drop startsort/endsort parameter for site.categorymembers method (:phab:`T74101`) +* Deprecated data attribute of http.fetch() result has been dropped (:phab:`T265206`) * toStdout parameter of pywikibot.output() has been dropped * Deprecated Site.getToken() and Site.case was removed -* Deprecated Family.known_families dict was removed (T89451) +* Deprecated Family.known_families dict was removed (:phab:`T89451`) * Deprecated DataSite.get_* methods was removed * Deprecated LogEntryFactory.logtypes classproperty was removed -* Unused comms.threadedhttp module was removed; threadedhttp.HttpRequest was already replaced with requests.Response (T265206) +* Unused comms.threadedhttp module was removed; threadedhttp.HttpRequest was already replaced with requests.Response (:phab:`T265206`) Other changes ~~~~~~~~~~~~~ -* Raise a SiteDefinitionError if api request response is Non-JSON and site is AutoFamily (T272911) -* Support deleting and undeleting specific file versions (T276725) +* Raise a SiteDefinitionError if api request response is Non-JSON and site is AutoFamily (:phab:`T272911`) +* Support deleting and undeleting specific file versions (:phab:`T276725`) * Only add bot option generator if the bot class have it already -* Raise a RuntimeError if pagegenerators -namespace option is provided too late (T276916) -* Check for LookupError exception in http._decide_encoding (T276715) -* Re-enable setting private family files (T270949) +* Raise a RuntimeError if pagegenerators -namespace option is provided too late (:phab:`T276916`) +* Check for LookupError exception in http._decide_encoding (:phab:`T276715`) +* Re-enable setting private family files (:phab:`T270949`) * Move the hardcoded namespace identifiers to an IntEnum * Buffer 'pageprops' in api.QueryGenerator * Ensure that BaseBot.generator is a Generator -* Add additional info into log if 'messagecode' is missing during login (T261061, T269503) -* Use hardcoded messages if i18n system is not available (T275981) +* Add additional info into log if 'messagecode' is missing during login (:phab:`T261061`, :phab:`T269503`) +* Use hardcoded messages if i18n system is not available (:phab:`T275981`) * Move wikibase data structures to page/_collections.py * L10N updates -* Add support for altwiki (T271984) -* Add support for mniwiki and mniwiktionary (T273467, T273462) -* Don't use mime parameter as boolean in api.Request (T274723) -* textlib.removeDisabledPart is able to remove templates (T274138) -* Create a SiteLink with __getitem__ method and implement lazy load (T273386, T245809, T238471, T226157) -* Fix date.formats['MonthName'] behaviour (T273573) +* Add support for altwiki (:phab:`T271984`) +* Add support for mniwiki and mniwiktionary (:phab:`T273467`, :phab:`T273462`) +* Don't use mime parameter as boolean in api.Request (:phab:`T274723`) +* textlib.removeDisabledPart is able to remove templates (:phab:`T274138`) +* Create a SiteLink with __getitem__ method and implement lazy load (:phab:`T273386`, :phab:`T245809`, :phab:`T238471`, :phab:`T226157`) +* Fix date.formats['MonthName'] behaviour (:phab:`T273573`) * Implement pagegenerators.handle_args() to process all options at once -* Add enabled_options, disabled_options to GeneratorFactory (T271320) +* Add enabled_options, disabled_options to GeneratorFactory (:phab:`T271320`) * Move interwiki() interwiki_prefix() and local_interwiki() methods from BaseSite to APISite -* Add requests.Response.headers to log when an API error occurs (T272325) +* Add requests.Response.headers to log when an API error occurs (:phab:`T272325`) 5.6.0 ----- *24 January 2021* -* Use string instead of Path-like object with "open" function in UploadRobot for Python 3.5 compatibility (T272345) -* Add support for trwikivoyage (T271263) -* UI.input_list_choice() has been improved (T272237) +* Use string instead of Path-like object with "open" function in UploadRobot for Python 3.5 compatibility (:phab:`T272345`) +* Add support for trwikivoyage (:phab:`T271263`) +* UI.input_list_choice() has been improved (:phab:`T272237`) * Global handleArgs() function was removed in favour of handle_args * Deprecated originPage and foundIn property has been removed in interwiki_graph.py * ParamInfo modules, prefixes, query_modules_with_limits properties and module_attribute_map() method was removed * Allow querying alldeletedrevisions with APISite.alldeletedrevisions() and User.deleted_contributions() -* data attribute of http.fetch() response is deprecated (T265206) -* Positional arguments of page.Revision aren't supported any longer (T259428) -* pagenenerators.handleArg() method was renamed to handle_arg() (T271437) +* data attribute of http.fetch() response is deprecated (:phab:`T265206`) +* Positional arguments of page.Revision aren't supported any longer (:phab:`T259428`) +* pagenenerators.handleArg() method was renamed to handle_arg() (:phab:`T271437`) * Page methods deprecated for 6 years were removed -* Create a Site with AutoFamily if a family isn't predefined (T249087) +* Create a Site with AutoFamily if a family isn't predefined (:phab:`T249087`) * L10N updates @@ -413,11 +413,11 @@ ----- *12 January 2021* -* Add support for niawiki, bclwikt, diqwikt, niawikt (T270416, T270282, T270278, T270412) -* Delete page using pageid instead of title (T57072) -* version.getversion_svn_setuptools function was removed (T270393) +* Add support for niawiki, bclwikt, diqwikt, niawikt (:phab:`T270416`, :phab:`T270282`, :phab:`T270278`, :phab:`T270412`) +* Delete page using pageid instead of title (:phab:`T57072`) +* version.getversion_svn_setuptools function was removed (:phab:`T270393`) * Add support for "musical notation" data type to wikibase -* -grepnot filter option was added to pagegenerators module (T219281) +* -grepnot filter option was added to pagegenerators module (:phab:`T219281`) * L10N updates @@ -425,13 +425,13 @@ ----- *2 January 2021* -* Re-enable reading user-config.py from site package (T270941) +* Re-enable reading user-config.py from site package (:phab:`T270941`) * LoginManager.getCookie() was renamed to login_to_site() -* Deprecation warning for MediaWiki < 1.23 (T268979) +* Deprecation warning for MediaWiki < 1.23 (:phab:`T268979`) * Add backports to support some Python 3.9 changes -* Desupported shared_image_repository() and nocapitalize() methods were removed (T89451) +* Desupported shared_image_repository() and nocapitalize() methods were removed (:phab:`T89451`) * pywikibot.cookie_jar was removed in favour of pywikibot.comms.http.cookie_jar -* Align http.fetch() params with requests and rename 'disable_ssl_certificate_validation' to 'verify' (T265206) +* Align http.fetch() params with requests and rename 'disable_ssl_certificate_validation' to 'verify' (:phab:`T265206`) * Deprecated compat BasePage.getRestrictions() method was removed * Outdated Site.recentchanges() parameters has been dropped * site.LoginStatus has been removed in favour of login.LoginStatus @@ -442,20 +442,20 @@ ----- *19 December 2020* -* Allow using pywikibot as site-package without user-config.py (T270474) +* Allow using pywikibot as site-package without user-config.py (:phab:`T270474`) * Python 3.10 is supported -* Fix AutoFamily scriptpath() call (T270370) -* Add support for skrwiki, skrwiktionary, eowikivoyage, wawikisource, madwiki (T268414, T268460, T269429, T269434, T269442) +* Fix AutoFamily scriptpath() call (:phab:`T270370`) +* Add support for skrwiki, skrwiktionary, eowikivoyage, wawikisource, madwiki (:phab:`T268414`, :phab:`T268460`, :phab:`T269429`, :phab:`T269434`, :phab:`T269442`) * wikistats methods fetch, raw_cached, csv, xml has been removed * PageRelatedError.getPage() has been removed in favour of PageRelatedError.page * DataSite.get_item() method has been removed -* global put_throttle option may be given as float (T269741) +* global put_throttle option may be given as float (:phab:`T269741`) * Property.getType() method has been removed -* Family.server_time() method was removed; it is still available from Site object (T89451) -* All HttpRequest parameters except of charset has been dropped (T265206) -* A lot of methods and properties of HttpRequest are deprecared in favour of requests.Resonse attributes (T265206) -* Method and properties of HttpRequest are delegated to requests.Response object (T265206) -* comms.threadedhttp.HttpRequest.raw was replaced by HttpRequest.content property (T265206) +* Family.server_time() method was removed; it is still available from Site object (:phab:`T89451`) +* All HttpRequest parameters except of charset has been dropped (:phab:`T265206`) +* A lot of methods and properties of HttpRequest are deprecared in favour of requests.Resonse attributes (:phab:`T265206`) +* Method and properties of HttpRequest are delegated to requests.Response object (:phab:`T265206`) +* comms.threadedhttp.HttpRequest.raw was replaced by HttpRequest.content property (:phab:`T265206`) * Desupported version.getfileversion() has been removed * site parameter of comms.http.requests() function is mandatory and cannot be omitted * date.MakeParameter() function has been removed @@ -467,40 +467,40 @@ ----- *10 December 2020* -* Remove deprecated args for Page.protect() (T227610) +* Remove deprecated args for Page.protect() (:phab:`T227610`) * Move BaseSite its own site/_basesite.py file * Improve toJSON() methods in page.__init__.py -* _is_wikibase_error_retryable rewritten (T48535, 268645) +* _is_wikibase_error_retryable rewritten (:phab:`T48535`, 268645) * Replace FrozenDict with frozenmap * WikiStats table may be sorted by any key * Retrieve month names from mediawiki_messages when required * Move Namespace and NamespacesDict to site/_namespace.py file -* Fix TypeError in api.LoginManager (T268445) +* Fix TypeError in api.LoginManager (:phab:`T268445`) * Add repr() method to BaseDataDict and ClaimCollection * Define availableOptions as deprecated property -* Do not strip all whitespaces from Link.title (T197642) +* Do not strip all whitespaces from Link.title (:phab:`T197642`) * Introduce a common BaseDataDict as parent for LanguageDict and AliasesDict -* Replaced PageNotSaved by PageSaveRelatedError (T267821) +* Replaced PageNotSaved by PageSaveRelatedError (:phab:`T267821`) * Add -site option as -family -lang shortcut -* Enable APISite.exturlusage() with default parameters (T266989) +* Enable APISite.exturlusage() with default parameters (:phab:`T266989`) * Update tools._unidata._category_cf from Unicode version 13.0.0 * Move TokenWallet to site/_tokenwallet.py file -* Fix import of httplib after release of requests 2.25 (T267762) -* user keyword parameter can be passed to Site.rollbackpage() (T106646) -* Check for {{bots}}/{{nobots}} templates in Page.text setter (T262136, T267770) +* Fix import of httplib after release of requests 2.25 (:phab:`T267762`) +* user keyword parameter can be passed to Site.rollbackpage() (:phab:`T106646`) +* Check for {{bots}}/{{nobots}} templates in Page.text setter (:phab:`T262136`, :phab:`T267770`) * Remove deprecated UserBlocked exception and Page.contributingUsers() * Add support for some 'wbset' actions in DataSite -* Fix UploadRobot site attribute (T267573) -* Ignore UnicodeDecodeError on input (T258143) -* Replace 'source' exception regex with 'syntaxhighlight' (T257899) -* Fix get_known_families() for wikipedia_family (T267196) +* Fix UploadRobot site attribute (:phab:`T267573`) +* Ignore UnicodeDecodeError on input (:phab:`T258143`) +* Replace 'source' exception regex with 'syntaxhighlight' (:phab:`T257899`) +* Fix get_known_families() for wikipedia_family (:phab:`T267196`) * Move _InterwikiMap class to site/_interwikimap.py * instantiate a CosmeticChangesToolkit by passing a page * Create a Site from sitename * pywikibot.Site() parameters "interface" and "url" must be keyworded -* Lookup the code parameter in xdict first (T255917) -* Remove interwiki_forwarded_from list from family files (T104125) -* Rewrite Revision class; each data can be accessed either by key or as an attribute (T102735, T259428) +* Lookup the code parameter in xdict first (:phab:`T255917`) +* Remove interwiki_forwarded_from list from family files (:phab:`T104125`) +* Rewrite Revision class; each data can be accessed either by key or as an attribute (:phab:`T102735`, :phab:`T259428`) * L10N-Updates @@ -509,23 +509,23 @@ *1 November 2020* -* Avoid conflicts between site and possible site keyword in api.Request.create_simple() (T262926) +* Avoid conflicts between site and possible site keyword in api.Request.create_simple() (:phab:`T262926`) * Remove wrong param of rvision() call in Page.latest_revision_id -* Do not raise Exception in Page.get_best_claim() but follow redirect (T265839) +* Do not raise Exception in Page.get_best_claim() but follow redirect (:phab:`T265839`) * xml-support of wikistats will be dropped * Remove deprecated mime_params in api.Request() * cleanup interwiki_graph.py and replace deprecated originPage by origin in Subjects -* Upload a file that ends with the '\r' byte (T132676) -* Fix incorrect server time (T266084) +* Upload a file that ends with the '\r' byte (:phab:`T132676`) +* Fix incorrect server time (:phab:`T266084`) * L10N-Updates -* Support Namespace packages in version.py (T265946) -* Server414Error was added to pywikibot (T266000) +* Support Namespace packages in version.py (:phab:`T265946`) +* Server414Error was added to pywikibot (:phab:`T266000`) * Deprecated editor.command() method was removed * comms.PywikibotCookieJar and comms.mode_check_decorator were deleted * Remove deprecated tools classes Stringtypes and UnicodeType * Remove deprecated tools function open_compressed and signature and UnicodeType class -* Fix http_tests.LiveFakeUserAgentTestCase (T265842) -* HttpRequest properties were renamed to request.Response identifiers (T265206) +* Fix http_tests.LiveFakeUserAgentTestCase (:phab:`T265842`) +* HttpRequest properties were renamed to request.Response identifiers (:phab:`T265206`) 5.0.0 @@ -533,47 +533,47 @@ *19 October 2020* -* Add support for smn-wiki (T264962) +* Add support for smn-wiki (:phab:`T264962`) * callback parameter of comms.http.fetch() is desupported * Fix api.APIError() calls for Flow and Thanks extension -* edit, move, create, upload, unprotect and prompt parameters of Page.protect() are deprecated (T227610) -* Accept only valid names in generate_family_file.py (T265328, T265353) +* edit, move, create, upload, unprotect and prompt parameters of Page.protect() are deprecated (:phab:`T227610`) +* Accept only valid names in generate_family_file.py (:phab:`T265328`, :phab:`T265353`) * New plural.plural_rule() function returns a rule for a given language -* Replace deprecated urllib.request.URLopener with http.fetch (T255575) -* OptionHandler/BaseBot options are accessable as OptionHandler.opt attributes or keyword item (see also T264721) +* Replace deprecated urllib.request.URLopener with http.fetch (:phab:`T255575`) +* OptionHandler/BaseBot options are accessable as OptionHandler.opt attributes or keyword item (see also :phab:`T264721`) * pywikibot.setAction() function was removed * A namedtuple is the result of textlib.extract_sections() -* Prevent circular imports in config2.py and http.py (T264500) +* Prevent circular imports in config2.py and http.py (:phab:`T264500`) * version.get_module_version() is deprecated and gives no meaningfull result -* Fix version.get_module_filename() and update log lines (T264235) -* Re-enable printing log header (T264235) -* Fix result of tools.intersect_generators() (T263947) +* Fix version.get_module_filename() and update log lines (:phab:`T264235`) +* Re-enable printing log header (:phab:`T264235`) +* Fix result of tools.intersect_generators() (:phab:`T263947`) * Only show _GLOBAL_HELP options if explicitly wanted * Deprecated Family.version() methods were removed * Unused parameters of page methods like forceReload, insite, throttle, step was removed -* Raise RuntimeError instead of AttributeError for old wikis (T263951) +* Raise RuntimeError instead of AttributeError for old wikis (:phab:`T263951`) * Deprecated script options were removed -* lyricwiki_family was removed (T245439) +* lyricwiki_family was removed (:phab:`T245439`) * RecentChangesPageGenerator parameters has been synced with APISite.recentchanges * APISite.recentchanges accepts keyword parameters only * LoginStatus enum class was moved from site to login.py * WbRepresentation derives from abstract base class abc.ABC * Update characters in the Cf category to Unicode version 12.1.0 -* Update __all__ variable in pywikibot (T122879) -* Use api.APIGenerator through site._generator (T129013) -* Support of MediaWiki releases below 1.19 has been dropped (T245350) -* Page.get_best_claim () retrieves preferred Claim of a property referring to the given page (T175207) -* Check whether _putthead is current_thread() to join() (T263331) +* Update __all__ variable in pywikibot (:phab:`T122879`) +* Use api.APIGenerator through site._generator (:phab:`T129013`) +* Support of MediaWiki releases below 1.19 has been dropped (:phab:`T245350`) +* Page.get_best_claim () retrieves preferred Claim of a property referring to the given page (:phab:`T175207`) +* Check whether _putthead is current_thread() to join() (:phab:`T263331`) * Add BasePage.has_deleted_revisions() method * Allow querying deleted revs without the deletedhistory right -* Use ignore_discard for login cookie container (T261066) -* Siteinfo.get() loads data via API instead from cache if expiry parameter is True (T260490) -* Move latest revision id handling to WikibaseEntity (T233406) -* Load wikibase entities when necessary (T245809) -* Fix path for stable release in version.getversion() (T262558) +* Use ignore_discard for login cookie container (:phab:`T261066`) +* Siteinfo.get() loads data via API instead from cache if expiry parameter is True (:phab:`T260490`) +* Move latest revision id handling to WikibaseEntity (:phab:`T233406`) +* Load wikibase entities when necessary (:phab:`T245809`) +* Fix path for stable release in version.getversion() (:phab:`T262558`) * "since" parameter in EventStreams given as Timestamp or MediaWiki timestamp string has been fixed * Methods deprecated for 6 years or longer were removed -* Page.getVersionHistory and Page.fullVersionHistory() methods were removed (T136513, T151110) +* Page.getVersionHistory and Page.fullVersionHistory() methods were removed (:phab:`T136513`, :phab:`T151110`) * Allow multiple types of contributors parameter given for Page.revision_count() * Deprecated tools.UnicodeMixin and tools.IteratorNextMixin has been removed * Localisation updates @@ -584,9 +584,9 @@ *2 September 2020* -* Don't check for valid Family/Site if running generate_user_files.py (T261771) -* Remove socket_timeout fix in config2.py introduced with T103069 -* Prevent huge traceback from underlying python libraries (T253236) +* Don't check for valid Family/Site if running generate_user_files.py (:phab:`T261771`) +* Remove socket_timeout fix in config2.py introduced with :phab:`T103069` +* Prevent huge traceback from underlying python libraries (:phab:`T253236`) * Localisation updates @@ -595,14 +595,14 @@ *28 August 2020* -* Add support for ja.wikivoyage (T261450) -* Only run cosmetic changes on wikitext pages (T260489) -* Leave a script gracefully for wrong -lang and -family option (T259756) -* Change meaning of BasePage.text (T260472) +* Add support for ja.wikivoyage (:phab:`T261450`) +* Only run cosmetic changes on wikitext pages (:phab:`T260489`) +* Leave a script gracefully for wrong -lang and -family option (:phab:`T259756`) +* Change meaning of BasePage.text (:phab:`T260472`) * site/family methods code2encodings() and code2encoding() has been removed in favour of encoding()/encodings() methods * Site.getExpandedString() method was removed in favour of expand_text * Site.Family() function was removed in favour of Family.load() method -* Add wikispore family (T260049) +* Add wikispore family (:phab:`T260049`) 4.1.1 @@ -620,10 +620,10 @@ *16 August 2020* * Enable Pywikibot for Python 3.9 -* APISite.loadpageinfo does not discard changes to page content when information was not loaded (T260472) +* APISite.loadpageinfo does not discard changes to page content when information was not loaded (:phab:`T260472`) * tools.UnicodeType and tools.signature are deprecated * BaseBot.stop() method is deprecated in favour of BaseBot.generator.close() -* Escape bot password correctly (T259488) +* Escape bot password correctly (:phab:`T259488`) * Bugfixes and improvements * Localisation updates @@ -633,14 +633,14 @@ *4 August 2020* -* Read correct object in SiteLinkCollection.normalizeData (T259426) +* Read correct object in SiteLinkCollection.normalizeData (:phab:`T259426`) * tools.count and tools classes Counter, OrderedDict and ContextManagerWrapper were removed * Deprecate UnicodeMixin and IteratorNextMixin * Restrict site module interface * EventStreams "since" parameter settings has been fixed * Unsupported debug and uploadByUrl parameters of UploadRobot were removed * Unported compat decode parameter of Page.title() has been removed -* Wikihow family file was added (T249814) +* Wikihow family file was added (:phab:`T249814`) * Improve performance of CosmeticChangesToolkit.translateMagicWords * Prohibit positional arguments with Page.title() * Functions dealing with stars list were removed @@ -648,27 +648,26 @@ * LogEntry became a UserDict; all content can be accessed by its key * URLs for new toolforge.org domain were updated * pywikibot.__release__ was deprecated -* Use one central point for framework version (T106121, T171886, T197936, T253719) -* rvtoken parameter of Site.loadrevisions() and Page.revisions() has been dropped (T74763) +* Use one central point for framework version (:phab:`T106121`, :phab:`T171886`, :phab:`T197936`, :phab:`T253719`) +* rvtoken parameter of Site.loadrevisions() and Page.revisions() has been dropped (:phab:`T74763`) * getFilesFromAnHash and getImagesFromAnHash Site methods have been removed * Site and Page methods deprecated for 10 years or longer have been removed -* Support for Python 2 and 3.4 has been dropped (T213287, T239542) +* Support for Python 2 and 3.4 has been dropped (:phab:`T213287`, :phab:`T239542`) * Bugfixes and improvements * Localisation updates -.. _python2: 3.0.20200703 ------------ -* Page.botMayEdit() method was improved (T253709) -* PageNotFound, SpamfilterError, UserActionRefuse exceptions were removed (T253681) -* tools.ip submodule has been removed (T243171) +* Page.botMayEdit() method was improved (:phab:`T253709`) +* PageNotFound, SpamfilterError, UserActionRefuse exceptions were removed (:phab:`T253681`) +* tools.ip submodule has been removed (:phab:`T243171`) * Wait in BaseBot.exit() until asynchronous saving pages are completed -* Solve IndexError when showing an empty diff with a non-zero context (T252724) +* Solve IndexError when showing an empty diff with a non-zero context (:phab:`T252724`) * linktrails were added or updated for a lot of sites -* Resolve namespaces with underlines (T252940) -* Fix getversion_svn for Python 3.6+ (T253617, T132292) +* Resolve namespaces with underlines (:phab:`T252940`) +* Fix getversion_svn for Python 3.6+ (:phab:`T253617`, :phab:`T132292`) * Bugfixes and improvements * Localisation updates @@ -676,14 +675,14 @@ 3.0.20200609 ------------ -* Fix page_can_be_edited for MediaWiki < 1.23 (T254623) +* Fix page_can_be_edited for MediaWiki < 1.23 (:phab:`T254623`) * Show global options with pwb.py -help * Usage of SkipPageError with BaseBot has been removed -* Throttle requests after ratelimits exceeded (T253180) -* Make Pywikibot daemon logs unexecutable (T253472) +* Throttle requests after ratelimits exceeded (:phab:`T253180`) +* Make Pywikibot daemon logs unexecutable (:phab:`T253472`) * Check for missing generator after BaseBot.setup() call -* Do not change usernames when creating a Site (T253127) -* pagegenerators: handle protocols in -weblink (T251308, T251310) +* Do not change usernames when creating a Site (:phab:`T253127`) +* pagegenerators: handle protocols in -weblink (:phab:`T251308`, :phab:`T251310`) * Bugfixes and improvements * Localisation updates @@ -691,183 +690,183 @@ 3.0.20200508 ------------ -* Unify and extend formats for setting sitelinks (T225863, T251512) -* Do not return a random i18n.translation() result (T220099) -* tools.ip_regexp has been removed (T174482) -* Page.getVersionHistory and Page.fullVersionHistory() methods has been desupported (T136513, T151110) -* Update wikimediachapter_family (T250802) -* Raise SpamblacklistError with spamblacklist APIError (T249436) -* SpamfilterError was renamed to SpamblacklistError (T249436) -* Do not removeUselessSpaces inside source/syntaxhighlight tags (T250469) -* Restrict Pillow to 6.2.2+ (T249911) -* Fix PetScan generator language and project (T249704) -* test_family has been removed (T228375, T228300) +* Unify and extend formats for setting sitelinks (:phab:`T225863`, :phab:`T251512`) +* Do not return a random i18n.translation() result (:phab:`T220099`) +* tools.ip_regexp has been removed (:phab:`T174482`) +* Page.getVersionHistory and Page.fullVersionHistory() methods has been desupported (:phab:`T136513`, :phab:`T151110`) +* Update wikimediachapter_family (:phab:`T250802`) +* Raise SpamblacklistError with spamblacklist APIError (:phab:`T249436`) +* SpamfilterError was renamed to SpamblacklistError (:phab:`T249436`) +* Do not removeUselessSpaces inside source/syntaxhighlight tags (:phab:`T250469`) +* Restrict Pillow to 6.2.2+ (:phab:`T249911`) +* Fix PetScan generator language and project (:phab:`T249704`) +* test_family has been removed (:phab:`T228375`, :phab:`T228300`) * Bugfixes and improvements * Localisation updates 3.0.20200405 ------------ -* Fix regression of combining sys.path in pwb.py wrapper (T249427) -* Site and Page methods deprecated for 10 years or longer are desupported and may be removed (T106121) +* Fix regression of combining sys.path in pwb.py wrapper (:phab:`T249427`) +* Site and Page methods deprecated for 10 years or longer are desupported and may be removed (:phab:`T106121`) * Usage of SkipPageError with BaseBot is desupported and may be removed -* Ignore InvalidTitle in textlib.replace_links() (T122091) +* Ignore InvalidTitle in textlib.replace_links() (:phab:`T122091`) * Raise ServerError also if connection to PetScan timeouts -* pagegenerators.py no longer supports 'oursql' or 'MySQLdb'. It now solely supports PyMySQL (T243154, T89976) +* pagegenerators.py no longer supports 'oursql' or 'MySQLdb'. It now solely supports PyMySQL (:phab:`T243154`, :phab:`T89976`) * Disfunctional Family.versionnumber() method was removed -* Refactor login functionality (T137805, T224712, T248767, T248768, T248945) +* Refactor login functionality (:phab:`T137805`, :phab:`T224712`, :phab:`T248767`, :phab:`T248768`, :phab:`T248945`) * Bugfixes and improvements * Localisation updates 3.0.20200326 ------------ * site.py and page.py files were moved to their own folders and will be split in the future -* Refactor data attributes of Wikibase entities (T233406) +* Refactor data attributes of Wikibase entities (:phab:`T233406`) * Functions dealing with stars list are desupported and may be removed -* Use path's stem of script filename within pwb.py wrapper (T248372) -* Disfunctional cgi_interface.py was removed (T248292, T248250, T193978) -* Fix logout on MW < 1.24 (T214009) -* Fixed TypeError in getFileVersionHistoryTable method (T248266) -* Outdated secure connection overrides were removed (T247668) +* Use path's stem of script filename within pwb.py wrapper (:phab:`T248372`) +* Disfunctional cgi_interface.py was removed (:phab:`T248292`, :phab:`T248250`, :phab:`T193978`) +* Fix logout on MW < 1.24 (:phab:`T214009`) +* Fixed TypeError in getFileVersionHistoryTable method (:phab:`T248266`) +* Outdated secure connection overrides were removed (:phab:`T247668`) * Check for all modules which are needed by a script within pwb.py wrapper * Check for all modules which are mandatory within pwb.py wrapper script -* Enable -help option with similar search of pwb.py (T241217) -* compat module has been removed (T183085) +* Enable -help option with similar search of pwb.py (:phab:`T241217`) +* compat module has been removed (:phab:`T183085`) * Category.copyTo and Category.copyAndKeep methods have been removed -* Site.page_restrictions() does no longer raise NoPage (T214286) -* Use site.userinfo getter instead of site._userinfo within api (T243794) -* Fix endprefix parameter in Category.articles() (T247201) -* Fix search for changed claims when saving entity (T246359) -* backports.py has been removed (T244664) -* Site.has_api method has been removed (T106121) +* Site.page_restrictions() does no longer raise NoPage (:phab:`T214286`) +* Use site.userinfo getter instead of site._userinfo within api (:phab:`T243794`) +* Fix endprefix parameter in Category.articles() (:phab:`T247201`) +* Fix search for changed claims when saving entity (:phab:`T246359`) +* backports.py has been removed (:phab:`T244664`) +* Site.has_api method has been removed (:phab:`T106121`) * Bugfixes and improvements * Localisation updates 3.0.20200306 ------------ -* Fix mul Wikisource aliases (T242537, T241413) -* Let Site('test', 'test) be equal to Site('test', 'wikipedia') (T228839) -* Support of MediaWiki releases below 1.19 will be dropped (T245350) +* Fix mul Wikisource aliases (:phab:`T242537`, :phab:`T241413`) +* Let Site('test', 'test) be equal to Site('test', 'wikipedia') (:phab:`T228839`) +* Support of MediaWiki releases below 1.19 will be dropped (:phab:`T245350`) * Provide mediawiki_messages for foreign language codes -* Use mw API IP/anon user detection (T245318) -* Correctly choose primary coordinates in BasePage.coordinates() (T244963) -* Rewrite APISite.page_can_be_edited (T244604) -* compat module is deprecated for 5 years and will be removed in next release (T183085) -* ipaddress module is required for Python 2 (T243171) -* tools.ip will be dropped in favour of tools.is_IP (T243171) +* Use mw API IP/anon user detection (:phab:`T245318`) +* Correctly choose primary coordinates in BasePage.coordinates() (:phab:`T244963`) +* Rewrite APISite.page_can_be_edited (:phab:`T244604`) +* compat module is deprecated for 5 years and will be removed in next release (:phab:`T183085`) +* ipaddress module is required for Python 2 (:phab:`T243171`) +* tools.ip will be dropped in favour of tools.is_IP (:phab:`T243171`) * tools.ip_regexp is deprecatd for 5 years and will be removed in next release -* backports.py will be removed in next release (T244664) -* stdnum package is required for ISBN scripts and cosmetic_changes (T132919, T144288, T241141) -* preload urllib.quote() with Python 2 (T243710, T222623) -* Drop isbn_hyphenate package due to outdated data (T243157) -* Fix UnboundLocalError in ProofreadPage._ocr_callback (T243644) +* backports.py will be removed in next release (:phab:`T244664`) +* stdnum package is required for ISBN scripts and cosmetic_changes (:phab:`T132919`, :phab:`T144288`, :phab:`T241141`) +* preload urllib.quote() with Python 2 (:phab:`T243710`, :phab:`T222623`) +* Drop isbn_hyphenate package due to outdated data (:phab:`T243157`) +* Fix UnboundLocalError in ProofreadPage._ocr_callback (:phab:`T243644`) * Deprecate/remove sysop parameter in several methods and functions -* Refactor Wikibase entity namespace handling (T160395) +* Refactor Wikibase entity namespace handling (:phab:`T160395`) * Site.has_api method will be removed in next release * Category.copyTo and Category.copyAndKeep will be removed in next release -* weblib module has been removed (T85001) -* botirc module has been removed (T212632) +* weblib module has been removed (:phab:`T85001`) +* botirc module has been removed (:phab:`T212632`) * Bugfixes and improvements * Localisation updates 3.0.20200111 ------------ -* Fix broken get_version() in setup.py (T198374) +* Fix broken get_version() in setup.py (:phab:`T198374`) * Rewrite site.log_page/site.unlock_page implementation -* Require requests 2.20.1 (T241934) +* Require requests 2.20.1 (:phab:`T241934`) * Make bot.suggest_help a function -* Fix gui settings for Python 3.7.4+ (T241216) -* Better api error message handling (T235500) -* Ensure that required props exists as Page attribute (T237497) -* Refactor data loading for WikibaseEntities (T233406) -* replaceCategoryInPlace: Allow LRM and RLM at the end of the old_cat title (T240084) -* Support for Python 3.4 will be dropped (T239542) -* Derive LoginStatus from IntEnum (T213287, T239533) -* enum34 package is mandatory for Python 2.7 (T213287) -* call LoginManager with keyword arguments (T237501) -* Enable Pywikibot for Python 3.8 (T238637) -* Derive BaseLink from tools.UnicodeMixin (T223894) -* Make _flush aware of _putthread ongoing tasks (T147178) -* Add family file for foundation wiki (T237888) -* Fix generate_family_file.py for private wikis (T235768) +* Fix gui settings for Python 3.7.4+ (:phab:`T241216`) +* Better api error message handling (:phab:`T235500`) +* Ensure that required props exists as Page attribute (:phab:`T237497`) +* Refactor data loading for WikibaseEntities (:phab:`T233406`) +* replaceCategoryInPlace: Allow LRM and RLM at the end of the old_cat title (:phab:`T240084`) +* Support for Python 3.4 will be dropped (:phab:`T239542`) +* Derive LoginStatus from IntEnum (:phab:`T213287`, :phab:`T239533`) +* enum34 package is mandatory for Python 2.7 (:phab:`T213287`) +* call LoginManager with keyword arguments (:phab:`T237501`) +* Enable Pywikibot for Python 3.8 (:phab:`T238637`) +* Derive BaseLink from tools.UnicodeMixin (:phab:`T223894`) +* Make _flush aware of _putthread ongoing tasks (:phab:`T147178`) +* Add family file for foundation wiki (:phab:`T237888`) +* Fix generate_family_file.py for private wikis (:phab:`T235768`) * Add rank parameter to Claim initializer -* Add current directory for similar script search (T217195) +* Add current directory for similar script search (:phab:`T217195`) * Release BaseSite.lock_page mutex during sleep -* Implement deletedrevisions api call (T75370) -* assert_valid_iter_params may raise AssertionError instead of pywikibot.Error (T233582) -* Upcast getRedirectTarget result and return the appropriate page subclass (T233392) -* Add ListGenerator for API:filearchive to site module (T230196) -* Deprecate the ability to login with a secondary sysop account (T71283) -* Enable global args with pwb.py wrapper script (T216825) -* Add a new ConfigParserBot class to set options from the scripts.ini file (T223778) -* Check a user's rights rather than group memberships; 'sysopnames' will be deprecated (T229293, T189126, T122705, T119335, T75545) -* proofreadpage.py: fix footer detection (T230301) -* Add allowusertalk to the User.block() options (T229288) -* botirc module will be removed in next release (T212632) -* weblib module will be removed in next release (T85001) +* Implement deletedrevisions api call (:phab:`T75370`) +* assert_valid_iter_params may raise AssertionError instead of pywikibot.Error (:phab:`T233582`) +* Upcast getRedirectTarget result and return the appropriate page subclass (:phab:`T233392`) +* Add ListGenerator for API:filearchive to site module (:phab:`T230196`) +* Deprecate the ability to login with a secondary sysop account (:phab:`T71283`) +* Enable global args with pwb.py wrapper script (:phab:`T216825`) +* Add a new ConfigParserBot class to set options from the scripts.ini file (:phab:`T223778`) +* Check a user's rights rather than group memberships; 'sysopnames' will be deprecated (:phab:`T229293`, :phab:`T189126`, :phab:`T122705`, :phab:`T119335`, :phab:`T75545`) +* proofreadpage.py: fix footer detection (:phab:`T230301`) +* Add allowusertalk to the User.block() options (:phab:`T229288`) +* botirc module will be removed in next release (:phab:`T212632`) +* weblib module will be removed in next release (:phab:`T85001`) * Bugfixes and improvements * Localisation updates 3.0.20190722 ------------ -* Increase the throttling delay if maxlag >> retry-after (T210606) -* deprecate test_family: Site('test', 'test'), use wikipedia_family: Site('test', 'wikipedia') instead (T228375, T228300) +* Increase the throttling delay if maxlag >> retry-after (:phab:`T210606`) +* deprecate test_family: Site('test', 'test'), use wikipedia_family: Site('test', 'wikipedia') instead (:phab:`T228375`, :phab:`T228300`) * Add "user_agent_description" option in config.py -* APISite.fromDBName works for all known dbnames (T225590, 225723, 226960) +* APISite.fromDBName works for all known dbnames (:phab:`T225590`, 225723, 226960) * remove the unimplemented "proxy" variable in config.py -* Make Family.langs property more robust (T226934) +* Make Family.langs property more robust (:phab:`T226934`) * Remove strategy family -* Handle closed_wikis as read-only (T74674) +* Handle closed_wikis as read-only (:phab:`T74674`) * TokenWallet: login automatically -* Add closed_wikis to Family.langs property (T225413) -* Redirect 'mo' site code to 'ro' and remove interwiki_replacement_overrides (T225417, T89451) -* Add support for badges on Wikibase item sitelinks through a SiteLink object instead plain str (T128202) +* Add closed_wikis to Family.langs property (:phab:`T225413`) +* Redirect 'mo' site code to 'ro' and remove interwiki_replacement_overrides (:phab:`T225417`, :phab:`T89451`) +* Add support for badges on Wikibase item sitelinks through a SiteLink object instead plain str (:phab:`T128202`) * Remove login.showCaptchaWindow() method * New parameter supplied in suggest_help function for missing dependencies * Remove NonMWAPISite class -* Introduce Claim.copy and prevent adding already saved claims (T220131) -* Fix create_short_link method after MediaWiki changes (T223865) +* Introduce Claim.copy and prevent adding already saved claims (:phab:`T220131`) +* Fix create_short_link method after MediaWiki changes (:phab:`T223865`) * Validate proofreadpage.IndexPage contents before saving it -* Refactor Link and introduce BaseLink (T66457) +* Refactor Link and introduce BaseLink (:phab:`T66457`) * Count skipped pages in BaseBot class -* 'actionthrottledtext' is a retryable wikibase error (T192912) -* Clear tokens on logout(T222508) -* Deprecation warning: support for Python 2 will be dropped (T213287) +* 'actionthrottledtext' is a retryable wikibase error (:phab:`T192912`) +* Clear tokens on logout(:phab:`T222508`) +* Deprecation warning: support for Python 2 will be dropped (:phab:`T213287`) * botirc.IRCBot has been dropped -* Avoid using outdated browseragents (T222959) -* textlib: avoid infinite execution of regex (T222671) -* Add CSRF token in sitelogout() api call (T222508) +* Avoid using outdated browseragents (:phab:`T222959`) +* textlib: avoid infinite execution of regex (:phab:`T222671`) +* Add CSRF token in sitelogout() api call (:phab:`T222508`) * Refactor WikibasePage.get and overriding methods and improve documentation * Improve title patterns of WikibasePage extensions -* Add support for property creation (T160402) +* Add support for property creation (:phab:`T160402`) * Bugfixes and improvements * Localisation updates 3.0.20190430 ------------ -* Unicode literals are required for all scripts; the usage of ASCII bytes may fail (T219095) -* Don't fail if the number of forms of a plural string is less than required (T99057, T219097) -* Implement create_short_link Page method to use Extension:UrlShortener (T220876) -* Remove wikia family file (T220921) +* Unicode literals are required for all scripts; the usage of ASCII bytes may fail (:phab:`T219095`) +* Don't fail if the number of forms of a plural string is less than required (:phab:`T99057`, :phab:`T219097`) +* Implement create_short_link Page method to use Extension:UrlShortener (:phab:`T220876`) +* Remove wikia family file (:phab:`T220921`) * Remove deprecated ez_setup.py -* Changed requirements for sseclient (T219024) -* Set optional parameter namespace to None in site.logpages (T217664) -* Add ability to display similar scripts when misspelled (T217195) -* Check if QueryGenerator supports namespaces (T198452) +* Changed requirements for sseclient (:phab:`T219024`) +* Set optional parameter namespace to None in site.logpages (:phab:`T217664`) +* Add ability to display similar scripts when misspelled (:phab:`T217195`) +* Check if QueryGenerator supports namespaces (:phab:`T198452`) * Bugfixes and improvements * Localisation updates 3.0.20190301 ------------ -* Fix version comparison (T164163) +* Fix version comparison (:phab:`T164163`) * Remove pre MediaWiki 1.14 code -* Dropped support for Python 2.7.2 and 2.7.3 (T191192) -* Fix header regex beginning with a comment (T209712) -* Implement Claim.__eq__ (T76615) +* Dropped support for Python 2.7.2 and 2.7.3 (:phab:`T191192`) +* Fix header regex beginning with a comment (:phab:`T209712`) +* Implement Claim.__eq__ (:phab:`T76615`) * cleanup config2.py * Add missing Wikibase API write actions * Bugfixes and improvements @@ -877,7 +876,7 @@ ------------ * Support python version 3.7 -* pagegenerators.py: add -querypage parameter to yield pages provided by any special page (T214234) +* pagegenerators.py: add -querypage parameter to yield pages provided by any special page (:phab:`T214234`) * Fix comparison of str, bytes and int literal * site.py: add generic self.querypage() to query SpecialPages * echo.Notification has a new event_id property as integer @@ -887,11 +886,11 @@ 3.0.20190106 ------------ -* Ensure "modules" parameter of ParamInfo._fetch is a set (T122763) -* Support adding new claims with qualifiers and/or references (T112577, T170432) +* Ensure "modules" parameter of ParamInfo._fetch is a set (:phab:`T122763`) +* Support adding new claims with qualifiers and/or references (:phab:`T112577`, :phab:`T170432`) * Support LZMA and XZ compression formats -* Update correct-ar Typo corrections in fixes.py (T211492) -* Enable MediaWiki timestamp with EventStreams (T212133) +* Update correct-ar Typo corrections in fixes.py (:phab:`T211492`) +* Enable MediaWiki timestamp with EventStreams (:phab:`T212133`) * Convert Timestamp.fromtimestampformat() if year, month and day are given only * tools.concat_options is deprecated * Additional ListOption subclasses ShowingListOption, MultipleChoiceList, ShowingMultipleChoiceList @@ -901,43 +900,43 @@ 3.0.20181203 ------------ -* Remove compat module references from autogenerated docs (T183085) -* site.preloadpages: split pagelist in most max_ids elements (T209111) +* Remove compat module references from autogenerated docs (:phab:`T183085`) +* site.preloadpages: split pagelist in most max_ids elements (:phab:`T209111`) * Disable empty sections in cosmetic_changes for user namespace -* Prevent touch from re-creating pages (T193833) -* New Page.title() parameter without_brackets; also used by titletranslate (T200399) -* Security: require requests version 2.20.0 or later (T208296) -* Check appropriate key in Site.messages (T163661) -* Make sure the cookie file is created with the right permissions (T206387) +* Prevent touch from re-creating pages (:phab:`T193833`) +* New Page.title() parameter without_brackets; also used by titletranslate (:phab:`T200399`) +* Security: require requests version 2.20.0 or later (:phab:`T208296`) +* Check appropriate key in Site.messages (:phab:`T163661`) +* Make sure the cookie file is created with the right permissions (:phab:`T206387`) * pydot >= 1.2 is required for interwiki_graph -* Move methods for simple claim adding/removing to WikibasePage (T113131) -* Enable start timestamp for EventStreams (T205121) -* Re-enable notifications (T205184) -* Use FutureWarning for warnings intended for end users (T191192) -* Provide new -wanted... page generators (T56557, T150222) -* api.QueryGenerator: Handle slots during initialization (T200955, T205210) +* Move methods for simple claim adding/removing to WikibasePage (:phab:`T113131`) +* Enable start timestamp for EventStreams (:phab:`T205121`) +* Re-enable notifications (:phab:`T205184`) +* Use FutureWarning for warnings intended for end users (:phab:`T191192`) +* Provide new -wanted... page generators (:phab:`T56557`, :phab:`T150222`) +* api.QueryGenerator: Handle slots during initialization (:phab:`T200955`, :phab:`T205210`) * Bugfixes and improvements * Localisation updates 3.0.20180922 ------------ -* Enable multiple streams for EventStreams (T205114) -* Fix Wikibase aliases handling (T194512) -* Remove cryptography support from python<=2.7.6 requirements (T203435) -* textlib._tag_pattern: Do not mistake self-closing tags with start tag (T203568) -* page.Link.langlinkUnsafe: Always set _namespace to a Namespace object (T203491) +* Enable multiple streams for EventStreams (:phab:`T205114`) +* Fix Wikibase aliases handling (:phab:`T194512`) +* Remove cryptography support from python<=2.7.6 requirements (:phab:`T203435`) +* textlib._tag_pattern: Do not mistake self-closing tags with start tag (:phab:`T203568`) +* page.Link.langlinkUnsafe: Always set _namespace to a Namespace object (:phab:`T203491`) * Enable Namespace.content for mw < 1.16 -* Allow terminating the bot generator by BaseBot.stop() method (T198801) +* Allow terminating the bot generator by BaseBot.stop() method (:phab:`T198801`) * Allow bot parameter in set_redirect_target -* Do not show empty error messages (T203462) -* Show the exception message in async mode (T203448) -* Fix the extended user-config extraction regex (T145371) -* Solve UnicodeDecodeError in site.getredirtarget (T126192) +* Do not show empty error messages (:phab:`T203462`) +* Show the exception message in async mode (:phab:`T203448`) +* Fix the extended user-config extraction regex (:phab:`T145371`) +* Solve UnicodeDecodeError in site.getredirtarget (:phab:`T126192`) * Introduce a new APISite property: mw_version * Improve hash method for BasePage and Link -* Avoid applying two uniquifying filters (T199615) -* Fix skipping of language links in CosmeticChangesToolkit.removeEmptySections (T202629) +* Avoid applying two uniquifying filters (:phab:`T199615`) +* Fix skipping of language links in CosmeticChangesToolkit.removeEmptySections (:phab:`T202629`) * New mediawiki projects were provided * Bugfixes and improvements * Localisation updates @@ -945,23 +944,23 @@ 3.0.20180823 ------------ -* Don't reset Bot._site to None if we have already a site object (T125046) -* pywikibot.site.Siteinfo: Fix the bug in cache_time when loading a CachedRequest (T202227) -* pagegenerators._handle_recentchanges: Do not request for reversed results (T199199) -* Use a key for filter_unique where appropriate (T199615) -* pywikibot.tools: Add exceptions for first_upper (T200357) -* Fix usages of site.namespaces.NAMESPACE_NAME (T201969) +* Don't reset Bot._site to None if we have already a site object (:phab:`T125046`) +* pywikibot.site.Siteinfo: Fix the bug in cache_time when loading a CachedRequest (:phab:`T202227`) +* pagegenerators._handle_recentchanges: Do not request for reversed results (:phab:`T199199`) +* Use a key for filter_unique where appropriate (:phab:`T199615`) +* pywikibot.tools: Add exceptions for first_upper (:phab:`T200357`) +* Fix usages of site.namespaces.NAMESPACE_NAME (:phab:`T201969`) * pywikibot/textlib.py: Fix header regex to allow comments -* Use 'rvslots' when fetching revisions on MW 1.32+ (T200955) -* Drop the '2' from PYWIKIBOT2_DIR, PYWIKIBOT2_DIR_PWB, and PYWIKIBOT2_NO_USER_CONFIG environment variables. The old names are now deprecated. The other PYWIKIBOT2_* variables which were used only for testing purposes have been renamed without deprecation. (T184674) -* Introduce a timestamp in deprecated decorator (T106121) -* textlib.extract_sections: Remove footer from the last section (T199751) -* Don't let WikidataBot crash on save related errors (T199642) -* Allow different projects to have different L10N entries (T198889) -* remove color highlights before fill function (T196874) -* Fix Portuguese file namespace translation in cc (T57242) -* textlib._create_default_regexes: Avoid using inline flags (T195538) -* Not everything after a language link is footer (T199539) +* Use 'rvslots' when fetching revisions on MW 1.32+ (:phab:`T200955`) +* Drop the '2' from PYWIKIBOT2_DIR, PYWIKIBOT2_DIR_PWB, and PYWIKIBOT2_NO_USER_CONFIG environment variables. The old names are now deprecated. The other PYWIKIBOT2_* variables which were used only for testing purposes have been renamed without deprecation. (:phab:`T184674`) +* Introduce a timestamp in deprecated decorator (:phab:`T106121`) +* textlib.extract_sections: Remove footer from the last section (:phab:`T199751`) +* Don't let WikidataBot crash on save related errors (:phab:`T199642`) +* Allow different projects to have different L10N entries (:phab:`T198889`) +* remove color highlights before fill function (:phab:`T196874`) +* Fix Portuguese file namespace translation in cc (:phab:`T57242`) +* textlib._create_default_regexes: Avoid using inline flags (:phab:`T195538`) +* Not everything after a language link is footer (:phab:`T199539`) * code cleanups * New mediawiki projects were provided * Bugfixes and improvements @@ -970,15 +969,15 @@ 3.0.20180710 ------------ -* Enable any LogEntry subclass for each logevent type (T199013) -* Deprecated pagegenerators options -<logtype>log aren't supported any longer (T199013) -* Open RotatingFileHandler with utf-8 encoding (T188231) -* Fix occasional failure of TestLogentries due to hidden namespace (T197506) -* Remove multiple empty sections at once in cosmetic_changes (T196324) -* Fix stub template position by putting it above interwiki comment (T57034) -* Fix handling of API continuation in PropertyGenerator (T196876) -* Use PyMySql as pure-Python MySQL client library instead of oursql, deprecate MySQLdb (T89976, T142021) -* Ensure that BaseBot.treat is always processing a Page object (T196562, T196813) +* Enable any LogEntry subclass for each logevent type (:phab:`T199013`) +* Deprecated pagegenerators options -<logtype>log aren't supported any longer (:phab:`T199013`) +* Open RotatingFileHandler with utf-8 encoding (:phab:`T188231`) +* Fix occasional failure of TestLogentries due to hidden namespace (:phab:`T197506`) +* Remove multiple empty sections at once in cosmetic_changes (:phab:`T196324`) +* Fix stub template position by putting it above interwiki comment (:phab:`T57034`) +* Fix handling of API continuation in PropertyGenerator (:phab:`T196876`) +* Use PyMySql as pure-Python MySQL client library instead of oursql, deprecate MySQLdb (:phab:`T89976`, :phab:`T142021`) +* Ensure that BaseBot.treat is always processing a Page object (:phab:`T196562`, :phab:`T196813`) * Update global bot settings * New mediawiki projects were provided * Bugfixes and improvements @@ -994,26 +993,26 @@ * Family class is made a singleton class * New rule 'startcolon' was introduced in textlib * BaseBot has new methods setup and teardown -* UploadBot got a filename prefix parameter (T170123) -* cosmetic_changes is able to remove empty sections (T140570) +* UploadBot got a filename prefix parameter (:phab:`T170123`) +* cosmetic_changes is able to remove empty sections (:phab:`T140570`) * Pywikibot is following PEP 396 versioning * pagegenerators AllpagesPageGenerator, CombinedPageGenerator, UnconnectedPageGenerator are deprecated * Some DayPageGenerator parameters has been renamed -* unicodedata2, httpbin and Flask dependency was removed (T102461, T108068, T178864, T193383) +* unicodedata2, httpbin and Flask dependency was removed (:phab:`T102461`, :phab:`T108068`, :phab:`T178864`, :phab:`T193383`) * New projects were provided * Bugfixes and improvements * Documentation updates -* Localisation updates (T194893) +* Localisation updates (:phab:`T194893`) * Translation updates 3.0.20180505 ------------ * Enable makepath and datafilepath not to create the directory -* Use API's retry-after value (T144023) -* Provide startprefix parameter for Category.articles() (T74101, T143120) -* Page.put_async() is marked as deprecated (T193494) -* Deprecate requests-requirements.txt (T193476) +* Use API's retry-after value (:phab:`T144023`) +* Provide startprefix parameter for Category.articles() (:phab:`T74101`, :phab:`T143120`) +* Page.put_async() is marked as deprecated (:phab:`T193494`) +* Deprecate requests-requirements.txt (:phab:`T193476`) * Bugfixes and improvements * New mediawiki projects were provided * Localisation updates @@ -1021,9 +1020,9 @@ 3.0.20180403 ------------ -* Deprecation warning: support for Python 2.7.2 and 2.7.3 will be dropped (T191192) -* Dropped support for Python 2.6 (T154771) -* Dropped support for Python 3.3 (T184508) +* Deprecation warning: support for Python 2.7.2 and 2.7.3 will be dropped (:phab:`T191192`) +* Dropped support for Python 2.6 (:phab:`T154771`) +* Dropped support for Python 3.3 (:phab:`T184508`) * Bugfixes and improvements * Localisation updates @@ -1087,52 +1086,52 @@ Bugfixes ~~~~~~~~ -* Manage temporary readonly error (T154011) -* Unbreak wbGeoShape and WbTabularData (T166362) -* Clean up issue with _WbDataPage (T166362) -* Re-enable xml for WikiStats with py2 (T165830) -* Solve httplib.IncompleteRead exception in eventstreams (T168535) -* Only force input_choise if self.always is given (T161483) -* Add colon when replacing category and file weblink (T127745) -* API Request: set uiprop only when ensuring 'userinfo' in meta (T169202) -* Fix TestLazyLoginNotExistUsername test for Stewardwiki (T169458) +* Manage temporary readonly error (:phab:`T154011`) +* Unbreak wbGeoShape and WbTabularData (:phab:`T166362`) +* Clean up issue with _WbDataPage (:phab:`T166362`) +* Re-enable xml for WikiStats with py2 (:phab:`T165830`) +* Solve httplib.IncompleteRead exception in eventstreams (:phab:`T168535`) +* Only force input_choise if self.always is given (:phab:`T161483`) +* Add colon when replacing category and file weblink (:phab:`T127745`) +* API Request: set uiprop only when ensuring 'userinfo' in meta (:phab:`T169202`) +* Fix TestLazyLoginNotExistUsername test for Stewardwiki (:phab:`T169458`) Improvements ~~~~~~~~~~~~ -* Introduce the new WbUnknown data type for Wikibase (T165961) +* Introduce the new WbUnknown data type for Wikibase (:phab:`T165961`) * djvu.py: add replace_page() and delete_page() * Build GeoShape and TabularData from shared base class -* Remove non-breaking spaces when tidying up a link (T130818) +* Remove non-breaking spaces when tidying up a link (:phab:`T130818`) * Replace private mylang variables with mycode in generate_user_files.py * FilePage: remove deprecated use of fileUrl -* Make socket_timeout recalculation reusable (T166539) -* FilePage.download(): add revision parameter to download arbitrary revision (T166939) -* Make pywikibot.Error more precise (T166982) -* Implement pywikibot support for adding thanks to normal revisions (T135409) -* Implement server side event client EventStreams (T158943) +* Make socket_timeout recalculation reusable (:phab:`T166539`) +* FilePage.download(): add revision parameter to download arbitrary revision (:phab:`T166939`) +* Make pywikibot.Error more precise (:phab:`T166982`) +* Implement pywikibot support for adding thanks to normal revisions (:phab:`T135409`) +* Implement server side event client EventStreams (:phab:`T158943`) * new pagegenerators filter option -titleregexnot -* Add exception for -namepace option (T167580) +* Add exception for -namepace option (:phab:`T167580`) * InteractiveReplace: Allow no replacements by default * Encode default globe in family file -* Add on to pywikibot support for thanking normal revisions (T135409) -* Add log entry code for thanks log (T135413) +* Add on to pywikibot support for thanking normal revisions (:phab:`T135409`) +* Add log entry code for thanks log (:phab:`T135413`) * Create superclass for log entries with user targets * Use relative reference to class attribute -* Allow pywikibot to authenticate against a private wiki (T153903) -* Make WbRepresentations hashable (T167827) +* Allow pywikibot to authenticate against a private wiki (:phab:`T153903`) +* Make WbRepresentations hashable (:phab:`T167827`) Updates ~~~~~~~ * Update linktails * Update languages_by_size * Update cross_allowed (global bot wikis group) -* Add atjwiki to wikipedia family file (T168049) +* Add atjwiki to wikipedia family file (:phab:`T168049`) * remove closed sites from languages_by_size list * Update category_redirect_templates for wikipedia and commons Family * Update logevent type parameter list -* Disable cleanUpSectionHeaders on jbo.wiktionary (T168399) -* Add kbpwiki to wikipedia family file (T169216) -* Remove anarchopedia family out of the framework (T167534) +* Disable cleanUpSectionHeaders on jbo.wiktionary (:phab:`T168399`) +* Add kbpwiki to wikipedia family file (:phab:`T169216`) +* Remove anarchopedia family out of the framework (:phab:`T167534`) 3.0.20170521 ------------ @@ -1143,49 +1142,49 @@ Bugfixes ~~~~~~~~ -* Increase the default socket_timeout to 75 seconds (T163635) -* use repr() of exceptions to prevent UnicodeDecodeErrors (T120222) -* Handle offset mismatches during chunked upload (T156402) -* Correct _wbtypes equality comparison (T160282) -* Re-enable getFileVersionHistoryTable() method (T162528) -* Replaced the word 'async' with 'asynchronous' due to py3.7 (T106230) -* Raise ImportError if no editor is available (T163632) -* templatesWithParams: cache and standardise params (T113892) -* getInternetArchiveURL: Retry http.fetch if there is a ConnectionError (T164208) -* Remove wikidataquery from pywikibot (T162585) +* Increase the default socket_timeout to 75 seconds (:phab:`T163635`) +* use repr() of exceptions to prevent UnicodeDecodeErrors (:phab:`T120222`) +* Handle offset mismatches during chunked upload (:phab:`T156402`) +* Correct _wbtypes equality comparison (:phab:`T160282`) +* Re-enable getFileVersionHistoryTable() method (:phab:`T162528`) +* Replaced the word 'async' with 'asynchronous' due to py3.7 (:phab:`T106230`) +* Raise ImportError if no editor is available (:phab:`T163632`) +* templatesWithParams: cache and standardise params (:phab:`T113892`) +* getInternetArchiveURL: Retry http.fetch if there is a ConnectionError (:phab:`T164208`) +* Remove wikidataquery from pywikibot (:phab:`T162585`) Improvements ~~~~~~~~~~~~ -* Introduce user_add_claim and allow asynchronous ItemPage.addClaim (T87493) -* Enable private edit summary in specialbots (T162527) +* Introduce user_add_claim and allow asynchronous ItemPage.addClaim (:phab:`T87493`) +* Enable private edit summary in specialbots (:phab:`T162527`) * Make a decorator for asynchronous methods * Provide options by a separate handler class -* Show a warning when a LogEntry type is not known (T135505) +* Show a warning when a LogEntry type is not known (:phab:`T135505`) * Add Wikibase Client extension requirement to APISite.unconnectedpages() * Update content after editing entity -* Make WbTime from Timestamp and vice versa (T131624) -* Add support for geo-shape Wikibase data type (T161726) -* Add async parameter to ItemPage.editEntity (T86074) -* Make sparql use Site to access sparql endpoint and entity_url (T159956) +* Make WbTime from Timestamp and vice versa (:phab:`T131624`) +* Add support for geo-shape Wikibase data type (:phab:`T161726`) +* Add async parameter to ItemPage.editEntity (:phab:`T86074`) +* Make sparql use Site to access sparql endpoint and entity_url (:phab:`T159956`) * timestripper: search wikilinks to reduce false matches * Set Coordinate globe via item * use extract_templates_and_params_regex_simple for template validation * Add _items for WbMonolingualText -* Allow date-versioned pypi releases from setup.py (T152907) +* Allow date-versioned pypi releases from setup.py (:phab:`T152907`) * Provide site to WbTime via WbTime.fromWikibase -* Provide preloading via GeneratorFactory.getCombinedGenerator() (T135331) -* Accept QuitKeyboardInterrupt in specialbots.Uploadbot (T163970) -* Remove unnecessary description change message when uploading a file (T163108) +* Provide preloading via GeneratorFactory.getCombinedGenerator() (:phab:`T135331`) +* Accept QuitKeyboardInterrupt in specialbots.Uploadbot (:phab:`T163970`) +* Remove unnecessary description change message when uploading a file (:phab:`T163108`) * Add 'OptionHandler' to bot.__all__ tuple * Use FilePage.upload inside UploadRobot -* Add support for tabular-data Wikibase data type (T163981) -* Get thumburl information in FilePage() (T137011) +* Add support for tabular-data Wikibase data type (:phab:`T163981`) +* Get thumburl information in FilePage() (:phab:`T137011`) Updates ~~~~~~~ * Update languages_by_size in family files * wikisource_family.py: Add "pa" to languages_by_size -* Config2: limit the number of retries to 15 (T165898) +* Config2: limit the number of retries to 15 (:phab:`T165898`) 3.0.20170403 ------------ @@ -1196,25 +1195,25 @@ Bugfixes ~~~~~~~~ -* Use default summary when summary value does not contain a string (T160823) -* Enable specialbots.py for PY3 (T161457) -* Change tw(n)translate from Site.code to Site.lang dependency (T140624) -* Do not use the "imp" module in Python 3 (T158640) -* Make sure the order of parameters does not change (T161291) -* Use pywikibot.tools.Counter instead of collections.Counter (T160620) +* Use default summary when summary value does not contain a string (:phab:`T160823`) +* Enable specialbots.py for PY3 (:phab:`T161457`) +* Change tw(n)translate from Site.code to Site.lang dependency (:phab:`T140624`) +* Do not use the "imp" module in Python 3 (:phab:`T158640`) +* Make sure the order of parameters does not change (:phab:`T161291`) +* Use pywikibot.tools.Counter instead of collections.Counter (:phab:`T160620`) * Introduce a new site method page_from_repository() -* Add pagelist tag for replaceExcept (T151940) -* logging in python3 when deprecated_args decorator is used (T159077) -* Avoid ResourceWarning using subprocess in python 3.6 (T159646) -* load_pages_from_pageids: do not fail on empty string (T153592) -* Add missing not-equal comparison for wbtypes (T158848) -* textlib.getCategoryLinks catch invalid category title exceptions (T154309) -* Fix html2unicode (T130925) -* Ignore first letter case on 'first-letter' sites, obey it otherwise (T130917) -* textlib.py: Limit catastrophic backtracking in FILE_LINK_REGEX (T148959) -* FilePage.get_file_history(): Check for len(self._file_revisions) (T155740) -* Fix for positional_arg behavior of GeneratorFactory (T155227) -* Fix broken LDAP based login (T90149) +* Add pagelist tag for replaceExcept (:phab:`T151940`) +* logging in python3 when deprecated_args decorator is used (:phab:`T159077`) +* Avoid ResourceWarning using subprocess in python 3.6 (:phab:`T159646`) +* load_pages_from_pageids: do not fail on empty string (:phab:`T153592`) +* Add missing not-equal comparison for wbtypes (:phab:`T158848`) +* textlib.getCategoryLinks catch invalid category title exceptions (:phab:`T154309`) +* Fix html2unicode (:phab:`T130925`) +* Ignore first letter case on 'first-letter' sites, obey it otherwise (:phab:`T130917`) +* textlib.py: Limit catastrophic backtracking in FILE_LINK_REGEX (:phab:`T148959`) +* FilePage.get_file_history(): Check for len(self._file_revisions) (:phab:`T155740`) +* Fix for positional_arg behavior of GeneratorFactory (:phab:`T155227`) +* Fix broken LDAP based login (:phab:`T90149`) Improvements ~~~~~~~~~~~~ @@ -1222,17 +1221,17 @@ * Renamed isImage and isCategory * Add -property option to pagegenerators.py * Add a new site method pages_with_property -* Allow retrieval of unit as ItemPage for WbQuantity (T143594) +* Allow retrieval of unit as ItemPage for WbQuantity (:phab:`T143594`) * return result of userPut with put_current method * Provide a new generator which yields a subclass of Page * Implement FilePage.download() * make general function to compute file sha -* Support adding units to WbQuantity through ItemPage or entity url (T143594) +* Support adding units to WbQuantity through ItemPage or entity url (:phab:`T143594`) * Make PropertyPage.get() return a dictionary * Add Wikibase Client extension requirement to APISite.unconnectedpages() * Make Wikibase Property provide labels data -* APISite.data_repository(): handle warning with re.match() (T156596) -* GeneratorFactory: make getCategory respect self.site (T155687) +* APISite.data_repository(): handle warning with re.match() (:phab:`T156596`) +* GeneratorFactory: make getCategory respect self.site (:phab:`T155687`) * Fix and improve default regexes Updates @@ -1248,7 +1247,7 @@ * self.doc_subpages for Meta-Wiki * Updating Wikibooks projects which allows global bots * Updated list of closed projects -* Add 'Bilde' as a namespace alias for file namespace of nn Wikipedia (T154947) +* Add 'Bilde' as a namespace alias for file namespace of nn Wikipedia (:phab:`T154947`) 2.0rc5 ------ diff --git a/ROADMAP.rst b/ROADMAP.rst index f9e86ae..d6f87bf 100644 --- a/ROADMAP.rst +++ b/ROADMAP.rst @@ -1,21 +1,21 @@ Current release changes ^^^^^^^^^^^^^^^^^^^^^^^ -* TextExtracts support was aded (T72682) +* TextExtracts support was aded (:phab:`T72682`) * Unused `get_redirect` parameter of Page.getOldVersion() has been dropped * Provide BasePage.get_parsed_page() a public method * Provide BuiltinNamespace.canonical_namespaces() with BuiltinNamespace IntEnum * BuiltinNamespace got a canonical() method -* Enable nested templates with MultiTemplateMatchBuilder (T110529) +* Enable nested templates with MultiTemplateMatchBuilder (:phab:`T110529`) * Introduce APISite.simple_request as a public method * Provide an Uploader class to upload files * Enable use of deletetalk parameter of the delete API -* Fix contextlib redirection for terminal interfaces (T283808) -* No longer use win32_unicode for Python 3.6+ (T281042, T283808, T303373) +* Fix contextlib redirection for terminal interfaces (:phab:`T283808`) +* No longer use win32_unicode for Python 3.6+ (:phab:`T281042`, :phab:`T283808`, :phab:`T303373`) * L10N updates * -cosmetic_changes (-cc) option allows to assign the value directly instead of toggle it * distutils.util.strtobool() was implemented as tools.strtobool() due to :pep:`632` -* The "in" operator always return whether the siteinfo contains the key even it is not cached (T302859) +* The "in" operator always return whether the siteinfo contains the key even it is not cached (:phab:`T302859`) * Siteinfo.clear() and Siteinfo.is_cached() methods were added Deprecations @@ -25,7 +25,7 @@ * 7.1.0: APISite._simple_request() will be removed in favour of APISite.simple_request() * 7.0.0: The i18n identifier 'cosmetic_changes-append' will be removed in favour of 'pywikibot-cosmetic-changes' * 7.0.0: User.isBlocked() method is renamed to is_blocked for consistency -* 7.0.0: Require mysql >= 0.7.11 (T216741) +* 7.0.0: Require mysql >= 0.7.11 (:phab:`T216741`) * 7.0.0: Private BaseBot counters _treat_counter, _save_counter, _skip_counter will be removed in favour of collections.Counter counter attribute * 7.0.0: A boolean watch parameter in Page.save() is deprecated and will be desupported * 7.0.0: baserevid parameter of editSource(), editQualifier(), removeClaims(), removeSources(), remove_qualifiers() DataSite methods will be removed diff --git a/docs/api_ref/pywikibot.config.rst b/docs/api_ref/pywikibot.config.rst index 354229c..a4fe0fb 100644 --- a/docs/api_ref/pywikibot.config.rst +++ b/docs/api_ref/pywikibot.config.rst @@ -18,8 +18,6 @@ :end-before: # ############# -.. _user-interface-settings: - User Interface Settings +++++++++++++++++++++++ @@ -41,9 +39,6 @@ :start-at: # ############# LOGFILE SETTINGS :end-before: # ############# - -.. _external-script-path-settings: - External Script Path Settings +++++++++++++++++++++++++++++ diff --git a/docs/conf.py b/docs/conf.py index 2a89d4f..6997f15 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -1,6 +1,6 @@ """Configuration file for Sphinx.""" # -# (C) Pywikibot team, 2014-2021 +# (C) Pywikibot team, 2014-2022 # # Distributed under the terms of the MIT license. # @@ -50,12 +50,15 @@ # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. -extensions = ['sphinx.ext.autodoc', - 'sphinx.ext.todo', - 'sphinx.ext.coverage', - 'sphinx.ext.viewcode', - 'sphinx.ext.autosummary', - 'sphinx.ext.napoleon'] +extensions = [ + 'sphinx.ext.autodoc', + 'sphinx.ext.autosectionlabel', + 'sphinx.ext.extlinks', + 'sphinx.ext.todo', + 'sphinx.ext.viewcode', + 'sphinx.ext.autosummary', + 'sphinx.ext.napoleon', +] # Allow lines like "Example:" to be followed by a code block napoleon_use_admonition_for_examples = True @@ -353,6 +356,10 @@ # Other settings autodoc_typehints = 'description' +extlinks = { + 'phab': ('https://phabricator.wikimedia.org/%s', '%s') +} + TOKENS_WITH_PARAM = [ # sphinx @@ -407,6 +414,15 @@ lines[:] = result[:] # assignment required in this way +def pywikibot_fix_phab_tasks(app, what, name, obj, options, lines): + """Convert Phabricator tasks id to a link using sphinx.ext.extlinks.""" + result = [] + for line in lines: + line = re.sub(r'(T\d{5,6})', r':phab:`\1`', line) + result.append(line) + lines[:] = result[:] + + def pywikibot_docstring_fixups(app, what, name, obj, options, lines): """Fixup docstrings.""" if what not in ('class', 'exception'): @@ -510,6 +526,7 @@ def setup(app): """Implicit Sphinx extension hook.""" app.connect('autodoc-process-docstring', pywikibot_epytext_to_sphinx) + app.connect('autodoc-process-docstring', pywikibot_fix_phab_tasks) app.connect('autodoc-process-docstring', pywikibot_docstring_fixups) app.connect('autodoc-process-docstring', pywikibot_script_docstring_fixups) app.connect('autodoc-skip-member', pywikibot_skip_members) diff --git a/docs/faq.rst b/docs/faq.rst index 0e6432e..a0f9555 100644 --- a/docs/faq.rst +++ b/docs/faq.rst @@ -23,7 +23,7 @@ **The bot cannot delete pages** Your account needs delete rights on your wiki. If you have setup another account in your user_config use ``-user`` - :ref:`global option <global options>` to change it. + :ref:`global options` to change it. Maybe you have to login first. **ERROR: Unable to execute script because no *generator* was defined.** diff --git a/docs/global_options.rst b/docs/global_options.rst index 9d3d921..57af62a 100644 --- a/docs/global_options.rst +++ b/docs/global_options.rst @@ -1,5 +1,3 @@ -.. _global options: - Global Options -------------- diff --git a/docs/glossary.rst b/docs/glossary.rst index 7361e7b..57dcc97 100644 --- a/docs/glossary.rst +++ b/docs/glossary.rst @@ -36,10 +36,9 @@ - the :mod:`pwb` wrapper script python2 - A :term:`tag` for the last Pywikibot release :ref:`3.0.20200703<python2>` - supporting Python 2 (Python 2.7.3 - 2.7.18). Ask for `Python 2 to - 3 support <https://phabricator.wikimedia.org/T242120>`_ to - convert your old scripts. + A :term:`tag` for the last Pywikibot release :ref:`3.0.20200703` + supporting Python 2 (Python 2.7.3 - 2.7.18). Ask for :phab:`Python 2 to + 3 support <T242120>` to convert your old scripts. pywikibot **Py**\ thon Media\ **Wiki Bot** Framework, a Python library and diff --git a/docs/installation.rst b/docs/installation.rst index 7cd30f2..1c7f6c3 100644 --- a/docs/installation.rst +++ b/docs/installation.rst @@ -38,6 +38,6 @@ .. seealso:: * :manpage:`i18n` Manual * `MediaWiki Language Codes <https://www.mediawiki.org/wiki/Manual:Language#Language_code>`_ - * :ref:`User Interface Settings<user-interface-settings>` + * :ref:`User Interface Settings` * :py:mod:`pywikibot.i18n` diff --git a/docs/licenses.rst b/docs/licenses.rst index e773820..1c56566 100644 --- a/docs/licenses.rst +++ b/docs/licenses.rst @@ -4,11 +4,10 @@ ======== **The framework itself and this documentation** are available under the -:ref:`MIT license <licenses-MIT>`; manual pages on mediawiki.org are +:ref:`MIT license`; manual pages on mediawiki.org are available under the `CC-BY-SA 3.0`_ license. The Pywikibot logo is available under the `CC-BY-SA 4.0`_ license. -.. _licenses-MIT: MIT License ----------- diff --git a/scripts/CHANGELOG.md b/scripts/CHANGELOG.md index 43413ed..5c7082d 100644 --- a/scripts/CHANGELOG.md +++ b/scripts/CHANGELOG.md @@ -8,7 +8,7 @@ * -always option was enabled ### reflinks -* Decode pdfinfo if it is bytes content (T303731) +* Decode pdfinfo if it is bytes content (:phab:`:phab:`T303731``) ## 7.0.0 @@ -16,61 +16,61 @@ ### general * L10N updates -* Provide ConfigParserBot for several scripts (T223778) +* Provide ConfigParserBot for several scripts (:phab:`T223778`) ### add_text -* Provide -create and -createonly options (T291354) +* Provide -create and -createonly options (:phab:`T291354`) * Deprecated function get_text() was removed in favour of Page.text and BaseBot.skip_page() * Deprecated function put_text() was removed in favour of BaseBot.userPut() method * Deprecated function add_text() were remove in favour of textlib.add_text() ### blockpageschecker -* Use different edit comments when adding, changeing or removing templates (T291345) -* Derive CheckerBot from ConfigParserBot (T57106) -* Derive CheckerBot from CurrentPageBot (T196851, T171713) +* Use different edit comments when adding, changeing or removing templates (:phab:`T291345`) +* Derive CheckerBot from ConfigParserBot (:phab:`T57106`) +* Derive CheckerBot from CurrentPageBot (:phab:`T196851`, :phab:`T171713`) ### category * CleanBot was added which can be invoked by clean action option * Recurse CategoryListifyRobot with depth -* Show a warning if a pagegenerator option is not enabled (T298522) +* Show a warning if a pagegenerator option is not enabled (:phab:`T298522`) * Deprecated code parts were removed ### checkimages -* Skip PageSaveRelatedError and ServerError when putting talk page (T302174) +* Skip PageSaveRelatedError and ServerError when putting talk page (:phab:`T302174`) ### commonscat -* Ignore InvalidTitleError in CommonscatBot.findCommonscatLink (T291783) +* Ignore InvalidTitleError in CommonscatBot.findCommonscatLink (:phab:`T291783`) ### cosmetic_changes -* Ignore InvalidTitleError in CosmeticChangesBot.treat_page (T293612) +* Ignore InvalidTitleError in CosmeticChangesBot.treat_page (:phab:`T293612`) ### djvutext -* pass site arg only once (T292367) +* pass site arg only once (:phab:`T292367`) ### fixing_redirects * Let only put_current show the message "No changes were needed" -* Use concurrent.futures to retrieve redirect or moved targets (T298789) -* Add an option to ignore solving moved targets (T298789) +* Use concurrent.futures to retrieve redirect or moved targets (:phab:`T298789`) +* Add an option to ignore solving moved targets (:phab:`T298789`) ### imagetransfer -* Add support for chunked uploading (T300531) +* Add support for chunked uploading (:phab:`T300531`) ### newitem * Do not pass OtherPageSaveRelatedError silently ### pagefromfile * Preload pages instead of reading them one by one before putting changes -* Don't ask for confirmation by default (T291757) +* Don't ask for confirmation by default (:phab:`T291757`) ### redirect -* Use site.maxlimit to determine the highest limit to load (T299859) +* Use site.maxlimit to determine the highest limit to load (:phab:`T299859`) ### replace -* Enable default behaviour with -mysqlquery (T299306) +* Enable default behaviour with -mysqlquery (:phab:`T299306`) * Deprecated "acceptall" and "addedCat" parameters were replaced by "always" and "addcat" ### revertbot -* Add support for translated dates/times (T102174) +* Add support for translated dates/times (:phab:`T102174`) * Deprecated "max" parameter was replaced by "total" ### solve_disambiguation @@ -80,7 +80,7 @@ *Do not pass OtherPageSaveRelatedError silently ### unusedfiles -* Use oldest_file_info.user as uploader (T301768) +* Use oldest_file_info.user as uploader (:phab:`T301768`) ## 6.6.1 @@ -101,7 +101,7 @@ *05 August 2021* ### reflinks -* Don't ignore identical references with newline in ref content (T286369) +* Don't ignore identical references with newline in ref content (:phab:`T286369`) * L10N updates @@ -112,29 +112,29 @@ * show a warning if pywikibot.__version__ is behind scripts.__version__ ### addtext -* Deprecate get_text, put_text and add_text functions (T284388) -* Use AutomaticTWSummaryBot and NoRedirectPageBot bot class instead of functions (T196851) +* Deprecate get_text, put_text and add_text functions (:phab:`T284388`) +* Use AutomaticTWSummaryBot and NoRedirectPageBot bot class instead of functions (:phab:`T196851`) ### blockpageschecker * Script was unarchived ### commonscat -* Enable multiple sites (T57083) +* Enable multiple sites (:phab:`T57083`) * Use new textlib.add_text function ### cosmetic_changes -* set -ignore option to CANCEL.MATCH by default (T108446) +* set -ignore option to CANCEL.MATCH by default (:phab:`T108446`) ### fixing_redirects -* Add -overwrite option (T235219) +* Add -overwrite option (:phab:`T235219`) ### imagetransfer -* Skip pages which does not exist on source site (T284414) +* Skip pages which does not exist on source site (:phab:`T284414`) * Use roundrobin_generators to combine multiple template inclusions -* Allow images existing in the shared repo (T267535) +* Allow images existing in the shared repo (:phab:`T267535`) ### template -* Do not try to initialze generator twice in TemplateRobot (T284534) +* Do not try to initialze generator twice in TemplateRobot (:phab:`T284534`) ### update_script * compat2core script was restored and renamed to update_script @@ -166,22 +166,22 @@ * dry parameter of CategoryAddBot will be removed ### commonscat -* Ignore InvalidTitleError (T267742) -* exit checkCommonscatLink method if target name is empty (T282693) +* Ignore InvalidTitleError (:phab:`T267742`) +* exit checkCommonscatLink method if target name is empty (:phab:`T282693`) ### fixing_redirects -* ValueError will be ignored (T283403, T111513) -* InterwikiRedirectPageError will be ignored (T137754) -* InvalidPageError will be ignored (T280043) +* ValueError will be ignored (:phab:`T283403`, :phab:`T111513`) +* InterwikiRedirectPageError will be ignored (:phab:`T137754`) +* InvalidPageError will be ignored (:phab:`T280043`) ### reflinks * Use consecutive reference numbers for autogenerated links ### replace -* InvalidPageError will be ignored (T280043) +* InvalidPageError will be ignored (:phab:`T280043`) ### upload -* Support async chunked uploads (T129216) +* Support async chunked uploads (:phab:`T129216`) ## 6.1.0 @@ -196,19 +196,19 @@ * watchlist.py was restored ### archivebot -* PageArchiver.maxsize must be defined before load_config() (T277547) +* PageArchiver.maxsize must be defined before load_config() (:phab:`T277547`) * Time period must have a qualifier ### imagetransfer -* Fix usage of -tofamily -tolang options (T279232) +* Fix usage of -tofamily -tolang options (:phab:`T279232`) ### misspelling * Use the new DisambiguationRobot interface and options ### reflinks -* Catch urllib3.LocationParseError and skip link (T280356) +* Catch urllib3.LocationParseError and skip link (:phab:`T280356`) * L10N updates -* Avoid dupliate reference names (T278040) +* Avoid dupliate reference names (:phab:`T278040`) ### solve_disambiguation * Keyword arguments are recommended if deriving the bot; opt option handler is used. @@ -221,12 +221,12 @@ *15 March 2021* ### general -* interwikidumps.py, cfd.py and featured.py scripts were deleted (T223826) -* Long time unused scripts were archived (T223826). Ask to recover if needed. +* interwikidumps.py, cfd.py and featured.py scripts were deleted (:phab:`T223826`) +* Long time unused scripts were archived (:phab:`T223826`). Ask to recover if needed. * pagegenerators.handle_args() is used in several scripts ### archivebot -* Always take 'maxarticlesize' into account when saving (T276937) +* Always take 'maxarticlesize' into account when saving (:phab:`T276937`) * Remove deprecated parts ### category @@ -236,16 +236,16 @@ * New script to wrap Commons file descriptions in language templates ### generate_family_file -* Ignore ssl certificate validation (T265210) +* Ignore ssl certificate validation (:phab:`T265210`) ### login * update help string ### maintenance -* Add a preload_sites.py script to preload site informations (T226157) +* Add a preload_sites.py script to preload site informations (:phab:`T226157`) ### reflinks -* Force pdf file to be closed (T276747) +* Force pdf file to be closed (:phab:`T276747`) * Fix http.fetch response data attribute * Fix treat process flow @@ -253,7 +253,7 @@ * Add replacement description to -summary message ### replicate_wiki -* replace pages in all sites (T275291) +* replace pages in all sites (:phab:`T275291`) ### solve_disambiguation * Deprecated methods were removed @@ -267,15 +267,15 @@ *24 January 2021* ### general -* pagegenerators handleArg was renamed to handle_arg (T271437) +* pagegenerators handleArg was renamed to handle_arg (:phab:`T271437`) * i18n updates ### add_text -* bugfix: str.join() expects an iterable not multiple args (T272223) +* bugfix: str.join() expects an iterable not multiple args (:phab:`T272223`) ### redirect -* pagegenerators -page option was implemented (T100643) -* pagegenerators namespace filter was implemented (T234133, T271116) +* pagegenerators -page option was implemented (:phab:`T100643`) +* pagegenerators namespace filter was implemented (:phab:`T234133`, :phab:`T271116`) ## weblinkchecker * Deprecated LinkChecker class was removed @@ -292,7 +292,7 @@ * -except option was renamed to -grepnot from pagegenerators ### solve_disambiguation -* ignore ValueError when parsing a Link object (T111513) +* ignore ValueError when parsing a Link object (:phab:`T111513`) ## 5.4.0 @@ -322,8 +322,8 @@ *10 December 2020* ### general -* Removed unsupported BadTitle Exception (T267768) -* Replaced PageNotSaved by PageSaveRelatedError (T267821) +* Removed unsupported BadTitle Exception (:phab:`T267768`) +* Replaced PageNotSaved by PageSaveRelatedError (:phab:`T267821`) * Update scripts to support Python 3.5+ only * i18n updates * L10N updates @@ -332,17 +332,17 @@ * Make BasicBot example a ConfigParserBot to explain the usage ### clean_sandbox -* Fix TypeError (T267717) +* Fix TypeError (:phab:`T267717`) ### fixing_redirects -* Ignore RuntimeError for missing 'redirects' in api response (T267567) +* Ignore RuntimeError for missing 'redirects' in api response (:phab:`T267567`) ### imagetransfer * Implement -tosite command and other improvements -* Do not use UploadRobot.run() with imagetransfer (T267579) +* Do not use UploadRobot.run() with imagetransfer (:phab:`T267579`) ### interwiki -* Use textfile for interwiki dumps and enable -restore:all option (T74943, T213624) +* Use textfile for interwiki dumps and enable -restore:all option (:phab:`T74943`, :phab:`T213624`) ### makecat * Use input_choice for options @@ -350,16 +350,16 @@ * Other improvements ### revertbot -* Take rollbacktoken to revert (T250509) +* Take rollbacktoken to revert (:phab:`T250509`) ### solve_disambiguation * Write ignoring pages as a whole ### touch -* Fix available_options and purge options (T268394) +* Fix available_options and purge options (:phab:`T268394`) ### weblinkchecker -* Fix AttributeError of HttpRequest (T269821) +* Fix AttributeError of HttpRequest (:phab:`T269821`) ## 5.1.0 @@ -367,35 +367,35 @@ ### general * i18n updates -* switch to new OptionHandler interface (T264721) +* switch to new OptionHandler interface (:phab:`T264721`) ### change_pagelang * New script was added ### download_dump -* Make `dumpdate` param work when using the script in Toolforge (T266630) +* Make `dumpdate` param work when using the script in Toolforge (:phab:`T266630`) ### imagetransfer -* Remove outdated "followRedirects" parameter from imagelinks(); treat instead of run method (T266867, T196851, T171713) +* Remove outdated "followRedirects" parameter from imagelinks(); treat instead of run method (:phab:`T266867`, :phab:`T196851`, :phab:`T171713`) ### interwiki * Replace deprecated originPage by origin in Subjects ### misspelling -* Enable misspelling.py for several sites using wikidata (T258859, T94681) +* Enable misspelling.py for several sites using wikidata (:phab:`T258859`, :phab:`T94681`) ### noreferences -* Rename NoReferencesBot.run to treat (T196851, T171713) -* Use wikidata item instead of dropped MediaWiki message for default category (T266413) +* Rename NoReferencesBot.run to treat (:phab:`T196851`, :phab:`T171713`) +* Use wikidata item instead of dropped MediaWiki message for default category (:phab:`T266413`) ### reflinks * Derive ReferencesRobot from ExistingPageBot and NoRedirectPageBot * Use chardet to find a valid encoding (266862) -* Rename ReferencesRobot.run to treat (T196851, T171713) -* Ignore duplication replacements inside templates (T266411) -* Fix edit summary (T265968) -* Add Server414Error in and close file after reading (T266000) -* Call ReferencesRobot.setup() (T265928) +* Rename ReferencesRobot.run to treat (:phab:`T196851`, :phab:`T171713`) +* Ignore duplication replacements inside templates (:phab:`T266411`) +* Fix edit summary (:phab:`T265968`) +* Add Server414Error in and close file after reading (:phab:`T266000`) +* Call ReferencesRobot.setup() (:phab:`T265928`) ### welcome * Replace _COLORS and _MSGS dicts by Enum @@ -414,11 +414,11 @@ * Split initializer and put getting whitelist to its own method ### checkimages -* Re-enable -sleep parameter (T264521) +* Re-enable -sleep parameter (:phab:`T264521`) ### commonscat -* get commons category from wikibase (T175207) -* Adjust save counter (T262772) +* get commons category from wikibase (:phab:`T175207`) +* Adjust save counter (:phab:`T262772`) ### flickrripper * Improve option handling @@ -427,19 +427,19 @@ * Improvements were made ### imagetransfer -* Do not encode str to bytes (T265257) +* Do not encode str to bytes (:phab:`T265257`) ### match_images * Improvements ### parser_function_count -Porting parser_function_count.py from compat to core/scripts (T66878) +Porting parser_function_count.py from compat to core/scripts (:phab:`T66878`) ### reflinks -decode byte-like object meta_content.group() (T264575) +decode byte-like object meta_content.group() (:phab:`T264575`) ### speedy_delete -* port speedy_delete.py to core (T66880) +* port speedy_delete.py to core (:phab:`T66880`) ### weblinkchecker * Use ThreadList with weblinkchecker @@ -480,7 +480,7 @@ * i18n updates ### download_dump -* Move this script to script folder (T123885, T184033) +* Move this script to script folder (:phab:`T123885`, :phab:`T184033`) ## replace * Show a FutureWarning for deprecated doReplacements method @@ -496,13 +496,13 @@ *4 August 2020* ### general -* Remove Python 2 related code (T257399) +* Remove Python 2 related code (:phab:`T257399`) * i18n updates * L10N updates ### archivebot * Only mention archives where something was really archived -* Reset counter when "era" changes (T215247) +* Reset counter when "era" changes (:phab:`T215247`) * Code improvements and cleanups * Fix ShouldArchive type * Refactor PageArchiver's main loop @@ -510,7 +510,7 @@ * Fix str2size to allow space separators ### cfd -* Script was archived and is no longer supported (T223826) +* Script was archived and is no longer supported (:phab:`T223826`) ### delete -* Use Dict in place of DefaultDict (T257770) +* Use Dict in place of DefaultDict (:phab:`T257770`) diff --git a/tox.ini b/tox.ini index 4e33252..feaffcb 100644 --- a/tox.ini +++ b/tox.ini @@ -77,7 +77,7 @@ [testenv:doc] commands = sphinx-build -M html ./docs ./docs/_build - rstcheck --recursive --report warning . + rstcheck --recursive --report warning --ignore-roles phab . basepython = python3 deps = -rrequirements.txt -- To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/772029 To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Change-Id: Idfbabc1694592d93d8b344915623ad2dc6a5417e Gerrit-Change-Number: 772029 Gerrit-PatchSet: 6 Gerrit-Owner: Xqt <info(a)gno.de> Gerrit-Reviewer: Xqt <info(a)gno.de> Gerrit-Reviewer: jenkins-bot Gerrit-MessageType: merged

2 years, 2 months

[Gerrit] ...core[master]: [doc] Update ROADMAP.rst and CONTENT.rst

by Xqt (Code Review)

Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/772014 ) Change subject: [doc] Update ROADMAP.rst and CONTENT.rst ...................................................................... [doc] Update ROADMAP.rst and CONTENT.rst Change-Id: Ia95d0fb47aa11c30207994521b7dd574316e18ae --- M ROADMAP.rst M pywikibot/CONTENT.rst 2 files changed, 3 insertions(+), 0 deletions(-) Approvals: Xqt: Verified; Looks good to me, approved diff --git a/ROADMAP.rst b/ROADMAP.rst index 10859ad..f9e86ae 100644 --- a/ROADMAP.rst +++ b/ROADMAP.rst @@ -1,6 +1,7 @@ Current release changes ^^^^^^^^^^^^^^^^^^^^^^^ +* TextExtracts support was aded (T72682) * Unused `get_redirect` parameter of Page.getOldVersion() has been dropped * Provide BasePage.get_parsed_page() a public method * Provide BuiltinNamespace.canonical_namespaces() with BuiltinNamespace IntEnum diff --git a/pywikibot/CONTENT.rst b/pywikibot/CONTENT.rst index 76b64f7..d38057b 100644 --- a/pywikibot/CONTENT.rst +++ b/pywikibot/CONTENT.rst @@ -108,6 +108,8 @@ +----------------------------+------------------------------------------------------+ | _decorators.py | Decorators used by page objects | +----------------------------+------------------------------------------------------+ + | _links.py | Objects representing link objects | + +----------------------------+------------------------------------------------------+ | _pages.py | Objects representing MediaWiki pages | +----------------------------+------------------------------------------------------+ | _revision.py | Object representing page revision | -- To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/772014 To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Change-Id: Ia95d0fb47aa11c30207994521b7dd574316e18ae Gerrit-Change-Number: 772014 Gerrit-PatchSet: 1 Gerrit-Owner: Xqt <info(a)gno.de> Gerrit-Reviewer: Xqt <info(a)gno.de> Gerrit-MessageType: merged

2 years, 2 months

[Gerrit] ...core[master]: [IMPR] Outsource wikibase objects to page._wikibase.py keeping its hi...

by Xqt (Code Review)

Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/771987 ) Change subject: [IMPR] Outsource wikibase objects to page._wikibase.py keeping its history ...................................................................... [IMPR] Outsource wikibase objects to page._wikibase.py keeping its history Change-Id: If7a3d04f61f7b9afbc8e9cf5209c779805b34d23 --- M pywikibot/CONTENT.rst M pywikibot/page/__init__.py M pywikibot/page/_links.py M pywikibot/page/_pages.py M pywikibot/page/_wikibase.py M tox.ini 6 files changed, 38 insertions(+), 5,157 deletions(-) Approvals: jenkins-bot: Verified Xqt: Verified; Looks good to me, approved diff --git a/pywikibot/CONTENT.rst b/pywikibot/CONTENT.rst index 4712112..76b64f7 100644 --- a/pywikibot/CONTENT.rst +++ b/pywikibot/CONTENT.rst @@ -104,14 +104,16 @@ +============================+======================================================+ | __init__.py | Interface representing MediaWiki pages | +----------------------------+------------------------------------------------------+ - | _basepage.py | Objects representing MediaWiki pages | - +----------------------------+------------------------------------------------------+ | _collections.py | Structures holding data for Wikibase entities | +----------------------------+------------------------------------------------------+ | _decorators.py | Decorators used by page objects | +----------------------------+------------------------------------------------------+ + | _pages.py | Objects representing MediaWiki pages | + +----------------------------+------------------------------------------------------+ | _revision.py | Object representing page revision | +----------------------------+------------------------------------------------------+ + | _wikibase.py | Objects representing wikibase structures | + +----------------------------+------------------------------------------------------+ +----------------------------+------------------------------------------------------+ diff --git a/pywikibot/page/__init__.py b/pywikibot/page/__init__.py index 73c542a..2caa740 100644 --- a/pywikibot/page/__init__.py +++ b/pywikibot/page/__init__.py @@ -6,23 +6,25 @@ # from typing import Union -from pywikibot.page._basepage import ( +from pywikibot.page._links import BaseLink, Link, SiteLink, html2unicode +from pywikibot.page._pages import ( BasePage, Category, - Claim, FileInfo, FilePage, + Page, + User, +) +from pywikibot.page._revision import Revision +from pywikibot.page._wikibase import ( + Claim, ItemPage, MediaInfo, - Page, Property, PropertyPage, - User, WikibaseEntity, WikibasePage, ) -from pywikibot.page._links import BaseLink, Link, SiteLink, html2unicode -from pywikibot.page._revision import Revision from pywikibot.site import BaseSite as _BaseSite from pywikibot.tools import deprecated, issue_deprecation_warning from pywikibot.tools.chars import url2string as _url2string diff --git a/pywikibot/page/_links.py b/pywikibot/page/_links.py index 9400085..5e6dd09 100644 --- a/pywikibot/page/_links.py +++ b/pywikibot/page/_links.py @@ -1,7 +1,7 @@ """Objects representing internal or interwiki link in wikitext. ..note:: - `Link` objects represent a wiki-page's title, while + `Link` objects definded here represent a wiki-page's title, while :class:`pywikibot.Page` objects represent the page itself, including its contents. """ diff --git a/pywikibot/page/_pages.py b/pywikibot/page/_pages.py index 6dfe675..3d48204 100644 --- a/pywikibot/page/_pages.py +++ b/pywikibot/page/_pages.py @@ -1,69 +1,57 @@ """ -Objects representing various types of MediaWiki, including Wikibase, pages. +Objects representing various types of MediaWiki, excluding Wikibase, pages. This module also includes objects: -* Property: a type of semantic data. -* Claim: an instance of a semantic assertion. -* Revision: a single change to a wiki page. * FileInfo: a structure holding imageinfo of latest rev. of FilePage + +..note:: + `Link` objects represent a wiki-page's title, while + :class:`pywikibot.Page` objects (defined here) represent the page + itself, including its contents. """ # # (C) Pywikibot team, 2008-2022 # # Distributed under the terms of the MIT license. # -import json as jsonlib -import logging import os.path import re -from collections import Counter, OrderedDict, defaultdict +from collections import Counter, defaultdict from contextlib import suppress from http import HTTPStatus -from itertools import chain from textwrap import shorten, wrap -from typing import Any, Optional, Union +from typing import Optional, Union from urllib.parse import quote_from_bytes from warnings import warn import pywikibot from pywikibot import config, date, i18n, textlib -from pywikibot.backports import Dict, Generator, Iterable, List, Tuple +from pywikibot.backports import Generator, Iterable, List, Tuple from pywikibot.comms import http from pywikibot.cosmetic_changes import CANCEL, CosmeticChangesToolkit from pywikibot.exceptions import ( APIError, AutoblockUserError, - EntityTypeUnknownError, Error, InterwikiRedirectPageError, InvalidPageError, - InvalidTitleError, IsNotRedirectPageError, IsRedirectPageError, NoMoveTargetError, NoPageError, NotEmailableError, NoUsernameError, - NoWikibaseEntityError, OtherPageSaveError, PageSaveRelatedError, SectionError, UnknownExtensionError, UserRightsError, - WikiBaseError, -) -from pywikibot.family import Family -from pywikibot.page._collections import ( - AliasesDict, - ClaimCollection, - LanguageDict, - SiteLinkCollection, ) from pywikibot.page._decorators import allow_asynchronous from pywikibot.page._links import BaseLink, Link from pywikibot.page._revision import Revision -from pywikibot.site import DataSite, Namespace, NamespaceArgType +from pywikibot.site import Namespace, NamespaceArgType from pywikibot.tools import ( ComparableMixin, compute_file_hash, @@ -80,25 +68,12 @@ __all__ = ( 'BasePage', 'Category', - 'Claim', 'FileInfo', 'FilePage', - 'ItemPage', - 'MediaInfo', 'Page', - 'Property', - 'PropertyPage', 'User', - 'WikibaseEntity', - 'WikibasePage', ) -logger = logging.getLogger('pywiki.wiki.page') - - -# Note: Link objects (defined later on) represent a wiki-page's title, while -# Page objects (defined here) represent the page itself, including its -# contents. class BasePage(ComparableMixin): @@ -1567,7 +1542,7 @@ :rtype: pywikibot.page.ItemPage """ - return ItemPage.fromPage(self) + return pywikibot.ItemPage.fromPage(self) def templates(self, content: bool = False): """ @@ -2616,7 +2591,7 @@ """ if self.site.has_extension('WikibaseMediaInfo'): if not hasattr(self, '_item'): - self._item = MediaInfo(self.site) + self._item = pywikibot.MediaInfo(self.site) self._item._file = self return self._item @@ -3320,1807 +3295,6 @@ return self.isRegistered() and 'bot' not in self.groups() -class WikibaseEntity: - - """ - The base interface for Wikibase entities. - - Each entity is identified by a data repository it belongs to - and an identifier. - - :cvar DATA_ATTRIBUTES: dictionary which maps data attributes (eg. 'labels', - 'claims') to appropriate collection classes (eg. LanguageDict, - ClaimsCollection) - - :cvar entity_type: entity type identifier - :type entity_type: str - - :cvar title_pattern: regular expression which matches all possible - entity ids for this entity type - :type title_pattern: str - """ - - DATA_ATTRIBUTES = {} # type: Dict[str, Any] - - def __init__(self, repo, id_=None) -> None: - """ - Initializer. - - :param repo: Entity repository. - :type repo: DataSite - :param id_: Entity identifier. - :type id_: str or None, -1 and None mean non-existing - """ - self.repo = repo - self.id = id_ if id_ is not None else '-1' - if self.id != '-1' and not self.is_valid_id(self.id): - raise InvalidTitleError( - "'{}' is not a valid {} page title" - .format(self.id, self.entity_type)) - - def __repr__(self) -> str: - if self.id != '-1': - return 'pywikibot.page.{}({!r}, {!r})'.format( - self.__class__.__name__, self.repo, self.id) - return 'pywikibot.page.{}({!r})'.format( - self.__class__.__name__, self.repo) - - @classmethod - def is_valid_id(cls, entity_id: str) -> bool: - """ - Whether the string can be a valid id of the entity type. - - :param entity_id: The ID to test. - """ - if not hasattr(cls, 'title_pattern'): - return True - - return bool(re.fullmatch(cls.title_pattern, entity_id)) - - def __getattr__(self, name): - if name in self.DATA_ATTRIBUTES: - if self.getID() == '-1': - for key, cls in self.DATA_ATTRIBUTES.items(): - setattr(self, key, cls.new_empty(self.repo)) - return getattr(self, name) - return self.get()[name] - - raise AttributeError("'{}' object has no attribute '{}'" - .format(self.__class__.__name__, name)) - - def _defined_by(self, singular: bool = False) -> dict: - """ - Internal function to provide the API parameters to identify the entity. - - An empty dict is returned if the entity has not been created yet. - - :param singular: Whether the parameter names should use the singular - form - :return: API parameters - """ - params = {} - if self.id != '-1': - if singular: - params['id'] = self.id - else: - params['ids'] = self.id - return params - - def getID(self, numeric: bool = False): - """ - Get the identifier of this entity. - - :param numeric: Strip the first letter and return an int - """ - if numeric: - return int(self.id[1:]) if self.id != '-1' else -1 - return self.id - - def get_data_for_new_entity(self) -> dict: - """ - Return data required for creation of a new entity. - - Override it if you need. - """ - return {} - - def toJSON(self, diffto: Optional[dict] = None) -> dict: - """ - Create JSON suitable for Wikibase API. - - When diffto is provided, JSON representing differences - to the provided data is created. - - :param diffto: JSON containing entity data - """ - data = {} - for key in self.DATA_ATTRIBUTES: - attr = getattr(self, key, None) - if attr is None: - continue - if diffto: - value = attr.toJSON(diffto=diffto.get(key)) - else: - value = attr.toJSON() - if value: - data[key] = value - return data - - @classmethod - def _normalizeData(cls, data: dict) -> dict: - """ - Helper function to expand data into the Wikibase API structure. - - :param data: The dict to normalize - :return: The dict with normalized data - """ - norm_data = {} - for key, attr in cls.DATA_ATTRIBUTES.items(): - if key in data: - norm_data[key] = attr.normalizeData(data[key]) - return norm_data - - @property - def latest_revision_id(self) -> Optional[int]: - """ - Get the revision identifier for the most recent revision of the entity. - - :rtype: int or None if it cannot be determined - :raise NoWikibaseEntityError: if the entity doesn't exist - """ - if not hasattr(self, '_revid'): - # fixme: unlike BasePage.latest_revision_id, this raises - # exception when entity is redirect, cannot use get_redirect - self.get() - return self._revid - - @latest_revision_id.setter - def latest_revision_id(self, value: Optional[int]) -> None: - self._revid = value - - @latest_revision_id.deleter - def latest_revision_id(self) -> None: - if hasattr(self, '_revid'): - del self._revid - - def exists(self) -> bool: - """Determine if an entity exists in the data repository.""" - if not hasattr(self, '_content'): - try: - self.get() - return True - except NoWikibaseEntityError: - return False - return 'missing' not in self._content - - def get(self, force: bool = False) -> dict: - """ - Fetch all entity data and cache it. - - :param force: override caching - :raise NoWikibaseEntityError: if this entity doesn't exist - :return: actual data which entity holds - """ - if force or not hasattr(self, '_content'): - identification = self._defined_by() - if not identification: - raise NoWikibaseEntityError(self) - - try: - data = self.repo.loadcontent(identification) - except APIError as err: - if err.code == 'no-such-entity': - raise NoWikibaseEntityError(self) - raise - item_index, content = data.popitem() - self.id = item_index - self._content = content - if 'missing' in self._content: - raise NoWikibaseEntityError(self) - - self.latest_revision_id = self._content.get('lastrevid') - - data = {} - - # This initializes all data, - for key, cls in self.DATA_ATTRIBUTES.items(): - value = cls.fromJSON(self._content.get(key, {}), self.repo) - setattr(self, key, value) - data[key] = value - return data - - def editEntity(self, data=None, **kwargs) -> None: - """ - Edit an entity using Wikibase wbeditentity API. - - :param data: Data to be saved - :type data: dict, or None to save the current content of the entity. - """ - if data is None: - data = self.toJSON(diffto=getattr(self, '_content', None)) - else: - data = self._normalizeData(data) - - baserevid = getattr(self, '_revid', None) - - updates = self.repo.editEntity( - self, data, baserevid=baserevid, **kwargs) - - # the attribute may have been unset in ItemPage - if getattr(self, 'id', '-1') == '-1': - self.__init__(self.repo, updates['entity']['id']) - - # the response also contains some data under the 'entity' key - # but it is NOT the actual content - # see also [[d:Special:Diff/1356933963]] - # TODO: there might be some circumstances under which - # the content can be safely reused - if hasattr(self, '_content'): - del self._content - self.latest_revision_id = updates['entity'].get('lastrevid') - - def concept_uri(self) -> str: - """ - Return the full concept URI. - - :raise NoWikibaseEntityError: if this entity doesn't exist - """ - entity_id = self.getID() - if entity_id == '-1': - raise NoWikibaseEntityError(self) - return '{}{}'.format(self.repo.concept_base_uri, entity_id) - - -class MediaInfo(WikibaseEntity): - - """Interface for MediaInfo entities on Commons. - - .. versionadded:: 6.5 - """ - - title_pattern = r'M[1-9]\d*' - DATA_ATTRIBUTES = { - 'labels': LanguageDict, - # TODO: 'statements': ClaimCollection, - } - - @property - def file(self) -> FilePage: - """Get the file associated with the mediainfo.""" - if not hasattr(self, '_file'): - if self.id == '-1': - # if the above doesn't apply, this entity is in an invalid - # state which needs to be raised as an exception, but also - # logged in case an exception handler is catching - # the generic Error - pywikibot.error('{} is in invalid state' - .format(self.__class__.__name__)) - raise Error('{} is in invalid state' - .format(self.__class__.__name__)) - - page_id = self.getID(numeric=True) - result = list(self.repo.load_pages_from_pageids([page_id])) - if not result: - raise Error('There is no existing page with id "{}"' - .format(page_id)) - - page = result.pop() - if page.namespace() != page.site.namespaces.FILE: - raise Error('Page with id "{}" is not a file'.format(page_id)) - - self._file = FilePage(page) - - return self._file - - def get(self, force: bool = False) -> dict: - """Fetch all MediaInfo entity data and cache it. - - :param force: override caching - :raise NoWikibaseEntityError: if this entity doesn't exist - :return: actual data which entity holds - """ - if self.id == '-1': - if force: - if not self.file.exists(): - exc = NoPageError(self.file) - raise NoWikibaseEntityError(self) from exc - # get just the id for Wikibase API call - self.id = 'M' + str(self.file.pageid) - else: - try: - data = self.file.latest_revision.slots['mediainfo']['*'] - except NoPageError as exc: - raise NoWikibaseEntityError(self) from exc - - self._content = jsonlib.loads(data) - self.id = self._content['id'] - - return super().get(force=force) - - def getID(self, numeric: bool = False): - """ - Get the entity identifier. - - :param numeric: Strip the first letter and return an int - """ - if self.id == '-1': - self.get() - return super().getID(numeric=numeric) - - -class WikibasePage(BasePage, WikibaseEntity): - - """ - Mixin base class for Wikibase entities which are also pages (eg. items). - - There should be no need to instantiate this directly. - """ - - _cache_attrs = BasePage._cache_attrs + ('_content', ) - - def __init__(self, site, title: str = '', **kwargs) -> None: - """ - Initializer. - - If title is provided, either ns or entity_type must also be provided, - and will be checked against the title parsed using the Page - initialisation logic. - - :param site: Wikibase data site - :type site: pywikibot.site.DataSite - :param title: normalized title of the page - :type title: str - :keyword ns: namespace - :type ns: Namespace instance, or int - :keyword entity_type: Wikibase entity type - :type entity_type: str ('item' or 'property') - - :raises TypeError: incorrect use of parameters - :raises ValueError: incorrect namespace - :raises pywikibot.exceptions.Error: title parsing problems - :raises NotImplementedError: the entity type is not supported - """ - if not isinstance(site, pywikibot.site.DataSite): - raise TypeError('site must be a pywikibot.site.DataSite object') - if title and ('ns' not in kwargs and 'entity_type' not in kwargs): - pywikibot.debug('{}.__init__: {} title {!r} specified without ' - 'ns or entity_type' - .format(self.__class__.__name__, site, - title), - layer='wikibase') - - self._namespace = None - - if 'ns' in kwargs: - if isinstance(kwargs['ns'], Namespace): - self._namespace = kwargs.pop('ns') - kwargs['ns'] = self._namespace.id - else: - # numerical namespace given - ns = int(kwargs['ns']) - if site.item_namespace.id == ns: - self._namespace = site.item_namespace - elif site.property_namespace.id == ns: - self._namespace = site.property_namespace - else: - raise ValueError('{!r}: Namespace "{}" is not valid' - .format(site, int(ns))) - - if 'entity_type' in kwargs: - entity_type = kwargs.pop('entity_type') - try: - entity_type_ns = site.get_namespace_for_entity_type( - entity_type) - except EntityTypeUnknownError: - raise ValueError('Wikibase entity type "{}" unknown' - .format(entity_type)) - - if self._namespace: - if self._namespace != entity_type_ns: - raise ValueError('Namespace "{}" is not valid for Wikibase' - ' entity type "{}"' - .format(int(kwargs['ns']), entity_type)) - else: - self._namespace = entity_type_ns - kwargs['ns'] = self._namespace.id - - BasePage.__init__(self, site, title, **kwargs) - - # If a title was not provided, - # avoid checks which may cause an exception. - if not title: - WikibaseEntity.__init__(self, site) - return - - if self._namespace: - if self._link.namespace != self._namespace.id: - raise ValueError("'{}' is not in the namespace {}" - .format(title, self._namespace.id)) - else: - # Neither ns or entity_type was provided. - # Use the _link to determine entity type. - ns = self._link.namespace - if self.site.item_namespace.id == ns: - self._namespace = self.site.item_namespace - elif self.site.property_namespace.id == ns: - self._namespace = self.site.property_namespace - else: - raise ValueError('{!r}: Namespace "{!r}" is not valid' - .format(self.site, ns)) - - WikibaseEntity.__init__( - self, - # .site forces a parse of the Link title to determine site - self.site, - # Link.__init__, called from Page.__init__, has cleaned the title - # stripping whitespace and uppercasing the first letter according - # to the namespace case=first-letter. - self._link.title) - - def namespace(self) -> int: - """ - Return the number of the namespace of the entity. - - :return: Namespace id - """ - return self._namespace.id - - def exists(self) -> bool: - """Determine if an entity exists in the data repository.""" - if not hasattr(self, '_content'): - try: - self.get(get_redirect=True) - return True - except NoPageError: - return False - return 'missing' not in self._content - - def botMayEdit(self) -> bool: - """ - Return whether bots may edit this page. - - Because there is currently no system to mark a page that it shouldn't - be edited by bots on Wikibase pages it always returns True. The content - of the page is not text but a dict, the original way (to search for a - template) doesn't apply. - - :return: True - """ - return True - - def get(self, force: bool = False, *args, **kwargs) -> dict: - """ - Fetch all page data, and cache it. - - :param force: override caching - :raise NotImplementedError: a value in args or kwargs - :return: actual data which entity holds - :note: dicts returned by this method are references to content - of this entity and their modifying may indirectly cause - unwanted change to the live content - """ - if args or kwargs: - raise NotImplementedError( - '{}.get does not implement var args: {!r} and {!r}'.format( - self.__class__.__name__, args, kwargs)) - - # todo: this variable is specific to ItemPage - lazy_loading_id = not hasattr(self, 'id') and hasattr(self, '_site') - try: - data = WikibaseEntity.get(self, force=force) - except NoWikibaseEntityError: - if lazy_loading_id: - p = Page(self._site, self._title) - if not p.exists(): - raise NoPageError(p) - # todo: raise a nicer exception here (T87345) - raise NoPageError(self) - - if 'pageid' in self._content: - self._pageid = self._content['pageid'] - - # xxx: this is ugly - if 'claims' in data: - self.claims.set_on_item(self) - - return data - - @property - def latest_revision_id(self) -> int: - """ - Get the revision identifier for the most recent revision of the entity. - - :rtype: int - :raise pywikibot.exceptions.NoPageError: if the entity doesn't exist - """ - if not hasattr(self, '_revid'): - self.get() - return self._revid - - @latest_revision_id.setter - def latest_revision_id(self, value) -> None: - self._revid = value - - @latest_revision_id.deleter - def latest_revision_id(self) -> None: - # fixme: this seems too destructive in comparison to the parent - self.clear_cache() - - @allow_asynchronous - def editEntity(self, data=None, **kwargs) -> None: - """ - Edit an entity using Wikibase wbeditentity API. - - This function is wrapped around by: - - editLabels - - editDescriptions - - editAliases - - ItemPage.setSitelinks - - :param data: Data to be saved - :type data: dict, or None to save the current content of the entity. - :keyword asynchronous: if True, launch a separate thread to edit - asynchronously - :type asynchronous: bool - :keyword callback: a callable object that will be called after the - entity has been updated. It must take two arguments: (1) a - WikibasePage object, and (2) an exception instance, which will be - None if the page was saved successfully. This is intended for use - by bots that need to keep track of which saves were successful. - :type callback: callable - """ - # kept for the decorator - super().editEntity(data, **kwargs) - - def editLabels(self, labels, **kwargs) -> None: - """ - Edit entity labels. - - Labels should be a dict, with the key - as a language or a site object. The - value should be the string to set it to. - You can set it to '' to remove the label. - """ - data = {'labels': labels} - self.editEntity(data, **kwargs) - - def editDescriptions(self, descriptions, **kwargs) -> None: - """ - Edit entity descriptions. - - Descriptions should be a dict, with the key - as a language or a site object. The - value should be the string to set it to. - You can set it to '' to remove the description. - """ - data = {'descriptions': descriptions} - self.editEntity(data, **kwargs) - - def editAliases(self, aliases, **kwargs) -> None: - """ - Edit entity aliases. - - Aliases should be a dict, with the key - as a language or a site object. The - value should be a list of strings. - """ - data = {'aliases': aliases} - self.editEntity(data, **kwargs) - - def set_redirect_target( - self, - target_page, - create: bool = False, - force: bool = False, - keep_section: bool = False, - save: bool = True, - **kwargs - ): - """ - Set target of a redirect for a Wikibase page. - - Has not been implemented in the Wikibase API yet, except for ItemPage. - """ - raise NotImplementedError - - @allow_asynchronous - def addClaim(self, claim, bot: bool = True, **kwargs): - """ - Add a claim to the entity. - - :param claim: The claim to add - :type claim: pywikibot.page.Claim - :param bot: Whether to flag as bot (if possible) - :keyword asynchronous: if True, launch a separate thread to add claim - asynchronously - :type asynchronous: bool - :keyword callback: a callable object that will be called after the - claim has been added. It must take two arguments: - (1) a WikibasePage object, and (2) an exception instance, - which will be None if the entity was saved successfully. This is - intended for use by bots that need to keep track of which saves - were successful. - :type callback: callable - """ - if claim.on_item is not None: - raise ValueError( - 'The provided Claim instance is already used in an entity') - self.repo.addClaim(self, claim, bot=bot, **kwargs) - claim.on_item = self - - def removeClaims(self, claims, **kwargs) -> None: - """ - Remove the claims from the entity. - - :param claims: list of claims to be removed - :type claims: list or pywikibot.Claim - """ - # this check allows single claims to be removed by pushing them into a - # list of length one. - if isinstance(claims, pywikibot.Claim): - claims = [claims] - data = self.repo.removeClaims(claims, **kwargs) - for claim in claims: - claim.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - claim.on_item = None - claim.snak = None - - -class ItemPage(WikibasePage): - - """ - Wikibase entity of type 'item'. - - A Wikibase item may be defined by either a 'Q' id (qid), - or by a site & title. - - If an item is defined by site & title, once an item's qid has - been looked up, the item is then defined by the qid. - """ - - _cache_attrs = WikibasePage._cache_attrs + ( - 'labels', 'descriptions', 'aliases', 'claims', 'sitelinks') - entity_type = 'item' - title_pattern = r'Q[1-9]\d*' - DATA_ATTRIBUTES = { - 'labels': LanguageDict, - 'descriptions': LanguageDict, - 'aliases': AliasesDict, - 'claims': ClaimCollection, - 'sitelinks': SiteLinkCollection, - } - - def __init__(self, site, title=None, ns=None) -> None: - """ - Initializer. - - :param site: data repository - :type site: pywikibot.site.DataSite - :param title: identifier of item, "Q###", - -1 or None for an empty item. - :type title: str - :type ns: namespace - :type ns: Namespace instance, or int, or None - for default item_namespace - """ - if ns is None: - ns = site.item_namespace - # Special case for empty item. - if title is None or title == '-1': - super().__init__(site, '-1', ns=ns) - assert self.id == '-1' - return - - # we don't want empty titles - if not title: - raise InvalidTitleError("Item's title cannot be empty") - - super().__init__(site, title, ns=ns) - - assert self.id == self._link.title - - def _defined_by(self, singular: bool = False) -> dict: - """ - Internal function to provide the API parameters to identify the item. - - The API parameters may be 'id' if the ItemPage has one, - or 'site'&'title' if instantiated via ItemPage.fromPage with - lazy_load enabled. - - Once an item's Q## is looked up, that will be used for all future - requests. - - An empty dict is returned if the ItemPage is instantiated without - either ID (internally it has id = '-1') or site&title. - - :param singular: Whether the parameter names should use the - singular form - :return: API parameters - """ - params = {} - if singular: - id = 'id' - site = 'site' - title = 'title' - else: - id = 'ids' - site = 'sites' - title = 'titles' - - lazy_loading_id = not hasattr(self, 'id') and hasattr(self, '_site') - - # id overrides all - if hasattr(self, 'id'): - if self.id != '-1': - params[id] = self.id - elif lazy_loading_id: - params[site] = self._site.dbName() - params[title] = self._title - else: - # if none of the above applies, this item is in an invalid state - # which needs to be raise as an exception, but also logged in case - # an exception handler is catching the generic Error. - pywikibot.error('{} is in invalid state' - .format(self.__class__.__name__)) - raise Error('{} is in invalid state' - .format(self.__class__.__name__)) - - return params - - def title(self, **kwargs): - """ - Return ID as title of the ItemPage. - - If the ItemPage was lazy-loaded via ItemPage.fromPage, this method - will fetch the Wikibase item ID for the page, potentially raising - NoPageError with the page on the linked wiki if it does not exist, or - does not have a corresponding Wikibase item ID. - - This method also refreshes the title if the id property was set. - i.e. item.id = 'Q60' - - All optional keyword parameters are passed to the superclass. - """ - # If instantiated via ItemPage.fromPage using site and title, - # _site and _title exist, and id does not exist. - lazy_loading_id = not hasattr(self, 'id') and hasattr(self, '_site') - - if lazy_loading_id or self._link._text != self.id: - # If the item is lazy loaded or has been modified, - # _link._text is stale. Removing _link._title - # forces Link to re-parse ._text into ._title. - if hasattr(self._link, '_title'): - del self._link._title - self._link._text = self.getID() - self._link.parse() - # Remove the temporary values that are no longer needed after - # the .getID() above has called .get(), which populated .id - if hasattr(self, '_site'): - del self._title - del self._site - - return super().title(**kwargs) - - def getID(self, numeric: bool = False, force: bool = False): - """ - Get the entity identifier. - - :param numeric: Strip the first letter and return an int - :param force: Force an update of new data - """ - if not hasattr(self, 'id') or force: - self.get(force=force) - return super().getID(numeric=numeric) - - @classmethod - def fromPage(cls, page, lazy_load: bool = False): - """ - Get the ItemPage for a Page that links to it. - - :param page: Page to look for corresponding data item - :type page: pywikibot.page.Page - :param lazy_load: Do not raise NoPageError if either page or - corresponding ItemPage does not exist. - :rtype: pywikibot.page.ItemPage - - :raise pywikibot.exceptions.NoPageError: There is no corresponding - ItemPage for the page - :raise pywikibot.exceptions.WikiBaseError: The site of the page - has no data repository. - """ - if hasattr(page, '_item'): - return page._item - if not page.site.has_data_repository: - raise WikiBaseError('{} has no data repository' - .format(page.site)) - if not lazy_load and not page.exists(): - raise NoPageError(page) - - repo = page.site.data_repository() - if hasattr(page, - '_pageprops') and page.properties().get('wikibase_item'): - # If we have already fetched the pageprops for something else, - # we already have the id, so use it - page._item = cls(repo, page.properties().get('wikibase_item')) - return page._item - i = cls(repo) - # clear id, and temporarily store data needed to lazy loading the item - del i.id - i._site = page.site - i._title = page.title(with_section=False) - if not lazy_load and not i.exists(): - raise NoPageError(i) - page._item = i - return page._item - - @classmethod - def from_entity_uri(cls, site, uri: str, lazy_load: bool = False): - """ - Get the ItemPage from its entity uri. - - :param site: The Wikibase site for the item. - :type site: pywikibot.site.DataSite - :param uri: Entity uri for the Wikibase item. - :param lazy_load: Do not raise NoPageError if ItemPage does not exist. - :rtype: pywikibot.page.ItemPage - - :raise TypeError: Site is not a valid DataSite. - :raise ValueError: Site does not match the base of the provided uri. - :raise pywikibot.exceptions.NoPageError: Uri points to non-existent - item. - """ - if not isinstance(site, DataSite): - raise TypeError('{} is not a data repository.'.format(site)) - - base_uri, _, qid = uri.rpartition('/') - if base_uri != site.concept_base_uri.rstrip('/'): - raise ValueError( - 'The supplied data repository ({repo}) does not correspond to ' - 'that of the item ({item})'.format( - repo=site.concept_base_uri.rstrip('/'), - item=base_uri)) - - item = cls(site, qid) - if not lazy_load and not item.exists(): - raise NoPageError(item) - - return item - - def get( - self, - force: bool = False, - get_redirect: bool = False, - *args, - **kwargs - ) -> Dict[str, Any]: - """ - Fetch all item data, and cache it. - - :param force: override caching - :param get_redirect: return the item content, do not follow the - redirect, do not raise an exception. - :raise NotImplementedError: a value in args or kwargs - :return: actual data which entity holds - :note: dicts returned by this method are references to content of this - entity and their modifying may indirectly cause unwanted change to - the live content - """ - data = super().get(force, *args, **kwargs) - - if self.isRedirectPage() and not get_redirect: - raise IsRedirectPageError(self) - - return data - - def getRedirectTarget(self): - """Return the redirect target for this page.""" - target = super().getRedirectTarget() - cmodel = target.content_model - if cmodel != 'wikibase-item': - raise Error('{} has redirect target {} with content model {} ' - 'instead of wikibase-item' - .format(self, target, cmodel)) - return self.__class__(target.site, target.title(), target.namespace()) - - def iterlinks(self, family=None): - """ - Iterate through all the sitelinks. - - :param family: string/Family object which represents what family of - links to iterate - :type family: str|pywikibot.family.Family - :return: iterator of pywikibot.Page objects - :rtype: iterator - """ - if not hasattr(self, 'sitelinks'): - self.get() - if family is not None and not isinstance(family, Family): - family = Family.load(family) - for sl in self.sitelinks.values(): - if family is None or family == sl.site.family: - pg = pywikibot.Page(sl) - pg._item = self - yield pg - - def getSitelink(self, site, force: bool = False) -> str: - """ - Return the title for the specific site. - - If the item doesn't have that language, raise NoPageError. - - :param site: Site to find the linked page of. - :type site: pywikibot.Site or database name - :param force: override caching - """ - if force or not hasattr(self, '_content'): - self.get(force=force) - - if site not in self.sitelinks: - raise NoPageError(self) - - return self.sitelinks[site].canonical_title() - - def setSitelink(self, sitelink, **kwargs) -> None: - """ - Set sitelinks. Calls setSitelinks(). - - A sitelink can be a Page object, a BaseLink object - or a {'site':dbname,'title':title} dictionary. - """ - self.setSitelinks([sitelink], **kwargs) - - def removeSitelink(self, site, **kwargs) -> None: - """ - Remove a sitelink. - - A site can either be a Site object, or it can be a dbName. - """ - self.removeSitelinks([site], **kwargs) - - def removeSitelinks(self, sites, **kwargs) -> None: - """ - Remove sitelinks. - - Sites should be a list, with values either - being Site objects, or dbNames. - """ - data = [] - for site in sites: - site = SiteLinkCollection.getdbName(site) - data.append({'site': site, 'title': ''}) - self.setSitelinks(data, **kwargs) - - def setSitelinks(self, sitelinks, **kwargs) -> None: - """ - Set sitelinks. - - Sitelinks should be a list. Each item in the - list can either be a Page object, a BaseLink object, or a dict - with a value for 'site' and 'title'. - """ - data = {'sitelinks': sitelinks} - self.editEntity(data, **kwargs) - - def mergeInto(self, item, **kwargs) -> None: - """ - Merge the item into another item. - - :param item: The item to merge into - :type item: pywikibot.page.ItemPage - """ - data = self.repo.mergeItems(from_item=self, to_item=item, **kwargs) - if not data.get('success', 0): - return - self.latest_revision_id = data['from']['lastrevid'] - item.latest_revision_id = data['to']['lastrevid'] - if data.get('redirected', 0): - self._isredir = True - self._redirtarget = item - - def set_redirect_target( - self, - target_page, - create: bool = False, - force: bool = False, - keep_section: bool = False, - save: bool = True, - **kwargs - ): - """ - Make the item redirect to another item. - - You need to define an extra argument to make this work, like save=True - - :param target_page: target of the redirect, this argument is required. - :type target_page: pywikibot.page.ItemPage or string - :param force: if true, it sets the redirect target even the page - is not redirect. - """ - if isinstance(target_page, str): - target_page = pywikibot.ItemPage(self.repo, target_page) - elif self.repo != target_page.repo: - raise InterwikiRedirectPageError(self, target_page) - if self.exists() and not self.isRedirectPage() and not force: - raise IsNotRedirectPageError(self) - if not save or keep_section or create: - raise NotImplementedError - data = self.repo.set_redirect_target( - from_item=self, to_item=target_page, - bot=kwargs.get('botflag', True)) - if data.get('success', 0): - del self.latest_revision_id - self._isredir = True - self._redirtarget = target_page - - def isRedirectPage(self): - """Return True if item is a redirect, False if not or not existing.""" - if hasattr(self, '_content') and not hasattr(self, '_isredir'): - self._isredir = self.id != self._content.get('id', self.id) - return self._isredir - return super().isRedirectPage() - - -class Property: - - """ - A Wikibase property. - - While every Wikibase property has a Page on the data repository, - this object is for when the property is used as part of another concept - where the property is not _the_ Page of the property. - - For example, a claim on an ItemPage has many property attributes, and so - it subclasses this Property class, but a claim does not have Page like - behaviour and semantics. - """ - - types = {'wikibase-item': ItemPage, - # 'wikibase-property': PropertyPage, must be declared first - 'string': str, - 'commonsMedia': FilePage, - 'globe-coordinate': pywikibot.Coordinate, - 'url': str, - 'time': pywikibot.WbTime, - 'quantity': pywikibot.WbQuantity, - 'monolingualtext': pywikibot.WbMonolingualText, - 'math': str, - 'external-id': str, - 'geo-shape': pywikibot.WbGeoShape, - 'tabular-data': pywikibot.WbTabularData, - 'musical-notation': str, - } - - # the value type where different from the type - value_types = {'wikibase-item': 'wikibase-entityid', - 'wikibase-property': 'wikibase-entityid', - 'commonsMedia': 'string', - 'url': 'string', - 'globe-coordinate': 'globecoordinate', - 'math': 'string', - 'external-id': 'string', - 'geo-shape': 'string', - 'tabular-data': 'string', - 'musical-notation': 'string', - } - - def __init__(self, site, id: str, datatype: Optional[str] = None) -> None: - """ - Initializer. - - :param site: data repository - :type site: pywikibot.site.DataSite - :param id: id of the property - :param datatype: datatype of the property; - if not given, it will be queried via the API - """ - self.repo = site - self.id = id.upper() - if datatype: - self._type = datatype - - @property - def type(self) -> str: - """Return the type of this property.""" - if not hasattr(self, '_type'): - self._type = self.repo.getPropertyType(self) - return self._type - - def getID(self, numeric: bool = False): - """ - Get the identifier of this property. - - :param numeric: Strip the first letter and return an int - """ - if numeric: - return int(self.id[1:]) - return self.id - - -class PropertyPage(WikibasePage, Property): - - """ - A Wikibase entity in the property namespace. - - Should be created as:: - - PropertyPage(DataSite, 'P21') - - or:: - - PropertyPage(DataSite, datatype='url') - """ - - _cache_attrs = WikibasePage._cache_attrs + ( - '_type', 'labels', 'descriptions', 'aliases', 'claims') - entity_type = 'property' - title_pattern = r'P[1-9]\d*' - DATA_ATTRIBUTES = { - 'labels': LanguageDict, - 'descriptions': LanguageDict, - 'aliases': AliasesDict, - 'claims': ClaimCollection, - } - - def __init__(self, source, title=None, datatype=None) -> None: - """ - Initializer. - - :param source: data repository property is on - :type source: pywikibot.site.DataSite - :param title: identifier of property, like "P##", - "-1" or None for an empty property. - :type title: str - :param datatype: Datatype for a new property. - :type datatype: str - """ - # Special case for new property. - if title is None or title == '-1': - if not datatype: - raise TypeError('"datatype" is required for new property.') - WikibasePage.__init__(self, source, '-1', - ns=source.property_namespace) - Property.__init__(self, source, '-1', datatype=datatype) - assert self.id == '-1' - else: - if not title: - raise InvalidTitleError( - "Property's title cannot be empty") - - WikibasePage.__init__(self, source, title, - ns=source.property_namespace) - Property.__init__(self, source, self.id) - - def get(self, force: bool = False, *args, **kwargs) -> dict: - """ - Fetch the property entity, and cache it. - - :param force: override caching - :raise NotImplementedError: a value in args or kwargs - :return: actual data which entity holds - :note: dicts returned by this method are references to content of this - entity and their modifying may indirectly cause unwanted change to - the live content - """ - if args or kwargs: - raise NotImplementedError( - 'PropertyPage.get only implements "force".') - - data = WikibasePage.get(self, force) - if 'datatype' in self._content: - self._type = self._content['datatype'] - data['datatype'] = self._type - return data - - def newClaim(self, *args, **kwargs): - """ - Helper function to create a new claim object for this property. - - :rtype: pywikibot.page.Claim - """ - # todo: raise when self.id is -1 - return Claim(self.site, self.getID(), datatype=self.type, - *args, **kwargs) - - def getID(self, numeric: bool = False): - """ - Get the identifier of this property. - - :param numeric: Strip the first letter and return an int - """ - # enforce this parent's implementation - return WikibasePage.getID(self, numeric=numeric) - - def get_data_for_new_entity(self): - """Return data required for creation of new property.""" - return {'datatype': self.type} - - -# Add PropertyPage to the class attribute "types" after its declaration. -Property.types['wikibase-property'] = PropertyPage - - -class Claim(Property): - - """ - A Claim on a Wikibase entity. - - Claims are standard claims as well as references and qualifiers. - """ - - TARGET_CONVERTER = { - 'wikibase-item': lambda value, site: - ItemPage(site, 'Q' + str(value['numeric-id'])), - 'wikibase-property': lambda value, site: - PropertyPage(site, 'P' + str(value['numeric-id'])), - 'commonsMedia': lambda value, site: - FilePage(pywikibot.Site('commons', 'commons'), value), # T90492 - 'globe-coordinate': pywikibot.Coordinate.fromWikibase, - 'geo-shape': pywikibot.WbGeoShape.fromWikibase, - 'tabular-data': pywikibot.WbTabularData.fromWikibase, - 'time': pywikibot.WbTime.fromWikibase, - 'quantity': pywikibot.WbQuantity.fromWikibase, - 'monolingualtext': lambda value, site: - pywikibot.WbMonolingualText.fromWikibase(value) - } - - SNAK_TYPES = ('value', 'somevalue', 'novalue') - - def __init__( - self, - site, - pid, - snak=None, - hash=None, - is_reference: bool = False, - is_qualifier: bool = False, - rank: str = 'normal', - **kwargs - ) -> None: - """ - Initializer. - - Defined by the "snak" value, supplemented by site + pid - - :param site: repository the claim is on - :type site: pywikibot.site.DataSite - :param pid: property id, with "P" prefix - :param snak: snak identifier for claim - :param hash: hash identifier for references - :param is_reference: whether specified claim is a reference - :param is_qualifier: whether specified claim is a qualifier - :param rank: rank for claim - """ - Property.__init__(self, site, pid, **kwargs) - self.snak = snak - self.hash = hash - self.rank = rank - self.isReference = is_reference - self.isQualifier = is_qualifier - if self.isQualifier and self.isReference: - raise ValueError('Claim cannot be both a qualifier and reference.') - self.sources = [] - self.qualifiers = OrderedDict() - self.target = None - self.snaktype = 'value' - self._on_item = None # The item it's on - - @property - def on_item(self): - """Return item this claim is attached to.""" - return self._on_item - - @on_item.setter - def on_item(self, item) -> None: - self._on_item = item - for values in self.qualifiers.values(): - for qualifier in values: - qualifier.on_item = item - for source in self.sources: - for values in source.values(): - for source in values: - source.on_item = item - - def __repr__(self) -> str: - """Return the representation string.""" - return '{cls_name}.fromJSON({}, {})'.format( - repr(self.repo), self.toJSON(), cls_name=type(self).__name__) - - def __eq__(self, other): - if not isinstance(other, self.__class__): - return False - - return self.same_as(other) - - def __ne__(self, other): - return not self.__eq__(other) - - @staticmethod - def _claim_mapping_same(this, other) -> bool: - if len(this) != len(other): - return False - my_values = list(chain.from_iterable(this.values())) - other_values = list(chain.from_iterable(other.values())) - if len(my_values) != len(other_values): - return False - for val in my_values: - if val not in other_values: - return False - for val in other_values: - if val not in my_values: - return False - return True - - def same_as( - self, - other, - ignore_rank: bool = True, - ignore_quals: bool = False, - ignore_refs: bool = True - ) -> bool: - """Check if two claims are same.""" - if ignore_rank: - attributes = ['id', 'snaktype', 'target'] - else: - attributes = ['id', 'snaktype', 'rank', 'target'] - for attr in attributes: - if getattr(self, attr) != getattr(other, attr): - return False - - if not ignore_quals: - if not self._claim_mapping_same(self.qualifiers, other.qualifiers): - return False - - if not ignore_refs: - if len(self.sources) != len(other.sources): - return False - for source in self.sources: - same = False - for other_source in other.sources: - if self._claim_mapping_same(source, other_source): - same = True - break - if not same: - return False - - return True - - def copy(self): - """ - Create an independent copy of this object. - - :rtype: pywikibot.page.Claim - """ - is_qualifier = self.isQualifier - is_reference = self.isReference - self.isQualifier = False - self.isReference = False - copy = self.fromJSON(self.repo, self.toJSON()) - for cl in (self, copy): - cl.isQualifier = is_qualifier - cl.isReference = is_reference - copy.hash = None - copy.snak = None - return copy - - @classmethod - def fromJSON(cls, site, data): - """ - Create a claim object from JSON returned in the API call. - - :param data: JSON containing claim data - :type data: dict - - :rtype: pywikibot.page.Claim - """ - claim = cls(site, data['mainsnak']['property'], - datatype=data['mainsnak'].get('datatype', None)) - if 'id' in data: - claim.snak = data['id'] - elif 'hash' in data: - claim.hash = data['hash'] - claim.snaktype = data['mainsnak']['snaktype'] - if claim.getSnakType() == 'value': - value = data['mainsnak']['datavalue']['value'] - # The default covers string, url types - if claim.type in cls.types or claim.type == 'wikibase-property': - claim.target = cls.TARGET_CONVERTER.get( - claim.type, lambda value, site: value)(value, site) - else: - pywikibot.warning( - '{} datatype is not supported yet.'.format(claim.type)) - claim.target = pywikibot.WbUnknown.fromWikibase(value) - if 'rank' in data: # References/Qualifiers don't have ranks - claim.rank = data['rank'] - if 'references' in data: - for source in data['references']: - claim.sources.append(cls.referenceFromJSON(site, source)) - if 'qualifiers' in data: - for prop in data['qualifiers-order']: - claim.qualifiers[prop] = [ - cls.qualifierFromJSON(site, qualifier) - for qualifier in data['qualifiers'][prop]] - return claim - - @classmethod - def referenceFromJSON(cls, site, data) -> dict: - """ - Create a dict of claims from reference JSON returned in the API call. - - Reference objects are represented a bit differently, and require - some more handling. - """ - source = OrderedDict() - - # Before #84516 Wikibase did not implement snaks-order. - # https://gerrit.wikimedia.org/r/c/84516/ - if 'snaks-order' in data: - prop_list = data['snaks-order'] - else: - prop_list = data['snaks'].keys() - - for prop in prop_list: - for claimsnak in data['snaks'][prop]: - claim = cls.fromJSON(site, {'mainsnak': claimsnak, - 'hash': data.get('hash')}) - claim.isReference = True - if claim.getID() not in source: - source[claim.getID()] = [] - source[claim.getID()].append(claim) - return source - - @classmethod - def qualifierFromJSON(cls, site, data): - """ - Create a Claim for a qualifier from JSON. - - Qualifier objects are represented a bit - differently like references, but I'm not - sure if this even requires it's own function. - - :rtype: pywikibot.page.Claim - """ - claim = cls.fromJSON(site, {'mainsnak': data, - 'hash': data.get('hash')}) - claim.isQualifier = True - return claim - - def toJSON(self) -> dict: - """Create dict suitable for the MediaWiki API.""" - data = { - 'mainsnak': { - 'snaktype': self.snaktype, - 'property': self.getID() - }, - 'type': 'statement' - } - if hasattr(self, 'snak') and self.snak is not None: - data['id'] = self.snak - if hasattr(self, 'rank') and self.rank is not None: - data['rank'] = self.rank - if self.getSnakType() == 'value': - data['mainsnak']['datatype'] = self.type - data['mainsnak']['datavalue'] = self._formatDataValue() - if self.isQualifier or self.isReference: - data = data['mainsnak'] - if hasattr(self, 'hash') and self.hash is not None: - data['hash'] = self.hash - else: - if self.qualifiers: - data['qualifiers'] = {} - data['qualifiers-order'] = list(self.qualifiers.keys()) - for prop, qualifiers in self.qualifiers.items(): - for qualifier in qualifiers: - assert qualifier.isQualifier is True - data['qualifiers'][prop] = [ - qualifier.toJSON() for qualifier in qualifiers] - - if self.sources: - data['references'] = [] - for collection in self.sources: - reference = { - 'snaks': {}, 'snaks-order': list(collection.keys())} - for prop, val in collection.items(): - reference['snaks'][prop] = [] - for source in val: - assert source.isReference is True - src_data = source.toJSON() - if 'hash' in src_data: - reference.setdefault('hash', src_data['hash']) - del src_data['hash'] - reference['snaks'][prop].append(src_data) - data['references'].append(reference) - return data - - def setTarget(self, value): - """ - Set the target value in the local object. - - :param value: The new target value. - :type value: object - - :exception ValueError: if value is not of the type - required for the Claim type. - """ - value_class = self.types[self.type] - if not isinstance(value, value_class): - raise ValueError('{} is not type {}.' - .format(value, value_class)) - self.target = value - - def changeTarget( - self, - value=None, - snaktype: str = 'value', - **kwargs - ) -> None: - """ - Set the target value in the data repository. - - :param value: The new target value. - :type value: object - :param snaktype: The new snak type ('value', 'somevalue', or - 'novalue'). - """ - if value: - self.setTarget(value) - - data = self.repo.changeClaimTarget(self, snaktype=snaktype, - **kwargs) - # TODO: Re-create the entire item from JSON, not just id - self.snak = data['claim']['id'] - self.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - - def getTarget(self): - """ - Return the target value of this Claim. - - None is returned if no target is set - - :return: object - """ - return self.target - - def getSnakType(self) -> str: - """ - Return the type of snak. - - :return: str ('value', 'somevalue' or 'novalue') - """ - return self.snaktype - - def setSnakType(self, value): - """ - Set the type of snak. - - :param value: Type of snak - :type value: str ('value', 'somevalue', or 'novalue') - """ - if value in self.SNAK_TYPES: - self.snaktype = value - else: - raise ValueError( - "snaktype must be 'value', 'somevalue', or 'novalue'.") - - def getRank(self): - """Return the rank of the Claim.""" - return self.rank - - def setRank(self, rank) -> None: - """Set the rank of the Claim.""" - self.rank = rank - - def changeRank(self, rank, **kwargs): - """Change the rank of the Claim and save.""" - self.rank = rank - return self.repo.save_claim(self, **kwargs) - - def changeSnakType(self, value=None, **kwargs) -> None: - """ - Save the new snak value. - - TODO: Is this function really needed? - """ - if value: - self.setSnakType(value) - self.changeTarget(snaktype=self.getSnakType(), **kwargs) - - def getSources(self) -> list: - """Return a list of sources, each being a list of Claims.""" - return self.sources - - def addSource(self, claim, **kwargs) -> None: - """ - Add the claim as a source. - - :param claim: the claim to add - :type claim: pywikibot.Claim - """ - self.addSources([claim], **kwargs) - - def addSources(self, claims, **kwargs): - """ - Add the claims as one source. - - :param claims: the claims to add - :type claims: list of pywikibot.Claim - """ - for claim in claims: - if claim.on_item is not None: - raise ValueError( - 'The provided Claim instance is already used in an entity') - if self.on_item is not None: - data = self.repo.editSource(self, claims, new=True, **kwargs) - self.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - for claim in claims: - claim.hash = data['reference']['hash'] - claim.on_item = self.on_item - source = defaultdict(list) - for claim in claims: - claim.isReference = True - source[claim.getID()].append(claim) - self.sources.append(source) - - def removeSource(self, source, **kwargs) -> None: - """ - Remove the source. Call removeSources(). - - :param source: the source to remove - :type source: pywikibot.Claim - """ - self.removeSources([source], **kwargs) - - def removeSources(self, sources, **kwargs) -> None: - """ - Remove the sources. - - :param sources: the sources to remove - :type sources: list of pywikibot.Claim - """ - data = self.repo.removeSources(self, sources, **kwargs) - self.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - for source in sources: - source_dict = defaultdict(list) - source_dict[source.getID()].append(source) - self.sources.remove(source_dict) - - def addQualifier(self, qualifier, **kwargs): - """Add the given qualifier. - - :param qualifier: the qualifier to add - :type qualifier: pywikibot.page.Claim - """ - if qualifier.on_item is not None: - raise ValueError( - 'The provided Claim instance is already used in an entity') - if self.on_item is not None: - data = self.repo.editQualifier(self, qualifier, **kwargs) - self.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - qualifier.on_item = self.on_item - qualifier.isQualifier = True - if qualifier.getID() in self.qualifiers: - self.qualifiers[qualifier.getID()].append(qualifier) - else: - self.qualifiers[qualifier.getID()] = [qualifier] - - def removeQualifier(self, qualifier, **kwargs) -> None: - """ - Remove the qualifier. Call removeQualifiers(). - - :param qualifier: the qualifier to remove - :type qualifier: pywikibot.page.Claim - """ - self.removeQualifiers([qualifier], **kwargs) - - def removeQualifiers(self, qualifiers, **kwargs) -> None: - """ - Remove the qualifiers. - - :param qualifiers: the qualifiers to remove - :type qualifiers: list Claim - """ - data = self.repo.remove_qualifiers(self, qualifiers, **kwargs) - self.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - for qualifier in qualifiers: - self.qualifiers[qualifier.getID()].remove(qualifier) - qualifier.on_item = None - - def target_equals(self, value) -> bool: - """ - Check whether the Claim's target is equal to specified value. - - The function checks for: - - - WikibasePage ID equality - - WbTime year equality - - Coordinate equality, regarding precision - - WbMonolingualText text equality - - direct equality - - :param value: the value to compare with - :return: true if the Claim's target is equal to the value provided, - false otherwise - """ - if (isinstance(self.target, WikibasePage) - and isinstance(value, str)): - return self.target.id == value - - if (isinstance(self.target, pywikibot.WbTime) - and not isinstance(value, pywikibot.WbTime)): - return self.target.year == int(value) - - if (isinstance(self.target, pywikibot.Coordinate) - and isinstance(value, str)): - coord_args = [float(x) for x in value.split(',')] - if len(coord_args) >= 3: - precision = coord_args[2] - else: - precision = 0.0001 # Default value (~10 m at equator) - with suppress(TypeError): - if self.target.precision is not None: - precision = max(precision, self.target.precision) - - return (abs(self.target.lat - coord_args[0]) <= precision - and abs(self.target.lon - coord_args[1]) <= precision) - - if (isinstance(self.target, pywikibot.WbMonolingualText) - and isinstance(value, str)): - return self.target.text == value - - return self.target == value - - def has_qualifier(self, qualifier_id: str, target) -> bool: - """ - Check whether Claim contains specified qualifier. - - :param qualifier_id: id of the qualifier - :param target: qualifier target to check presence of - :return: true if the qualifier was found, false otherwise - """ - if self.isQualifier or self.isReference: - raise ValueError('Qualifiers and references cannot have ' - 'qualifiers.') - - for qualifier in self.qualifiers.get(qualifier_id, []): - if qualifier.target_equals(target): - return True - return False - - def _formatValue(self) -> dict: - """ - Format the target into the proper JSON value that Wikibase wants. - - :return: JSON value - """ - if self.type in ('wikibase-item', 'wikibase-property'): - value = {'entity-type': self.getTarget().entity_type, - 'numeric-id': self.getTarget().getID(numeric=True)} - elif self.type in ('string', 'url', 'math', 'external-id', - 'musical-notation'): - value = self.getTarget() - elif self.type == 'commonsMedia': - value = self.getTarget().title(with_ns=False) - elif self.type in ('globe-coordinate', 'time', - 'quantity', 'monolingualtext', - 'geo-shape', 'tabular-data'): - value = self.getTarget().toWikibase() - else: # WbUnknown - pywikibot.warning( - '{} datatype is not supported yet.'.format(self.type)) - value = self.getTarget().toWikibase() - return value - - def _formatDataValue(self) -> dict: - """ - Format the target into the proper JSON datavalue that Wikibase wants. - - :return: Wikibase API representation with type and value. - """ - return { - 'value': self._formatValue(), - 'type': self.value_types.get(self.type, self.type) - } - - class FileInfo: """ diff --git a/pywikibot/page/_wikibase.py b/pywikibot/page/_wikibase.py index 6dfe675..503bee3 100644 --- a/pywikibot/page/_wikibase.py +++ b/pywikibot/page/_wikibase.py @@ -1,56 +1,37 @@ """ -Objects representing various types of MediaWiki, including Wikibase, pages. +Objects representing various types of Wikibase pages and structures. This module also includes objects: -* Property: a type of semantic data. * Claim: an instance of a semantic assertion. -* Revision: a single change to a wiki page. -* FileInfo: a structure holding imageinfo of latest rev. of FilePage +* MediaInfo: Interface for MediaInfo entities on image repository +* Property: a type of semantic data. +* WikibaseEntity: base interface for Wikibase entities. """ # -# (C) Pywikibot team, 2008-2022 +# (C) Pywikibot team, 2013-2022 # # Distributed under the terms of the MIT license. # import json as jsonlib -import logging -import os.path import re -from collections import Counter, OrderedDict, defaultdict +from collections import OrderedDict, defaultdict from contextlib import suppress -from http import HTTPStatus from itertools import chain -from textwrap import shorten, wrap -from typing import Any, Optional, Union -from urllib.parse import quote_from_bytes -from warnings import warn +from typing import Any, Optional import pywikibot -from pywikibot import config, date, i18n, textlib -from pywikibot.backports import Dict, Generator, Iterable, List, Tuple -from pywikibot.comms import http -from pywikibot.cosmetic_changes import CANCEL, CosmeticChangesToolkit +from pywikibot.backports import Dict from pywikibot.exceptions import ( APIError, - AutoblockUserError, EntityTypeUnknownError, Error, InterwikiRedirectPageError, - InvalidPageError, InvalidTitleError, IsNotRedirectPageError, IsRedirectPageError, - NoMoveTargetError, NoPageError, - NotEmailableError, - NoUsernameError, NoWikibaseEntityError, - OtherPageSaveError, - PageSaveRelatedError, - SectionError, - UnknownExtensionError, - UserRightsError, WikiBaseError, ) from pywikibot.family import Family @@ -61,3264 +42,20 @@ SiteLinkCollection, ) from pywikibot.page._decorators import allow_asynchronous -from pywikibot.page._links import BaseLink, Link -from pywikibot.page._revision import Revision -from pywikibot.site import DataSite, Namespace, NamespaceArgType -from pywikibot.tools import ( - ComparableMixin, - compute_file_hash, - deprecated, - first_upper, - is_ip_address, - issue_deprecation_warning, - remove_last_args, -) +from pywikibot.page._pages import BasePage, FilePage +from pywikibot.site import DataSite, Namespace -PROTOCOL_REGEX = r'\Ahttps?://' - __all__ = ( - 'BasePage', - 'Category', 'Claim', - 'FileInfo', - 'FilePage', 'ItemPage', 'MediaInfo', - 'Page', 'Property', 'PropertyPage', - 'User', 'WikibaseEntity', 'WikibasePage', ) -logger = logging.getLogger('pywiki.wiki.page') - - -# Note: Link objects (defined later on) represent a wiki-page's title, while -# Page objects (defined here) represent the page itself, including its -# contents. - -class BasePage(ComparableMixin): - - """ - BasePage: Base object for a MediaWiki page. - - This object only implements internally methods that do not require - reading from or writing to the wiki. All other methods are delegated - to the Site object. - - Will be subclassed by Page, WikibasePage, and FlowPage. - """ - - _cache_attrs = ( - '_text', '_pageid', '_catinfo', '_templates', '_protection', - '_contentmodel', '_langlinks', '_isredir', '_coords', - '_preloadedtext', '_timestamp', '_applicable_protections', - '_flowinfo', '_quality', '_pageprops', '_revid', '_quality_text', - '_pageimage', '_item', '_lintinfo', - ) - - def __init__(self, source, title: str = '', ns=0) -> None: - """ - Instantiate a Page object. - - Three calling formats are supported: - - - If the first argument is a Page, create a copy of that object. - This can be used to convert an existing Page into a subclass - object, such as Category or FilePage. (If the title is also - given as the second argument, creates a copy with that title; - this is used when pages are moved.) - - If the first argument is a Site, create a Page on that Site - using the second argument as the title (may include a section), - and the third as the namespace number. The namespace number is - mandatory, even if the title includes the namespace prefix. This - is the preferred syntax when using an already-normalized title - obtained from api.php or a database dump. WARNING: may produce - invalid objects if page title isn't in normal form! - - If the first argument is a BaseLink, create a Page from that link. - This is the preferred syntax when using a title scraped from - wikitext, URLs, or another non-normalized source. - - :param source: the source of the page - :type source: pywikibot.page.BaseLink (or subclass), - pywikibot.page.Page (or subclass), or pywikibot.page.Site - :param title: normalized title of the page; required if source is a - Site, ignored otherwise - :type title: str - :param ns: namespace number; required if source is a Site, ignored - otherwise - :type ns: int - """ - if title is None: - raise ValueError('Title cannot be None.') - - if isinstance(source, pywikibot.site.BaseSite): - self._link = Link(title, source=source, default_namespace=ns) - self._revisions = {} - elif isinstance(source, Page): - # copy all of source's attributes to this object - # without overwriting non-None values - self.__dict__.update((k, v) for k, v in source.__dict__.items() - if k not in self.__dict__ - or self.__dict__[k] is None) - if title: - # overwrite title - self._link = Link(title, source=source.site, - default_namespace=ns) - elif isinstance(source, BaseLink): - self._link = source - self._revisions = {} - else: - raise Error( - "Invalid argument type '{}' in Page initializer: {}" - .format(type(source), source)) - - @property - def site(self): - """Return the Site object for the wiki on which this Page resides. - - :rtype: pywikibot.Site - """ - return self._link.site - - def version(self): - """ - Return MediaWiki version number of the page site. - - This is needed to use @need_version() decorator for methods of - Page objects. - """ - return self.site.version() - - @property - def image_repository(self): - """Return the Site object for the image repository.""" - return self.site.image_repository() - - @property - def data_repository(self): - """Return the Site object for the data repository.""" - return self.site.data_repository() - - def namespace(self): - """ - Return the namespace of the page. - - :return: namespace of the page - :rtype: pywikibot.Namespace - """ - return self._link.namespace - - @property - def content_model(self): - """ - Return the content model for this page. - - If it cannot be reliably determined via the API, - None is returned. - """ - if not hasattr(self, '_contentmodel'): - self.site.loadpageinfo(self) - return self._contentmodel - - @property - def depth(self): - """Return the depth/subpage level of the page.""" - if not hasattr(self, '_depth'): - # Check if the namespace allows subpages - if self.namespace().subpages: - self._depth = self.title().count('/') - else: - # Does not allow subpages, which means depth is always 0 - self._depth = 0 - - return self._depth - - @property - def pageid(self) -> int: - """ - Return pageid of the page. - - :return: pageid or 0 if page does not exist - """ - if not hasattr(self, '_pageid'): - self.site.loadpageinfo(self) - return self._pageid - - def title( - self, - *, - underscore: bool = False, - with_ns: bool = True, - with_section: bool = True, - as_url: bool = False, - as_link: bool = False, - allow_interwiki: bool = True, - force_interwiki: bool = False, - textlink: bool = False, - as_filename: bool = False, - insite=None, - without_brackets: bool = False - ) -> str: - """ - Return the title of this Page, as a string. - - :param underscore: (not used with as_link) if true, replace all ' ' - characters with '_' - :param with_ns: if false, omit the namespace prefix. If this - option is false and used together with as_link return a labeled - link like [[link|label]] - :param with_section: if false, omit the section - :param as_url: (not used with as_link) if true, quote title as if in an - URL - :param as_link: if true, return the title in the form of a wikilink - :param allow_interwiki: (only used if as_link is true) if true, format - the link as an interwiki link if necessary - :param force_interwiki: (only used if as_link is true) if true, always - format the link as an interwiki link - :param textlink: (only used if as_link is true) if true, place a ':' - before Category: and Image: links - :param as_filename: (not used with as_link) if true, replace any - characters that are unsafe in filenames - :param insite: (only used if as_link is true) a site object where the - title is to be shown. Default is the current family/lang given by - -family and -lang or -site option i.e. config.family and - config.mylang - :param without_brackets: (cannot be used with as_link) if true, remove - the last pair of brackets(usually removes disambiguation brackets). - """ - title = self._link.canonical_title() - label = self._link.title - if with_section and self.section(): - section = '#' + self.section() - else: - section = '' - if as_link: - if insite: - target_code = insite.code - target_family = insite.family.name - else: - target_code = config.mylang - target_family = config.family - if force_interwiki \ - or (allow_interwiki - and (self.site.family.name != target_family - or self.site.code != target_code)): - if self.site.family.name not in ( - target_family, self.site.code): - title = '{site.family.name}:{site.code}:{title}'.format( - site=self.site, title=title) - else: - # use this form for sites like commons, where the - # code is the same as the family name - title = '{}:{}'.format(self.site.code, title) - elif textlink and (self.is_filepage() or self.is_categorypage()): - title = ':{}'.format(title) - elif self.namespace() == 0 and not section: - with_ns = True - if with_ns: - return '[[{}{}]]'.format(title, section) - return '[[{}{}|{}]]'.format(title, section, label) - if not with_ns and self.namespace() != 0: - title = label + section - else: - title += section - if without_brackets: - brackets_re = r'\s+$[^()]+?$$' - title = re.sub(brackets_re, '', title) - if underscore or as_url: - title = title.replace(' ', '_') - if as_url: - encoded_title = title.encode(self.site.encoding()) - title = quote_from_bytes(encoded_title, safe='') - if as_filename: - # Replace characters that are not possible in file names on some - # systems, but still are valid in MediaWiki titles: - # Unix: / - # MediaWiki: /:\ - # Windows: /:\"?* - # Spaces are possible on most systems, but are bad for URLs. - for forbidden in ':*?/\\" ': - title = title.replace(forbidden, '_') - return title - - def section(self) -> Optional[str]: - """ - Return the name of the section this Page refers to. - - The section is the part of the title following a '#' character, if - any. If no section is present, return None. - """ - try: - section = self._link.section - except AttributeError: - section = None - return section - - def __str__(self) -> str: - """Return a string representation.""" - return self.title(as_link=True, force_interwiki=True) - - def __repr__(self) -> str: - """Return a more complete string representation.""" - return '{}({!r})'.format(self.__class__.__name__, self.title()) - - def _cmpkey(self): - """ - Key for comparison of Page objects. - - Page objects are "equal" if and only if they are on the same site - and have the same normalized title, including section if any. - - Page objects are sortable by site, namespace then title. - """ - return (self.site, self.namespace(), self.title()) - - def __hash__(self): - """ - A stable identifier to be used as a key in hash-tables. - - This relies on the fact that the string - representation of an instance cannot change after the construction. - """ - return hash(self._cmpkey()) - - def full_url(self): - """Return the full URL.""" - return self.site.base_url( - self.site.articlepath.format(self.title(as_url=True))) - - def autoFormat(self): - """ - Return :py:obj:`date.getAutoFormat` dictName and value, if any. - - Value can be a year, date, etc., and dictName is 'YearBC', - 'Year_December', or another dictionary name. Please note that two - entries may have exactly the same autoFormat, but be in two - different namespaces, as some sites have categories with the - same names. Regular titles return (None, None). - """ - if not hasattr(self, '_autoFormat'): - self._autoFormat = date.getAutoFormat( - self.site.lang, - self.title(with_ns=False) - ) - return self._autoFormat - - def isAutoTitle(self): - """Return True if title of this Page is in the autoFormat dict.""" - return self.autoFormat()[0] is not None - - def get(self, force: bool = False, get_redirect: bool = False) -> str: - """Return the wiki-text of the page. - - This will retrieve the page from the server if it has not been - retrieved yet, or if force is True. This can raise the following - exceptions that should be caught by the calling code: - - :exception pywikibot.exceptions.NoPageError: The page does not exist - :exception pywikibot.exceptions.IsRedirectPageError: The page is a - redirect. The argument of the exception is the title of the page - it redirects to. - :exception pywikibot.exceptions.SectionError: The section does not - exist on a page with a # link - - :param force: reload all page attributes, including errors. - :param get_redirect: return the redirect text, do not follow the - redirect, do not raise an exception. - """ - if force: - del self.latest_revision_id - if hasattr(self, '_bot_may_edit'): - del self._bot_may_edit - try: - self._getInternals() - except IsRedirectPageError: - if not get_redirect: - raise - - return self.latest_revision.text - - def _latest_cached_revision(self): - """Get the latest revision if cached and has text, otherwise None.""" - if (hasattr(self, '_revid') and self._revid in self._revisions - and self._revisions[self._revid].text is not None): - return self._revisions[self._revid] - return None - - def _getInternals(self): - """ - Helper function for get(). - - Stores latest revision in self if it doesn't contain it, doesn't think. - * Raises exceptions from previous runs. - * Stores new exceptions in _getexception and raises them. - """ - # Raise exceptions from previous runs - if hasattr(self, '_getexception'): - raise self._getexception - - # If not already stored, fetch revision - if self._latest_cached_revision() is None: - try: - self.site.loadrevisions(self, content=True) - except (NoPageError, SectionError) as e: - self._getexception = e - raise - - # self._isredir is set by loadrevisions - if self._isredir: - self._getexception = IsRedirectPageError(self) - raise self._getexception - - @remove_last_args(['get_redirect']) - def getOldVersion(self, oldid, force: bool = False) -> str: - """Return text of an old revision of this page. - - :param oldid: The revid of the revision desired. - """ - if force or oldid not in self._revisions \ - or self._revisions[oldid].text is None: - self.site.loadrevisions(self, content=True, revids=oldid) - return self._revisions[oldid].text - - def permalink(self, oldid=None, percent_encoded: bool = True, - with_protocol: bool = False) -> str: - """Return the permalink URL of an old revision of this page. - - :param oldid: The revid of the revision desired. - :param percent_encoded: if false, the link will be provided - without title uncoded. - :param with_protocol: if true, http or https prefixes will be - included before the double slash. - """ - if percent_encoded: - title = self.title(as_url=True) - else: - title = self.title(as_url=False).replace(' ', '_') - return '{}//{}{}/index.php?title={}&oldid={}'.format( - self.site.protocol() + ':' if with_protocol else '', - self.site.hostname(), - self.site.scriptpath(), - title, - oldid if oldid is not None else self.latest_revision_id) - - @property - def latest_revision_id(self): - """Return the current revision id for this page.""" - if not hasattr(self, '_revid'): - self.revisions() - return self._revid - - @latest_revision_id.deleter - def latest_revision_id(self) -> None: - """ - Remove the latest revision id set for this Page. - - All internal cached values specifically for the latest revision - of this page are cleared. - - The following cached values are not cleared: - - text property - - page properties, and page coordinates - - lastNonBotUser - - isDisambig and isCategoryRedirect status - - langlinks, templates and deleted revisions - """ - # When forcing, we retry the page no matter what: - # * Old exceptions do not apply any more - # * Deleting _revid to force reload - # * Deleting _redirtarget, that info is now obsolete. - for attr in ['_redirtarget', '_getexception', '_revid']: - if hasattr(self, attr): - delattr(self, attr) - - @latest_revision_id.setter - def latest_revision_id(self, value) -> None: - """Set the latest revision for this Page.""" - del self.latest_revision_id - self._revid = value - - @property - def latest_revision(self): - """Return the current revision for this page.""" - rev = self._latest_cached_revision() - if rev is not None: - return rev - - with suppress(StopIteration): - return next(self.revisions(content=True, total=1)) - raise InvalidPageError(self) - - @property - def text(self) -> str: - """ - Return the current (edited) wikitext, loading it if necessary. - - :return: text of the page - """ - if getattr(self, '_text', None) is not None: - return self._text - - try: - return self.get(get_redirect=True) - except NoPageError: - # TODO: what other exceptions might be returned? - return '' - - @text.setter - def text(self, value: Optional[str]): - """Update the current (edited) wikitext. - - :param value: New value or None - """ - try: - self.botMayEdit() # T262136, T267770 - except Exception as e: - # dry tests aren't able to make an API call - # but are rejected by an Exception; ignore it then. - if not str(e).startswith('DryRequest rejecting request:'): - raise - - del self.text - self._text = None if value is None else str(value) - - @text.deleter - def text(self) -> None: - """Delete the current (edited) wikitext.""" - if hasattr(self, '_text'): - del self._text - if hasattr(self, '_expanded_text'): - del self._expanded_text - if hasattr(self, '_raw_extracted_templates'): - del self._raw_extracted_templates - - def preloadText(self) -> str: - """ - The text returned by EditFormPreloadText. - - See API module "info". - - Application: on Wikisource wikis, text can be preloaded even if - a page does not exist, if an Index page is present. - """ - self.site.loadpageinfo(self, preload=True) - return self._preloadedtext - - def get_parsed_page(self, force: bool = False) -> str: - """Retrieve parsed text (via action=parse) and cache it. - - .. versionchanged:: 7.1 - `force` parameter was added; - `_get_parsed_page` becomes a public method - - :param force: force updating from the live site - - .. seealso:: - :meth:`APISite.get_parsed_page() - <pywikibot.site._apisite.APISite.get_parsed_page>` - """ - if not hasattr(self, '_parsed_text') or force: - self._parsed_text = self.site.get_parsed_page(self) - return self._parsed_text - - def extract(self, variant: str = 'plain', *, - lines: Optional[int] = None, - chars: Optional[int] = None, - sentences: Optional[int] = None, - intro: bool = True) -> str: - """Retrieve an extract of this page. - - .. versionadded:: 7.1 - - :param variant: The variant of extract, either 'plain' for plain - text, 'html' for limited HTML (both excludes templates and - any text formatting) or 'wiki' for bare wikitext which also - includes any templates for example. - :param lines: if not None, wrap the extract into lines with - width of 79 chars and return a string with that given number - of lines. - :param chars: How many characters to return. Actual text - returned might be slightly longer. - :param sentences: How many sentences to return - :param intro: Return only content before the first section - :raises NoPageError: given page does not exist - :raises NotImplementedError: "wiki" variant does not support - `sencence` parameter. - :raises ValueError: `variant` parameter must be "plain", "html" or - "wiki" - - .. seealso:: :meth:`APISite.extract() - <pywikibot.site._extensions.TextExtractsMixin.extract>`. - """ - if variant in ('plain', 'html'): - extract = self.site.extract(self, chars=chars, sentences=sentences, - intro=intro, - plaintext=variant == 'plain') - elif variant == 'wiki': - if not self.exists(): - raise NoPageError(self) - if sentences: - raise NotImplementedError( - "'wiki' variant of extract method does not support " - "'sencence' parameter") - - extract = self.text[:] - if intro: - pos = extract.find('\n=') - if pos: - extract = extract[:pos] - if chars: - extract = shorten(extract, chars, break_long_words=False, - placeholder='…') - else: - raise ValueError( - 'variant parameter must be "plain", "html" or "wiki", not "{}"' - .format(variant)) - - if not lines: - return extract - - text_lines = [] - for i, text in enumerate(extract.splitlines(), start=1): - text_lines += wrap(text, width=79) or [''] - if i >= lines: - break - - return '\n'.join(text_lines[:min(lines, len(text_lines))]) - - def properties(self, force: bool = False) -> dict: - """ - Return the properties of the page. - - :param force: force updating from the live site - """ - if not hasattr(self, '_pageprops') or force: - self._pageprops = {} # page may not have pageprops (see T56868) - self.site.loadpageprops(self) - return self._pageprops - - def defaultsort(self, force: bool = False) -> Optional[str]: - """ - Extract value of the {{DEFAULTSORT:}} magic word from the page. - - :param force: force updating from the live site - """ - return self.properties(force=force).get('defaultsort') - - def expand_text( - self, - force: bool = False, - includecomments: bool = False - ) -> str: - """Return the page text with all templates and parser words expanded. - - :param force: force updating from the live site - :param includecomments: Also strip comments if includecomments - parameter is not True. - """ - if not hasattr(self, '_expanded_text') or ( - self._expanded_text is None) or force: - if not self.text: - self._expanded_text = '' - return '' - - self._expanded_text = self.site.expand_text( - self.text, - title=self.title(with_section=False), - includecomments=includecomments) - return self._expanded_text - - def userName(self) -> str: - """Return name or IP address of last user to edit page.""" - return self.latest_revision.user - - def isIpEdit(self) -> bool: - """Return True if last editor was unregistered.""" - return self.latest_revision.anon - - def lastNonBotUser(self) -> str: - """ - Return name or IP address of last human/non-bot user to edit page. - - Determine the most recent human editor out of the last revisions. - If it was not able to retrieve a human user, returns None. - - If the edit was done by a bot which is no longer flagged as 'bot', - i.e. which is not returned by Site.botusers(), it will be returned - as a non-bot edit. - """ - if hasattr(self, '_lastNonBotUser'): - return self._lastNonBotUser - - self._lastNonBotUser = None - for entry in self.revisions(): - if entry.user and (not self.site.isBot(entry.user)): - self._lastNonBotUser = entry.user - break - - return self._lastNonBotUser - - def editTime(self): - """Return timestamp of last revision to page. - - :rtype: pywikibot.Timestamp - """ - return self.latest_revision.timestamp - - def exists(self) -> bool: - """Return True if page exists on the wiki, even if it's a redirect. - - If the title includes a section, return False if this section isn't - found. - """ - with suppress(AttributeError): - return self.pageid > 0 - raise InvalidPageError(self) - - @property - def oldest_revision(self): - """ - Return the first revision of this page. - - :rtype: :py:obj:`Revision` - """ - return next(self.revisions(reverse=True, total=1)) - - def isRedirectPage(self): - """Return True if this is a redirect, False if not or not existing.""" - return self.site.page_isredirect(self) - - def isStaticRedirect(self, force: bool = False) -> bool: - """Determine whether the page is a static redirect. - - A static redirect must be a valid redirect, and contain the magic - word __STATICREDIRECT__. - - .. versionchanged:: 7.0 - __STATICREDIRECT__ can be transcluded - - :param force: Bypass local caching - """ - return self.isRedirectPage() \ - and 'staticredirect' in self.properties(force=force) - - def isCategoryRedirect(self) -> bool: - """Return True if this is a category redirect page, False otherwise.""" - if not self.is_categorypage(): - return False - - if not hasattr(self, '_catredirect'): - self._catredirect = False - catredirs = self.site.category_redirects() - for template, args in self.templatesWithParams(): - if template.title(with_ns=False) not in catredirs: - continue - - if args: - # Get target (first template argument) - target_title = args[0].strip() - p = pywikibot.Page( - self.site, target_title, Namespace.CATEGORY) - try: - p.title() - except pywikibot.exceptions.InvalidTitleError: - target_title = self.site.expand_text( - text=target_title, title=self.title()) - p = pywikibot.Page(self.site, target_title, - Namespace.CATEGORY) - if p.namespace() == Namespace.CATEGORY: - self._catredirect = p.title() - else: - pywikibot.warning( - 'Category redirect target {} on {} is not a ' - 'category'.format(p.title(as_link=True), - self.title(as_link=True))) - else: - pywikibot.warning( - 'No target found for category redirect on ' - + self.title(as_link=True)) - break - - return bool(self._catredirect) - - def getCategoryRedirectTarget(self): - """ - If this is a category redirect, return the target category title. - - :rtype: pywikibot.page.Category - """ - if self.isCategoryRedirect(): - return Category(Link(self._catredirect, self.site)) - raise IsNotRedirectPageError(self) - - def isTalkPage(self): - """Return True if this page is in any talk namespace.""" - ns = self.namespace() - return ns >= 0 and ns % 2 == 1 - - def toggleTalkPage(self): - """ - Return other member of the article-talk page pair for this Page. - - If self is a talk page, returns the associated content page; - otherwise, returns the associated talk page. The returned page need - not actually exist on the wiki. - - :return: Page or None if self is a special page. - :rtype: typing.Optional[pywikibot.Page] - """ - ns = self.namespace() - if ns < 0: # Special page - return None - - title = self.title(with_ns=False) - new_ns = ns + (1, -1)[self.isTalkPage()] - return Page(self.site, - '{}:{}'.format(self.site.namespace(new_ns), title)) - - def is_categorypage(self): - """Return True if the page is a Category, False otherwise.""" - return self.namespace() == 14 - - def is_filepage(self): - """Return True if this is a file description page, False otherwise.""" - return self.namespace() == 6 - - def isDisambig(self) -> bool: - """ - Return True if this is a disambiguation page, False otherwise. - - By default, it uses the Disambiguator extension's result. The - identification relies on the presence of the __DISAMBIG__ magic word - which may also be transcluded. - - If the Disambiguator extension isn't activated for the given site, - the identification relies on the presence of specific templates. - First load a list of template names from the Family file; - if the value in the Family file is None or no entry was made, look for - the list on [[MediaWiki:Disambiguationspage]]. If this page does not - exist, take the MediaWiki message. 'Template:Disambig' is always - assumed to be default, and will be appended regardless of its - existence. - """ - if self.site.has_extension('Disambiguator'): - # If the Disambiguator extension is loaded, use it - return 'disambiguation' in self.properties() - - if not hasattr(self.site, '_disambigtemplates'): - try: - default = set(self.site.family.disambig('_default')) - except KeyError: - default = {'Disambig'} - try: - distl = self.site.family.disambig(self.site.code, - fallback=False) - except KeyError: - distl = None - if distl is None: - disambigpages = Page(self.site, - 'MediaWiki:Disambiguationspage') - if disambigpages.exists(): - disambigs = {link.title(with_ns=False) - for link in disambigpages.linkedPages() - if link.namespace() == 10} - elif self.site.has_mediawiki_message('disambiguationspage'): - message = self.site.mediawiki_message( - 'disambiguationspage').split(':', 1)[1] - # add the default template(s) for default mw message - # only - disambigs = {first_upper(message)} | default - else: - disambigs = default - self.site._disambigtemplates = disambigs - else: - # Normalize template capitalization - self.site._disambigtemplates = {first_upper(t) for t in distl} - templates = {tl.title(with_ns=False) for tl in self.templates()} - disambigs = set() - # always use cached disambig templates - disambigs.update(self.site._disambigtemplates) - # see if any template on this page is in the set of disambigs - disambig_in_page = disambigs.intersection(templates) - return self.namespace() != 10 and bool(disambig_in_page) - - def getReferences(self, - follow_redirects: bool = True, - with_template_inclusion: bool = True, - only_template_inclusion: bool = False, - filter_redirects: bool = False, - namespaces=None, - total: Optional[int] = None, - content: bool = False): - """ - Return an iterator all pages that refer to or embed the page. - - If you need a full list of referring pages, use - ``pages = list(s.getReferences())`` - - :param follow_redirects: if True, also iterate pages that link to a - redirect pointing to the page. - :param with_template_inclusion: if True, also iterate pages where self - is used as a template. - :param only_template_inclusion: if True, only iterate pages where self - is used as a template. - :param filter_redirects: if True, only iterate redirects to self. - :param namespaces: only iterate pages in these namespaces - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each referring page (default False) - :rtype: typing.Iterable[pywikibot.Page] - """ - # N.B.: this method intentionally overlaps with backlinks() and - # embeddedin(). Depending on the interface, it may be more efficient - # to implement those methods in the site interface and then combine - # the results for this method, or to implement this method and then - # split up the results for the others. - return self.site.pagereferences( - self, - follow_redirects=follow_redirects, - filter_redirects=filter_redirects, - with_template_inclusion=with_template_inclusion, - only_template_inclusion=only_template_inclusion, - namespaces=namespaces, - total=total, - content=content - ) - - def backlinks(self, - follow_redirects: bool = True, - filter_redirects: Optional[bool] = None, - namespaces=None, - total: Optional[int] = None, - content: bool = False): - """ - Return an iterator for pages that link to this page. - - :param follow_redirects: if True, also iterate pages that link to a - redirect pointing to the page. - :param filter_redirects: if True, only iterate redirects; if False, - omit redirects; if None, do not filter - :param namespaces: only iterate pages in these namespaces - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each referring page (default False) - """ - return self.site.pagebacklinks( - self, - follow_redirects=follow_redirects, - filter_redirects=filter_redirects, - namespaces=namespaces, - total=total, - content=content - ) - - def embeddedin(self, - filter_redirects: Optional[bool] = None, - namespaces=None, - total: Optional[int] = None, - content: bool = False): - """ - Return an iterator for pages that embed this page as a template. - - :param filter_redirects: if True, only iterate redirects; if False, - omit redirects; if None, do not filter - :param namespaces: only iterate pages in these namespaces - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each embedding page (default False) - """ - return self.site.page_embeddedin( - self, - filter_redirects=filter_redirects, - namespaces=namespaces, - total=total, - content=content - ) - - def redirects( - self, - *, - filter_fragments: Optional[bool] = None, - namespaces: NamespaceArgType = None, - total: Optional[int] = None, - content: bool = False - ) -> 'Iterable[pywikibot.Page]': - """ - Return an iterable of redirects to this page. - - :param filter_fragments: If True, only return redirects with fragments. - If False, only return redirects without fragments. If None, return - both (no filtering). - :param namespaces: only return redirects from these namespaces - :param total: maximum number of redirects to retrieve in total - :param content: load the current content of each redirect - - .. versionadded:: 7.0 - """ - return self.site.page_redirects( - self, - filter_fragments=filter_fragments, - namespaces=namespaces, - total=total, - content=content, - ) - - def protection(self) -> dict: - """Return a dictionary reflecting page protections.""" - return self.site.page_restrictions(self) - - def applicable_protections(self) -> set: - """ - Return the protection types allowed for that page. - - If the page doesn't exist it only returns "create". Otherwise it - returns all protection types provided by the site, except "create". - It also removes "upload" if that page is not in the File namespace. - - It is possible, that it returns an empty set, but only if original - protection types were removed. - - :return: set of str - """ - # New API since commit 32083235eb332c419df2063cf966b3400be7ee8a - if self.site.mw_version >= '1.25wmf14': - self.site.loadpageinfo(self) - return self._applicable_protections - - p_types = set(self.site.protection_types()) - if not self.exists(): - return {'create'} if 'create' in p_types else set() - p_types.remove('create') # no existing page allows that - if not self.is_filepage(): # only file pages allow upload - p_types.remove('upload') - return p_types - - def has_permission(self, action: str = 'edit') -> bool: - """Determine whether the page can be modified. - - Return True if the bot has the permission of needed restriction level - for the given action type. - - :param action: a valid restriction type like 'edit', 'move' - :raises ValueError: invalid action parameter - """ - return self.site.page_can_be_edited(self, action) - - def botMayEdit(self) -> bool: - """ - Determine whether the active bot is allowed to edit the page. - - This will be True if the page doesn't contain {{bots}} or {{nobots}} - or any other template from edit_restricted_templates list - in x_family.py file, or it contains them and the active bot is allowed - to edit this page. (This method is only useful on those sites that - recognize the bot-exclusion protocol; on other sites, it will always - return True.) - - The framework enforces this restriction by default. It is possible - to override this by setting ignore_bot_templates=True in - user-config.py, or using page.put(force=True). - """ - if not hasattr(self, '_bot_may_edit'): - self._bot_may_edit = self._check_bot_may_edit() - return self._bot_may_edit - - def _check_bot_may_edit(self, module: Optional[str] = None) -> bool: - """A botMayEdit helper method. - - @param module: The module name to be restricted. Defaults to - pywikibot.calledModuleName(). - """ - if not hasattr(self, 'templatesWithParams'): - return True - - if config.ignore_bot_templates: # Check the "master ignore switch" - return True - - username = self.site.username() - try: - templates = self.templatesWithParams() - except (NoPageError, IsRedirectPageError, SectionError): - return True - - # go through all templates and look for any restriction - restrictions = set(self.site.get_edit_restricted_templates()) - - if module is None: - module = pywikibot.calledModuleName() - - # also add archive templates for non-archive bots - if module != 'archivebot': - restrictions.update(self.site.get_archived_page_templates()) - - # multiple bots/nobots templates are allowed - for template, params in templates: - title = template.title(with_ns=False) - - if title in restrictions: - return False - - if title not in ('Bots', 'Nobots'): - continue - - try: - key, sep, value = params[0].partition('=') - except IndexError: - key, sep, value = '', '', '' - names = set() - else: - if not sep: - key, value = value, key - key = key.strip() - names = {name.strip() for name in value.split(',')} - - if len(params) > 1: - pywikibot.warning( - '{{%s|%s}} has more than 1 parameter; taking the first.' - % (title.lower(), '|'.join(params))) - - if title == 'Nobots': - if not params: - return False - - if key: - pywikibot.error( - '%s parameter for {{nobots}} is not allowed. ' - 'Edit declined' % key) - return False - - if 'all' in names or module in names or username in names: - return False - - if title == 'Bots': - if value and not key: - pywikibot.warning( - '{{bots|%s}} is not valid. Ignoring.' % value) - continue - - if key and not value: - pywikibot.warning( - '{{bots|%s=}} is not valid. Ignoring.' % key) - continue - - if key == 'allow': - if not ('all' in names or username in names): - return False - - elif key == 'deny': - if 'all' in names or username in names: - return False - - elif key == 'allowscript': - if not ('all' in names or module in names): - return False - - elif key == 'denyscript': - if 'all' in names or module in names: - return False - - elif key: # ignore unrecognized keys with a warning - pywikibot.warning( - '{{bots|%s}} is not valid. Ignoring.' % params[0]) - - # no restricting template found - return True - - def save(self, - summary: Optional[str] = None, - watch: Optional[str] = None, - minor: bool = True, - botflag: Optional[bool] = None, - force: bool = False, - asynchronous: bool = False, - callback=None, - apply_cosmetic_changes: Optional[bool] = None, - quiet: bool = False, - **kwargs): - """ - Save the current contents of page's text to the wiki. - - .. versionchanged:: 7.0 - boolean watch parameter is deprecated - - :param summary: The edit summary for the modification (optional, but - most wikis strongly encourage its use) - :param watch: Specify how the watchlist is affected by this edit, set - to one of "watch", "unwatch", "preferences", "nochange": - * watch: add the page to the watchlist - * unwatch: remove the page from the watchlist - * preferences: use the preference settings (Default) - * nochange: don't change the watchlist - If None (default), follow bot account's default settings - :param minor: if True, mark this edit as minor - :param botflag: if True, mark this edit as made by a bot (default: - True if user has bot status, False if not) - :param force: if True, ignore botMayEdit() setting - :param asynchronous: if True, launch a separate thread to save - asynchronously - :param callback: a callable object that will be called after the - page put operation. This object must take two arguments: (1) a - Page object, and (2) an exception instance, which will be None - if the page was saved successfully. The callback is intended for - use by bots that need to keep track of which saves were - successful. - :param apply_cosmetic_changes: Overwrites the cosmetic_changes - configuration value to this value unless it's None. - :param quiet: enable/disable successful save operation message; - defaults to False. - In asynchronous mode, if True, it is up to the calling bot to - manage the output e.g. via callback. - """ - if not summary: - summary = config.default_edit_summary - - if isinstance(watch, bool): - issue_deprecation_warning( - 'boolean watch parameter', - '"watch", "unwatch", "preferences" or "nochange" value', - since='7.0.0') - watch = ('unwatch', 'watch')[watch] - - if not force and not self.botMayEdit(): - raise OtherPageSaveError( - self, 'Editing restricted by {{bots}}, {{nobots}} ' - "or site's equivalent of {{in use}} template") - self._save(summary=summary, watch=watch, minor=minor, botflag=botflag, - asynchronous=asynchronous, callback=callback, - cc=apply_cosmetic_changes, quiet=quiet, **kwargs) - - @allow_asynchronous - def _save(self, summary=None, watch=None, minor: bool = True, botflag=None, - cc=None, quiet: bool = False, **kwargs): - """Helper function for save().""" - link = self.title(as_link=True) - if cc or (cc is None and config.cosmetic_changes): - summary = self._cosmetic_changes_hook(summary) - - done = self.site.editpage(self, summary=summary, minor=minor, - watch=watch, bot=botflag, **kwargs) - if not done: - if not quiet: - pywikibot.warning('Page {} not saved'.format(link)) - raise PageSaveRelatedError(self) - if not quiet: - pywikibot.output('Page {} saved'.format(link)) - - def _cosmetic_changes_hook(self, summary: str) -> str: - """The cosmetic changes hook. - - :param summary: The current edit summary. - :return: Modified edit summary if cosmetic changes has been done, - else the old edit summary. - """ - if self.isTalkPage() or self.content_model != 'wikitext' or \ - pywikibot.calledModuleName() in config.cosmetic_changes_deny_script: - return summary - - # check if cosmetic_changes is enabled for this page - family = self.site.family.name - if config.cosmetic_changes_mylang_only: - cc = ((family == config.family and self.site.lang == config.mylang) - or self.site.lang in config.cosmetic_changes_enable.get( - family, [])) - else: - cc = True - cc = cc and self.site.lang not in config.cosmetic_changes_disable.get( - family, []) - cc = cc and self._check_bot_may_edit('cosmetic_changes') - if not cc: - return summary - - old = self.text - pywikibot.log('Cosmetic changes for {}-{} enabled.' - .format(family, self.site.lang)) - # cc depends on page directly and via several other imports - cc_toolkit = CosmeticChangesToolkit(self, ignore=CANCEL.MATCH) - self.text = cc_toolkit.change(old) - - # i18n package changed in Pywikibot 7.0.0 - old_i18n = i18n.twtranslate(self.site, 'cosmetic_changes-append', - fallback_prompt='; cosmetic changes') - if summary and old.strip().replace( - '\r\n', '\n') != self.text.strip().replace('\r\n', '\n'): - summary += i18n.twtranslate(self.site, - 'pywikibot-cosmetic-changes', - fallback_prompt=old_i18n) - return summary - - def put(self, newtext: str, - summary: Optional[str] = None, - watch: Optional[str] = None, - minor: bool = True, - botflag: Optional[bool] = None, - force: bool = False, - asynchronous: bool = False, - callback=None, - show_diff: bool = False, - **kwargs) -> None: - """ - Save the page with the contents of the first argument as the text. - - This method is maintained primarily for backwards-compatibility. - For new code, using Page.save() is preferred. See save() method - docs for all parameters not listed here. - - .. versionadded:: 7.0 - The `show_diff` parameter - - :param newtext: The complete text of the revised page. - :param show_diff: show changes between oldtext and newtext - (default: False) - """ - if show_diff: - pywikibot.showDiff(self.text, newtext) - self.text = newtext - self.save(summary=summary, watch=watch, minor=minor, botflag=botflag, - force=force, asynchronous=asynchronous, callback=callback, - **kwargs) - - def watch(self, unwatch: bool = False) -> bool: - """ - Add or remove this page to/from bot account's watchlist. - - :param unwatch: True to unwatch, False (default) to watch. - :return: True if successful, False otherwise. - """ - return self.site.watch(self, unwatch) - - def clear_cache(self) -> None: - """Clear the cached attributes of the page.""" - self._revisions = {} - for attr in self._cache_attrs: - with suppress(AttributeError): - delattr(self, attr) - - def purge(self, **kwargs) -> bool: - """ - Purge the server's cache for this page. - - :keyword redirects: Automatically resolve redirects. - :type redirects: bool - :keyword converttitles: Convert titles to other variants if necessary. - Only works if the wiki's content language supports variant - conversion. - :type converttitles: bool - :keyword forcelinkupdate: Update the links tables. - :type forcelinkupdate: bool - :keyword forcerecursivelinkupdate: Update the links table, and update - the links tables for any page that uses this page as a template. - :type forcerecursivelinkupdate: bool - """ - self.clear_cache() - return self.site.purgepages([self], **kwargs) - - def touch(self, callback=None, botflag: bool = False, **kwargs): - """ - Make a touch edit for this page. - - See save() method docs for all parameters. - The following parameters will be overridden by this method: - - summary, watch, minor, force, asynchronous - - Parameter botflag is False by default. - - minor and botflag parameters are set to False which prevents hiding - the edit when it becomes a real edit due to a bug. - - :note: This discards content saved to self.text. - """ - if self.exists(): - # ensure always get the page text and not to change it. - del self.text - summary = i18n.twtranslate(self.site, 'pywikibot-touch') - self.save(summary=summary, watch='nochange', - minor=False, botflag=botflag, force=True, - asynchronous=False, callback=callback, - apply_cosmetic_changes=False, nocreate=True, **kwargs) - else: - raise NoPageError(self) - - def linkedPages( - self, *args, **kwargs - ) -> Generator['pywikibot.Page', None, None]: - """Iterate Pages that this Page links to. - - Only returns pages from "normal" internal links. Embedded - templates are omitted but links within them are returned. All - interwiki and external links are omitted. - - For the parameters refer - :py:mod:`APISite.pagelinks<pywikibot.site.APISite.pagelinks>` - - .. versionadded:: 7.0 - the `follow_redirects` keyword argument - .. deprecated:: 7.0 - the positional arguments - - .. seealso:: https://www.mediawiki.org/wiki/API:Links - - :keyword namespaces: Only iterate pages in these namespaces - (default: all) - :type namespaces: iterable of str or Namespace key, - or a single instance of those types. May be a '|' separated - list of namespace identifiers. - :keyword follow_redirects: if True, yields the target of any redirects, - rather than the redirect page - :keyword total: iterate no more than this number of pages in total - :keyword content: if True, load the current content of each page - """ - # Deprecate positional arguments and synchronize with Site.pagelinks - keys = ('namespaces', 'total', 'content') - for i, arg in enumerate(args): - key = keys[i] - issue_deprecation_warning( - 'Positional argument {} ({})'.format(i + 1, arg), - 'keyword argument "{}={}"'.format(key, arg), - since='7.0.0') - if key in kwargs: - pywikibot.warning('{!r} is given as keyword argument {!r} ' - 'already; ignoring {!r}' - .format(key, arg, kwargs[key])) - else: - kwargs[key] = arg - - return self.site.pagelinks(self, **kwargs) - - def interwiki(self, expand: bool = True): - """ - Iterate interwiki links in the page text, excluding language links. - - :param expand: if True (default), include interwiki links found in - templates transcluded onto this page; if False, only iterate - interwiki links found in this page's own wikitext - :return: a generator that yields Link objects - :rtype: generator - """ - # This function does not exist in the API, so it has to be - # implemented by screen-scraping - if expand: - text = self.expand_text() - else: - text = self.text - for linkmatch in pywikibot.link_regex.finditer( - textlib.removeDisabledParts(text)): - linktitle = linkmatch.group('title') - link = Link(linktitle, self.site) - # only yield links that are to a different site and that - # are not language links - try: - if link.site != self.site: - if linktitle.lstrip().startswith(':'): - # initial ":" indicates not a language link - yield link - elif link.site.family != self.site.family: - # link to a different family is not a language link - yield link - except Error: - # ignore any links with invalid contents - continue - - def langlinks(self, include_obsolete: bool = False) -> list: - """ - Return a list of all inter-language Links on this page. - - :param include_obsolete: if true, return even Link objects whose site - is obsolete - :return: list of Link objects. - """ - # Note: We preload a list of *all* langlinks, including links to - # obsolete sites, and store that in self._langlinks. We then filter - # this list if the method was called with include_obsolete=False - # (which is the default) - if not hasattr(self, '_langlinks'): - self._langlinks = list(self.iterlanglinks(include_obsolete=True)) - - if include_obsolete: - return self._langlinks - return [i for i in self._langlinks if not i.site.obsolete] - - def iterlanglinks(self, - total: Optional[int] = None, - include_obsolete: bool = False): - """Iterate all inter-language links on this page. - - :param total: iterate no more than this number of pages in total - :param include_obsolete: if true, yield even Link object whose site - is obsolete - :return: a generator that yields Link objects. - :rtype: generator - """ - if hasattr(self, '_langlinks'): - return iter(self.langlinks(include_obsolete=include_obsolete)) - # XXX We might want to fill _langlinks when the Site - # method is called. If we do this, we'll have to think - # about what will happen if the generator is not completely - # iterated upon. - return self.site.pagelanglinks(self, total=total, - include_obsolete=include_obsolete) - - def data_item(self): - """ - Convenience function to get the Wikibase item of a page. - - :rtype: pywikibot.page.ItemPage - """ - return ItemPage.fromPage(self) - - def templates(self, content: bool = False): - """ - Return a list of Page objects for templates used on this Page. - - Template parameters are ignored. This method only returns embedded - templates, not template pages that happen to be referenced through - a normal link. - - :param content: if True, retrieve the content of the current version - of each template (default False) - :param content: bool - """ - # Data might have been preloaded - if not hasattr(self, '_templates'): - self._templates = list(self.itertemplates(content=content)) - - return self._templates - - def itertemplates(self, - total: Optional[int] = None, - content: bool = False): - """ - Iterate Page objects for templates used on this Page. - - Template parameters are ignored. This method only returns embedded - templates, not template pages that happen to be referenced through - a normal link. - - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each template (default False) - :param content: bool - """ - if hasattr(self, '_templates'): - return iter(self._templates) - return self.site.pagetemplates(self, total=total, content=content) - - def imagelinks(self, total: Optional[int] = None, content: bool = False): - """ - Iterate FilePage objects for images displayed on this Page. - - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each image description page (default False) - :return: a generator that yields FilePage objects. - """ - return self.site.pageimages(self, total=total, content=content) - - def categories(self, - with_sort_key: bool = False, - total: Optional[int] = None, - content: bool = False): - """ - Iterate categories that the article is in. - - :param with_sort_key: if True, include the sort key in each Category. - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each category description page (default False) - :return: a generator that yields Category objects. - :rtype: generator - """ - # FIXME: bug T75561: with_sort_key is ignored by Site.pagecategories - if with_sort_key: - raise NotImplementedError('with_sort_key is not implemented') - - return self.site.pagecategories(self, total=total, content=content) - - def extlinks(self, total: Optional[int] = None): - """ - Iterate all external URLs (not interwiki links) from this page. - - :param total: iterate no more than this number of pages in total - :return: a generator that yields str objects containing URLs. - :rtype: generator - """ - return self.site.page_extlinks(self, total=total) - - def coordinates(self, primary_only: bool = False): - """ - Return a list of Coordinate objects for points on the page. - - Uses the MediaWiki extension GeoData. - - :param primary_only: Only return the coordinate indicated to be primary - :return: A list of Coordinate objects or a single Coordinate if - primary_only is True - :rtype: list of Coordinate or Coordinate or None - """ - if not hasattr(self, '_coords'): - self._coords = [] - self.site.loadcoordinfo(self) - if primary_only: - for coord in self._coords: - if coord.primary: - return coord - return None - return list(self._coords) - - def page_image(self): - """ - Return the most appropriate image on the page. - - Uses the MediaWiki extension PageImages. - - :return: A FilePage object - :rtype: pywikibot.page.FilePage - """ - if not hasattr(self, '_pageimage'): - self._pageimage = None - self.site.loadpageimage(self) - - return self._pageimage - - def getRedirectTarget(self): - """ - Return a Page object for the target this Page redirects to. - - If this page is not a redirect page, will raise an - IsNotRedirectPageError. This method also can raise a NoPageError. - - :rtype: pywikibot.Page - """ - return self.site.getredirtarget(self) - - def moved_target(self): - """ - Return a Page object for the target this Page was moved to. - - If this page was not moved, it will raise a NoMoveTargetError. - This method also works if the source was already deleted. - - :rtype: pywikibot.page.Page - :raises pywikibot.exceptions.NoMoveTargetError: page was not moved - """ - gen = iter(self.site.logevents(logtype='move', page=self, total=1)) - try: - lastmove = next(gen) - except StopIteration: - raise NoMoveTargetError(self) - else: - return lastmove.target_page - - def revisions(self, - reverse: bool = False, - total: Optional[int] = None, - content: bool = False, - starttime=None, endtime=None): - """Generator which loads the version history as Revision instances.""" - # TODO: Only request uncached revisions - self.site.loadrevisions(self, content=content, rvdir=reverse, - starttime=starttime, endtime=endtime, - total=total) - return (self._revisions[rev] for rev in - sorted(self._revisions, reverse=not reverse)[:total]) - - def getVersionHistoryTable(self, - reverse: bool = False, - total: Optional[int] = None): - """Return the version history as a wiki table.""" - result = '{| class="wikitable"\n' - result += '! oldid || date/time || username || edit summary\n' - for entry in self.revisions(reverse=reverse, total=total): - result += '|----\n' - result += ('| {r.revid} || {r.timestamp} || {r.user} || ' - '<nowiki>{r.comment}</nowiki>\n'.format(r=entry)) - result += '|}\n' - return result - - def contributors(self, - total: Optional[int] = None, - starttime=None, endtime=None): - """ - Compile contributors of this page with edit counts. - - :param total: iterate no more than this number of revisions in total - :param starttime: retrieve revisions starting at this Timestamp - :param endtime: retrieve revisions ending at this Timestamp - - :return: number of edits for each username - :rtype: :py:obj:`collections.Counter` - """ - return Counter(rev.user for rev in - self.revisions(total=total, - starttime=starttime, endtime=endtime)) - - def revision_count(self, contributors=None) -> int: - """Determine number of edits from contributors. - - :param contributors: contributor usernames - :type contributors: iterable of str or pywikibot.User, - a single pywikibot.User, a str or None - :return: number of edits for all provided usernames - """ - cnt = self.contributors() - - if not contributors: - return sum(cnt.values()) - - if isinstance(contributors, User): - contributors = contributors.username - - if isinstance(contributors, str): - return cnt[contributors] - - return sum(cnt[user.username] if isinstance(user, User) else cnt[user] - for user in contributors) - - def merge_history(self, dest, timestamp=None, reason=None) -> None: - """ - Merge revisions from this page into another page. - - See :py:obj:`APISite.merge_history` for details. - - :param dest: Destination page to which revisions will be merged - :type dest: pywikibot.Page - :param timestamp: Revisions from this page dating up to this timestamp - will be merged into the destination page (if not given or False, - all revisions will be merged) - :type timestamp: pywikibot.Timestamp - :param reason: Optional reason for the history merge - :type reason: str - """ - self.site.merge_history(self, dest, timestamp, reason) - - def move(self, - newtitle: str, - reason: Optional[str] = None, - movetalk: bool = True, - noredirect: bool = False): - """ - Move this page to a new title. - - :param newtitle: The new page title. - :param reason: The edit summary for the move. - :param movetalk: If true, move this page's talk page (if it exists) - :param noredirect: if move succeeds, delete the old page - (usually requires sysop privileges, depending on wiki settings) - """ - if reason is None: - pywikibot.output('Moving {} to [[{}]].' - .format(self.title(as_link=True), newtitle)) - reason = pywikibot.input('Please enter a reason for the move:') - return self.site.movepage(self, newtitle, reason, - movetalk=movetalk, - noredirect=noredirect) - - def delete( - self, - reason: Optional[str] = None, - prompt: bool = True, - mark: bool = False, - automatic_quit: bool = False, - *, - deletetalk: bool = False - ) -> None: - """ - Delete the page from the wiki. Requires administrator status. - - .. versionchanged:: 7.1 - keyword only parameter *deletetalk* was added. - - :param reason: The edit summary for the deletion, or rationale - for deletion if requesting. If None, ask for it. - :param deletetalk: Also delete the talk page, if it exists. - :param prompt: If true, prompt user for confirmation before deleting. - :param mark: If true, and user does not have sysop rights, place a - speedy-deletion request on the page instead. If false, non-sysops - will be asked before marking pages for deletion. - :param automatic_quit: show also the quit option, when asking - for confirmation. - """ - if reason is None: - pywikibot.output('Deleting {}.'.format(self.title(as_link=True))) - reason = pywikibot.input('Please enter a reason for the deletion:') - - # If user has 'delete' right, delete the page - if self.site.has_right('delete'): - answer = 'y' - if prompt and not hasattr(self.site, '_noDeletePrompt'): - answer = pywikibot.input_choice( - 'Do you want to delete {}?'.format(self.title( - as_link=True, force_interwiki=True)), - [('Yes', 'y'), ('No', 'n'), ('All', 'a')], - 'n', automatic_quit=automatic_quit) - if answer == 'a': - answer = 'y' - self.site._noDeletePrompt = True - if answer == 'y': - self.site.delete(self, reason, deletetalk=deletetalk) - return - - # Otherwise mark it for deletion - if mark or hasattr(self.site, '_noMarkDeletePrompt'): - answer = 'y' - else: - answer = pywikibot.input_choice( - "Can't delete {}; do you want to mark it for deletion instead?" - .format(self), - [('Yes', 'y'), ('No', 'n'), ('All', 'a')], - 'n', automatic_quit=False) - if answer == 'a': - answer = 'y' - self.site._noMarkDeletePrompt = True - if answer == 'y': - template = '{{delete|1=%s}}\n' % reason - # We can't add templates in a wikidata item, so let's use its - # talk page - if isinstance(self, pywikibot.ItemPage): - target = self.toggleTalkPage() - else: - target = self - target.text = template + target.text - target.save(summary=reason) - - def has_deleted_revisions(self) -> bool: - """Return True if the page has deleted revisions. - - .. versionadded:: 4.2 - """ - if not hasattr(self, '_has_deleted_revisions'): - gen = self.site.deletedrevs(self, total=1, prop=['ids']) - self._has_deleted_revisions = bool(list(gen)) - return self._has_deleted_revisions - - def loadDeletedRevisions(self, total: Optional[int] = None, **kwargs): - """ - Retrieve deleted revisions for this Page. - - Stores all revisions' timestamps, dates, editors and comments in - self._deletedRevs attribute. - - :return: iterator of timestamps (which can be used to retrieve - revisions later on). - :rtype: generator - """ - if not hasattr(self, '_deletedRevs'): - self._deletedRevs = {} - for item in self.site.deletedrevs(self, total=total, **kwargs): - for rev in item.get('revisions', []): - self._deletedRevs[rev['timestamp']] = rev - yield rev['timestamp'] - - def getDeletedRevision( - self, - timestamp, - content: bool = False, - **kwargs - ) -> List: - """ - Return a particular deleted revision by timestamp. - - :return: a list of [date, editor, comment, text, restoration - marker]. text will be None, unless content is True (or has - been retrieved earlier). If timestamp is not found, returns - empty list. - """ - if hasattr(self, '_deletedRevs'): - if timestamp in self._deletedRevs and ( - not content - or 'content' in self._deletedRevs[timestamp]): - return self._deletedRevs[timestamp] - - for item in self.site.deletedrevs(self, start=timestamp, - content=content, total=1, **kwargs): - # should only be one item with one revision - if item['title'] == self.title(): - if 'revisions' in item: - return item['revisions'][0] - return [] - - def markDeletedRevision(self, timestamp, undelete: bool = True): - """ - Mark the revision identified by timestamp for undeletion. - - :param undelete: if False, mark the revision to remain deleted. - """ - if not hasattr(self, '_deletedRevs'): - self.loadDeletedRevisions() - if timestamp not in self._deletedRevs: - raise ValueError( - 'Timestamp {} is not a deleted revision' - .format(timestamp)) - self._deletedRevs[timestamp]['marked'] = undelete - - def undelete(self, reason: Optional[str] = None) -> None: - """ - Undelete revisions based on the markers set by previous calls. - - If no calls have been made since loadDeletedRevisions(), everything - will be restored. - - Simplest case:: - - Page(...).undelete('This will restore all revisions') - - More complex:: - - pg = Page(...) - revs = pg.loadDeletedRevisions() - for rev in revs: - if ... #decide whether to undelete a revision - pg.markDeletedRevision(rev) #mark for undeletion - pg.undelete('This will restore only selected revisions.') - - :param reason: Reason for the action. - """ - if hasattr(self, '_deletedRevs'): - undelete_revs = [ts for ts, rev in self._deletedRevs.items() - if 'marked' in rev and rev['marked']] - else: - undelete_revs = [] - if reason is None: - warn('Not passing a reason for undelete() is deprecated.', - DeprecationWarning) - pywikibot.output('Undeleting {}.'.format(self.title(as_link=True))) - reason = pywikibot.input( - 'Please enter a reason for the undeletion:') - self.site.undelete(self, reason, revision=undelete_revs) - - def protect(self, - reason: Optional[str] = None, - protections: Optional[dict] = None, - **kwargs) -> None: - """ - Protect or unprotect a wiki page. Requires administrator status. - - Valid protection levels are '' (equivalent to 'none'), - 'autoconfirmed', 'sysop' and 'all'. 'all' means 'everyone is allowed', - i.e. that protection type will be unprotected. - - In order to unprotect a type of permission, the protection level shall - be either set to 'all' or '' or skipped in the protections dictionary. - - Expiry of protections can be set via kwargs, see Site.protect() for - details. By default there is no expiry for the protection types. - - :param protections: A dict mapping type of protection to protection - level of that type. Allowed protection types for a page can be - retrieved by Page.self.applicable_protections() - Defaults to protections is None, which means unprotect all - protection types. - Example: {'move': 'sysop', 'edit': 'autoconfirmed'} - - :param reason: Reason for the action, default is None and will set an - empty string. - """ - protections = protections or {} # protections is converted to {} - reason = reason or '' # None is converted to '' - - self.site.protect(self, protections, reason, **kwargs) - - def change_category(self, old_cat, new_cat, - summary: Optional[str] = None, - sort_key=None, - in_place: bool = True, - include: Optional[List[str]] = None, - show_diff: bool = False) -> bool: - """ - Remove page from oldCat and add it to newCat. - - .. versionadded:: 7.0 - The `show_diff` parameter - - :param old_cat: category to be removed - :type old_cat: pywikibot.page.Category - :param new_cat: category to be added, if any - :type new_cat: pywikibot.page.Category or None - - :param summary: string to use as an edit summary - - :param sort_key: sortKey to use for the added category. - Unused if newCat is None, or if inPlace=True - If sortKey=True, the sortKey used for oldCat will be used. - - :param in_place: if True, change categories in place rather than - rearranging them. - - :param include: list of tags not to be disabled by default in relevant - textlib functions, where CategoryLinks can be searched. - :param show_diff: show changes between oldtext and newtext - (default: False) - - :return: True if page was saved changed, otherwise False. - """ - # get list of Category objects the article is in and remove possible - # duplicates - cats = [] - for cat in textlib.getCategoryLinks(self.text, site=self.site, - include=include or []): - if cat not in cats: - cats.append(cat) - - if not self.has_permission(): - pywikibot.output("Can't edit {}, skipping it..." - .format(self.title(as_link=True))) - return False - - if old_cat not in cats: - if self.namespace() != 10: - pywikibot.error('{} is not in category {}!' - .format(self.title(as_link=True), - old_cat.title())) - else: - pywikibot.output('{} is not in category {}, skipping...' - .format(self.title(as_link=True), - old_cat.title())) - return False - - # This prevents the bot from adding new_cat if it is already present. - if new_cat in cats: - new_cat = None - - oldtext = self.text - if in_place or self.namespace() == 10: - newtext = textlib.replaceCategoryInPlace(oldtext, old_cat, new_cat, - site=self.site) - else: - old_cat_pos = cats.index(old_cat) - if new_cat: - if sort_key is True: - # Fetch sort_key from old_cat in current page. - sort_key = cats[old_cat_pos].sortKey - cats[old_cat_pos] = Category(self.site, new_cat.title(), - sort_key=sort_key) - else: - cats.pop(old_cat_pos) - - try: - newtext = textlib.replaceCategoryLinks(oldtext, cats) - except ValueError: - # Make sure that the only way replaceCategoryLinks() can return - # a ValueError is in the case of interwiki links to self. - pywikibot.output('Skipping {} because of interwiki link to ' - 'self'.format(self.title())) - return False - - if oldtext != newtext: - try: - self.put(newtext, summary, show_diff=show_diff) - except PageSaveRelatedError as error: - pywikibot.output('Page {} not saved: {}' - .format(self.title(as_link=True), error)) - except NoUsernameError: - pywikibot.output('Page {} not saved; sysop privileges ' - 'required.'.format(self.title(as_link=True))) - else: - return True - - return False - - def is_flow_page(self) -> bool: - """Whether a page is a Flow page.""" - return self.content_model == 'flow-board' - - def create_short_link(self, - permalink: bool = False, - with_protocol: bool = True) -> str: - """ - Return a shortened link that points to that page. - - If shared_urlshortner_wiki is defined in family config, it'll use - that site to create the link instead of the current wiki. - - :param permalink: If true, the link will point to the actual revision - of the page. - :param with_protocol: If true, and if it's not already included, - the link will have http(s) protocol prepended. On Wikimedia wikis - the protocol is already present. - :return: The reduced link. - """ - wiki = self.site - if self.site.family.shared_urlshortner_wiki: - wiki = pywikibot.Site(*self.site.family.shared_urlshortner_wiki) - - url = self.permalink() if permalink else self.full_url() - - link = wiki.create_short_link(url) - if re.match(PROTOCOL_REGEX, link): - if not with_protocol: - return re.sub(PROTOCOL_REGEX, '', link) - elif with_protocol: - return '{}://{}'.format(wiki.protocol(), link) - return link - - -class Page(BasePage): - - """Page: A MediaWiki page.""" - - def __init__(self, source, title: str = '', ns=0) -> None: - """Instantiate a Page object.""" - if isinstance(source, pywikibot.site.BaseSite): - if not title: - raise ValueError('Title must be specified and not empty ' - 'if source is a Site.') - super().__init__(source, title, ns) - - @property - def raw_extracted_templates(self): - """ - Extract templates using :py:obj:`textlib.extract_templates_and_params`. - - Disabled parts and whitespace are stripped, except for - whitespace in anonymous positional arguments. - - This value is cached. - - :rtype: list of (str, OrderedDict) - """ - if not hasattr(self, '_raw_extracted_templates'): - templates = textlib.extract_templates_and_params( - self.text, True, True) - self._raw_extracted_templates = templates - - return self._raw_extracted_templates - - def templatesWithParams(self): - """ - Return templates used on this Page. - - The templates are extracted by - :py:obj:`textlib.extract_templates_and_params`, with positional - arguments placed first in order, and each named argument - appearing as 'name=value'. - - All parameter keys and values for each template are stripped of - whitespace. - - :return: a list of tuples with one tuple for each template invocation - in the page, with the template Page as the first entry and a list - of parameters as the second entry. - :rtype: list of (pywikibot.page.Page, list) - """ - # WARNING: may not return all templates used in particularly - # intricate cases such as template substitution - titles = {t.title() for t in self.templates()} - templates = self.raw_extracted_templates - # backwards-compatibility: convert the dict returned as the second - # element into a list in the format used by old scripts - result = [] - for template in templates: - try: - link = pywikibot.Link(template[0], self.site, - default_namespace=10) - if link.canonical_title() not in titles: - continue - except Error: - # this is a parser function or magic word, not template name - # the template name might also contain invalid parts - continue - args = template[1] - intkeys = {} - named = {} - positional = [] - for key in sorted(args): - try: - intkeys[int(key)] = args[key] - except ValueError: - named[key] = args[key] - for i in range(1, len(intkeys) + 1): - # only those args with consecutive integer keys can be - # treated as positional; an integer could also be used - # (out of order) as the key for a named argument - # example: {{tmp|one|two|5=five|three}} - if i in intkeys: - positional.append(intkeys[i]) - else: - for k in intkeys: - if k < 1 or k >= i: - named[str(k)] = intkeys[k] - break - for item in named.items(): - positional.append('{}={}'.format(*item)) - result.append((pywikibot.Page(link, self.site), positional)) - return result - - def set_redirect_target( - self, - target_page, - create: bool = False, - force: bool = False, - keep_section: bool = False, - save: bool = True, - **kwargs - ): - """ - Change the page's text to point to the redirect page. - - :param target_page: target of the redirect, this argument is required. - :type target_page: pywikibot.Page or string - :param create: if true, it creates the redirect even if the page - doesn't exist. - :param force: if true, it set the redirect target even the page - doesn't exist or it's not redirect. - :param keep_section: if the old redirect links to a section - and the new one doesn't it uses the old redirect's section. - :param save: if true, it saves the page immediately. - :param kwargs: Arguments which are used for saving the page directly - afterwards, like 'summary' for edit summary. - """ - if isinstance(target_page, str): - target_page = pywikibot.Page(self.site, target_page) - elif self.site != target_page.site: - raise InterwikiRedirectPageError(self, target_page) - if not self.exists() and not (create or force): - raise NoPageError(self) - if self.exists() and not self.isRedirectPage() and not force: - raise IsNotRedirectPageError(self) - redirect_regex = self.site.redirect_regex - if self.exists(): - old_text = self.get(get_redirect=True) - else: - old_text = '' - result = redirect_regex.search(old_text) - if result: - oldlink = result.group(1) - if (keep_section and '#' in oldlink - and target_page.section() is None): - sectionlink = oldlink[oldlink.index('#'):] - target_page = pywikibot.Page( - self.site, - target_page.title() + sectionlink - ) - prefix = self.text[:result.start()] - suffix = self.text[result.end():] - else: - prefix = '' - suffix = '' - - target_link = target_page.title(as_link=True, textlink=True, - allow_interwiki=False) - target_link = '#{} {}'.format(self.site.redirect(), target_link) - self.text = prefix + target_link + suffix - if save: - self.save(**kwargs) - - def get_best_claim(self, prop: str): - """ - Return the first best Claim for this page. - - Return the first 'preferred' ranked Claim specified by Wikibase - property or the first 'normal' one otherwise. - - .. versionadded:: 3.0 - - :param prop: property id, "P###" - :return: Claim object given by Wikibase property number - for this page object. - :rtype: pywikibot.Claim or None - - :raises UnknownExtensionError: site has no Wikibase extension - """ - def find_best_claim(claims): - """Find the first best ranked claim.""" - index = None - for i, claim in enumerate(claims): - if claim.rank == 'preferred': - return claim - if index is None and claim.rank == 'normal': - index = i - if index is None: - index = 0 - return claims[index] - - if not self.site.has_data_repository: - raise UnknownExtensionError( - 'Wikibase is not implemented for {}.'.format(self.site)) - - def get_item_page(func, *args): - try: - item_p = func(*args) - item_p.get() - return item_p - except NoPageError: - return None - except IsRedirectPageError: - return get_item_page(item_p.getRedirectTarget) - - item_page = get_item_page(pywikibot.ItemPage.fromPage, self) - if item_page and prop in item_page.claims: - return find_best_claim(item_page.claims[prop]) - return None - - -class FilePage(Page): - - """ - A subclass of Page representing a file description page. - - Supports the same interface as Page, with some added methods. - """ - - def __init__(self, source, title: str = '') -> None: - """Initializer.""" - self._file_revisions = {} # dictionary to cache File history. - super().__init__(source, title, 6) - if self.namespace() != 6: - raise ValueError("'{}' is not in the file namespace!" - .format(self.title())) - - def _load_file_revisions(self, imageinfo) -> None: - for file_rev in imageinfo: - # filemissing in API response indicates most fields are missing - # see https://gerrit.wikimedia.org/r/c/mediawiki/core/+/533482/ - if 'filemissing' in file_rev: - pywikibot.warning("File '{}' contains missing revisions" - .format(self.title())) - continue - file_revision = FileInfo(file_rev) - self._file_revisions[file_revision.timestamp] = file_revision - - @property - def latest_file_info(self): - """ - Retrieve and store information of latest Image rev. of FilePage. - - At the same time, the whole history of Image is fetched and cached in - self._file_revisions - - :return: instance of FileInfo() - """ - if not self._file_revisions: - self.site.loadimageinfo(self, history=True) - latest_ts = max(self._file_revisions) - return self._file_revisions[latest_ts] - - @property - def oldest_file_info(self): - """ - Retrieve and store information of oldest Image rev. of FilePage. - - At the same time, the whole history of Image is fetched and cached in - self._file_revisions - - :return: instance of FileInfo() - """ - if not self._file_revisions: - self.site.loadimageinfo(self, history=True) - oldest_ts = min(self._file_revisions) - return self._file_revisions[oldest_ts] - - def get_file_history(self) -> dict: - """ - Return the file's version history. - - :return: dictionary with: - key: timestamp of the entry - value: instance of FileInfo() - """ - if not self._file_revisions: - self.site.loadimageinfo(self, history=True) - return self._file_revisions - - def getImagePageHtml(self) -> str: - """Download the file page, and return the HTML, as a string. - - Caches the HTML code, so that if you run this method twice on the - same FilePage object, the page will only be downloaded once. - """ - if not hasattr(self, '_imagePageHtml'): - path = '{}/index.php?title={}'.format(self.site.scriptpath(), - self.title(as_url=True)) - self._imagePageHtml = http.request(self.site, path).text - return self._imagePageHtml - - def get_file_url(self, url_width=None, url_height=None, - url_param=None) -> str: - """ - Return the url or the thumburl of the file described on this page. - - Fetch the information if not available. - - Once retrieved, thumburl information will also be accessible as - latest_file_info attributes, named as in [1]: - - url, thumburl, thumbwidth and thumbheight - - Parameters correspond to iiprops in: - [1] https://www.mediawiki.org/wiki/API:Imageinfo - - Parameters validation and error handling left to the API call. - - :param url_width: see iiurlwidth in [1] - :param url_height: see iiurlheigth in [1] - :param url_param: see iiurlparam in [1] - :return: latest file url or thumburl - """ - # Plain url is requested. - if url_width is None and url_height is None and url_param is None: - return self.latest_file_info.url - - # Thumburl is requested. - self.site.loadimageinfo(self, history=not self._file_revisions, - url_width=url_width, url_height=url_height, - url_param=url_param) - return self.latest_file_info.thumburl - - def file_is_shared(self) -> bool: - """Check if the file is stored on any known shared repository.""" - # as of now, the only known repositories are commons and wikitravel - # TODO: put the URLs to family file - if not self.site.has_image_repository: - return False - - if 'wikitravel_shared' in self.site.shared_image_repository(): - return self.latest_file_info.url.startswith( - 'https://wikitravel.org/upload/shared/') - # default to commons - return self.latest_file_info.url.startswith( - 'https://upload.wikimedia.org/wikipedia/commons/') - - def getFileVersionHistoryTable(self): - """Return the version history in the form of a wiki table.""" - lines = [] - for info in self.get_file_history().values(): - dimension = '{width}×{height} px ({size} bytes)'.format( - **info.__dict__) - lines.append('| {timestamp} || {user} || {dimension} |' - '| <nowiki>{comment}</nowiki>' - .format(dimension=dimension, **info.__dict__)) - return ('{| class="wikitable"\n' - '! {{int:filehist-datetime}} || {{int:filehist-user}} |' - '| {{int:filehist-dimensions}} || {{int:filehist-comment}}\n' - '|-\n%s\n|}\n' % '\n|-\n'.join(lines)) - - def usingPages(self, total: Optional[int] = None, content: bool = False): - """Yield Pages on which the file is displayed. - - :param total: iterate no more than this number of pages in total - :param content: if True, load the current content of each iterated page - (default False) - """ - return self.site.imageusage(self, total=total, content=content) - - def upload(self, source: str, **kwargs) -> bool: - """ - Upload this file to the wiki. - - keyword arguments are from site.upload() method. - - :param source: Path or URL to the file to be uploaded. - - :keyword comment: Edit summary; if this is not provided, then - filepage.text will be used. An empty summary is not permitted. - This may also serve as the initial page text (see below). - :keyword text: Initial page text; if this is not set, then - filepage.text will be used, or comment. - :keyword watch: If true, add filepage to the bot user's watchlist - :keyword ignore_warnings: It may be a static boolean, a callable - returning a boolean or an iterable. The callable gets a list of - UploadError instances and the iterable should contain the warning - codes for which an equivalent callable would return True if all - UploadError codes are in thet list. If the result is False it'll - not continue uploading the file and otherwise disable any warning - and reattempt to upload the file. NOTE: If report_success is True - or None it'll raise an UploadError exception if the static - boolean is False. - :type ignore_warnings: bool or callable or iterable of str - :keyword chunk_size: The chunk size in bytesfor chunked uploading (see - https://www.mediawiki.org/wiki/API:Upload#Chunked_uploading). It - will only upload in chunks, if the chunk size is positive but lower - than the file size. - :type chunk_size: int - :keyword report_success: If the upload was successful it'll print a - success message and if ignore_warnings is set to False it'll - raise an UploadError if a warning occurred. If it's - None (default) it'll be True if ignore_warnings is a bool and False - otherwise. If it's True or None ignore_warnings must be a bool. - :return: It returns True if the upload was successful and False - otherwise. - """ - filename = url = None - if '://' in source: - url = source - else: - filename = source - return self.site.upload(self, source_filename=filename, source_url=url, - **kwargs) - - def download(self, filename=None, chunk_size=100 * 1024, revision=None): - """ - Download to filename file of FilePage. - - :param filename: filename where to save file: - None: self.title(as_filename=True, with_ns=False) - will be used - str: provided filename will be used. - :type filename: None or str - :param chunk_size: the size of each chunk to be received and - written to file. - :type chunk_size: int - :param revision: file revision to download: - None: self.latest_file_info will be used - FileInfo: provided revision will be used. - :type revision: None or FileInfo - :return: True if download is successful, False otherwise. - :raise IOError: if filename cannot be written for any reason. - """ - if filename is None: - filename = self.title(as_filename=True, with_ns=False) - - filename = os.path.expanduser(filename) - - if revision is None: - revision = self.latest_file_info - - req = http.fetch(revision.url, stream=True) - if req.status_code == HTTPStatus.OK: - try: - with open(filename, 'wb') as f: - for chunk in req.iter_content(chunk_size): - f.write(chunk) - except OSError as e: - raise e - - sha1 = compute_file_hash(filename) - return sha1 == revision.sha1 - pywikibot.warning( - 'Unsuccessfull request ({}): {}' - .format(req.status_code, req.url)) - return False - - def globalusage(self, total=None): - """ - Iterate all global usage for this page. - - :param total: iterate no more than this number of pages in total - :return: a generator that yields Pages also on sites different from - self.site. - :rtype: generator - """ - return self.site.globalusage(self, total=total) - - def data_item(self): - """ - Convenience function to get the associated Wikibase item of the file. - - If WikibaseMediaInfo extension is available (e.g. on Commons), - the method returns the associated mediainfo entity. Otherwise, - it falls back to behavior of BasePage.data_item. - - .. versionadded:: 6.5 - - :rtype: pywikibot.page.WikibaseEntity - """ - if self.site.has_extension('WikibaseMediaInfo'): - if not hasattr(self, '_item'): - self._item = MediaInfo(self.site) - self._item._file = self - return self._item - - return super().data_item() - - -class Category(Page): - - """A page in the Category: namespace.""" - - def __init__(self, source, title: str = '', sort_key=None) -> None: - """ - Initializer. - - All parameters are the same as for Page() Initializer. - """ - self.sortKey = sort_key - super().__init__(source, title, ns=14) - if self.namespace() != 14: - raise ValueError("'{}' is not in the category namespace!" - .format(self.title())) - - def aslink(self, sort_key: Optional[str] = None) -> str: - """ - Return a link to place a page in this Category. - - Use this only to generate a "true" category link, not for interwikis - or text links to category pages. - - :param sort_key: The sort key for the article to be placed in this - Category; if omitted, default sort key is used. - """ - key = sort_key or self.sortKey - if key is not None: - title_with_sort_key = self.title(with_section=False) + '|' + key - else: - title_with_sort_key = self.title(with_section=False) - return '[[{}]]'.format(title_with_sort_key) - - def subcategories(self, - recurse: Union[int, bool] = False, - total: Optional[int] = None, - content: bool = False): - """ - Iterate all subcategories of the current category. - - :param recurse: if not False or 0, also iterate subcategories of - subcategories. If an int, limit recursion to this number of - levels. (Example: recurse=1 will iterate direct subcats and - first-level sub-sub-cats, but no deeper.) - :param total: iterate no more than this number of - subcategories in total (at all levels) - :param content: if True, retrieve the content of the current version - of each category description page (default False) - """ - if not isinstance(recurse, bool) and recurse: - recurse = recurse - 1 - if not hasattr(self, '_subcats'): - self._subcats = [] - for member in self.site.categorymembers( - self, member_type='subcat', total=total, content=content): - subcat = Category(member) - self._subcats.append(subcat) - yield subcat - if total is not None: - total -= 1 - if total == 0: - return - if recurse: - for item in subcat.subcategories( - recurse, total=total, content=content): - yield item - if total is not None: - total -= 1 - if total == 0: - return - else: - for subcat in self._subcats: - yield subcat - if total is not None: - total -= 1 - if total == 0: - return - if recurse: - for item in subcat.subcategories( - recurse, total=total, content=content): - yield item - if total is not None: - total -= 1 - if total == 0: - return - - def articles(self, - recurse: Union[int, bool] = False, - total: Optional[int] = None, - content: bool = False, - namespaces: Union[int, List[int]] = None, - sortby: Optional[str] = None, - reverse: bool = False, - starttime=None, endtime=None, - startprefix: Optional[str] = None, - endprefix: Optional[str] = None): - """ - Yield all articles in the current category. - - By default, yields all *pages* in the category that are not - subcategories! - - :param recurse: if not False or 0, also iterate articles in - subcategories. If an int, limit recursion to this number of - levels. (Example: recurse=1 will iterate articles in first-level - subcats, but no deeper.) - :param total: iterate no more than this number of pages in - total (at all levels) - :param namespaces: only yield pages in the specified namespaces - :param content: if True, retrieve the content of the current version - of each page (default False) - :param sortby: determines the order in which results are generated, - valid values are "sortkey" (default, results ordered by category - sort key) or "timestamp" (results ordered by time page was - added to the category). This applies recursively. - :param reverse: if True, generate results in reverse order - (default False) - :param starttime: if provided, only generate pages added after this - time; not valid unless sortby="timestamp" - :type starttime: pywikibot.Timestamp - :param endtime: if provided, only generate pages added before this - time; not valid unless sortby="timestamp" - :type endtime: pywikibot.Timestamp - :param startprefix: if provided, only generate pages >= this title - lexically; not valid if sortby="timestamp" - :param endprefix: if provided, only generate pages < this title - lexically; not valid if sortby="timestamp" - :rtype: typing.Iterable[pywikibot.Page] - """ - seen = set() - for member in self.site.categorymembers(self, - namespaces=namespaces, - total=total, - content=content, - sortby=sortby, - reverse=reverse, - starttime=starttime, - endtime=endtime, - startprefix=startprefix, - endprefix=endprefix, - member_type=['page', 'file']): - if recurse: - seen.add(hash(member)) - yield member - if total is not None: - total -= 1 - if total == 0: - return - - if recurse: - if not isinstance(recurse, bool) and recurse: - recurse -= 1 - for subcat in self.subcategories(): - for article in subcat.articles(recurse=recurse, - total=total, - content=content, - namespaces=namespaces, - sortby=sortby, - reverse=reverse, - starttime=starttime, - endtime=endtime, - startprefix=startprefix, - endprefix=endprefix): - hash_value = hash(article) - if hash_value in seen: - continue - seen.add(hash_value) - yield article - if total is not None: - total -= 1 - if total == 0: - return - - def members(self, recurse: bool = False, - namespaces=None, - total: Optional[int] = None, - content: bool = False): - """Yield all category contents (subcats, pages, and files). - - :rtype: typing.Iterable[pywikibot.Page] - """ - for member in self.site.categorymembers( - self, namespaces=namespaces, total=total, content=content): - yield member - if total is not None: - total -= 1 - if total == 0: - return - if recurse: - if not isinstance(recurse, bool) and recurse: - recurse = recurse - 1 - for subcat in self.subcategories(): - for article in subcat.members( - recurse, namespaces, total=total, content=content): - yield article - if total is not None: - total -= 1 - if total == 0: - return - - def isEmptyCategory(self) -> bool: - """Return True if category has no members (including subcategories).""" - ci = self.categoryinfo - return sum(ci[k] for k in ['files', 'pages', 'subcats']) == 0 - - def isHiddenCategory(self) -> bool: - """Return True if the category is hidden.""" - return 'hiddencat' in self.properties() - - @property - def categoryinfo(self) -> dict: - """ - Return a dict containing information about the category. - - The dict contains values for: - - Numbers of pages, subcategories, files, and total contents. - """ - return self.site.categoryinfo(self) - - def newest_pages(self, total=None): - """ - Return pages in a category ordered by the creation date. - - If two or more pages are created at the same time, the pages are - returned in the order they were added to the category. The most - recently added page is returned first. - - It only allows to return the pages ordered from newest to oldest, as it - is impossible to determine the oldest page in a category without - checking all pages. But it is possible to check the category in order - with the newly added first and it yields all pages which were created - after the currently checked page was added (and thus there is no page - created after any of the cached but added before the currently - checked). - - :param total: The total number of pages queried. - :type total: int - :return: A page generator of all pages in a category ordered by the - creation date. From newest to oldest. Note: It currently only - returns Page instances and not a subclass of it if possible. This - might change so don't expect to only get Page instances. - :rtype: generator - """ - def check_cache(latest): - """Return the cached pages in order and not more than total.""" - cached = [] - for timestamp in sorted((ts for ts in cache if ts > latest), - reverse=True): - # The complete list can be removed, it'll either yield all of - # them, or only a portion but will skip the rest anyway - cached += cache.pop(timestamp)[:None if total is None else - total - len(cached)] - if total and len(cached) >= total: - break # already got enough - assert total is None or len(cached) <= total, \ - 'Number of caches is more than total number requested' - return cached - - # all pages which have been checked but where created before the - # current page was added, at some point they will be created after - # the current page was added. It saves all pages via the creation - # timestamp. Be prepared for multiple pages. - cache = defaultdict(list) - # TODO: Make site.categorymembers is usable as it returns pages - # There is no total defined, as it's not known how many pages need to - # be checked before the total amount of new pages was found. In worst - # case all pages of a category need to be checked. - for member in pywikibot.data.api.QueryGenerator( - site=self.site, parameters={ - 'list': 'categorymembers', 'cmsort': 'timestamp', - 'cmdir': 'older', 'cmprop': 'timestamp|title', - 'cmtitle': self.title()}): - # TODO: Upcast to suitable class - page = pywikibot.Page(self.site, member['title']) - assert page.namespace() == member['ns'], \ - 'Namespace of the page is not consistent' - cached = check_cache(pywikibot.Timestamp.fromISOformat( - member['timestamp'])) - yield from cached - if total is not None: - total -= len(cached) - if total <= 0: - break - cache[page.oldest_revision.timestamp] += [page] - else: - # clear cache - assert total is None or total > 0, \ - 'As many items as given in total already returned' - yield from check_cache(pywikibot.Timestamp.min) - - -class User(Page): - - """ - A class that represents a Wiki user. - - This class also represents the Wiki page User:<username> - """ - - def __init__(self, source, title: str = '') -> None: - """ - Initializer for a User object. - - All parameters are the same as for Page() Initializer. - """ - self._isAutoblock = True - if title.startswith('#'): - title = title[1:] - elif ':#' in title: - title = title.replace(':#', ':') - else: - self._isAutoblock = False - super().__init__(source, title, ns=2) - if self.namespace() != 2: - raise ValueError("'{}' is not in the user namespace!" - .format(self.title())) - if self._isAutoblock: - # This user is probably being queried for purpose of lifting - # an autoblock. - pywikibot.output( - 'This is an autoblock ID, you can only use to unblock it.') - - @property - def username(self) -> str: - """ - The username. - - Convenience method that returns the title of the page with - namespace prefix omitted, which is the username. - """ - if self._isAutoblock: - return '#' + self.title(with_ns=False) - return self.title(with_ns=False) - - def isRegistered(self, force: bool = False) -> bool: - """ - Determine if the user is registered on the site. - - It is possible to have a page named User:xyz and not have - a corresponding user with username xyz. - - The page does not need to exist for this method to return - True. - - :param force: if True, forces reloading the data from API - """ - # T135828: the registration timestamp may be None but the key exists - return (not self.isAnonymous() - and 'registration' in self.getprops(force)) - - def isAnonymous(self) -> bool: - """Determine if the user is editing as an IP address.""" - return is_ip_address(self.username) - - def getprops(self, force: bool = False) -> dict: - """ - Return a properties about the user. - - :param force: if True, forces reloading the data from API - """ - if force and hasattr(self, '_userprops'): - del self._userprops - if not hasattr(self, '_userprops'): - self._userprops = list(self.site.users([self.username, ]))[0] - if self.isAnonymous(): - r = list(self.site.blocks(iprange=self.username, total=1)) - if r: - self._userprops['blockedby'] = r[0]['by'] - self._userprops['blockreason'] = r[0]['reason'] - return self._userprops - - def registration(self, force: bool = False): - """ - Fetch registration date for this user. - - :param force: if True, forces reloading the data from API - :rtype: pywikibot.Timestamp or None - """ - if not self.isAnonymous(): - reg = self.getprops(force).get('registration') - if reg: - return pywikibot.Timestamp.fromISOformat(reg) - return None - - def editCount(self, force: bool = False) -> int: - """ - Return edit count for a registered user. - - Always returns 0 for 'anonymous' users. - - :param force: if True, forces reloading the data from API - """ - return self.getprops(force).get('editcount', 0) - - def is_blocked(self, force: bool = False) -> bool: - """Determine whether the user is currently blocked. - - .. versionchanged:: 7.0 - renamed from :meth:`isBlocked` method, - can also detect range blocks. - - :param force: if True, forces reloading the data from API - """ - return 'blockedby' in self.getprops(force) - - @deprecated('is_blocked', since='7.0.0') - def isBlocked(self, force: bool = False) -> bool: - """Determine whether the user is currently blocked. - - .. deprecated:: 7.0 - use :meth:`is_blocked` instead - - :param force: if True, forces reloading the data from API - """ - return self.is_blocked(force) - - def is_locked(self, force: bool = False) -> bool: - """Determine whether the user is currently locked globally. - - .. versionadded:: 7.0 - - :param force: if True, forces reloading the data from API - """ - return self.site.is_locked(self.username, force) - - def isEmailable(self, force: bool = False) -> bool: - """ - Determine whether emails may be send to this user through MediaWiki. - - :param force: if True, forces reloading the data from API - """ - return not self.isAnonymous() and 'emailable' in self.getprops(force) - - def groups(self, force: bool = False) -> list: - """ - Return a list of groups to which this user belongs. - - The list of groups may be empty. - - :param force: if True, forces reloading the data from API - :return: groups property - """ - return self.getprops(force).get('groups', []) - - def gender(self, force: bool = False) -> str: - """Return the gender of the user. - - :param force: if True, forces reloading the data from API - :return: return 'male', 'female', or 'unknown' - """ - if self.isAnonymous(): - return 'unknown' - return self.getprops(force).get('gender', 'unknown') - - def rights(self, force: bool = False) -> list: - """Return user rights. - - :param force: if True, forces reloading the data from API - :return: return user rights - """ - return self.getprops(force).get('rights', []) - - def getUserPage(self, subpage: str = ''): - """ - Return a Page object relative to this user's main page. - - :param subpage: subpage part to be appended to the main - page title (optional) - :type subpage: str - :return: Page object of user page or user subpage - :rtype: pywikibot.Page - """ - if self._isAutoblock: - # This user is probably being queried for purpose of lifting - # an autoblock, so has no user pages per se. - raise AutoblockUserError( - 'This is an autoblock ID, you can only use to unblock it.') - if subpage: - subpage = '/' + subpage - return Page(Link(self.title() + subpage, self.site)) - - def getUserTalkPage(self, subpage: str = ''): - """ - Return a Page object relative to this user's main talk page. - - :param subpage: subpage part to be appended to the main - talk page title (optional) - :type subpage: str - :return: Page object of user talk page or user talk subpage - :rtype: pywikibot.Page - """ - if self._isAutoblock: - # This user is probably being queried for purpose of lifting - # an autoblock, so has no user talk pages per se. - raise AutoblockUserError( - 'This is an autoblock ID, you can only use to unblock it.') - if subpage: - subpage = '/' + subpage - return Page(Link(self.username + subpage, - self.site, default_namespace=3)) - - def send_email(self, subject: str, text: str, ccme: bool = False) -> bool: - """ - Send an email to this user via MediaWiki's email interface. - - :param subject: the subject header of the mail - :param text: mail body - :param ccme: if True, sends a copy of this email to the bot - :raises NotEmailableError: the user of this User is not emailable - :raises UserRightsError: logged in user does not have 'sendemail' right - :return: operation successful indicator - """ - if not self.isEmailable(): - raise NotEmailableError(self) - - if not self.site.has_right('sendemail'): - raise UserRightsError("You don't have permission to send mail") - - params = { - 'action': 'emailuser', - 'target': self.username, - 'token': self.site.tokens['email'], - 'subject': subject, - 'text': text, - } - if ccme: - params['ccme'] = 1 - mailrequest = self.site.simple_request(**params) - maildata = mailrequest.submit() - - if 'emailuser' in maildata: - if maildata['emailuser']['result'] == 'Success': - return True - return False - - def block(self, *args, **kwargs): - """ - Block user. - - Refer :py:obj:`APISite.blockuser` method for parameters. - - :return: None - """ - try: - self.site.blockuser(self, *args, **kwargs) - except APIError as err: - if err.code == 'invalidrange': - raise ValueError('{} is not a valid IP range.' - .format(self.username)) - - raise err - - def unblock(self, reason: Optional[str] = None) -> None: - """ - Remove the block for the user. - - :param reason: Reason for the unblock. - """ - self.site.unblockuser(self, reason) - - def logevents(self, **kwargs): - """Yield user activities. - - :keyword logtype: only iterate entries of this type - (see mediawiki api documentation for available types) - :type logtype: str - :keyword page: only iterate entries affecting this page - :type page: Page or str - :keyword namespace: namespace to retrieve logevents from - :type namespace: int or Namespace - :keyword start: only iterate entries from and after this Timestamp - :type start: Timestamp or ISO date string - :keyword end: only iterate entries up to and through this Timestamp - :type end: Timestamp or ISO date string - :keyword reverse: if True, iterate oldest entries first - (default: newest) - :type reverse: bool - :keyword tag: only iterate entries tagged with this tag - :type tag: str - :keyword total: maximum number of events to iterate - :type total: int - :rtype: iterable - """ - return self.site.logevents(user=self.username, **kwargs) - - @property - def last_event(self): - """Return last user activity. - - :return: last user log entry - :rtype: LogEntry or None - """ - return next(iter(self.logevents(total=1)), None) - - def contributions(self, total: int = 500, **kwargs) -> tuple: - """ - Yield tuples describing this user edits. - - Each tuple is composed of a pywikibot.Page object, - the revision id (int), the edit timestamp (as a pywikibot.Timestamp - object), and the comment (str). - Pages returned are not guaranteed to be unique. - - :param total: limit result to this number of pages - :keyword start: Iterate contributions starting at this Timestamp - :keyword end: Iterate contributions ending at this Timestamp - :keyword reverse: Iterate oldest contributions first (default: newest) - :keyword namespaces: only iterate pages in these namespaces - :type namespaces: iterable of str or Namespace key, - or a single instance of those types. May be a '|' separated - list of namespace identifiers. - :keyword showMinor: if True, iterate only minor edits; if False and - not None, iterate only non-minor edits (default: iterate both) - :keyword top_only: if True, iterate only edits which are the latest - revision (default: False) - :return: tuple of pywikibot.Page, revid, pywikibot.Timestamp, comment - """ - for contrib in self.site.usercontribs( - user=self.username, total=total, **kwargs): - ts = pywikibot.Timestamp.fromISOformat(contrib['timestamp']) - yield (Page(self.site, contrib['title'], contrib['ns']), - contrib['revid'], - ts, - contrib.get('comment')) - - @property - def first_edit(self): - """Return first user contribution. - - :return: first user contribution entry - :return: tuple of pywikibot.Page, revid, pywikibot.Timestamp, comment - :rtype: tuple or None - """ - return next(self.contributions(reverse=True, total=1), None) - - @property - def last_edit(self): - """Return last user contribution. - - :return: last user contribution entry - :return: tuple of pywikibot.Page, revid, pywikibot.Timestamp, comment - :rtype: tuple or None - """ - return next(self.contributions(total=1), None) - - def deleted_contributions( - self, *, total: int = 500, **kwargs - ) -> Iterable[Tuple[Page, Revision]]: - """Yield tuples describing this user's deleted edits. - - .. versionadded:: 5.5 - - :param total: Limit results to this number of pages - :keyword start: Iterate contributions starting at this Timestamp - :keyword end: Iterate contributions ending at this Timestamp - :keyword reverse: Iterate oldest contributions first (default: newest) - :keyword namespaces: Only iterate pages in these namespaces - """ - for data in self.site.alldeletedrevisions(user=self.username, - total=total, **kwargs): - page = Page(self.site, data['title'], data['ns']) - for contrib in data['revisions']: - yield page, Revision(**contrib) - - def uploadedImages(self, total=10): - """ - Yield tuples describing files uploaded by this user. - - Each tuple is composed of a pywikibot.Page, the timestamp (str in - ISO8601 format), comment (str) and a bool for pageid > 0. - Pages returned are not guaranteed to be unique. - - :param total: limit result to this number of pages - :type total: int - """ - if not self.isRegistered(): - return - for item in self.logevents(logtype='upload', total=total): - yield (item.page(), - str(item.timestamp()), - item.comment(), - item.pageid() > 0) - - @property - def is_thankable(self) -> bool: - """ - Determine if the user has thanks notifications enabled. - - NOTE: This doesn't accurately determine if thanks is enabled for user. - Privacy of thanks preferences is under discussion, please see - https://phabricator.wikimedia.org/T57401#2216861, and - https://phabricator.wikimedia.org/T120753#1863894 - """ - return self.isRegistered() and 'bot' not in self.groups() - class WikibaseEntity: @@ -3810,7 +547,7 @@ data = WikibaseEntity.get(self, force=force) except NoWikibaseEntityError: if lazy_loading_id: - p = Page(self._site, self._title) + p = pywikibot.Page(self._site, self._title) if not p.exists(): raise NoPageError(p) # todo: raise a nicer exception here (T87345) @@ -5119,38 +1856,3 @@ 'value': self._formatValue(), 'type': self.value_types.get(self.type, self.type) } - - -class FileInfo: - - """ - A structure holding imageinfo of latest rev. of FilePage. - - All keys of API imageinfo dictionary are mapped to FileInfo attributes. - Attributes can be retrieved both as self['key'] or self.key. - - Following attributes will be returned: - - timestamp, user, comment, url, size, sha1, mime, metadata - - archivename (not for latest revision) - - See Site.loadimageinfo() for details. - - Note: timestamp will be casted to pywikibot.Timestamp. - """ - - def __init__(self, file_revision) -> None: - """Initiate the class using the dict from L{APISite.loadimageinfo}.""" - self.__dict__.update(file_revision) - self.timestamp = pywikibot.Timestamp.fromISOformat(self.timestamp) - - def __getitem__(self, key): - """Give access to class values by key.""" - return getattr(self, key) - - def __repr__(self) -> str: - """Return a more complete string representation.""" - return repr(self.__dict__) - - def __eq__(self, other): - """Test if two File_info objects are equal.""" - return self.__dict__ == other.__dict__ diff --git a/tox.ini b/tox.ini index 12254a4..4e33252 100644 --- a/tox.ini +++ b/tox.ini @@ -148,8 +148,9 @@ pywikibot/fixes.py: E241 pywikibot/interwiki_graph.py: N802, N803, N806 pywikibot/login.py: N802, N816 - pywikibot/page/_basepage.py: N802 pywikibot/page/_collections.py: N802 + pywikibot/page/_pages.py: N802 + pywikibot/page/_wikibase.py: N802 pywikibot/pagegenerators.py: N802, N803, N806, N816 pywikibot/scripts/generate_family_file.py: T001 pywikibot/site/_datasite.py: N802 -- To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/771987 To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Change-Id: If7a3d04f61f7b9afbc8e9cf5209c779805b34d23 Gerrit-Change-Number: 771987 Gerrit-PatchSet: 7 Gerrit-Owner: Xqt <info(a)gno.de> Gerrit-Reviewer: Xqt <info(a)gno.de> Gerrit-Reviewer: jenkins-bot Gerrit-MessageType: merged

2 years, 2 months

[Gerrit] ...core[master]: [IMPR] Rename _basepage and _cateory to split page into parts

by Xqt (Code Review)

Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/772012 ) Change subject: [IMPR] Rename _basepage and _cateory to split page into parts ...................................................................... [IMPR] Rename _basepage and _cateory to split page into parts Change-Id: I09959ee747a39aafa3b7cb5084e69df42619c674 --- D pywikibot/page/_category.py C pywikibot/page/_pages.py R pywikibot/page/_wikibase.py 3 files changed, 0 insertions(+), 5,156 deletions(-) Approvals: Xqt: Verified; Looks good to me, approved diff --git a/pywikibot/page/_category.py b/pywikibot/page/_category.py deleted file mode 100644 index 6dfe675..0000000 --- a/pywikibot/page/_category.py +++ /dev/null @@ -1,5156 +0,0 @@ -""" -Objects representing various types of MediaWiki, including Wikibase, pages. - -This module also includes objects: - -* Property: a type of semantic data. -* Claim: an instance of a semantic assertion. -* Revision: a single change to a wiki page. -* FileInfo: a structure holding imageinfo of latest rev. of FilePage -""" -# -# (C) Pywikibot team, 2008-2022 -# -# Distributed under the terms of the MIT license. -# -import json as jsonlib -import logging -import os.path -import re -from collections import Counter, OrderedDict, defaultdict -from contextlib import suppress -from http import HTTPStatus -from itertools import chain -from textwrap import shorten, wrap -from typing import Any, Optional, Union -from urllib.parse import quote_from_bytes -from warnings import warn - -import pywikibot -from pywikibot import config, date, i18n, textlib -from pywikibot.backports import Dict, Generator, Iterable, List, Tuple -from pywikibot.comms import http -from pywikibot.cosmetic_changes import CANCEL, CosmeticChangesToolkit -from pywikibot.exceptions import ( - APIError, - AutoblockUserError, - EntityTypeUnknownError, - Error, - InterwikiRedirectPageError, - InvalidPageError, - InvalidTitleError, - IsNotRedirectPageError, - IsRedirectPageError, - NoMoveTargetError, - NoPageError, - NotEmailableError, - NoUsernameError, - NoWikibaseEntityError, - OtherPageSaveError, - PageSaveRelatedError, - SectionError, - UnknownExtensionError, - UserRightsError, - WikiBaseError, -) -from pywikibot.family import Family -from pywikibot.page._collections import ( - AliasesDict, - ClaimCollection, - LanguageDict, - SiteLinkCollection, -) -from pywikibot.page._decorators import allow_asynchronous -from pywikibot.page._links import BaseLink, Link -from pywikibot.page._revision import Revision -from pywikibot.site import DataSite, Namespace, NamespaceArgType -from pywikibot.tools import ( - ComparableMixin, - compute_file_hash, - deprecated, - first_upper, - is_ip_address, - issue_deprecation_warning, - remove_last_args, -) - - -PROTOCOL_REGEX = r'\Ahttps?://' - -__all__ = ( - 'BasePage', - 'Category', - 'Claim', - 'FileInfo', - 'FilePage', - 'ItemPage', - 'MediaInfo', - 'Page', - 'Property', - 'PropertyPage', - 'User', - 'WikibaseEntity', - 'WikibasePage', -) - -logger = logging.getLogger('pywiki.wiki.page') - - -# Note: Link objects (defined later on) represent a wiki-page's title, while -# Page objects (defined here) represent the page itself, including its -# contents. - -class BasePage(ComparableMixin): - - """ - BasePage: Base object for a MediaWiki page. - - This object only implements internally methods that do not require - reading from or writing to the wiki. All other methods are delegated - to the Site object. - - Will be subclassed by Page, WikibasePage, and FlowPage. - """ - - _cache_attrs = ( - '_text', '_pageid', '_catinfo', '_templates', '_protection', - '_contentmodel', '_langlinks', '_isredir', '_coords', - '_preloadedtext', '_timestamp', '_applicable_protections', - '_flowinfo', '_quality', '_pageprops', '_revid', '_quality_text', - '_pageimage', '_item', '_lintinfo', - ) - - def __init__(self, source, title: str = '', ns=0) -> None: - """ - Instantiate a Page object. - - Three calling formats are supported: - - - If the first argument is a Page, create a copy of that object. - This can be used to convert an existing Page into a subclass - object, such as Category or FilePage. (If the title is also - given as the second argument, creates a copy with that title; - this is used when pages are moved.) - - If the first argument is a Site, create a Page on that Site - using the second argument as the title (may include a section), - and the third as the namespace number. The namespace number is - mandatory, even if the title includes the namespace prefix. This - is the preferred syntax when using an already-normalized title - obtained from api.php or a database dump. WARNING: may produce - invalid objects if page title isn't in normal form! - - If the first argument is a BaseLink, create a Page from that link. - This is the preferred syntax when using a title scraped from - wikitext, URLs, or another non-normalized source. - - :param source: the source of the page - :type source: pywikibot.page.BaseLink (or subclass), - pywikibot.page.Page (or subclass), or pywikibot.page.Site - :param title: normalized title of the page; required if source is a - Site, ignored otherwise - :type title: str - :param ns: namespace number; required if source is a Site, ignored - otherwise - :type ns: int - """ - if title is None: - raise ValueError('Title cannot be None.') - - if isinstance(source, pywikibot.site.BaseSite): - self._link = Link(title, source=source, default_namespace=ns) - self._revisions = {} - elif isinstance(source, Page): - # copy all of source's attributes to this object - # without overwriting non-None values - self.__dict__.update((k, v) for k, v in source.__dict__.items() - if k not in self.__dict__ - or self.__dict__[k] is None) - if title: - # overwrite title - self._link = Link(title, source=source.site, - default_namespace=ns) - elif isinstance(source, BaseLink): - self._link = source - self._revisions = {} - else: - raise Error( - "Invalid argument type '{}' in Page initializer: {}" - .format(type(source), source)) - - @property - def site(self): - """Return the Site object for the wiki on which this Page resides. - - :rtype: pywikibot.Site - """ - return self._link.site - - def version(self): - """ - Return MediaWiki version number of the page site. - - This is needed to use @need_version() decorator for methods of - Page objects. - """ - return self.site.version() - - @property - def image_repository(self): - """Return the Site object for the image repository.""" - return self.site.image_repository() - - @property - def data_repository(self): - """Return the Site object for the data repository.""" - return self.site.data_repository() - - def namespace(self): - """ - Return the namespace of the page. - - :return: namespace of the page - :rtype: pywikibot.Namespace - """ - return self._link.namespace - - @property - def content_model(self): - """ - Return the content model for this page. - - If it cannot be reliably determined via the API, - None is returned. - """ - if not hasattr(self, '_contentmodel'): - self.site.loadpageinfo(self) - return self._contentmodel - - @property - def depth(self): - """Return the depth/subpage level of the page.""" - if not hasattr(self, '_depth'): - # Check if the namespace allows subpages - if self.namespace().subpages: - self._depth = self.title().count('/') - else: - # Does not allow subpages, which means depth is always 0 - self._depth = 0 - - return self._depth - - @property - def pageid(self) -> int: - """ - Return pageid of the page. - - :return: pageid or 0 if page does not exist - """ - if not hasattr(self, '_pageid'): - self.site.loadpageinfo(self) - return self._pageid - - def title( - self, - *, - underscore: bool = False, - with_ns: bool = True, - with_section: bool = True, - as_url: bool = False, - as_link: bool = False, - allow_interwiki: bool = True, - force_interwiki: bool = False, - textlink: bool = False, - as_filename: bool = False, - insite=None, - without_brackets: bool = False - ) -> str: - """ - Return the title of this Page, as a string. - - :param underscore: (not used with as_link) if true, replace all ' ' - characters with '_' - :param with_ns: if false, omit the namespace prefix. If this - option is false and used together with as_link return a labeled - link like [[link|label]] - :param with_section: if false, omit the section - :param as_url: (not used with as_link) if true, quote title as if in an - URL - :param as_link: if true, return the title in the form of a wikilink - :param allow_interwiki: (only used if as_link is true) if true, format - the link as an interwiki link if necessary - :param force_interwiki: (only used if as_link is true) if true, always - format the link as an interwiki link - :param textlink: (only used if as_link is true) if true, place a ':' - before Category: and Image: links - :param as_filename: (not used with as_link) if true, replace any - characters that are unsafe in filenames - :param insite: (only used if as_link is true) a site object where the - title is to be shown. Default is the current family/lang given by - -family and -lang or -site option i.e. config.family and - config.mylang - :param without_brackets: (cannot be used with as_link) if true, remove - the last pair of brackets(usually removes disambiguation brackets). - """ - title = self._link.canonical_title() - label = self._link.title - if with_section and self.section(): - section = '#' + self.section() - else: - section = '' - if as_link: - if insite: - target_code = insite.code - target_family = insite.family.name - else: - target_code = config.mylang - target_family = config.family - if force_interwiki \ - or (allow_interwiki - and (self.site.family.name != target_family - or self.site.code != target_code)): - if self.site.family.name not in ( - target_family, self.site.code): - title = '{site.family.name}:{site.code}:{title}'.format( - site=self.site, title=title) - else: - # use this form for sites like commons, where the - # code is the same as the family name - title = '{}:{}'.format(self.site.code, title) - elif textlink and (self.is_filepage() or self.is_categorypage()): - title = ':{}'.format(title) - elif self.namespace() == 0 and not section: - with_ns = True - if with_ns: - return '[[{}{}]]'.format(title, section) - return '[[{}{}|{}]]'.format(title, section, label) - if not with_ns and self.namespace() != 0: - title = label + section - else: - title += section - if without_brackets: - brackets_re = r'\s+$[^()]+?$$' - title = re.sub(brackets_re, '', title) - if underscore or as_url: - title = title.replace(' ', '_') - if as_url: - encoded_title = title.encode(self.site.encoding()) - title = quote_from_bytes(encoded_title, safe='') - if as_filename: - # Replace characters that are not possible in file names on some - # systems, but still are valid in MediaWiki titles: - # Unix: / - # MediaWiki: /:\ - # Windows: /:\"?* - # Spaces are possible on most systems, but are bad for URLs. - for forbidden in ':*?/\\" ': - title = title.replace(forbidden, '_') - return title - - def section(self) -> Optional[str]: - """ - Return the name of the section this Page refers to. - - The section is the part of the title following a '#' character, if - any. If no section is present, return None. - """ - try: - section = self._link.section - except AttributeError: - section = None - return section - - def __str__(self) -> str: - """Return a string representation.""" - return self.title(as_link=True, force_interwiki=True) - - def __repr__(self) -> str: - """Return a more complete string representation.""" - return '{}({!r})'.format(self.__class__.__name__, self.title()) - - def _cmpkey(self): - """ - Key for comparison of Page objects. - - Page objects are "equal" if and only if they are on the same site - and have the same normalized title, including section if any. - - Page objects are sortable by site, namespace then title. - """ - return (self.site, self.namespace(), self.title()) - - def __hash__(self): - """ - A stable identifier to be used as a key in hash-tables. - - This relies on the fact that the string - representation of an instance cannot change after the construction. - """ - return hash(self._cmpkey()) - - def full_url(self): - """Return the full URL.""" - return self.site.base_url( - self.site.articlepath.format(self.title(as_url=True))) - - def autoFormat(self): - """ - Return :py:obj:`date.getAutoFormat` dictName and value, if any. - - Value can be a year, date, etc., and dictName is 'YearBC', - 'Year_December', or another dictionary name. Please note that two - entries may have exactly the same autoFormat, but be in two - different namespaces, as some sites have categories with the - same names. Regular titles return (None, None). - """ - if not hasattr(self, '_autoFormat'): - self._autoFormat = date.getAutoFormat( - self.site.lang, - self.title(with_ns=False) - ) - return self._autoFormat - - def isAutoTitle(self): - """Return True if title of this Page is in the autoFormat dict.""" - return self.autoFormat()[0] is not None - - def get(self, force: bool = False, get_redirect: bool = False) -> str: - """Return the wiki-text of the page. - - This will retrieve the page from the server if it has not been - retrieved yet, or if force is True. This can raise the following - exceptions that should be caught by the calling code: - - :exception pywikibot.exceptions.NoPageError: The page does not exist - :exception pywikibot.exceptions.IsRedirectPageError: The page is a - redirect. The argument of the exception is the title of the page - it redirects to. - :exception pywikibot.exceptions.SectionError: The section does not - exist on a page with a # link - - :param force: reload all page attributes, including errors. - :param get_redirect: return the redirect text, do not follow the - redirect, do not raise an exception. - """ - if force: - del self.latest_revision_id - if hasattr(self, '_bot_may_edit'): - del self._bot_may_edit - try: - self._getInternals() - except IsRedirectPageError: - if not get_redirect: - raise - - return self.latest_revision.text - - def _latest_cached_revision(self): - """Get the latest revision if cached and has text, otherwise None.""" - if (hasattr(self, '_revid') and self._revid in self._revisions - and self._revisions[self._revid].text is not None): - return self._revisions[self._revid] - return None - - def _getInternals(self): - """ - Helper function for get(). - - Stores latest revision in self if it doesn't contain it, doesn't think. - * Raises exceptions from previous runs. - * Stores new exceptions in _getexception and raises them. - """ - # Raise exceptions from previous runs - if hasattr(self, '_getexception'): - raise self._getexception - - # If not already stored, fetch revision - if self._latest_cached_revision() is None: - try: - self.site.loadrevisions(self, content=True) - except (NoPageError, SectionError) as e: - self._getexception = e - raise - - # self._isredir is set by loadrevisions - if self._isredir: - self._getexception = IsRedirectPageError(self) - raise self._getexception - - @remove_last_args(['get_redirect']) - def getOldVersion(self, oldid, force: bool = False) -> str: - """Return text of an old revision of this page. - - :param oldid: The revid of the revision desired. - """ - if force or oldid not in self._revisions \ - or self._revisions[oldid].text is None: - self.site.loadrevisions(self, content=True, revids=oldid) - return self._revisions[oldid].text - - def permalink(self, oldid=None, percent_encoded: bool = True, - with_protocol: bool = False) -> str: - """Return the permalink URL of an old revision of this page. - - :param oldid: The revid of the revision desired. - :param percent_encoded: if false, the link will be provided - without title uncoded. - :param with_protocol: if true, http or https prefixes will be - included before the double slash. - """ - if percent_encoded: - title = self.title(as_url=True) - else: - title = self.title(as_url=False).replace(' ', '_') - return '{}//{}{}/index.php?title={}&oldid={}'.format( - self.site.protocol() + ':' if with_protocol else '', - self.site.hostname(), - self.site.scriptpath(), - title, - oldid if oldid is not None else self.latest_revision_id) - - @property - def latest_revision_id(self): - """Return the current revision id for this page.""" - if not hasattr(self, '_revid'): - self.revisions() - return self._revid - - @latest_revision_id.deleter - def latest_revision_id(self) -> None: - """ - Remove the latest revision id set for this Page. - - All internal cached values specifically for the latest revision - of this page are cleared. - - The following cached values are not cleared: - - text property - - page properties, and page coordinates - - lastNonBotUser - - isDisambig and isCategoryRedirect status - - langlinks, templates and deleted revisions - """ - # When forcing, we retry the page no matter what: - # * Old exceptions do not apply any more - # * Deleting _revid to force reload - # * Deleting _redirtarget, that info is now obsolete. - for attr in ['_redirtarget', '_getexception', '_revid']: - if hasattr(self, attr): - delattr(self, attr) - - @latest_revision_id.setter - def latest_revision_id(self, value) -> None: - """Set the latest revision for this Page.""" - del self.latest_revision_id - self._revid = value - - @property - def latest_revision(self): - """Return the current revision for this page.""" - rev = self._latest_cached_revision() - if rev is not None: - return rev - - with suppress(StopIteration): - return next(self.revisions(content=True, total=1)) - raise InvalidPageError(self) - - @property - def text(self) -> str: - """ - Return the current (edited) wikitext, loading it if necessary. - - :return: text of the page - """ - if getattr(self, '_text', None) is not None: - return self._text - - try: - return self.get(get_redirect=True) - except NoPageError: - # TODO: what other exceptions might be returned? - return '' - - @text.setter - def text(self, value: Optional[str]): - """Update the current (edited) wikitext. - - :param value: New value or None - """ - try: - self.botMayEdit() # T262136, T267770 - except Exception as e: - # dry tests aren't able to make an API call - # but are rejected by an Exception; ignore it then. - if not str(e).startswith('DryRequest rejecting request:'): - raise - - del self.text - self._text = None if value is None else str(value) - - @text.deleter - def text(self) -> None: - """Delete the current (edited) wikitext.""" - if hasattr(self, '_text'): - del self._text - if hasattr(self, '_expanded_text'): - del self._expanded_text - if hasattr(self, '_raw_extracted_templates'): - del self._raw_extracted_templates - - def preloadText(self) -> str: - """ - The text returned by EditFormPreloadText. - - See API module "info". - - Application: on Wikisource wikis, text can be preloaded even if - a page does not exist, if an Index page is present. - """ - self.site.loadpageinfo(self, preload=True) - return self._preloadedtext - - def get_parsed_page(self, force: bool = False) -> str: - """Retrieve parsed text (via action=parse) and cache it. - - .. versionchanged:: 7.1 - `force` parameter was added; - `_get_parsed_page` becomes a public method - - :param force: force updating from the live site - - .. seealso:: - :meth:`APISite.get_parsed_page() - <pywikibot.site._apisite.APISite.get_parsed_page>` - """ - if not hasattr(self, '_parsed_text') or force: - self._parsed_text = self.site.get_parsed_page(self) - return self._parsed_text - - def extract(self, variant: str = 'plain', *, - lines: Optional[int] = None, - chars: Optional[int] = None, - sentences: Optional[int] = None, - intro: bool = True) -> str: - """Retrieve an extract of this page. - - .. versionadded:: 7.1 - - :param variant: The variant of extract, either 'plain' for plain - text, 'html' for limited HTML (both excludes templates and - any text formatting) or 'wiki' for bare wikitext which also - includes any templates for example. - :param lines: if not None, wrap the extract into lines with - width of 79 chars and return a string with that given number - of lines. - :param chars: How many characters to return. Actual text - returned might be slightly longer. - :param sentences: How many sentences to return - :param intro: Return only content before the first section - :raises NoPageError: given page does not exist - :raises NotImplementedError: "wiki" variant does not support - `sencence` parameter. - :raises ValueError: `variant` parameter must be "plain", "html" or - "wiki" - - .. seealso:: :meth:`APISite.extract() - <pywikibot.site._extensions.TextExtractsMixin.extract>`. - """ - if variant in ('plain', 'html'): - extract = self.site.extract(self, chars=chars, sentences=sentences, - intro=intro, - plaintext=variant == 'plain') - elif variant == 'wiki': - if not self.exists(): - raise NoPageError(self) - if sentences: - raise NotImplementedError( - "'wiki' variant of extract method does not support " - "'sencence' parameter") - - extract = self.text[:] - if intro: - pos = extract.find('\n=') - if pos: - extract = extract[:pos] - if chars: - extract = shorten(extract, chars, break_long_words=False, - placeholder='…') - else: - raise ValueError( - 'variant parameter must be "plain", "html" or "wiki", not "{}"' - .format(variant)) - - if not lines: - return extract - - text_lines = [] - for i, text in enumerate(extract.splitlines(), start=1): - text_lines += wrap(text, width=79) or [''] - if i >= lines: - break - - return '\n'.join(text_lines[:min(lines, len(text_lines))]) - - def properties(self, force: bool = False) -> dict: - """ - Return the properties of the page. - - :param force: force updating from the live site - """ - if not hasattr(self, '_pageprops') or force: - self._pageprops = {} # page may not have pageprops (see T56868) - self.site.loadpageprops(self) - return self._pageprops - - def defaultsort(self, force: bool = False) -> Optional[str]: - """ - Extract value of the {{DEFAULTSORT:}} magic word from the page. - - :param force: force updating from the live site - """ - return self.properties(force=force).get('defaultsort') - - def expand_text( - self, - force: bool = False, - includecomments: bool = False - ) -> str: - """Return the page text with all templates and parser words expanded. - - :param force: force updating from the live site - :param includecomments: Also strip comments if includecomments - parameter is not True. - """ - if not hasattr(self, '_expanded_text') or ( - self._expanded_text is None) or force: - if not self.text: - self._expanded_text = '' - return '' - - self._expanded_text = self.site.expand_text( - self.text, - title=self.title(with_section=False), - includecomments=includecomments) - return self._expanded_text - - def userName(self) -> str: - """Return name or IP address of last user to edit page.""" - return self.latest_revision.user - - def isIpEdit(self) -> bool: - """Return True if last editor was unregistered.""" - return self.latest_revision.anon - - def lastNonBotUser(self) -> str: - """ - Return name or IP address of last human/non-bot user to edit page. - - Determine the most recent human editor out of the last revisions. - If it was not able to retrieve a human user, returns None. - - If the edit was done by a bot which is no longer flagged as 'bot', - i.e. which is not returned by Site.botusers(), it will be returned - as a non-bot edit. - """ - if hasattr(self, '_lastNonBotUser'): - return self._lastNonBotUser - - self._lastNonBotUser = None - for entry in self.revisions(): - if entry.user and (not self.site.isBot(entry.user)): - self._lastNonBotUser = entry.user - break - - return self._lastNonBotUser - - def editTime(self): - """Return timestamp of last revision to page. - - :rtype: pywikibot.Timestamp - """ - return self.latest_revision.timestamp - - def exists(self) -> bool: - """Return True if page exists on the wiki, even if it's a redirect. - - If the title includes a section, return False if this section isn't - found. - """ - with suppress(AttributeError): - return self.pageid > 0 - raise InvalidPageError(self) - - @property - def oldest_revision(self): - """ - Return the first revision of this page. - - :rtype: :py:obj:`Revision` - """ - return next(self.revisions(reverse=True, total=1)) - - def isRedirectPage(self): - """Return True if this is a redirect, False if not or not existing.""" - return self.site.page_isredirect(self) - - def isStaticRedirect(self, force: bool = False) -> bool: - """Determine whether the page is a static redirect. - - A static redirect must be a valid redirect, and contain the magic - word __STATICREDIRECT__. - - .. versionchanged:: 7.0 - __STATICREDIRECT__ can be transcluded - - :param force: Bypass local caching - """ - return self.isRedirectPage() \ - and 'staticredirect' in self.properties(force=force) - - def isCategoryRedirect(self) -> bool: - """Return True if this is a category redirect page, False otherwise.""" - if not self.is_categorypage(): - return False - - if not hasattr(self, '_catredirect'): - self._catredirect = False - catredirs = self.site.category_redirects() - for template, args in self.templatesWithParams(): - if template.title(with_ns=False) not in catredirs: - continue - - if args: - # Get target (first template argument) - target_title = args[0].strip() - p = pywikibot.Page( - self.site, target_title, Namespace.CATEGORY) - try: - p.title() - except pywikibot.exceptions.InvalidTitleError: - target_title = self.site.expand_text( - text=target_title, title=self.title()) - p = pywikibot.Page(self.site, target_title, - Namespace.CATEGORY) - if p.namespace() == Namespace.CATEGORY: - self._catredirect = p.title() - else: - pywikibot.warning( - 'Category redirect target {} on {} is not a ' - 'category'.format(p.title(as_link=True), - self.title(as_link=True))) - else: - pywikibot.warning( - 'No target found for category redirect on ' - + self.title(as_link=True)) - break - - return bool(self._catredirect) - - def getCategoryRedirectTarget(self): - """ - If this is a category redirect, return the target category title. - - :rtype: pywikibot.page.Category - """ - if self.isCategoryRedirect(): - return Category(Link(self._catredirect, self.site)) - raise IsNotRedirectPageError(self) - - def isTalkPage(self): - """Return True if this page is in any talk namespace.""" - ns = self.namespace() - return ns >= 0 and ns % 2 == 1 - - def toggleTalkPage(self): - """ - Return other member of the article-talk page pair for this Page. - - If self is a talk page, returns the associated content page; - otherwise, returns the associated talk page. The returned page need - not actually exist on the wiki. - - :return: Page or None if self is a special page. - :rtype: typing.Optional[pywikibot.Page] - """ - ns = self.namespace() - if ns < 0: # Special page - return None - - title = self.title(with_ns=False) - new_ns = ns + (1, -1)[self.isTalkPage()] - return Page(self.site, - '{}:{}'.format(self.site.namespace(new_ns), title)) - - def is_categorypage(self): - """Return True if the page is a Category, False otherwise.""" - return self.namespace() == 14 - - def is_filepage(self): - """Return True if this is a file description page, False otherwise.""" - return self.namespace() == 6 - - def isDisambig(self) -> bool: - """ - Return True if this is a disambiguation page, False otherwise. - - By default, it uses the Disambiguator extension's result. The - identification relies on the presence of the __DISAMBIG__ magic word - which may also be transcluded. - - If the Disambiguator extension isn't activated for the given site, - the identification relies on the presence of specific templates. - First load a list of template names from the Family file; - if the value in the Family file is None or no entry was made, look for - the list on [[MediaWiki:Disambiguationspage]]. If this page does not - exist, take the MediaWiki message. 'Template:Disambig' is always - assumed to be default, and will be appended regardless of its - existence. - """ - if self.site.has_extension('Disambiguator'): - # If the Disambiguator extension is loaded, use it - return 'disambiguation' in self.properties() - - if not hasattr(self.site, '_disambigtemplates'): - try: - default = set(self.site.family.disambig('_default')) - except KeyError: - default = {'Disambig'} - try: - distl = self.site.family.disambig(self.site.code, - fallback=False) - except KeyError: - distl = None - if distl is None: - disambigpages = Page(self.site, - 'MediaWiki:Disambiguationspage') - if disambigpages.exists(): - disambigs = {link.title(with_ns=False) - for link in disambigpages.linkedPages() - if link.namespace() == 10} - elif self.site.has_mediawiki_message('disambiguationspage'): - message = self.site.mediawiki_message( - 'disambiguationspage').split(':', 1)[1] - # add the default template(s) for default mw message - # only - disambigs = {first_upper(message)} | default - else: - disambigs = default - self.site._disambigtemplates = disambigs - else: - # Normalize template capitalization - self.site._disambigtemplates = {first_upper(t) for t in distl} - templates = {tl.title(with_ns=False) for tl in self.templates()} - disambigs = set() - # always use cached disambig templates - disambigs.update(self.site._disambigtemplates) - # see if any template on this page is in the set of disambigs - disambig_in_page = disambigs.intersection(templates) - return self.namespace() != 10 and bool(disambig_in_page) - - def getReferences(self, - follow_redirects: bool = True, - with_template_inclusion: bool = True, - only_template_inclusion: bool = False, - filter_redirects: bool = False, - namespaces=None, - total: Optional[int] = None, - content: bool = False): - """ - Return an iterator all pages that refer to or embed the page. - - If you need a full list of referring pages, use - ``pages = list(s.getReferences())`` - - :param follow_redirects: if True, also iterate pages that link to a - redirect pointing to the page. - :param with_template_inclusion: if True, also iterate pages where self - is used as a template. - :param only_template_inclusion: if True, only iterate pages where self - is used as a template. - :param filter_redirects: if True, only iterate redirects to self. - :param namespaces: only iterate pages in these namespaces - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each referring page (default False) - :rtype: typing.Iterable[pywikibot.Page] - """ - # N.B.: this method intentionally overlaps with backlinks() and - # embeddedin(). Depending on the interface, it may be more efficient - # to implement those methods in the site interface and then combine - # the results for this method, or to implement this method and then - # split up the results for the others. - return self.site.pagereferences( - self, - follow_redirects=follow_redirects, - filter_redirects=filter_redirects, - with_template_inclusion=with_template_inclusion, - only_template_inclusion=only_template_inclusion, - namespaces=namespaces, - total=total, - content=content - ) - - def backlinks(self, - follow_redirects: bool = True, - filter_redirects: Optional[bool] = None, - namespaces=None, - total: Optional[int] = None, - content: bool = False): - """ - Return an iterator for pages that link to this page. - - :param follow_redirects: if True, also iterate pages that link to a - redirect pointing to the page. - :param filter_redirects: if True, only iterate redirects; if False, - omit redirects; if None, do not filter - :param namespaces: only iterate pages in these namespaces - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each referring page (default False) - """ - return self.site.pagebacklinks( - self, - follow_redirects=follow_redirects, - filter_redirects=filter_redirects, - namespaces=namespaces, - total=total, - content=content - ) - - def embeddedin(self, - filter_redirects: Optional[bool] = None, - namespaces=None, - total: Optional[int] = None, - content: bool = False): - """ - Return an iterator for pages that embed this page as a template. - - :param filter_redirects: if True, only iterate redirects; if False, - omit redirects; if None, do not filter - :param namespaces: only iterate pages in these namespaces - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each embedding page (default False) - """ - return self.site.page_embeddedin( - self, - filter_redirects=filter_redirects, - namespaces=namespaces, - total=total, - content=content - ) - - def redirects( - self, - *, - filter_fragments: Optional[bool] = None, - namespaces: NamespaceArgType = None, - total: Optional[int] = None, - content: bool = False - ) -> 'Iterable[pywikibot.Page]': - """ - Return an iterable of redirects to this page. - - :param filter_fragments: If True, only return redirects with fragments. - If False, only return redirects without fragments. If None, return - both (no filtering). - :param namespaces: only return redirects from these namespaces - :param total: maximum number of redirects to retrieve in total - :param content: load the current content of each redirect - - .. versionadded:: 7.0 - """ - return self.site.page_redirects( - self, - filter_fragments=filter_fragments, - namespaces=namespaces, - total=total, - content=content, - ) - - def protection(self) -> dict: - """Return a dictionary reflecting page protections.""" - return self.site.page_restrictions(self) - - def applicable_protections(self) -> set: - """ - Return the protection types allowed for that page. - - If the page doesn't exist it only returns "create". Otherwise it - returns all protection types provided by the site, except "create". - It also removes "upload" if that page is not in the File namespace. - - It is possible, that it returns an empty set, but only if original - protection types were removed. - - :return: set of str - """ - # New API since commit 32083235eb332c419df2063cf966b3400be7ee8a - if self.site.mw_version >= '1.25wmf14': - self.site.loadpageinfo(self) - return self._applicable_protections - - p_types = set(self.site.protection_types()) - if not self.exists(): - return {'create'} if 'create' in p_types else set() - p_types.remove('create') # no existing page allows that - if not self.is_filepage(): # only file pages allow upload - p_types.remove('upload') - return p_types - - def has_permission(self, action: str = 'edit') -> bool: - """Determine whether the page can be modified. - - Return True if the bot has the permission of needed restriction level - for the given action type. - - :param action: a valid restriction type like 'edit', 'move' - :raises ValueError: invalid action parameter - """ - return self.site.page_can_be_edited(self, action) - - def botMayEdit(self) -> bool: - """ - Determine whether the active bot is allowed to edit the page. - - This will be True if the page doesn't contain {{bots}} or {{nobots}} - or any other template from edit_restricted_templates list - in x_family.py file, or it contains them and the active bot is allowed - to edit this page. (This method is only useful on those sites that - recognize the bot-exclusion protocol; on other sites, it will always - return True.) - - The framework enforces this restriction by default. It is possible - to override this by setting ignore_bot_templates=True in - user-config.py, or using page.put(force=True). - """ - if not hasattr(self, '_bot_may_edit'): - self._bot_may_edit = self._check_bot_may_edit() - return self._bot_may_edit - - def _check_bot_may_edit(self, module: Optional[str] = None) -> bool: - """A botMayEdit helper method. - - @param module: The module name to be restricted. Defaults to - pywikibot.calledModuleName(). - """ - if not hasattr(self, 'templatesWithParams'): - return True - - if config.ignore_bot_templates: # Check the "master ignore switch" - return True - - username = self.site.username() - try: - templates = self.templatesWithParams() - except (NoPageError, IsRedirectPageError, SectionError): - return True - - # go through all templates and look for any restriction - restrictions = set(self.site.get_edit_restricted_templates()) - - if module is None: - module = pywikibot.calledModuleName() - - # also add archive templates for non-archive bots - if module != 'archivebot': - restrictions.update(self.site.get_archived_page_templates()) - - # multiple bots/nobots templates are allowed - for template, params in templates: - title = template.title(with_ns=False) - - if title in restrictions: - return False - - if title not in ('Bots', 'Nobots'): - continue - - try: - key, sep, value = params[0].partition('=') - except IndexError: - key, sep, value = '', '', '' - names = set() - else: - if not sep: - key, value = value, key - key = key.strip() - names = {name.strip() for name in value.split(',')} - - if len(params) > 1: - pywikibot.warning( - '{{%s|%s}} has more than 1 parameter; taking the first.' - % (title.lower(), '|'.join(params))) - - if title == 'Nobots': - if not params: - return False - - if key: - pywikibot.error( - '%s parameter for {{nobots}} is not allowed. ' - 'Edit declined' % key) - return False - - if 'all' in names or module in names or username in names: - return False - - if title == 'Bots': - if value and not key: - pywikibot.warning( - '{{bots|%s}} is not valid. Ignoring.' % value) - continue - - if key and not value: - pywikibot.warning( - '{{bots|%s=}} is not valid. Ignoring.' % key) - continue - - if key == 'allow': - if not ('all' in names or username in names): - return False - - elif key == 'deny': - if 'all' in names or username in names: - return False - - elif key == 'allowscript': - if not ('all' in names or module in names): - return False - - elif key == 'denyscript': - if 'all' in names or module in names: - return False - - elif key: # ignore unrecognized keys with a warning - pywikibot.warning( - '{{bots|%s}} is not valid. Ignoring.' % params[0]) - - # no restricting template found - return True - - def save(self, - summary: Optional[str] = None, - watch: Optional[str] = None, - minor: bool = True, - botflag: Optional[bool] = None, - force: bool = False, - asynchronous: bool = False, - callback=None, - apply_cosmetic_changes: Optional[bool] = None, - quiet: bool = False, - **kwargs): - """ - Save the current contents of page's text to the wiki. - - .. versionchanged:: 7.0 - boolean watch parameter is deprecated - - :param summary: The edit summary for the modification (optional, but - most wikis strongly encourage its use) - :param watch: Specify how the watchlist is affected by this edit, set - to one of "watch", "unwatch", "preferences", "nochange": - * watch: add the page to the watchlist - * unwatch: remove the page from the watchlist - * preferences: use the preference settings (Default) - * nochange: don't change the watchlist - If None (default), follow bot account's default settings - :param minor: if True, mark this edit as minor - :param botflag: if True, mark this edit as made by a bot (default: - True if user has bot status, False if not) - :param force: if True, ignore botMayEdit() setting - :param asynchronous: if True, launch a separate thread to save - asynchronously - :param callback: a callable object that will be called after the - page put operation. This object must take two arguments: (1) a - Page object, and (2) an exception instance, which will be None - if the page was saved successfully. The callback is intended for - use by bots that need to keep track of which saves were - successful. - :param apply_cosmetic_changes: Overwrites the cosmetic_changes - configuration value to this value unless it's None. - :param quiet: enable/disable successful save operation message; - defaults to False. - In asynchronous mode, if True, it is up to the calling bot to - manage the output e.g. via callback. - """ - if not summary: - summary = config.default_edit_summary - - if isinstance(watch, bool): - issue_deprecation_warning( - 'boolean watch parameter', - '"watch", "unwatch", "preferences" or "nochange" value', - since='7.0.0') - watch = ('unwatch', 'watch')[watch] - - if not force and not self.botMayEdit(): - raise OtherPageSaveError( - self, 'Editing restricted by {{bots}}, {{nobots}} ' - "or site's equivalent of {{in use}} template") - self._save(summary=summary, watch=watch, minor=minor, botflag=botflag, - asynchronous=asynchronous, callback=callback, - cc=apply_cosmetic_changes, quiet=quiet, **kwargs) - - @allow_asynchronous - def _save(self, summary=None, watch=None, minor: bool = True, botflag=None, - cc=None, quiet: bool = False, **kwargs): - """Helper function for save().""" - link = self.title(as_link=True) - if cc or (cc is None and config.cosmetic_changes): - summary = self._cosmetic_changes_hook(summary) - - done = self.site.editpage(self, summary=summary, minor=minor, - watch=watch, bot=botflag, **kwargs) - if not done: - if not quiet: - pywikibot.warning('Page {} not saved'.format(link)) - raise PageSaveRelatedError(self) - if not quiet: - pywikibot.output('Page {} saved'.format(link)) - - def _cosmetic_changes_hook(self, summary: str) -> str: - """The cosmetic changes hook. - - :param summary: The current edit summary. - :return: Modified edit summary if cosmetic changes has been done, - else the old edit summary. - """ - if self.isTalkPage() or self.content_model != 'wikitext' or \ - pywikibot.calledModuleName() in config.cosmetic_changes_deny_script: - return summary - - # check if cosmetic_changes is enabled for this page - family = self.site.family.name - if config.cosmetic_changes_mylang_only: - cc = ((family == config.family and self.site.lang == config.mylang) - or self.site.lang in config.cosmetic_changes_enable.get( - family, [])) - else: - cc = True - cc = cc and self.site.lang not in config.cosmetic_changes_disable.get( - family, []) - cc = cc and self._check_bot_may_edit('cosmetic_changes') - if not cc: - return summary - - old = self.text - pywikibot.log('Cosmetic changes for {}-{} enabled.' - .format(family, self.site.lang)) - # cc depends on page directly and via several other imports - cc_toolkit = CosmeticChangesToolkit(self, ignore=CANCEL.MATCH) - self.text = cc_toolkit.change(old) - - # i18n package changed in Pywikibot 7.0.0 - old_i18n = i18n.twtranslate(self.site, 'cosmetic_changes-append', - fallback_prompt='; cosmetic changes') - if summary and old.strip().replace( - '\r\n', '\n') != self.text.strip().replace('\r\n', '\n'): - summary += i18n.twtranslate(self.site, - 'pywikibot-cosmetic-changes', - fallback_prompt=old_i18n) - return summary - - def put(self, newtext: str, - summary: Optional[str] = None, - watch: Optional[str] = None, - minor: bool = True, - botflag: Optional[bool] = None, - force: bool = False, - asynchronous: bool = False, - callback=None, - show_diff: bool = False, - **kwargs) -> None: - """ - Save the page with the contents of the first argument as the text. - - This method is maintained primarily for backwards-compatibility. - For new code, using Page.save() is preferred. See save() method - docs for all parameters not listed here. - - .. versionadded:: 7.0 - The `show_diff` parameter - - :param newtext: The complete text of the revised page. - :param show_diff: show changes between oldtext and newtext - (default: False) - """ - if show_diff: - pywikibot.showDiff(self.text, newtext) - self.text = newtext - self.save(summary=summary, watch=watch, minor=minor, botflag=botflag, - force=force, asynchronous=asynchronous, callback=callback, - **kwargs) - - def watch(self, unwatch: bool = False) -> bool: - """ - Add or remove this page to/from bot account's watchlist. - - :param unwatch: True to unwatch, False (default) to watch. - :return: True if successful, False otherwise. - """ - return self.site.watch(self, unwatch) - - def clear_cache(self) -> None: - """Clear the cached attributes of the page.""" - self._revisions = {} - for attr in self._cache_attrs: - with suppress(AttributeError): - delattr(self, attr) - - def purge(self, **kwargs) -> bool: - """ - Purge the server's cache for this page. - - :keyword redirects: Automatically resolve redirects. - :type redirects: bool - :keyword converttitles: Convert titles to other variants if necessary. - Only works if the wiki's content language supports variant - conversion. - :type converttitles: bool - :keyword forcelinkupdate: Update the links tables. - :type forcelinkupdate: bool - :keyword forcerecursivelinkupdate: Update the links table, and update - the links tables for any page that uses this page as a template. - :type forcerecursivelinkupdate: bool - """ - self.clear_cache() - return self.site.purgepages([self], **kwargs) - - def touch(self, callback=None, botflag: bool = False, **kwargs): - """ - Make a touch edit for this page. - - See save() method docs for all parameters. - The following parameters will be overridden by this method: - - summary, watch, minor, force, asynchronous - - Parameter botflag is False by default. - - minor and botflag parameters are set to False which prevents hiding - the edit when it becomes a real edit due to a bug. - - :note: This discards content saved to self.text. - """ - if self.exists(): - # ensure always get the page text and not to change it. - del self.text - summary = i18n.twtranslate(self.site, 'pywikibot-touch') - self.save(summary=summary, watch='nochange', - minor=False, botflag=botflag, force=True, - asynchronous=False, callback=callback, - apply_cosmetic_changes=False, nocreate=True, **kwargs) - else: - raise NoPageError(self) - - def linkedPages( - self, *args, **kwargs - ) -> Generator['pywikibot.Page', None, None]: - """Iterate Pages that this Page links to. - - Only returns pages from "normal" internal links. Embedded - templates are omitted but links within them are returned. All - interwiki and external links are omitted. - - For the parameters refer - :py:mod:`APISite.pagelinks<pywikibot.site.APISite.pagelinks>` - - .. versionadded:: 7.0 - the `follow_redirects` keyword argument - .. deprecated:: 7.0 - the positional arguments - - .. seealso:: https://www.mediawiki.org/wiki/API:Links - - :keyword namespaces: Only iterate pages in these namespaces - (default: all) - :type namespaces: iterable of str or Namespace key, - or a single instance of those types. May be a '|' separated - list of namespace identifiers. - :keyword follow_redirects: if True, yields the target of any redirects, - rather than the redirect page - :keyword total: iterate no more than this number of pages in total - :keyword content: if True, load the current content of each page - """ - # Deprecate positional arguments and synchronize with Site.pagelinks - keys = ('namespaces', 'total', 'content') - for i, arg in enumerate(args): - key = keys[i] - issue_deprecation_warning( - 'Positional argument {} ({})'.format(i + 1, arg), - 'keyword argument "{}={}"'.format(key, arg), - since='7.0.0') - if key in kwargs: - pywikibot.warning('{!r} is given as keyword argument {!r} ' - 'already; ignoring {!r}' - .format(key, arg, kwargs[key])) - else: - kwargs[key] = arg - - return self.site.pagelinks(self, **kwargs) - - def interwiki(self, expand: bool = True): - """ - Iterate interwiki links in the page text, excluding language links. - - :param expand: if True (default), include interwiki links found in - templates transcluded onto this page; if False, only iterate - interwiki links found in this page's own wikitext - :return: a generator that yields Link objects - :rtype: generator - """ - # This function does not exist in the API, so it has to be - # implemented by screen-scraping - if expand: - text = self.expand_text() - else: - text = self.text - for linkmatch in pywikibot.link_regex.finditer( - textlib.removeDisabledParts(text)): - linktitle = linkmatch.group('title') - link = Link(linktitle, self.site) - # only yield links that are to a different site and that - # are not language links - try: - if link.site != self.site: - if linktitle.lstrip().startswith(':'): - # initial ":" indicates not a language link - yield link - elif link.site.family != self.site.family: - # link to a different family is not a language link - yield link - except Error: - # ignore any links with invalid contents - continue - - def langlinks(self, include_obsolete: bool = False) -> list: - """ - Return a list of all inter-language Links on this page. - - :param include_obsolete: if true, return even Link objects whose site - is obsolete - :return: list of Link objects. - """ - # Note: We preload a list of *all* langlinks, including links to - # obsolete sites, and store that in self._langlinks. We then filter - # this list if the method was called with include_obsolete=False - # (which is the default) - if not hasattr(self, '_langlinks'): - self._langlinks = list(self.iterlanglinks(include_obsolete=True)) - - if include_obsolete: - return self._langlinks - return [i for i in self._langlinks if not i.site.obsolete] - - def iterlanglinks(self, - total: Optional[int] = None, - include_obsolete: bool = False): - """Iterate all inter-language links on this page. - - :param total: iterate no more than this number of pages in total - :param include_obsolete: if true, yield even Link object whose site - is obsolete - :return: a generator that yields Link objects. - :rtype: generator - """ - if hasattr(self, '_langlinks'): - return iter(self.langlinks(include_obsolete=include_obsolete)) - # XXX We might want to fill _langlinks when the Site - # method is called. If we do this, we'll have to think - # about what will happen if the generator is not completely - # iterated upon. - return self.site.pagelanglinks(self, total=total, - include_obsolete=include_obsolete) - - def data_item(self): - """ - Convenience function to get the Wikibase item of a page. - - :rtype: pywikibot.page.ItemPage - """ - return ItemPage.fromPage(self) - - def templates(self, content: bool = False): - """ - Return a list of Page objects for templates used on this Page. - - Template parameters are ignored. This method only returns embedded - templates, not template pages that happen to be referenced through - a normal link. - - :param content: if True, retrieve the content of the current version - of each template (default False) - :param content: bool - """ - # Data might have been preloaded - if not hasattr(self, '_templates'): - self._templates = list(self.itertemplates(content=content)) - - return self._templates - - def itertemplates(self, - total: Optional[int] = None, - content: bool = False): - """ - Iterate Page objects for templates used on this Page. - - Template parameters are ignored. This method only returns embedded - templates, not template pages that happen to be referenced through - a normal link. - - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each template (default False) - :param content: bool - """ - if hasattr(self, '_templates'): - return iter(self._templates) - return self.site.pagetemplates(self, total=total, content=content) - - def imagelinks(self, total: Optional[int] = None, content: bool = False): - """ - Iterate FilePage objects for images displayed on this Page. - - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each image description page (default False) - :return: a generator that yields FilePage objects. - """ - return self.site.pageimages(self, total=total, content=content) - - def categories(self, - with_sort_key: bool = False, - total: Optional[int] = None, - content: bool = False): - """ - Iterate categories that the article is in. - - :param with_sort_key: if True, include the sort key in each Category. - :param total: iterate no more than this number of pages in total - :param content: if True, retrieve the content of the current version - of each category description page (default False) - :return: a generator that yields Category objects. - :rtype: generator - """ - # FIXME: bug T75561: with_sort_key is ignored by Site.pagecategories - if with_sort_key: - raise NotImplementedError('with_sort_key is not implemented') - - return self.site.pagecategories(self, total=total, content=content) - - def extlinks(self, total: Optional[int] = None): - """ - Iterate all external URLs (not interwiki links) from this page. - - :param total: iterate no more than this number of pages in total - :return: a generator that yields str objects containing URLs. - :rtype: generator - """ - return self.site.page_extlinks(self, total=total) - - def coordinates(self, primary_only: bool = False): - """ - Return a list of Coordinate objects for points on the page. - - Uses the MediaWiki extension GeoData. - - :param primary_only: Only return the coordinate indicated to be primary - :return: A list of Coordinate objects or a single Coordinate if - primary_only is True - :rtype: list of Coordinate or Coordinate or None - """ - if not hasattr(self, '_coords'): - self._coords = [] - self.site.loadcoordinfo(self) - if primary_only: - for coord in self._coords: - if coord.primary: - return coord - return None - return list(self._coords) - - def page_image(self): - """ - Return the most appropriate image on the page. - - Uses the MediaWiki extension PageImages. - - :return: A FilePage object - :rtype: pywikibot.page.FilePage - """ - if not hasattr(self, '_pageimage'): - self._pageimage = None - self.site.loadpageimage(self) - - return self._pageimage - - def getRedirectTarget(self): - """ - Return a Page object for the target this Page redirects to. - - If this page is not a redirect page, will raise an - IsNotRedirectPageError. This method also can raise a NoPageError. - - :rtype: pywikibot.Page - """ - return self.site.getredirtarget(self) - - def moved_target(self): - """ - Return a Page object for the target this Page was moved to. - - If this page was not moved, it will raise a NoMoveTargetError. - This method also works if the source was already deleted. - - :rtype: pywikibot.page.Page - :raises pywikibot.exceptions.NoMoveTargetError: page was not moved - """ - gen = iter(self.site.logevents(logtype='move', page=self, total=1)) - try: - lastmove = next(gen) - except StopIteration: - raise NoMoveTargetError(self) - else: - return lastmove.target_page - - def revisions(self, - reverse: bool = False, - total: Optional[int] = None, - content: bool = False, - starttime=None, endtime=None): - """Generator which loads the version history as Revision instances.""" - # TODO: Only request uncached revisions - self.site.loadrevisions(self, content=content, rvdir=reverse, - starttime=starttime, endtime=endtime, - total=total) - return (self._revisions[rev] for rev in - sorted(self._revisions, reverse=not reverse)[:total]) - - def getVersionHistoryTable(self, - reverse: bool = False, - total: Optional[int] = None): - """Return the version history as a wiki table.""" - result = '{| class="wikitable"\n' - result += '! oldid || date/time || username || edit summary\n' - for entry in self.revisions(reverse=reverse, total=total): - result += '|----\n' - result += ('| {r.revid} || {r.timestamp} || {r.user} || ' - '<nowiki>{r.comment}</nowiki>\n'.format(r=entry)) - result += '|}\n' - return result - - def contributors(self, - total: Optional[int] = None, - starttime=None, endtime=None): - """ - Compile contributors of this page with edit counts. - - :param total: iterate no more than this number of revisions in total - :param starttime: retrieve revisions starting at this Timestamp - :param endtime: retrieve revisions ending at this Timestamp - - :return: number of edits for each username - :rtype: :py:obj:`collections.Counter` - """ - return Counter(rev.user for rev in - self.revisions(total=total, - starttime=starttime, endtime=endtime)) - - def revision_count(self, contributors=None) -> int: - """Determine number of edits from contributors. - - :param contributors: contributor usernames - :type contributors: iterable of str or pywikibot.User, - a single pywikibot.User, a str or None - :return: number of edits for all provided usernames - """ - cnt = self.contributors() - - if not contributors: - return sum(cnt.values()) - - if isinstance(contributors, User): - contributors = contributors.username - - if isinstance(contributors, str): - return cnt[contributors] - - return sum(cnt[user.username] if isinstance(user, User) else cnt[user] - for user in contributors) - - def merge_history(self, dest, timestamp=None, reason=None) -> None: - """ - Merge revisions from this page into another page. - - See :py:obj:`APISite.merge_history` for details. - - :param dest: Destination page to which revisions will be merged - :type dest: pywikibot.Page - :param timestamp: Revisions from this page dating up to this timestamp - will be merged into the destination page (if not given or False, - all revisions will be merged) - :type timestamp: pywikibot.Timestamp - :param reason: Optional reason for the history merge - :type reason: str - """ - self.site.merge_history(self, dest, timestamp, reason) - - def move(self, - newtitle: str, - reason: Optional[str] = None, - movetalk: bool = True, - noredirect: bool = False): - """ - Move this page to a new title. - - :param newtitle: The new page title. - :param reason: The edit summary for the move. - :param movetalk: If true, move this page's talk page (if it exists) - :param noredirect: if move succeeds, delete the old page - (usually requires sysop privileges, depending on wiki settings) - """ - if reason is None: - pywikibot.output('Moving {} to [[{}]].' - .format(self.title(as_link=True), newtitle)) - reason = pywikibot.input('Please enter a reason for the move:') - return self.site.movepage(self, newtitle, reason, - movetalk=movetalk, - noredirect=noredirect) - - def delete( - self, - reason: Optional[str] = None, - prompt: bool = True, - mark: bool = False, - automatic_quit: bool = False, - *, - deletetalk: bool = False - ) -> None: - """ - Delete the page from the wiki. Requires administrator status. - - .. versionchanged:: 7.1 - keyword only parameter *deletetalk* was added. - - :param reason: The edit summary for the deletion, or rationale - for deletion if requesting. If None, ask for it. - :param deletetalk: Also delete the talk page, if it exists. - :param prompt: If true, prompt user for confirmation before deleting. - :param mark: If true, and user does not have sysop rights, place a - speedy-deletion request on the page instead. If false, non-sysops - will be asked before marking pages for deletion. - :param automatic_quit: show also the quit option, when asking - for confirmation. - """ - if reason is None: - pywikibot.output('Deleting {}.'.format(self.title(as_link=True))) - reason = pywikibot.input('Please enter a reason for the deletion:') - - # If user has 'delete' right, delete the page - if self.site.has_right('delete'): - answer = 'y' - if prompt and not hasattr(self.site, '_noDeletePrompt'): - answer = pywikibot.input_choice( - 'Do you want to delete {}?'.format(self.title( - as_link=True, force_interwiki=True)), - [('Yes', 'y'), ('No', 'n'), ('All', 'a')], - 'n', automatic_quit=automatic_quit) - if answer == 'a': - answer = 'y' - self.site._noDeletePrompt = True - if answer == 'y': - self.site.delete(self, reason, deletetalk=deletetalk) - return - - # Otherwise mark it for deletion - if mark or hasattr(self.site, '_noMarkDeletePrompt'): - answer = 'y' - else: - answer = pywikibot.input_choice( - "Can't delete {}; do you want to mark it for deletion instead?" - .format(self), - [('Yes', 'y'), ('No', 'n'), ('All', 'a')], - 'n', automatic_quit=False) - if answer == 'a': - answer = 'y' - self.site._noMarkDeletePrompt = True - if answer == 'y': - template = '{{delete|1=%s}}\n' % reason - # We can't add templates in a wikidata item, so let's use its - # talk page - if isinstance(self, pywikibot.ItemPage): - target = self.toggleTalkPage() - else: - target = self - target.text = template + target.text - target.save(summary=reason) - - def has_deleted_revisions(self) -> bool: - """Return True if the page has deleted revisions. - - .. versionadded:: 4.2 - """ - if not hasattr(self, '_has_deleted_revisions'): - gen = self.site.deletedrevs(self, total=1, prop=['ids']) - self._has_deleted_revisions = bool(list(gen)) - return self._has_deleted_revisions - - def loadDeletedRevisions(self, total: Optional[int] = None, **kwargs): - """ - Retrieve deleted revisions for this Page. - - Stores all revisions' timestamps, dates, editors and comments in - self._deletedRevs attribute. - - :return: iterator of timestamps (which can be used to retrieve - revisions later on). - :rtype: generator - """ - if not hasattr(self, '_deletedRevs'): - self._deletedRevs = {} - for item in self.site.deletedrevs(self, total=total, **kwargs): - for rev in item.get('revisions', []): - self._deletedRevs[rev['timestamp']] = rev - yield rev['timestamp'] - - def getDeletedRevision( - self, - timestamp, - content: bool = False, - **kwargs - ) -> List: - """ - Return a particular deleted revision by timestamp. - - :return: a list of [date, editor, comment, text, restoration - marker]. text will be None, unless content is True (or has - been retrieved earlier). If timestamp is not found, returns - empty list. - """ - if hasattr(self, '_deletedRevs'): - if timestamp in self._deletedRevs and ( - not content - or 'content' in self._deletedRevs[timestamp]): - return self._deletedRevs[timestamp] - - for item in self.site.deletedrevs(self, start=timestamp, - content=content, total=1, **kwargs): - # should only be one item with one revision - if item['title'] == self.title(): - if 'revisions' in item: - return item['revisions'][0] - return [] - - def markDeletedRevision(self, timestamp, undelete: bool = True): - """ - Mark the revision identified by timestamp for undeletion. - - :param undelete: if False, mark the revision to remain deleted. - """ - if not hasattr(self, '_deletedRevs'): - self.loadDeletedRevisions() - if timestamp not in self._deletedRevs: - raise ValueError( - 'Timestamp {} is not a deleted revision' - .format(timestamp)) - self._deletedRevs[timestamp]['marked'] = undelete - - def undelete(self, reason: Optional[str] = None) -> None: - """ - Undelete revisions based on the markers set by previous calls. - - If no calls have been made since loadDeletedRevisions(), everything - will be restored. - - Simplest case:: - - Page(...).undelete('This will restore all revisions') - - More complex:: - - pg = Page(...) - revs = pg.loadDeletedRevisions() - for rev in revs: - if ... #decide whether to undelete a revision - pg.markDeletedRevision(rev) #mark for undeletion - pg.undelete('This will restore only selected revisions.') - - :param reason: Reason for the action. - """ - if hasattr(self, '_deletedRevs'): - undelete_revs = [ts for ts, rev in self._deletedRevs.items() - if 'marked' in rev and rev['marked']] - else: - undelete_revs = [] - if reason is None: - warn('Not passing a reason for undelete() is deprecated.', - DeprecationWarning) - pywikibot.output('Undeleting {}.'.format(self.title(as_link=True))) - reason = pywikibot.input( - 'Please enter a reason for the undeletion:') - self.site.undelete(self, reason, revision=undelete_revs) - - def protect(self, - reason: Optional[str] = None, - protections: Optional[dict] = None, - **kwargs) -> None: - """ - Protect or unprotect a wiki page. Requires administrator status. - - Valid protection levels are '' (equivalent to 'none'), - 'autoconfirmed', 'sysop' and 'all'. 'all' means 'everyone is allowed', - i.e. that protection type will be unprotected. - - In order to unprotect a type of permission, the protection level shall - be either set to 'all' or '' or skipped in the protections dictionary. - - Expiry of protections can be set via kwargs, see Site.protect() for - details. By default there is no expiry for the protection types. - - :param protections: A dict mapping type of protection to protection - level of that type. Allowed protection types for a page can be - retrieved by Page.self.applicable_protections() - Defaults to protections is None, which means unprotect all - protection types. - Example: {'move': 'sysop', 'edit': 'autoconfirmed'} - - :param reason: Reason for the action, default is None and will set an - empty string. - """ - protections = protections or {} # protections is converted to {} - reason = reason or '' # None is converted to '' - - self.site.protect(self, protections, reason, **kwargs) - - def change_category(self, old_cat, new_cat, - summary: Optional[str] = None, - sort_key=None, - in_place: bool = True, - include: Optional[List[str]] = None, - show_diff: bool = False) -> bool: - """ - Remove page from oldCat and add it to newCat. - - .. versionadded:: 7.0 - The `show_diff` parameter - - :param old_cat: category to be removed - :type old_cat: pywikibot.page.Category - :param new_cat: category to be added, if any - :type new_cat: pywikibot.page.Category or None - - :param summary: string to use as an edit summary - - :param sort_key: sortKey to use for the added category. - Unused if newCat is None, or if inPlace=True - If sortKey=True, the sortKey used for oldCat will be used. - - :param in_place: if True, change categories in place rather than - rearranging them. - - :param include: list of tags not to be disabled by default in relevant - textlib functions, where CategoryLinks can be searched. - :param show_diff: show changes between oldtext and newtext - (default: False) - - :return: True if page was saved changed, otherwise False. - """ - # get list of Category objects the article is in and remove possible - # duplicates - cats = [] - for cat in textlib.getCategoryLinks(self.text, site=self.site, - include=include or []): - if cat not in cats: - cats.append(cat) - - if not self.has_permission(): - pywikibot.output("Can't edit {}, skipping it..." - .format(self.title(as_link=True))) - return False - - if old_cat not in cats: - if self.namespace() != 10: - pywikibot.error('{} is not in category {}!' - .format(self.title(as_link=True), - old_cat.title())) - else: - pywikibot.output('{} is not in category {}, skipping...' - .format(self.title(as_link=True), - old_cat.title())) - return False - - # This prevents the bot from adding new_cat if it is already present. - if new_cat in cats: - new_cat = None - - oldtext = self.text - if in_place or self.namespace() == 10: - newtext = textlib.replaceCategoryInPlace(oldtext, old_cat, new_cat, - site=self.site) - else: - old_cat_pos = cats.index(old_cat) - if new_cat: - if sort_key is True: - # Fetch sort_key from old_cat in current page. - sort_key = cats[old_cat_pos].sortKey - cats[old_cat_pos] = Category(self.site, new_cat.title(), - sort_key=sort_key) - else: - cats.pop(old_cat_pos) - - try: - newtext = textlib.replaceCategoryLinks(oldtext, cats) - except ValueError: - # Make sure that the only way replaceCategoryLinks() can return - # a ValueError is in the case of interwiki links to self. - pywikibot.output('Skipping {} because of interwiki link to ' - 'self'.format(self.title())) - return False - - if oldtext != newtext: - try: - self.put(newtext, summary, show_diff=show_diff) - except PageSaveRelatedError as error: - pywikibot.output('Page {} not saved: {}' - .format(self.title(as_link=True), error)) - except NoUsernameError: - pywikibot.output('Page {} not saved; sysop privileges ' - 'required.'.format(self.title(as_link=True))) - else: - return True - - return False - - def is_flow_page(self) -> bool: - """Whether a page is a Flow page.""" - return self.content_model == 'flow-board' - - def create_short_link(self, - permalink: bool = False, - with_protocol: bool = True) -> str: - """ - Return a shortened link that points to that page. - - If shared_urlshortner_wiki is defined in family config, it'll use - that site to create the link instead of the current wiki. - - :param permalink: If true, the link will point to the actual revision - of the page. - :param with_protocol: If true, and if it's not already included, - the link will have http(s) protocol prepended. On Wikimedia wikis - the protocol is already present. - :return: The reduced link. - """ - wiki = self.site - if self.site.family.shared_urlshortner_wiki: - wiki = pywikibot.Site(*self.site.family.shared_urlshortner_wiki) - - url = self.permalink() if permalink else self.full_url() - - link = wiki.create_short_link(url) - if re.match(PROTOCOL_REGEX, link): - if not with_protocol: - return re.sub(PROTOCOL_REGEX, '', link) - elif with_protocol: - return '{}://{}'.format(wiki.protocol(), link) - return link - - -class Page(BasePage): - - """Page: A MediaWiki page.""" - - def __init__(self, source, title: str = '', ns=0) -> None: - """Instantiate a Page object.""" - if isinstance(source, pywikibot.site.BaseSite): - if not title: - raise ValueError('Title must be specified and not empty ' - 'if source is a Site.') - super().__init__(source, title, ns) - - @property - def raw_extracted_templates(self): - """ - Extract templates using :py:obj:`textlib.extract_templates_and_params`. - - Disabled parts and whitespace are stripped, except for - whitespace in anonymous positional arguments. - - This value is cached. - - :rtype: list of (str, OrderedDict) - """ - if not hasattr(self, '_raw_extracted_templates'): - templates = textlib.extract_templates_and_params( - self.text, True, True) - self._raw_extracted_templates = templates - - return self._raw_extracted_templates - - def templatesWithParams(self): - """ - Return templates used on this Page. - - The templates are extracted by - :py:obj:`textlib.extract_templates_and_params`, with positional - arguments placed first in order, and each named argument - appearing as 'name=value'. - - All parameter keys and values for each template are stripped of - whitespace. - - :return: a list of tuples with one tuple for each template invocation - in the page, with the template Page as the first entry and a list - of parameters as the second entry. - :rtype: list of (pywikibot.page.Page, list) - """ - # WARNING: may not return all templates used in particularly - # intricate cases such as template substitution - titles = {t.title() for t in self.templates()} - templates = self.raw_extracted_templates - # backwards-compatibility: convert the dict returned as the second - # element into a list in the format used by old scripts - result = [] - for template in templates: - try: - link = pywikibot.Link(template[0], self.site, - default_namespace=10) - if link.canonical_title() not in titles: - continue - except Error: - # this is a parser function or magic word, not template name - # the template name might also contain invalid parts - continue - args = template[1] - intkeys = {} - named = {} - positional = [] - for key in sorted(args): - try: - intkeys[int(key)] = args[key] - except ValueError: - named[key] = args[key] - for i in range(1, len(intkeys) + 1): - # only those args with consecutive integer keys can be - # treated as positional; an integer could also be used - # (out of order) as the key for a named argument - # example: {{tmp|one|two|5=five|three}} - if i in intkeys: - positional.append(intkeys[i]) - else: - for k in intkeys: - if k < 1 or k >= i: - named[str(k)] = intkeys[k] - break - for item in named.items(): - positional.append('{}={}'.format(*item)) - result.append((pywikibot.Page(link, self.site), positional)) - return result - - def set_redirect_target( - self, - target_page, - create: bool = False, - force: bool = False, - keep_section: bool = False, - save: bool = True, - **kwargs - ): - """ - Change the page's text to point to the redirect page. - - :param target_page: target of the redirect, this argument is required. - :type target_page: pywikibot.Page or string - :param create: if true, it creates the redirect even if the page - doesn't exist. - :param force: if true, it set the redirect target even the page - doesn't exist or it's not redirect. - :param keep_section: if the old redirect links to a section - and the new one doesn't it uses the old redirect's section. - :param save: if true, it saves the page immediately. - :param kwargs: Arguments which are used for saving the page directly - afterwards, like 'summary' for edit summary. - """ - if isinstance(target_page, str): - target_page = pywikibot.Page(self.site, target_page) - elif self.site != target_page.site: - raise InterwikiRedirectPageError(self, target_page) - if not self.exists() and not (create or force): - raise NoPageError(self) - if self.exists() and not self.isRedirectPage() and not force: - raise IsNotRedirectPageError(self) - redirect_regex = self.site.redirect_regex - if self.exists(): - old_text = self.get(get_redirect=True) - else: - old_text = '' - result = redirect_regex.search(old_text) - if result: - oldlink = result.group(1) - if (keep_section and '#' in oldlink - and target_page.section() is None): - sectionlink = oldlink[oldlink.index('#'):] - target_page = pywikibot.Page( - self.site, - target_page.title() + sectionlink - ) - prefix = self.text[:result.start()] - suffix = self.text[result.end():] - else: - prefix = '' - suffix = '' - - target_link = target_page.title(as_link=True, textlink=True, - allow_interwiki=False) - target_link = '#{} {}'.format(self.site.redirect(), target_link) - self.text = prefix + target_link + suffix - if save: - self.save(**kwargs) - - def get_best_claim(self, prop: str): - """ - Return the first best Claim for this page. - - Return the first 'preferred' ranked Claim specified by Wikibase - property or the first 'normal' one otherwise. - - .. versionadded:: 3.0 - - :param prop: property id, "P###" - :return: Claim object given by Wikibase property number - for this page object. - :rtype: pywikibot.Claim or None - - :raises UnknownExtensionError: site has no Wikibase extension - """ - def find_best_claim(claims): - """Find the first best ranked claim.""" - index = None - for i, claim in enumerate(claims): - if claim.rank == 'preferred': - return claim - if index is None and claim.rank == 'normal': - index = i - if index is None: - index = 0 - return claims[index] - - if not self.site.has_data_repository: - raise UnknownExtensionError( - 'Wikibase is not implemented for {}.'.format(self.site)) - - def get_item_page(func, *args): - try: - item_p = func(*args) - item_p.get() - return item_p - except NoPageError: - return None - except IsRedirectPageError: - return get_item_page(item_p.getRedirectTarget) - - item_page = get_item_page(pywikibot.ItemPage.fromPage, self) - if item_page and prop in item_page.claims: - return find_best_claim(item_page.claims[prop]) - return None - - -class FilePage(Page): - - """ - A subclass of Page representing a file description page. - - Supports the same interface as Page, with some added methods. - """ - - def __init__(self, source, title: str = '') -> None: - """Initializer.""" - self._file_revisions = {} # dictionary to cache File history. - super().__init__(source, title, 6) - if self.namespace() != 6: - raise ValueError("'{}' is not in the file namespace!" - .format(self.title())) - - def _load_file_revisions(self, imageinfo) -> None: - for file_rev in imageinfo: - # filemissing in API response indicates most fields are missing - # see https://gerrit.wikimedia.org/r/c/mediawiki/core/+/533482/ - if 'filemissing' in file_rev: - pywikibot.warning("File '{}' contains missing revisions" - .format(self.title())) - continue - file_revision = FileInfo(file_rev) - self._file_revisions[file_revision.timestamp] = file_revision - - @property - def latest_file_info(self): - """ - Retrieve and store information of latest Image rev. of FilePage. - - At the same time, the whole history of Image is fetched and cached in - self._file_revisions - - :return: instance of FileInfo() - """ - if not self._file_revisions: - self.site.loadimageinfo(self, history=True) - latest_ts = max(self._file_revisions) - return self._file_revisions[latest_ts] - - @property - def oldest_file_info(self): - """ - Retrieve and store information of oldest Image rev. of FilePage. - - At the same time, the whole history of Image is fetched and cached in - self._file_revisions - - :return: instance of FileInfo() - """ - if not self._file_revisions: - self.site.loadimageinfo(self, history=True) - oldest_ts = min(self._file_revisions) - return self._file_revisions[oldest_ts] - - def get_file_history(self) -> dict: - """ - Return the file's version history. - - :return: dictionary with: - key: timestamp of the entry - value: instance of FileInfo() - """ - if not self._file_revisions: - self.site.loadimageinfo(self, history=True) - return self._file_revisions - - def getImagePageHtml(self) -> str: - """Download the file page, and return the HTML, as a string. - - Caches the HTML code, so that if you run this method twice on the - same FilePage object, the page will only be downloaded once. - """ - if not hasattr(self, '_imagePageHtml'): - path = '{}/index.php?title={}'.format(self.site.scriptpath(), - self.title(as_url=True)) - self._imagePageHtml = http.request(self.site, path).text - return self._imagePageHtml - - def get_file_url(self, url_width=None, url_height=None, - url_param=None) -> str: - """ - Return the url or the thumburl of the file described on this page. - - Fetch the information if not available. - - Once retrieved, thumburl information will also be accessible as - latest_file_info attributes, named as in [1]: - - url, thumburl, thumbwidth and thumbheight - - Parameters correspond to iiprops in: - [1] https://www.mediawiki.org/wiki/API:Imageinfo - - Parameters validation and error handling left to the API call. - - :param url_width: see iiurlwidth in [1] - :param url_height: see iiurlheigth in [1] - :param url_param: see iiurlparam in [1] - :return: latest file url or thumburl - """ - # Plain url is requested. - if url_width is None and url_height is None and url_param is None: - return self.latest_file_info.url - - # Thumburl is requested. - self.site.loadimageinfo(self, history=not self._file_revisions, - url_width=url_width, url_height=url_height, - url_param=url_param) - return self.latest_file_info.thumburl - - def file_is_shared(self) -> bool: - """Check if the file is stored on any known shared repository.""" - # as of now, the only known repositories are commons and wikitravel - # TODO: put the URLs to family file - if not self.site.has_image_repository: - return False - - if 'wikitravel_shared' in self.site.shared_image_repository(): - return self.latest_file_info.url.startswith( - 'https://wikitravel.org/upload/shared/') - # default to commons - return self.latest_file_info.url.startswith( - 'https://upload.wikimedia.org/wikipedia/commons/') - - def getFileVersionHistoryTable(self): - """Return the version history in the form of a wiki table.""" - lines = [] - for info in self.get_file_history().values(): - dimension = '{width}×{height} px ({size} bytes)'.format( - **info.__dict__) - lines.append('| {timestamp} || {user} || {dimension} |' - '| <nowiki>{comment}</nowiki>' - .format(dimension=dimension, **info.__dict__)) - return ('{| class="wikitable"\n' - '! {{int:filehist-datetime}} || {{int:filehist-user}} |' - '| {{int:filehist-dimensions}} || {{int:filehist-comment}}\n' - '|-\n%s\n|}\n' % '\n|-\n'.join(lines)) - - def usingPages(self, total: Optional[int] = None, content: bool = False): - """Yield Pages on which the file is displayed. - - :param total: iterate no more than this number of pages in total - :param content: if True, load the current content of each iterated page - (default False) - """ - return self.site.imageusage(self, total=total, content=content) - - def upload(self, source: str, **kwargs) -> bool: - """ - Upload this file to the wiki. - - keyword arguments are from site.upload() method. - - :param source: Path or URL to the file to be uploaded. - - :keyword comment: Edit summary; if this is not provided, then - filepage.text will be used. An empty summary is not permitted. - This may also serve as the initial page text (see below). - :keyword text: Initial page text; if this is not set, then - filepage.text will be used, or comment. - :keyword watch: If true, add filepage to the bot user's watchlist - :keyword ignore_warnings: It may be a static boolean, a callable - returning a boolean or an iterable. The callable gets a list of - UploadError instances and the iterable should contain the warning - codes for which an equivalent callable would return True if all - UploadError codes are in thet list. If the result is False it'll - not continue uploading the file and otherwise disable any warning - and reattempt to upload the file. NOTE: If report_success is True - or None it'll raise an UploadError exception if the static - boolean is False. - :type ignore_warnings: bool or callable or iterable of str - :keyword chunk_size: The chunk size in bytesfor chunked uploading (see - https://www.mediawiki.org/wiki/API:Upload#Chunked_uploading). It - will only upload in chunks, if the chunk size is positive but lower - than the file size. - :type chunk_size: int - :keyword report_success: If the upload was successful it'll print a - success message and if ignore_warnings is set to False it'll - raise an UploadError if a warning occurred. If it's - None (default) it'll be True if ignore_warnings is a bool and False - otherwise. If it's True or None ignore_warnings must be a bool. - :return: It returns True if the upload was successful and False - otherwise. - """ - filename = url = None - if '://' in source: - url = source - else: - filename = source - return self.site.upload(self, source_filename=filename, source_url=url, - **kwargs) - - def download(self, filename=None, chunk_size=100 * 1024, revision=None): - """ - Download to filename file of FilePage. - - :param filename: filename where to save file: - None: self.title(as_filename=True, with_ns=False) - will be used - str: provided filename will be used. - :type filename: None or str - :param chunk_size: the size of each chunk to be received and - written to file. - :type chunk_size: int - :param revision: file revision to download: - None: self.latest_file_info will be used - FileInfo: provided revision will be used. - :type revision: None or FileInfo - :return: True if download is successful, False otherwise. - :raise IOError: if filename cannot be written for any reason. - """ - if filename is None: - filename = self.title(as_filename=True, with_ns=False) - - filename = os.path.expanduser(filename) - - if revision is None: - revision = self.latest_file_info - - req = http.fetch(revision.url, stream=True) - if req.status_code == HTTPStatus.OK: - try: - with open(filename, 'wb') as f: - for chunk in req.iter_content(chunk_size): - f.write(chunk) - except OSError as e: - raise e - - sha1 = compute_file_hash(filename) - return sha1 == revision.sha1 - pywikibot.warning( - 'Unsuccessfull request ({}): {}' - .format(req.status_code, req.url)) - return False - - def globalusage(self, total=None): - """ - Iterate all global usage for this page. - - :param total: iterate no more than this number of pages in total - :return: a generator that yields Pages also on sites different from - self.site. - :rtype: generator - """ - return self.site.globalusage(self, total=total) - - def data_item(self): - """ - Convenience function to get the associated Wikibase item of the file. - - If WikibaseMediaInfo extension is available (e.g. on Commons), - the method returns the associated mediainfo entity. Otherwise, - it falls back to behavior of BasePage.data_item. - - .. versionadded:: 6.5 - - :rtype: pywikibot.page.WikibaseEntity - """ - if self.site.has_extension('WikibaseMediaInfo'): - if not hasattr(self, '_item'): - self._item = MediaInfo(self.site) - self._item._file = self - return self._item - - return super().data_item() - - -class Category(Page): - - """A page in the Category: namespace.""" - - def __init__(self, source, title: str = '', sort_key=None) -> None: - """ - Initializer. - - All parameters are the same as for Page() Initializer. - """ - self.sortKey = sort_key - super().__init__(source, title, ns=14) - if self.namespace() != 14: - raise ValueError("'{}' is not in the category namespace!" - .format(self.title())) - - def aslink(self, sort_key: Optional[str] = None) -> str: - """ - Return a link to place a page in this Category. - - Use this only to generate a "true" category link, not for interwikis - or text links to category pages. - - :param sort_key: The sort key for the article to be placed in this - Category; if omitted, default sort key is used. - """ - key = sort_key or self.sortKey - if key is not None: - title_with_sort_key = self.title(with_section=False) + '|' + key - else: - title_with_sort_key = self.title(with_section=False) - return '[[{}]]'.format(title_with_sort_key) - - def subcategories(self, - recurse: Union[int, bool] = False, - total: Optional[int] = None, - content: bool = False): - """ - Iterate all subcategories of the current category. - - :param recurse: if not False or 0, also iterate subcategories of - subcategories. If an int, limit recursion to this number of - levels. (Example: recurse=1 will iterate direct subcats and - first-level sub-sub-cats, but no deeper.) - :param total: iterate no more than this number of - subcategories in total (at all levels) - :param content: if True, retrieve the content of the current version - of each category description page (default False) - """ - if not isinstance(recurse, bool) and recurse: - recurse = recurse - 1 - if not hasattr(self, '_subcats'): - self._subcats = [] - for member in self.site.categorymembers( - self, member_type='subcat', total=total, content=content): - subcat = Category(member) - self._subcats.append(subcat) - yield subcat - if total is not None: - total -= 1 - if total == 0: - return - if recurse: - for item in subcat.subcategories( - recurse, total=total, content=content): - yield item - if total is not None: - total -= 1 - if total == 0: - return - else: - for subcat in self._subcats: - yield subcat - if total is not None: - total -= 1 - if total == 0: - return - if recurse: - for item in subcat.subcategories( - recurse, total=total, content=content): - yield item - if total is not None: - total -= 1 - if total == 0: - return - - def articles(self, - recurse: Union[int, bool] = False, - total: Optional[int] = None, - content: bool = False, - namespaces: Union[int, List[int]] = None, - sortby: Optional[str] = None, - reverse: bool = False, - starttime=None, endtime=None, - startprefix: Optional[str] = None, - endprefix: Optional[str] = None): - """ - Yield all articles in the current category. - - By default, yields all *pages* in the category that are not - subcategories! - - :param recurse: if not False or 0, also iterate articles in - subcategories. If an int, limit recursion to this number of - levels. (Example: recurse=1 will iterate articles in first-level - subcats, but no deeper.) - :param total: iterate no more than this number of pages in - total (at all levels) - :param namespaces: only yield pages in the specified namespaces - :param content: if True, retrieve the content of the current version - of each page (default False) - :param sortby: determines the order in which results are generated, - valid values are "sortkey" (default, results ordered by category - sort key) or "timestamp" (results ordered by time page was - added to the category). This applies recursively. - :param reverse: if True, generate results in reverse order - (default False) - :param starttime: if provided, only generate pages added after this - time; not valid unless sortby="timestamp" - :type starttime: pywikibot.Timestamp - :param endtime: if provided, only generate pages added before this - time; not valid unless sortby="timestamp" - :type endtime: pywikibot.Timestamp - :param startprefix: if provided, only generate pages >= this title - lexically; not valid if sortby="timestamp" - :param endprefix: if provided, only generate pages < this title - lexically; not valid if sortby="timestamp" - :rtype: typing.Iterable[pywikibot.Page] - """ - seen = set() - for member in self.site.categorymembers(self, - namespaces=namespaces, - total=total, - content=content, - sortby=sortby, - reverse=reverse, - starttime=starttime, - endtime=endtime, - startprefix=startprefix, - endprefix=endprefix, - member_type=['page', 'file']): - if recurse: - seen.add(hash(member)) - yield member - if total is not None: - total -= 1 - if total == 0: - return - - if recurse: - if not isinstance(recurse, bool) and recurse: - recurse -= 1 - for subcat in self.subcategories(): - for article in subcat.articles(recurse=recurse, - total=total, - content=content, - namespaces=namespaces, - sortby=sortby, - reverse=reverse, - starttime=starttime, - endtime=endtime, - startprefix=startprefix, - endprefix=endprefix): - hash_value = hash(article) - if hash_value in seen: - continue - seen.add(hash_value) - yield article - if total is not None: - total -= 1 - if total == 0: - return - - def members(self, recurse: bool = False, - namespaces=None, - total: Optional[int] = None, - content: bool = False): - """Yield all category contents (subcats, pages, and files). - - :rtype: typing.Iterable[pywikibot.Page] - """ - for member in self.site.categorymembers( - self, namespaces=namespaces, total=total, content=content): - yield member - if total is not None: - total -= 1 - if total == 0: - return - if recurse: - if not isinstance(recurse, bool) and recurse: - recurse = recurse - 1 - for subcat in self.subcategories(): - for article in subcat.members( - recurse, namespaces, total=total, content=content): - yield article - if total is not None: - total -= 1 - if total == 0: - return - - def isEmptyCategory(self) -> bool: - """Return True if category has no members (including subcategories).""" - ci = self.categoryinfo - return sum(ci[k] for k in ['files', 'pages', 'subcats']) == 0 - - def isHiddenCategory(self) -> bool: - """Return True if the category is hidden.""" - return 'hiddencat' in self.properties() - - @property - def categoryinfo(self) -> dict: - """ - Return a dict containing information about the category. - - The dict contains values for: - - Numbers of pages, subcategories, files, and total contents. - """ - return self.site.categoryinfo(self) - - def newest_pages(self, total=None): - """ - Return pages in a category ordered by the creation date. - - If two or more pages are created at the same time, the pages are - returned in the order they were added to the category. The most - recently added page is returned first. - - It only allows to return the pages ordered from newest to oldest, as it - is impossible to determine the oldest page in a category without - checking all pages. But it is possible to check the category in order - with the newly added first and it yields all pages which were created - after the currently checked page was added (and thus there is no page - created after any of the cached but added before the currently - checked). - - :param total: The total number of pages queried. - :type total: int - :return: A page generator of all pages in a category ordered by the - creation date. From newest to oldest. Note: It currently only - returns Page instances and not a subclass of it if possible. This - might change so don't expect to only get Page instances. - :rtype: generator - """ - def check_cache(latest): - """Return the cached pages in order and not more than total.""" - cached = [] - for timestamp in sorted((ts for ts in cache if ts > latest), - reverse=True): - # The complete list can be removed, it'll either yield all of - # them, or only a portion but will skip the rest anyway - cached += cache.pop(timestamp)[:None if total is None else - total - len(cached)] - if total and len(cached) >= total: - break # already got enough - assert total is None or len(cached) <= total, \ - 'Number of caches is more than total number requested' - return cached - - # all pages which have been checked but where created before the - # current page was added, at some point they will be created after - # the current page was added. It saves all pages via the creation - # timestamp. Be prepared for multiple pages. - cache = defaultdict(list) - # TODO: Make site.categorymembers is usable as it returns pages - # There is no total defined, as it's not known how many pages need to - # be checked before the total amount of new pages was found. In worst - # case all pages of a category need to be checked. - for member in pywikibot.data.api.QueryGenerator( - site=self.site, parameters={ - 'list': 'categorymembers', 'cmsort': 'timestamp', - 'cmdir': 'older', 'cmprop': 'timestamp|title', - 'cmtitle': self.title()}): - # TODO: Upcast to suitable class - page = pywikibot.Page(self.site, member['title']) - assert page.namespace() == member['ns'], \ - 'Namespace of the page is not consistent' - cached = check_cache(pywikibot.Timestamp.fromISOformat( - member['timestamp'])) - yield from cached - if total is not None: - total -= len(cached) - if total <= 0: - break - cache[page.oldest_revision.timestamp] += [page] - else: - # clear cache - assert total is None or total > 0, \ - 'As many items as given in total already returned' - yield from check_cache(pywikibot.Timestamp.min) - - -class User(Page): - - """ - A class that represents a Wiki user. - - This class also represents the Wiki page User:<username> - """ - - def __init__(self, source, title: str = '') -> None: - """ - Initializer for a User object. - - All parameters are the same as for Page() Initializer. - """ - self._isAutoblock = True - if title.startswith('#'): - title = title[1:] - elif ':#' in title: - title = title.replace(':#', ':') - else: - self._isAutoblock = False - super().__init__(source, title, ns=2) - if self.namespace() != 2: - raise ValueError("'{}' is not in the user namespace!" - .format(self.title())) - if self._isAutoblock: - # This user is probably being queried for purpose of lifting - # an autoblock. - pywikibot.output( - 'This is an autoblock ID, you can only use to unblock it.') - - @property - def username(self) -> str: - """ - The username. - - Convenience method that returns the title of the page with - namespace prefix omitted, which is the username. - """ - if self._isAutoblock: - return '#' + self.title(with_ns=False) - return self.title(with_ns=False) - - def isRegistered(self, force: bool = False) -> bool: - """ - Determine if the user is registered on the site. - - It is possible to have a page named User:xyz and not have - a corresponding user with username xyz. - - The page does not need to exist for this method to return - True. - - :param force: if True, forces reloading the data from API - """ - # T135828: the registration timestamp may be None but the key exists - return (not self.isAnonymous() - and 'registration' in self.getprops(force)) - - def isAnonymous(self) -> bool: - """Determine if the user is editing as an IP address.""" - return is_ip_address(self.username) - - def getprops(self, force: bool = False) -> dict: - """ - Return a properties about the user. - - :param force: if True, forces reloading the data from API - """ - if force and hasattr(self, '_userprops'): - del self._userprops - if not hasattr(self, '_userprops'): - self._userprops = list(self.site.users([self.username, ]))[0] - if self.isAnonymous(): - r = list(self.site.blocks(iprange=self.username, total=1)) - if r: - self._userprops['blockedby'] = r[0]['by'] - self._userprops['blockreason'] = r[0]['reason'] - return self._userprops - - def registration(self, force: bool = False): - """ - Fetch registration date for this user. - - :param force: if True, forces reloading the data from API - :rtype: pywikibot.Timestamp or None - """ - if not self.isAnonymous(): - reg = self.getprops(force).get('registration') - if reg: - return pywikibot.Timestamp.fromISOformat(reg) - return None - - def editCount(self, force: bool = False) -> int: - """ - Return edit count for a registered user. - - Always returns 0 for 'anonymous' users. - - :param force: if True, forces reloading the data from API - """ - return self.getprops(force).get('editcount', 0) - - def is_blocked(self, force: bool = False) -> bool: - """Determine whether the user is currently blocked. - - .. versionchanged:: 7.0 - renamed from :meth:`isBlocked` method, - can also detect range blocks. - - :param force: if True, forces reloading the data from API - """ - return 'blockedby' in self.getprops(force) - - @deprecated('is_blocked', since='7.0.0') - def isBlocked(self, force: bool = False) -> bool: - """Determine whether the user is currently blocked. - - .. deprecated:: 7.0 - use :meth:`is_blocked` instead - - :param force: if True, forces reloading the data from API - """ - return self.is_blocked(force) - - def is_locked(self, force: bool = False) -> bool: - """Determine whether the user is currently locked globally. - - .. versionadded:: 7.0 - - :param force: if True, forces reloading the data from API - """ - return self.site.is_locked(self.username, force) - - def isEmailable(self, force: bool = False) -> bool: - """ - Determine whether emails may be send to this user through MediaWiki. - - :param force: if True, forces reloading the data from API - """ - return not self.isAnonymous() and 'emailable' in self.getprops(force) - - def groups(self, force: bool = False) -> list: - """ - Return a list of groups to which this user belongs. - - The list of groups may be empty. - - :param force: if True, forces reloading the data from API - :return: groups property - """ - return self.getprops(force).get('groups', []) - - def gender(self, force: bool = False) -> str: - """Return the gender of the user. - - :param force: if True, forces reloading the data from API - :return: return 'male', 'female', or 'unknown' - """ - if self.isAnonymous(): - return 'unknown' - return self.getprops(force).get('gender', 'unknown') - - def rights(self, force: bool = False) -> list: - """Return user rights. - - :param force: if True, forces reloading the data from API - :return: return user rights - """ - return self.getprops(force).get('rights', []) - - def getUserPage(self, subpage: str = ''): - """ - Return a Page object relative to this user's main page. - - :param subpage: subpage part to be appended to the main - page title (optional) - :type subpage: str - :return: Page object of user page or user subpage - :rtype: pywikibot.Page - """ - if self._isAutoblock: - # This user is probably being queried for purpose of lifting - # an autoblock, so has no user pages per se. - raise AutoblockUserError( - 'This is an autoblock ID, you can only use to unblock it.') - if subpage: - subpage = '/' + subpage - return Page(Link(self.title() + subpage, self.site)) - - def getUserTalkPage(self, subpage: str = ''): - """ - Return a Page object relative to this user's main talk page. - - :param subpage: subpage part to be appended to the main - talk page title (optional) - :type subpage: str - :return: Page object of user talk page or user talk subpage - :rtype: pywikibot.Page - """ - if self._isAutoblock: - # This user is probably being queried for purpose of lifting - # an autoblock, so has no user talk pages per se. - raise AutoblockUserError( - 'This is an autoblock ID, you can only use to unblock it.') - if subpage: - subpage = '/' + subpage - return Page(Link(self.username + subpage, - self.site, default_namespace=3)) - - def send_email(self, subject: str, text: str, ccme: bool = False) -> bool: - """ - Send an email to this user via MediaWiki's email interface. - - :param subject: the subject header of the mail - :param text: mail body - :param ccme: if True, sends a copy of this email to the bot - :raises NotEmailableError: the user of this User is not emailable - :raises UserRightsError: logged in user does not have 'sendemail' right - :return: operation successful indicator - """ - if not self.isEmailable(): - raise NotEmailableError(self) - - if not self.site.has_right('sendemail'): - raise UserRightsError("You don't have permission to send mail") - - params = { - 'action': 'emailuser', - 'target': self.username, - 'token': self.site.tokens['email'], - 'subject': subject, - 'text': text, - } - if ccme: - params['ccme'] = 1 - mailrequest = self.site.simple_request(**params) - maildata = mailrequest.submit() - - if 'emailuser' in maildata: - if maildata['emailuser']['result'] == 'Success': - return True - return False - - def block(self, *args, **kwargs): - """ - Block user. - - Refer :py:obj:`APISite.blockuser` method for parameters. - - :return: None - """ - try: - self.site.blockuser(self, *args, **kwargs) - except APIError as err: - if err.code == 'invalidrange': - raise ValueError('{} is not a valid IP range.' - .format(self.username)) - - raise err - - def unblock(self, reason: Optional[str] = None) -> None: - """ - Remove the block for the user. - - :param reason: Reason for the unblock. - """ - self.site.unblockuser(self, reason) - - def logevents(self, **kwargs): - """Yield user activities. - - :keyword logtype: only iterate entries of this type - (see mediawiki api documentation for available types) - :type logtype: str - :keyword page: only iterate entries affecting this page - :type page: Page or str - :keyword namespace: namespace to retrieve logevents from - :type namespace: int or Namespace - :keyword start: only iterate entries from and after this Timestamp - :type start: Timestamp or ISO date string - :keyword end: only iterate entries up to and through this Timestamp - :type end: Timestamp or ISO date string - :keyword reverse: if True, iterate oldest entries first - (default: newest) - :type reverse: bool - :keyword tag: only iterate entries tagged with this tag - :type tag: str - :keyword total: maximum number of events to iterate - :type total: int - :rtype: iterable - """ - return self.site.logevents(user=self.username, **kwargs) - - @property - def last_event(self): - """Return last user activity. - - :return: last user log entry - :rtype: LogEntry or None - """ - return next(iter(self.logevents(total=1)), None) - - def contributions(self, total: int = 500, **kwargs) -> tuple: - """ - Yield tuples describing this user edits. - - Each tuple is composed of a pywikibot.Page object, - the revision id (int), the edit timestamp (as a pywikibot.Timestamp - object), and the comment (str). - Pages returned are not guaranteed to be unique. - - :param total: limit result to this number of pages - :keyword start: Iterate contributions starting at this Timestamp - :keyword end: Iterate contributions ending at this Timestamp - :keyword reverse: Iterate oldest contributions first (default: newest) - :keyword namespaces: only iterate pages in these namespaces - :type namespaces: iterable of str or Namespace key, - or a single instance of those types. May be a '|' separated - list of namespace identifiers. - :keyword showMinor: if True, iterate only minor edits; if False and - not None, iterate only non-minor edits (default: iterate both) - :keyword top_only: if True, iterate only edits which are the latest - revision (default: False) - :return: tuple of pywikibot.Page, revid, pywikibot.Timestamp, comment - """ - for contrib in self.site.usercontribs( - user=self.username, total=total, **kwargs): - ts = pywikibot.Timestamp.fromISOformat(contrib['timestamp']) - yield (Page(self.site, contrib['title'], contrib['ns']), - contrib['revid'], - ts, - contrib.get('comment')) - - @property - def first_edit(self): - """Return first user contribution. - - :return: first user contribution entry - :return: tuple of pywikibot.Page, revid, pywikibot.Timestamp, comment - :rtype: tuple or None - """ - return next(self.contributions(reverse=True, total=1), None) - - @property - def last_edit(self): - """Return last user contribution. - - :return: last user contribution entry - :return: tuple of pywikibot.Page, revid, pywikibot.Timestamp, comment - :rtype: tuple or None - """ - return next(self.contributions(total=1), None) - - def deleted_contributions( - self, *, total: int = 500, **kwargs - ) -> Iterable[Tuple[Page, Revision]]: - """Yield tuples describing this user's deleted edits. - - .. versionadded:: 5.5 - - :param total: Limit results to this number of pages - :keyword start: Iterate contributions starting at this Timestamp - :keyword end: Iterate contributions ending at this Timestamp - :keyword reverse: Iterate oldest contributions first (default: newest) - :keyword namespaces: Only iterate pages in these namespaces - """ - for data in self.site.alldeletedrevisions(user=self.username, - total=total, **kwargs): - page = Page(self.site, data['title'], data['ns']) - for contrib in data['revisions']: - yield page, Revision(**contrib) - - def uploadedImages(self, total=10): - """ - Yield tuples describing files uploaded by this user. - - Each tuple is composed of a pywikibot.Page, the timestamp (str in - ISO8601 format), comment (str) and a bool for pageid > 0. - Pages returned are not guaranteed to be unique. - - :param total: limit result to this number of pages - :type total: int - """ - if not self.isRegistered(): - return - for item in self.logevents(logtype='upload', total=total): - yield (item.page(), - str(item.timestamp()), - item.comment(), - item.pageid() > 0) - - @property - def is_thankable(self) -> bool: - """ - Determine if the user has thanks notifications enabled. - - NOTE: This doesn't accurately determine if thanks is enabled for user. - Privacy of thanks preferences is under discussion, please see - https://phabricator.wikimedia.org/T57401#2216861, and - https://phabricator.wikimedia.org/T120753#1863894 - """ - return self.isRegistered() and 'bot' not in self.groups() - - -class WikibaseEntity: - - """ - The base interface for Wikibase entities. - - Each entity is identified by a data repository it belongs to - and an identifier. - - :cvar DATA_ATTRIBUTES: dictionary which maps data attributes (eg. 'labels', - 'claims') to appropriate collection classes (eg. LanguageDict, - ClaimsCollection) - - :cvar entity_type: entity type identifier - :type entity_type: str - - :cvar title_pattern: regular expression which matches all possible - entity ids for this entity type - :type title_pattern: str - """ - - DATA_ATTRIBUTES = {} # type: Dict[str, Any] - - def __init__(self, repo, id_=None) -> None: - """ - Initializer. - - :param repo: Entity repository. - :type repo: DataSite - :param id_: Entity identifier. - :type id_: str or None, -1 and None mean non-existing - """ - self.repo = repo - self.id = id_ if id_ is not None else '-1' - if self.id != '-1' and not self.is_valid_id(self.id): - raise InvalidTitleError( - "'{}' is not a valid {} page title" - .format(self.id, self.entity_type)) - - def __repr__(self) -> str: - if self.id != '-1': - return 'pywikibot.page.{}({!r}, {!r})'.format( - self.__class__.__name__, self.repo, self.id) - return 'pywikibot.page.{}({!r})'.format( - self.__class__.__name__, self.repo) - - @classmethod - def is_valid_id(cls, entity_id: str) -> bool: - """ - Whether the string can be a valid id of the entity type. - - :param entity_id: The ID to test. - """ - if not hasattr(cls, 'title_pattern'): - return True - - return bool(re.fullmatch(cls.title_pattern, entity_id)) - - def __getattr__(self, name): - if name in self.DATA_ATTRIBUTES: - if self.getID() == '-1': - for key, cls in self.DATA_ATTRIBUTES.items(): - setattr(self, key, cls.new_empty(self.repo)) - return getattr(self, name) - return self.get()[name] - - raise AttributeError("'{}' object has no attribute '{}'" - .format(self.__class__.__name__, name)) - - def _defined_by(self, singular: bool = False) -> dict: - """ - Internal function to provide the API parameters to identify the entity. - - An empty dict is returned if the entity has not been created yet. - - :param singular: Whether the parameter names should use the singular - form - :return: API parameters - """ - params = {} - if self.id != '-1': - if singular: - params['id'] = self.id - else: - params['ids'] = self.id - return params - - def getID(self, numeric: bool = False): - """ - Get the identifier of this entity. - - :param numeric: Strip the first letter and return an int - """ - if numeric: - return int(self.id[1:]) if self.id != '-1' else -1 - return self.id - - def get_data_for_new_entity(self) -> dict: - """ - Return data required for creation of a new entity. - - Override it if you need. - """ - return {} - - def toJSON(self, diffto: Optional[dict] = None) -> dict: - """ - Create JSON suitable for Wikibase API. - - When diffto is provided, JSON representing differences - to the provided data is created. - - :param diffto: JSON containing entity data - """ - data = {} - for key in self.DATA_ATTRIBUTES: - attr = getattr(self, key, None) - if attr is None: - continue - if diffto: - value = attr.toJSON(diffto=diffto.get(key)) - else: - value = attr.toJSON() - if value: - data[key] = value - return data - - @classmethod - def _normalizeData(cls, data: dict) -> dict: - """ - Helper function to expand data into the Wikibase API structure. - - :param data: The dict to normalize - :return: The dict with normalized data - """ - norm_data = {} - for key, attr in cls.DATA_ATTRIBUTES.items(): - if key in data: - norm_data[key] = attr.normalizeData(data[key]) - return norm_data - - @property - def latest_revision_id(self) -> Optional[int]: - """ - Get the revision identifier for the most recent revision of the entity. - - :rtype: int or None if it cannot be determined - :raise NoWikibaseEntityError: if the entity doesn't exist - """ - if not hasattr(self, '_revid'): - # fixme: unlike BasePage.latest_revision_id, this raises - # exception when entity is redirect, cannot use get_redirect - self.get() - return self._revid - - @latest_revision_id.setter - def latest_revision_id(self, value: Optional[int]) -> None: - self._revid = value - - @latest_revision_id.deleter - def latest_revision_id(self) -> None: - if hasattr(self, '_revid'): - del self._revid - - def exists(self) -> bool: - """Determine if an entity exists in the data repository.""" - if not hasattr(self, '_content'): - try: - self.get() - return True - except NoWikibaseEntityError: - return False - return 'missing' not in self._content - - def get(self, force: bool = False) -> dict: - """ - Fetch all entity data and cache it. - - :param force: override caching - :raise NoWikibaseEntityError: if this entity doesn't exist - :return: actual data which entity holds - """ - if force or not hasattr(self, '_content'): - identification = self._defined_by() - if not identification: - raise NoWikibaseEntityError(self) - - try: - data = self.repo.loadcontent(identification) - except APIError as err: - if err.code == 'no-such-entity': - raise NoWikibaseEntityError(self) - raise - item_index, content = data.popitem() - self.id = item_index - self._content = content - if 'missing' in self._content: - raise NoWikibaseEntityError(self) - - self.latest_revision_id = self._content.get('lastrevid') - - data = {} - - # This initializes all data, - for key, cls in self.DATA_ATTRIBUTES.items(): - value = cls.fromJSON(self._content.get(key, {}), self.repo) - setattr(self, key, value) - data[key] = value - return data - - def editEntity(self, data=None, **kwargs) -> None: - """ - Edit an entity using Wikibase wbeditentity API. - - :param data: Data to be saved - :type data: dict, or None to save the current content of the entity. - """ - if data is None: - data = self.toJSON(diffto=getattr(self, '_content', None)) - else: - data = self._normalizeData(data) - - baserevid = getattr(self, '_revid', None) - - updates = self.repo.editEntity( - self, data, baserevid=baserevid, **kwargs) - - # the attribute may have been unset in ItemPage - if getattr(self, 'id', '-1') == '-1': - self.__init__(self.repo, updates['entity']['id']) - - # the response also contains some data under the 'entity' key - # but it is NOT the actual content - # see also [[d:Special:Diff/1356933963]] - # TODO: there might be some circumstances under which - # the content can be safely reused - if hasattr(self, '_content'): - del self._content - self.latest_revision_id = updates['entity'].get('lastrevid') - - def concept_uri(self) -> str: - """ - Return the full concept URI. - - :raise NoWikibaseEntityError: if this entity doesn't exist - """ - entity_id = self.getID() - if entity_id == '-1': - raise NoWikibaseEntityError(self) - return '{}{}'.format(self.repo.concept_base_uri, entity_id) - - -class MediaInfo(WikibaseEntity): - - """Interface for MediaInfo entities on Commons. - - .. versionadded:: 6.5 - """ - - title_pattern = r'M[1-9]\d*' - DATA_ATTRIBUTES = { - 'labels': LanguageDict, - # TODO: 'statements': ClaimCollection, - } - - @property - def file(self) -> FilePage: - """Get the file associated with the mediainfo.""" - if not hasattr(self, '_file'): - if self.id == '-1': - # if the above doesn't apply, this entity is in an invalid - # state which needs to be raised as an exception, but also - # logged in case an exception handler is catching - # the generic Error - pywikibot.error('{} is in invalid state' - .format(self.__class__.__name__)) - raise Error('{} is in invalid state' - .format(self.__class__.__name__)) - - page_id = self.getID(numeric=True) - result = list(self.repo.load_pages_from_pageids([page_id])) - if not result: - raise Error('There is no existing page with id "{}"' - .format(page_id)) - - page = result.pop() - if page.namespace() != page.site.namespaces.FILE: - raise Error('Page with id "{}" is not a file'.format(page_id)) - - self._file = FilePage(page) - - return self._file - - def get(self, force: bool = False) -> dict: - """Fetch all MediaInfo entity data and cache it. - - :param force: override caching - :raise NoWikibaseEntityError: if this entity doesn't exist - :return: actual data which entity holds - """ - if self.id == '-1': - if force: - if not self.file.exists(): - exc = NoPageError(self.file) - raise NoWikibaseEntityError(self) from exc - # get just the id for Wikibase API call - self.id = 'M' + str(self.file.pageid) - else: - try: - data = self.file.latest_revision.slots['mediainfo']['*'] - except NoPageError as exc: - raise NoWikibaseEntityError(self) from exc - - self._content = jsonlib.loads(data) - self.id = self._content['id'] - - return super().get(force=force) - - def getID(self, numeric: bool = False): - """ - Get the entity identifier. - - :param numeric: Strip the first letter and return an int - """ - if self.id == '-1': - self.get() - return super().getID(numeric=numeric) - - -class WikibasePage(BasePage, WikibaseEntity): - - """ - Mixin base class for Wikibase entities which are also pages (eg. items). - - There should be no need to instantiate this directly. - """ - - _cache_attrs = BasePage._cache_attrs + ('_content', ) - - def __init__(self, site, title: str = '', **kwargs) -> None: - """ - Initializer. - - If title is provided, either ns or entity_type must also be provided, - and will be checked against the title parsed using the Page - initialisation logic. - - :param site: Wikibase data site - :type site: pywikibot.site.DataSite - :param title: normalized title of the page - :type title: str - :keyword ns: namespace - :type ns: Namespace instance, or int - :keyword entity_type: Wikibase entity type - :type entity_type: str ('item' or 'property') - - :raises TypeError: incorrect use of parameters - :raises ValueError: incorrect namespace - :raises pywikibot.exceptions.Error: title parsing problems - :raises NotImplementedError: the entity type is not supported - """ - if not isinstance(site, pywikibot.site.DataSite): - raise TypeError('site must be a pywikibot.site.DataSite object') - if title and ('ns' not in kwargs and 'entity_type' not in kwargs): - pywikibot.debug('{}.__init__: {} title {!r} specified without ' - 'ns or entity_type' - .format(self.__class__.__name__, site, - title), - layer='wikibase') - - self._namespace = None - - if 'ns' in kwargs: - if isinstance(kwargs['ns'], Namespace): - self._namespace = kwargs.pop('ns') - kwargs['ns'] = self._namespace.id - else: - # numerical namespace given - ns = int(kwargs['ns']) - if site.item_namespace.id == ns: - self._namespace = site.item_namespace - elif site.property_namespace.id == ns: - self._namespace = site.property_namespace - else: - raise ValueError('{!r}: Namespace "{}" is not valid' - .format(site, int(ns))) - - if 'entity_type' in kwargs: - entity_type = kwargs.pop('entity_type') - try: - entity_type_ns = site.get_namespace_for_entity_type( - entity_type) - except EntityTypeUnknownError: - raise ValueError('Wikibase entity type "{}" unknown' - .format(entity_type)) - - if self._namespace: - if self._namespace != entity_type_ns: - raise ValueError('Namespace "{}" is not valid for Wikibase' - ' entity type "{}"' - .format(int(kwargs['ns']), entity_type)) - else: - self._namespace = entity_type_ns - kwargs['ns'] = self._namespace.id - - BasePage.__init__(self, site, title, **kwargs) - - # If a title was not provided, - # avoid checks which may cause an exception. - if not title: - WikibaseEntity.__init__(self, site) - return - - if self._namespace: - if self._link.namespace != self._namespace.id: - raise ValueError("'{}' is not in the namespace {}" - .format(title, self._namespace.id)) - else: - # Neither ns or entity_type was provided. - # Use the _link to determine entity type. - ns = self._link.namespace - if self.site.item_namespace.id == ns: - self._namespace = self.site.item_namespace - elif self.site.property_namespace.id == ns: - self._namespace = self.site.property_namespace - else: - raise ValueError('{!r}: Namespace "{!r}" is not valid' - .format(self.site, ns)) - - WikibaseEntity.__init__( - self, - # .site forces a parse of the Link title to determine site - self.site, - # Link.__init__, called from Page.__init__, has cleaned the title - # stripping whitespace and uppercasing the first letter according - # to the namespace case=first-letter. - self._link.title) - - def namespace(self) -> int: - """ - Return the number of the namespace of the entity. - - :return: Namespace id - """ - return self._namespace.id - - def exists(self) -> bool: - """Determine if an entity exists in the data repository.""" - if not hasattr(self, '_content'): - try: - self.get(get_redirect=True) - return True - except NoPageError: - return False - return 'missing' not in self._content - - def botMayEdit(self) -> bool: - """ - Return whether bots may edit this page. - - Because there is currently no system to mark a page that it shouldn't - be edited by bots on Wikibase pages it always returns True. The content - of the page is not text but a dict, the original way (to search for a - template) doesn't apply. - - :return: True - """ - return True - - def get(self, force: bool = False, *args, **kwargs) -> dict: - """ - Fetch all page data, and cache it. - - :param force: override caching - :raise NotImplementedError: a value in args or kwargs - :return: actual data which entity holds - :note: dicts returned by this method are references to content - of this entity and their modifying may indirectly cause - unwanted change to the live content - """ - if args or kwargs: - raise NotImplementedError( - '{}.get does not implement var args: {!r} and {!r}'.format( - self.__class__.__name__, args, kwargs)) - - # todo: this variable is specific to ItemPage - lazy_loading_id = not hasattr(self, 'id') and hasattr(self, '_site') - try: - data = WikibaseEntity.get(self, force=force) - except NoWikibaseEntityError: - if lazy_loading_id: - p = Page(self._site, self._title) - if not p.exists(): - raise NoPageError(p) - # todo: raise a nicer exception here (T87345) - raise NoPageError(self) - - if 'pageid' in self._content: - self._pageid = self._content['pageid'] - - # xxx: this is ugly - if 'claims' in data: - self.claims.set_on_item(self) - - return data - - @property - def latest_revision_id(self) -> int: - """ - Get the revision identifier for the most recent revision of the entity. - - :rtype: int - :raise pywikibot.exceptions.NoPageError: if the entity doesn't exist - """ - if not hasattr(self, '_revid'): - self.get() - return self._revid - - @latest_revision_id.setter - def latest_revision_id(self, value) -> None: - self._revid = value - - @latest_revision_id.deleter - def latest_revision_id(self) -> None: - # fixme: this seems too destructive in comparison to the parent - self.clear_cache() - - @allow_asynchronous - def editEntity(self, data=None, **kwargs) -> None: - """ - Edit an entity using Wikibase wbeditentity API. - - This function is wrapped around by: - - editLabels - - editDescriptions - - editAliases - - ItemPage.setSitelinks - - :param data: Data to be saved - :type data: dict, or None to save the current content of the entity. - :keyword asynchronous: if True, launch a separate thread to edit - asynchronously - :type asynchronous: bool - :keyword callback: a callable object that will be called after the - entity has been updated. It must take two arguments: (1) a - WikibasePage object, and (2) an exception instance, which will be - None if the page was saved successfully. This is intended for use - by bots that need to keep track of which saves were successful. - :type callback: callable - """ - # kept for the decorator - super().editEntity(data, **kwargs) - - def editLabels(self, labels, **kwargs) -> None: - """ - Edit entity labels. - - Labels should be a dict, with the key - as a language or a site object. The - value should be the string to set it to. - You can set it to '' to remove the label. - """ - data = {'labels': labels} - self.editEntity(data, **kwargs) - - def editDescriptions(self, descriptions, **kwargs) -> None: - """ - Edit entity descriptions. - - Descriptions should be a dict, with the key - as a language or a site object. The - value should be the string to set it to. - You can set it to '' to remove the description. - """ - data = {'descriptions': descriptions} - self.editEntity(data, **kwargs) - - def editAliases(self, aliases, **kwargs) -> None: - """ - Edit entity aliases. - - Aliases should be a dict, with the key - as a language or a site object. The - value should be a list of strings. - """ - data = {'aliases': aliases} - self.editEntity(data, **kwargs) - - def set_redirect_target( - self, - target_page, - create: bool = False, - force: bool = False, - keep_section: bool = False, - save: bool = True, - **kwargs - ): - """ - Set target of a redirect for a Wikibase page. - - Has not been implemented in the Wikibase API yet, except for ItemPage. - """ - raise NotImplementedError - - @allow_asynchronous - def addClaim(self, claim, bot: bool = True, **kwargs): - """ - Add a claim to the entity. - - :param claim: The claim to add - :type claim: pywikibot.page.Claim - :param bot: Whether to flag as bot (if possible) - :keyword asynchronous: if True, launch a separate thread to add claim - asynchronously - :type asynchronous: bool - :keyword callback: a callable object that will be called after the - claim has been added. It must take two arguments: - (1) a WikibasePage object, and (2) an exception instance, - which will be None if the entity was saved successfully. This is - intended for use by bots that need to keep track of which saves - were successful. - :type callback: callable - """ - if claim.on_item is not None: - raise ValueError( - 'The provided Claim instance is already used in an entity') - self.repo.addClaim(self, claim, bot=bot, **kwargs) - claim.on_item = self - - def removeClaims(self, claims, **kwargs) -> None: - """ - Remove the claims from the entity. - - :param claims: list of claims to be removed - :type claims: list or pywikibot.Claim - """ - # this check allows single claims to be removed by pushing them into a - # list of length one. - if isinstance(claims, pywikibot.Claim): - claims = [claims] - data = self.repo.removeClaims(claims, **kwargs) - for claim in claims: - claim.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - claim.on_item = None - claim.snak = None - - -class ItemPage(WikibasePage): - - """ - Wikibase entity of type 'item'. - - A Wikibase item may be defined by either a 'Q' id (qid), - or by a site & title. - - If an item is defined by site & title, once an item's qid has - been looked up, the item is then defined by the qid. - """ - - _cache_attrs = WikibasePage._cache_attrs + ( - 'labels', 'descriptions', 'aliases', 'claims', 'sitelinks') - entity_type = 'item' - title_pattern = r'Q[1-9]\d*' - DATA_ATTRIBUTES = { - 'labels': LanguageDict, - 'descriptions': LanguageDict, - 'aliases': AliasesDict, - 'claims': ClaimCollection, - 'sitelinks': SiteLinkCollection, - } - - def __init__(self, site, title=None, ns=None) -> None: - """ - Initializer. - - :param site: data repository - :type site: pywikibot.site.DataSite - :param title: identifier of item, "Q###", - -1 or None for an empty item. - :type title: str - :type ns: namespace - :type ns: Namespace instance, or int, or None - for default item_namespace - """ - if ns is None: - ns = site.item_namespace - # Special case for empty item. - if title is None or title == '-1': - super().__init__(site, '-1', ns=ns) - assert self.id == '-1' - return - - # we don't want empty titles - if not title: - raise InvalidTitleError("Item's title cannot be empty") - - super().__init__(site, title, ns=ns) - - assert self.id == self._link.title - - def _defined_by(self, singular: bool = False) -> dict: - """ - Internal function to provide the API parameters to identify the item. - - The API parameters may be 'id' if the ItemPage has one, - or 'site'&'title' if instantiated via ItemPage.fromPage with - lazy_load enabled. - - Once an item's Q## is looked up, that will be used for all future - requests. - - An empty dict is returned if the ItemPage is instantiated without - either ID (internally it has id = '-1') or site&title. - - :param singular: Whether the parameter names should use the - singular form - :return: API parameters - """ - params = {} - if singular: - id = 'id' - site = 'site' - title = 'title' - else: - id = 'ids' - site = 'sites' - title = 'titles' - - lazy_loading_id = not hasattr(self, 'id') and hasattr(self, '_site') - - # id overrides all - if hasattr(self, 'id'): - if self.id != '-1': - params[id] = self.id - elif lazy_loading_id: - params[site] = self._site.dbName() - params[title] = self._title - else: - # if none of the above applies, this item is in an invalid state - # which needs to be raise as an exception, but also logged in case - # an exception handler is catching the generic Error. - pywikibot.error('{} is in invalid state' - .format(self.__class__.__name__)) - raise Error('{} is in invalid state' - .format(self.__class__.__name__)) - - return params - - def title(self, **kwargs): - """ - Return ID as title of the ItemPage. - - If the ItemPage was lazy-loaded via ItemPage.fromPage, this method - will fetch the Wikibase item ID for the page, potentially raising - NoPageError with the page on the linked wiki if it does not exist, or - does not have a corresponding Wikibase item ID. - - This method also refreshes the title if the id property was set. - i.e. item.id = 'Q60' - - All optional keyword parameters are passed to the superclass. - """ - # If instantiated via ItemPage.fromPage using site and title, - # _site and _title exist, and id does not exist. - lazy_loading_id = not hasattr(self, 'id') and hasattr(self, '_site') - - if lazy_loading_id or self._link._text != self.id: - # If the item is lazy loaded or has been modified, - # _link._text is stale. Removing _link._title - # forces Link to re-parse ._text into ._title. - if hasattr(self._link, '_title'): - del self._link._title - self._link._text = self.getID() - self._link.parse() - # Remove the temporary values that are no longer needed after - # the .getID() above has called .get(), which populated .id - if hasattr(self, '_site'): - del self._title - del self._site - - return super().title(**kwargs) - - def getID(self, numeric: bool = False, force: bool = False): - """ - Get the entity identifier. - - :param numeric: Strip the first letter and return an int - :param force: Force an update of new data - """ - if not hasattr(self, 'id') or force: - self.get(force=force) - return super().getID(numeric=numeric) - - @classmethod - def fromPage(cls, page, lazy_load: bool = False): - """ - Get the ItemPage for a Page that links to it. - - :param page: Page to look for corresponding data item - :type page: pywikibot.page.Page - :param lazy_load: Do not raise NoPageError if either page or - corresponding ItemPage does not exist. - :rtype: pywikibot.page.ItemPage - - :raise pywikibot.exceptions.NoPageError: There is no corresponding - ItemPage for the page - :raise pywikibot.exceptions.WikiBaseError: The site of the page - has no data repository. - """ - if hasattr(page, '_item'): - return page._item - if not page.site.has_data_repository: - raise WikiBaseError('{} has no data repository' - .format(page.site)) - if not lazy_load and not page.exists(): - raise NoPageError(page) - - repo = page.site.data_repository() - if hasattr(page, - '_pageprops') and page.properties().get('wikibase_item'): - # If we have already fetched the pageprops for something else, - # we already have the id, so use it - page._item = cls(repo, page.properties().get('wikibase_item')) - return page._item - i = cls(repo) - # clear id, and temporarily store data needed to lazy loading the item - del i.id - i._site = page.site - i._title = page.title(with_section=False) - if not lazy_load and not i.exists(): - raise NoPageError(i) - page._item = i - return page._item - - @classmethod - def from_entity_uri(cls, site, uri: str, lazy_load: bool = False): - """ - Get the ItemPage from its entity uri. - - :param site: The Wikibase site for the item. - :type site: pywikibot.site.DataSite - :param uri: Entity uri for the Wikibase item. - :param lazy_load: Do not raise NoPageError if ItemPage does not exist. - :rtype: pywikibot.page.ItemPage - - :raise TypeError: Site is not a valid DataSite. - :raise ValueError: Site does not match the base of the provided uri. - :raise pywikibot.exceptions.NoPageError: Uri points to non-existent - item. - """ - if not isinstance(site, DataSite): - raise TypeError('{} is not a data repository.'.format(site)) - - base_uri, _, qid = uri.rpartition('/') - if base_uri != site.concept_base_uri.rstrip('/'): - raise ValueError( - 'The supplied data repository ({repo}) does not correspond to ' - 'that of the item ({item})'.format( - repo=site.concept_base_uri.rstrip('/'), - item=base_uri)) - - item = cls(site, qid) - if not lazy_load and not item.exists(): - raise NoPageError(item) - - return item - - def get( - self, - force: bool = False, - get_redirect: bool = False, - *args, - **kwargs - ) -> Dict[str, Any]: - """ - Fetch all item data, and cache it. - - :param force: override caching - :param get_redirect: return the item content, do not follow the - redirect, do not raise an exception. - :raise NotImplementedError: a value in args or kwargs - :return: actual data which entity holds - :note: dicts returned by this method are references to content of this - entity and their modifying may indirectly cause unwanted change to - the live content - """ - data = super().get(force, *args, **kwargs) - - if self.isRedirectPage() and not get_redirect: - raise IsRedirectPageError(self) - - return data - - def getRedirectTarget(self): - """Return the redirect target for this page.""" - target = super().getRedirectTarget() - cmodel = target.content_model - if cmodel != 'wikibase-item': - raise Error('{} has redirect target {} with content model {} ' - 'instead of wikibase-item' - .format(self, target, cmodel)) - return self.__class__(target.site, target.title(), target.namespace()) - - def iterlinks(self, family=None): - """ - Iterate through all the sitelinks. - - :param family: string/Family object which represents what family of - links to iterate - :type family: str|pywikibot.family.Family - :return: iterator of pywikibot.Page objects - :rtype: iterator - """ - if not hasattr(self, 'sitelinks'): - self.get() - if family is not None and not isinstance(family, Family): - family = Family.load(family) - for sl in self.sitelinks.values(): - if family is None or family == sl.site.family: - pg = pywikibot.Page(sl) - pg._item = self - yield pg - - def getSitelink(self, site, force: bool = False) -> str: - """ - Return the title for the specific site. - - If the item doesn't have that language, raise NoPageError. - - :param site: Site to find the linked page of. - :type site: pywikibot.Site or database name - :param force: override caching - """ - if force or not hasattr(self, '_content'): - self.get(force=force) - - if site not in self.sitelinks: - raise NoPageError(self) - - return self.sitelinks[site].canonical_title() - - def setSitelink(self, sitelink, **kwargs) -> None: - """ - Set sitelinks. Calls setSitelinks(). - - A sitelink can be a Page object, a BaseLink object - or a {'site':dbname,'title':title} dictionary. - """ - self.setSitelinks([sitelink], **kwargs) - - def removeSitelink(self, site, **kwargs) -> None: - """ - Remove a sitelink. - - A site can either be a Site object, or it can be a dbName. - """ - self.removeSitelinks([site], **kwargs) - - def removeSitelinks(self, sites, **kwargs) -> None: - """ - Remove sitelinks. - - Sites should be a list, with values either - being Site objects, or dbNames. - """ - data = [] - for site in sites: - site = SiteLinkCollection.getdbName(site) - data.append({'site': site, 'title': ''}) - self.setSitelinks(data, **kwargs) - - def setSitelinks(self, sitelinks, **kwargs) -> None: - """ - Set sitelinks. - - Sitelinks should be a list. Each item in the - list can either be a Page object, a BaseLink object, or a dict - with a value for 'site' and 'title'. - """ - data = {'sitelinks': sitelinks} - self.editEntity(data, **kwargs) - - def mergeInto(self, item, **kwargs) -> None: - """ - Merge the item into another item. - - :param item: The item to merge into - :type item: pywikibot.page.ItemPage - """ - data = self.repo.mergeItems(from_item=self, to_item=item, **kwargs) - if not data.get('success', 0): - return - self.latest_revision_id = data['from']['lastrevid'] - item.latest_revision_id = data['to']['lastrevid'] - if data.get('redirected', 0): - self._isredir = True - self._redirtarget = item - - def set_redirect_target( - self, - target_page, - create: bool = False, - force: bool = False, - keep_section: bool = False, - save: bool = True, - **kwargs - ): - """ - Make the item redirect to another item. - - You need to define an extra argument to make this work, like save=True - - :param target_page: target of the redirect, this argument is required. - :type target_page: pywikibot.page.ItemPage or string - :param force: if true, it sets the redirect target even the page - is not redirect. - """ - if isinstance(target_page, str): - target_page = pywikibot.ItemPage(self.repo, target_page) - elif self.repo != target_page.repo: - raise InterwikiRedirectPageError(self, target_page) - if self.exists() and not self.isRedirectPage() and not force: - raise IsNotRedirectPageError(self) - if not save or keep_section or create: - raise NotImplementedError - data = self.repo.set_redirect_target( - from_item=self, to_item=target_page, - bot=kwargs.get('botflag', True)) - if data.get('success', 0): - del self.latest_revision_id - self._isredir = True - self._redirtarget = target_page - - def isRedirectPage(self): - """Return True if item is a redirect, False if not or not existing.""" - if hasattr(self, '_content') and not hasattr(self, '_isredir'): - self._isredir = self.id != self._content.get('id', self.id) - return self._isredir - return super().isRedirectPage() - - -class Property: - - """ - A Wikibase property. - - While every Wikibase property has a Page on the data repository, - this object is for when the property is used as part of another concept - where the property is not _the_ Page of the property. - - For example, a claim on an ItemPage has many property attributes, and so - it subclasses this Property class, but a claim does not have Page like - behaviour and semantics. - """ - - types = {'wikibase-item': ItemPage, - # 'wikibase-property': PropertyPage, must be declared first - 'string': str, - 'commonsMedia': FilePage, - 'globe-coordinate': pywikibot.Coordinate, - 'url': str, - 'time': pywikibot.WbTime, - 'quantity': pywikibot.WbQuantity, - 'monolingualtext': pywikibot.WbMonolingualText, - 'math': str, - 'external-id': str, - 'geo-shape': pywikibot.WbGeoShape, - 'tabular-data': pywikibot.WbTabularData, - 'musical-notation': str, - } - - # the value type where different from the type - value_types = {'wikibase-item': 'wikibase-entityid', - 'wikibase-property': 'wikibase-entityid', - 'commonsMedia': 'string', - 'url': 'string', - 'globe-coordinate': 'globecoordinate', - 'math': 'string', - 'external-id': 'string', - 'geo-shape': 'string', - 'tabular-data': 'string', - 'musical-notation': 'string', - } - - def __init__(self, site, id: str, datatype: Optional[str] = None) -> None: - """ - Initializer. - - :param site: data repository - :type site: pywikibot.site.DataSite - :param id: id of the property - :param datatype: datatype of the property; - if not given, it will be queried via the API - """ - self.repo = site - self.id = id.upper() - if datatype: - self._type = datatype - - @property - def type(self) -> str: - """Return the type of this property.""" - if not hasattr(self, '_type'): - self._type = self.repo.getPropertyType(self) - return self._type - - def getID(self, numeric: bool = False): - """ - Get the identifier of this property. - - :param numeric: Strip the first letter and return an int - """ - if numeric: - return int(self.id[1:]) - return self.id - - -class PropertyPage(WikibasePage, Property): - - """ - A Wikibase entity in the property namespace. - - Should be created as:: - - PropertyPage(DataSite, 'P21') - - or:: - - PropertyPage(DataSite, datatype='url') - """ - - _cache_attrs = WikibasePage._cache_attrs + ( - '_type', 'labels', 'descriptions', 'aliases', 'claims') - entity_type = 'property' - title_pattern = r'P[1-9]\d*' - DATA_ATTRIBUTES = { - 'labels': LanguageDict, - 'descriptions': LanguageDict, - 'aliases': AliasesDict, - 'claims': ClaimCollection, - } - - def __init__(self, source, title=None, datatype=None) -> None: - """ - Initializer. - - :param source: data repository property is on - :type source: pywikibot.site.DataSite - :param title: identifier of property, like "P##", - "-1" or None for an empty property. - :type title: str - :param datatype: Datatype for a new property. - :type datatype: str - """ - # Special case for new property. - if title is None or title == '-1': - if not datatype: - raise TypeError('"datatype" is required for new property.') - WikibasePage.__init__(self, source, '-1', - ns=source.property_namespace) - Property.__init__(self, source, '-1', datatype=datatype) - assert self.id == '-1' - else: - if not title: - raise InvalidTitleError( - "Property's title cannot be empty") - - WikibasePage.__init__(self, source, title, - ns=source.property_namespace) - Property.__init__(self, source, self.id) - - def get(self, force: bool = False, *args, **kwargs) -> dict: - """ - Fetch the property entity, and cache it. - - :param force: override caching - :raise NotImplementedError: a value in args or kwargs - :return: actual data which entity holds - :note: dicts returned by this method are references to content of this - entity and their modifying may indirectly cause unwanted change to - the live content - """ - if args or kwargs: - raise NotImplementedError( - 'PropertyPage.get only implements "force".') - - data = WikibasePage.get(self, force) - if 'datatype' in self._content: - self._type = self._content['datatype'] - data['datatype'] = self._type - return data - - def newClaim(self, *args, **kwargs): - """ - Helper function to create a new claim object for this property. - - :rtype: pywikibot.page.Claim - """ - # todo: raise when self.id is -1 - return Claim(self.site, self.getID(), datatype=self.type, - *args, **kwargs) - - def getID(self, numeric: bool = False): - """ - Get the identifier of this property. - - :param numeric: Strip the first letter and return an int - """ - # enforce this parent's implementation - return WikibasePage.getID(self, numeric=numeric) - - def get_data_for_new_entity(self): - """Return data required for creation of new property.""" - return {'datatype': self.type} - - -# Add PropertyPage to the class attribute "types" after its declaration. -Property.types['wikibase-property'] = PropertyPage - - -class Claim(Property): - - """ - A Claim on a Wikibase entity. - - Claims are standard claims as well as references and qualifiers. - """ - - TARGET_CONVERTER = { - 'wikibase-item': lambda value, site: - ItemPage(site, 'Q' + str(value['numeric-id'])), - 'wikibase-property': lambda value, site: - PropertyPage(site, 'P' + str(value['numeric-id'])), - 'commonsMedia': lambda value, site: - FilePage(pywikibot.Site('commons', 'commons'), value), # T90492 - 'globe-coordinate': pywikibot.Coordinate.fromWikibase, - 'geo-shape': pywikibot.WbGeoShape.fromWikibase, - 'tabular-data': pywikibot.WbTabularData.fromWikibase, - 'time': pywikibot.WbTime.fromWikibase, - 'quantity': pywikibot.WbQuantity.fromWikibase, - 'monolingualtext': lambda value, site: - pywikibot.WbMonolingualText.fromWikibase(value) - } - - SNAK_TYPES = ('value', 'somevalue', 'novalue') - - def __init__( - self, - site, - pid, - snak=None, - hash=None, - is_reference: bool = False, - is_qualifier: bool = False, - rank: str = 'normal', - **kwargs - ) -> None: - """ - Initializer. - - Defined by the "snak" value, supplemented by site + pid - - :param site: repository the claim is on - :type site: pywikibot.site.DataSite - :param pid: property id, with "P" prefix - :param snak: snak identifier for claim - :param hash: hash identifier for references - :param is_reference: whether specified claim is a reference - :param is_qualifier: whether specified claim is a qualifier - :param rank: rank for claim - """ - Property.__init__(self, site, pid, **kwargs) - self.snak = snak - self.hash = hash - self.rank = rank - self.isReference = is_reference - self.isQualifier = is_qualifier - if self.isQualifier and self.isReference: - raise ValueError('Claim cannot be both a qualifier and reference.') - self.sources = [] - self.qualifiers = OrderedDict() - self.target = None - self.snaktype = 'value' - self._on_item = None # The item it's on - - @property - def on_item(self): - """Return item this claim is attached to.""" - return self._on_item - - @on_item.setter - def on_item(self, item) -> None: - self._on_item = item - for values in self.qualifiers.values(): - for qualifier in values: - qualifier.on_item = item - for source in self.sources: - for values in source.values(): - for source in values: - source.on_item = item - - def __repr__(self) -> str: - """Return the representation string.""" - return '{cls_name}.fromJSON({}, {})'.format( - repr(self.repo), self.toJSON(), cls_name=type(self).__name__) - - def __eq__(self, other): - if not isinstance(other, self.__class__): - return False - - return self.same_as(other) - - def __ne__(self, other): - return not self.__eq__(other) - - @staticmethod - def _claim_mapping_same(this, other) -> bool: - if len(this) != len(other): - return False - my_values = list(chain.from_iterable(this.values())) - other_values = list(chain.from_iterable(other.values())) - if len(my_values) != len(other_values): - return False - for val in my_values: - if val not in other_values: - return False - for val in other_values: - if val not in my_values: - return False - return True - - def same_as( - self, - other, - ignore_rank: bool = True, - ignore_quals: bool = False, - ignore_refs: bool = True - ) -> bool: - """Check if two claims are same.""" - if ignore_rank: - attributes = ['id', 'snaktype', 'target'] - else: - attributes = ['id', 'snaktype', 'rank', 'target'] - for attr in attributes: - if getattr(self, attr) != getattr(other, attr): - return False - - if not ignore_quals: - if not self._claim_mapping_same(self.qualifiers, other.qualifiers): - return False - - if not ignore_refs: - if len(self.sources) != len(other.sources): - return False - for source in self.sources: - same = False - for other_source in other.sources: - if self._claim_mapping_same(source, other_source): - same = True - break - if not same: - return False - - return True - - def copy(self): - """ - Create an independent copy of this object. - - :rtype: pywikibot.page.Claim - """ - is_qualifier = self.isQualifier - is_reference = self.isReference - self.isQualifier = False - self.isReference = False - copy = self.fromJSON(self.repo, self.toJSON()) - for cl in (self, copy): - cl.isQualifier = is_qualifier - cl.isReference = is_reference - copy.hash = None - copy.snak = None - return copy - - @classmethod - def fromJSON(cls, site, data): - """ - Create a claim object from JSON returned in the API call. - - :param data: JSON containing claim data - :type data: dict - - :rtype: pywikibot.page.Claim - """ - claim = cls(site, data['mainsnak']['property'], - datatype=data['mainsnak'].get('datatype', None)) - if 'id' in data: - claim.snak = data['id'] - elif 'hash' in data: - claim.hash = data['hash'] - claim.snaktype = data['mainsnak']['snaktype'] - if claim.getSnakType() == 'value': - value = data['mainsnak']['datavalue']['value'] - # The default covers string, url types - if claim.type in cls.types or claim.type == 'wikibase-property': - claim.target = cls.TARGET_CONVERTER.get( - claim.type, lambda value, site: value)(value, site) - else: - pywikibot.warning( - '{} datatype is not supported yet.'.format(claim.type)) - claim.target = pywikibot.WbUnknown.fromWikibase(value) - if 'rank' in data: # References/Qualifiers don't have ranks - claim.rank = data['rank'] - if 'references' in data: - for source in data['references']: - claim.sources.append(cls.referenceFromJSON(site, source)) - if 'qualifiers' in data: - for prop in data['qualifiers-order']: - claim.qualifiers[prop] = [ - cls.qualifierFromJSON(site, qualifier) - for qualifier in data['qualifiers'][prop]] - return claim - - @classmethod - def referenceFromJSON(cls, site, data) -> dict: - """ - Create a dict of claims from reference JSON returned in the API call. - - Reference objects are represented a bit differently, and require - some more handling. - """ - source = OrderedDict() - - # Before #84516 Wikibase did not implement snaks-order. - # https://gerrit.wikimedia.org/r/c/84516/ - if 'snaks-order' in data: - prop_list = data['snaks-order'] - else: - prop_list = data['snaks'].keys() - - for prop in prop_list: - for claimsnak in data['snaks'][prop]: - claim = cls.fromJSON(site, {'mainsnak': claimsnak, - 'hash': data.get('hash')}) - claim.isReference = True - if claim.getID() not in source: - source[claim.getID()] = [] - source[claim.getID()].append(claim) - return source - - @classmethod - def qualifierFromJSON(cls, site, data): - """ - Create a Claim for a qualifier from JSON. - - Qualifier objects are represented a bit - differently like references, but I'm not - sure if this even requires it's own function. - - :rtype: pywikibot.page.Claim - """ - claim = cls.fromJSON(site, {'mainsnak': data, - 'hash': data.get('hash')}) - claim.isQualifier = True - return claim - - def toJSON(self) -> dict: - """Create dict suitable for the MediaWiki API.""" - data = { - 'mainsnak': { - 'snaktype': self.snaktype, - 'property': self.getID() - }, - 'type': 'statement' - } - if hasattr(self, 'snak') and self.snak is not None: - data['id'] = self.snak - if hasattr(self, 'rank') and self.rank is not None: - data['rank'] = self.rank - if self.getSnakType() == 'value': - data['mainsnak']['datatype'] = self.type - data['mainsnak']['datavalue'] = self._formatDataValue() - if self.isQualifier or self.isReference: - data = data['mainsnak'] - if hasattr(self, 'hash') and self.hash is not None: - data['hash'] = self.hash - else: - if self.qualifiers: - data['qualifiers'] = {} - data['qualifiers-order'] = list(self.qualifiers.keys()) - for prop, qualifiers in self.qualifiers.items(): - for qualifier in qualifiers: - assert qualifier.isQualifier is True - data['qualifiers'][prop] = [ - qualifier.toJSON() for qualifier in qualifiers] - - if self.sources: - data['references'] = [] - for collection in self.sources: - reference = { - 'snaks': {}, 'snaks-order': list(collection.keys())} - for prop, val in collection.items(): - reference['snaks'][prop] = [] - for source in val: - assert source.isReference is True - src_data = source.toJSON() - if 'hash' in src_data: - reference.setdefault('hash', src_data['hash']) - del src_data['hash'] - reference['snaks'][prop].append(src_data) - data['references'].append(reference) - return data - - def setTarget(self, value): - """ - Set the target value in the local object. - - :param value: The new target value. - :type value: object - - :exception ValueError: if value is not of the type - required for the Claim type. - """ - value_class = self.types[self.type] - if not isinstance(value, value_class): - raise ValueError('{} is not type {}.' - .format(value, value_class)) - self.target = value - - def changeTarget( - self, - value=None, - snaktype: str = 'value', - **kwargs - ) -> None: - """ - Set the target value in the data repository. - - :param value: The new target value. - :type value: object - :param snaktype: The new snak type ('value', 'somevalue', or - 'novalue'). - """ - if value: - self.setTarget(value) - - data = self.repo.changeClaimTarget(self, snaktype=snaktype, - **kwargs) - # TODO: Re-create the entire item from JSON, not just id - self.snak = data['claim']['id'] - self.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - - def getTarget(self): - """ - Return the target value of this Claim. - - None is returned if no target is set - - :return: object - """ - return self.target - - def getSnakType(self) -> str: - """ - Return the type of snak. - - :return: str ('value', 'somevalue' or 'novalue') - """ - return self.snaktype - - def setSnakType(self, value): - """ - Set the type of snak. - - :param value: Type of snak - :type value: str ('value', 'somevalue', or 'novalue') - """ - if value in self.SNAK_TYPES: - self.snaktype = value - else: - raise ValueError( - "snaktype must be 'value', 'somevalue', or 'novalue'.") - - def getRank(self): - """Return the rank of the Claim.""" - return self.rank - - def setRank(self, rank) -> None: - """Set the rank of the Claim.""" - self.rank = rank - - def changeRank(self, rank, **kwargs): - """Change the rank of the Claim and save.""" - self.rank = rank - return self.repo.save_claim(self, **kwargs) - - def changeSnakType(self, value=None, **kwargs) -> None: - """ - Save the new snak value. - - TODO: Is this function really needed? - """ - if value: - self.setSnakType(value) - self.changeTarget(snaktype=self.getSnakType(), **kwargs) - - def getSources(self) -> list: - """Return a list of sources, each being a list of Claims.""" - return self.sources - - def addSource(self, claim, **kwargs) -> None: - """ - Add the claim as a source. - - :param claim: the claim to add - :type claim: pywikibot.Claim - """ - self.addSources([claim], **kwargs) - - def addSources(self, claims, **kwargs): - """ - Add the claims as one source. - - :param claims: the claims to add - :type claims: list of pywikibot.Claim - """ - for claim in claims: - if claim.on_item is not None: - raise ValueError( - 'The provided Claim instance is already used in an entity') - if self.on_item is not None: - data = self.repo.editSource(self, claims, new=True, **kwargs) - self.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - for claim in claims: - claim.hash = data['reference']['hash'] - claim.on_item = self.on_item - source = defaultdict(list) - for claim in claims: - claim.isReference = True - source[claim.getID()].append(claim) - self.sources.append(source) - - def removeSource(self, source, **kwargs) -> None: - """ - Remove the source. Call removeSources(). - - :param source: the source to remove - :type source: pywikibot.Claim - """ - self.removeSources([source], **kwargs) - - def removeSources(self, sources, **kwargs) -> None: - """ - Remove the sources. - - :param sources: the sources to remove - :type sources: list of pywikibot.Claim - """ - data = self.repo.removeSources(self, sources, **kwargs) - self.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - for source in sources: - source_dict = defaultdict(list) - source_dict[source.getID()].append(source) - self.sources.remove(source_dict) - - def addQualifier(self, qualifier, **kwargs): - """Add the given qualifier. - - :param qualifier: the qualifier to add - :type qualifier: pywikibot.page.Claim - """ - if qualifier.on_item is not None: - raise ValueError( - 'The provided Claim instance is already used in an entity') - if self.on_item is not None: - data = self.repo.editQualifier(self, qualifier, **kwargs) - self.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - qualifier.on_item = self.on_item - qualifier.isQualifier = True - if qualifier.getID() in self.qualifiers: - self.qualifiers[qualifier.getID()].append(qualifier) - else: - self.qualifiers[qualifier.getID()] = [qualifier] - - def removeQualifier(self, qualifier, **kwargs) -> None: - """ - Remove the qualifier. Call removeQualifiers(). - - :param qualifier: the qualifier to remove - :type qualifier: pywikibot.page.Claim - """ - self.removeQualifiers([qualifier], **kwargs) - - def removeQualifiers(self, qualifiers, **kwargs) -> None: - """ - Remove the qualifiers. - - :param qualifiers: the qualifiers to remove - :type qualifiers: list Claim - """ - data = self.repo.remove_qualifiers(self, qualifiers, **kwargs) - self.on_item.latest_revision_id = data['pageinfo']['lastrevid'] - for qualifier in qualifiers: - self.qualifiers[qualifier.getID()].remove(qualifier) - qualifier.on_item = None - - def target_equals(self, value) -> bool: - """ - Check whether the Claim's target is equal to specified value. - - The function checks for: - - - WikibasePage ID equality - - WbTime year equality - - Coordinate equality, regarding precision - - WbMonolingualText text equality - - direct equality - - :param value: the value to compare with - :return: true if the Claim's target is equal to the value provided, - false otherwise - """ - if (isinstance(self.target, WikibasePage) - and isinstance(value, str)): - return self.target.id == value - - if (isinstance(self.target, pywikibot.WbTime) - and not isinstance(value, pywikibot.WbTime)): - return self.target.year == int(value) - - if (isinstance(self.target, pywikibot.Coordinate) - and isinstance(value, str)): - coord_args = [float(x) for x in value.split(',')] - if len(coord_args) >= 3: - precision = coord_args[2] - else: - precision = 0.0001 # Default value (~10 m at equator) - with suppress(TypeError): - if self.target.precision is not None: - precision = max(precision, self.target.precision) - - return (abs(self.target.lat - coord_args[0]) <= precision - and abs(self.target.lon - coord_args[1]) <= precision) - - if (isinstance(self.target, pywikibot.WbMonolingualText) - and isinstance(value, str)): - return self.target.text == value - - return self.target == value - - def has_qualifier(self, qualifier_id: str, target) -> bool: - """ - Check whether Claim contains specified qualifier. - - :param qualifier_id: id of the qualifier - :param target: qualifier target to check presence of - :return: true if the qualifier was found, false otherwise - """ - if self.isQualifier or self.isReference: - raise ValueError('Qualifiers and references cannot have ' - 'qualifiers.') - - for qualifier in self.qualifiers.get(qualifier_id, []): - if qualifier.target_equals(target): - return True - return False - - def _formatValue(self) -> dict: - """ - Format the target into the proper JSON value that Wikibase wants. - - :return: JSON value - """ - if self.type in ('wikibase-item', 'wikibase-property'): - value = {'entity-type': self.getTarget().entity_type, - 'numeric-id': self.getTarget().getID(numeric=True)} - elif self.type in ('string', 'url', 'math', 'external-id', - 'musical-notation'): - value = self.getTarget() - elif self.type == 'commonsMedia': - value = self.getTarget().title(with_ns=False) - elif self.type in ('globe-coordinate', 'time', - 'quantity', 'monolingualtext', - 'geo-shape', 'tabular-data'): - value = self.getTarget().toWikibase() - else: # WbUnknown - pywikibot.warning( - '{} datatype is not supported yet.'.format(self.type)) - value = self.getTarget().toWikibase() - return value - - def _formatDataValue(self) -> dict: - """ - Format the target into the proper JSON datavalue that Wikibase wants. - - :return: Wikibase API representation with type and value. - """ - return { - 'value': self._formatValue(), - 'type': self.value_types.get(self.type, self.type) - } - - -class FileInfo: - - """ - A structure holding imageinfo of latest rev. of FilePage. - - All keys of API imageinfo dictionary are mapped to FileInfo attributes. - Attributes can be retrieved both as self['key'] or self.key. - - Following attributes will be returned: - - timestamp, user, comment, url, size, sha1, mime, metadata - - archivename (not for latest revision) - - See Site.loadimageinfo() for details. - - Note: timestamp will be casted to pywikibot.Timestamp. - """ - - def __init__(self, file_revision) -> None: - """Initiate the class using the dict from L{APISite.loadimageinfo}.""" - self.__dict__.update(file_revision) - self.timestamp = pywikibot.Timestamp.fromISOformat(self.timestamp) - - def __getitem__(self, key): - """Give access to class values by key.""" - return getattr(self, key) - - def __repr__(self) -> str: - """Return a more complete string representation.""" - return repr(self.__dict__) - - def __eq__(self, other): - """Test if two File_info objects are equal.""" - return self.__dict__ == other.__dict__ diff --git a/pywikibot/page/_basepage.py b/pywikibot/page/_pages.py similarity index 100% copy from pywikibot/page/_basepage.py copy to pywikibot/page/_pages.py diff --git a/pywikibot/page/_basepage.py b/pywikibot/page/_wikibase.py similarity index 100% rename from pywikibot/page/_basepage.py rename to pywikibot/page/_wikibase.py -- To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/772012 To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Change-Id: I09959ee747a39aafa3b7cb5084e69df42619c674 Gerrit-Change-Number: 772012 Gerrit-PatchSet: 1 Gerrit-Owner: Xqt <info(a)gno.de> Gerrit-Reviewer: Xqt <info(a)gno.de> Gerrit-Reviewer: jenkins-bot Gerrit-MessageType: merged

2 years, 2 months

[Gerrit] ...core[master]: [IMPR] Improvements for using site.proofread_levels

by Xqt (Code Review)

Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/771901 ) Change subject: [IMPR] Improvements for using site.proofread_levels ...................................................................... [IMPR] Improvements for using site.proofread_levels - use list(site.proofread_levels) instead of site.proofread_levels.keys when printing so see a list instead a dict_keys view - use list(site.proofread_levels) instead of list(site.proofread_levels.keys()) which gives the same result Change-Id: Ic5f33615dfe96333fb629e28c25bf2f17a6ef20c --- M pywikibot/proofreadpage.py 1 file changed, 6 insertions(+), 6 deletions(-) Approvals: Mpaa: Looks good to me, approved Xqt: Verified; Looks good to me, approved diff --git a/pywikibot/proofreadpage.py b/pywikibot/proofreadpage.py index 4c03176..6ea2ad1 100644 --- a/pywikibot/proofreadpage.py +++ b/pywikibot/proofreadpage.py @@ -209,10 +209,10 @@ raise ValueError('Page {} must belong to {} namespace' .format(self.title(), site.proofread_page_ns)) # Ensure that constants are in line with Extension values. - if list(self.site.proofread_levels.keys()) != self.PROOFREAD_LEVELS: + level_list = list(self.site.proofread_levels) + if level_list != self.PROOFREAD_LEVELS: raise ValueError('QLs do not match site values: {} != {}' - .format(self.site.proofread_levels.keys(), - self.PROOFREAD_LEVELS)) + .format(level_list, self.PROOFREAD_LEVELS)) self._base, self._base_ext, self._num = self._parse_title() self._multi_page = self._base_ext in self._MULTI_PAGE_EXT @@ -350,7 +350,7 @@ def ql(self, value: int) -> None: if value not in self.site.proofread_levels: raise ValueError('Not valid QL value: {} (legal values: {})' - .format(value, self.site.proofread_levels)) + .format(value, list(self.site.proofread_levels))) # TODO: add logic to validate ql value change, considering # site.proofread_levels. self._full_header.ql = value @@ -375,7 +375,7 @@ except KeyError: pywikibot.warning('Not valid status set for {}: quality level = {}' .format(self.title(as_link=True), self.ql)) - return None + return None def without_text(self) -> None: """Set Page QL to "Without text".""" @@ -1024,7 +1024,7 @@ # All but 'Without Text' if filter_ql is None: - filter_ql = list(self.site.proofread_levels.keys()) + filter_ql = list(self.site.proofread_levels) filter_ql.remove(ProofreadPage.WITHOUT_TEXT) gen = (self.get_page(i) for i in range(start, end + 1)) -- To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/771901 To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Change-Id: Ic5f33615dfe96333fb629e28c25bf2f17a6ef20c Gerrit-Change-Number: 771901 Gerrit-PatchSet: 1 Gerrit-Owner: Xqt <info(a)gno.de> Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com> Gerrit-Reviewer: Xqt <info(a)gno.de> Gerrit-Reviewer: jenkins-bot Gerrit-MessageType: merged

2 years, 2 months

[Gerrit] ...core[master]: [IMPR] use pg.XMLDumpPageGenerator in replace.py

by Xqt (Code Review)

Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/769728 ) Change subject: [IMPR] use pg.XMLDumpPageGenerator in replace.py ...................................................................... [IMPR] use pg.XMLDumpPageGenerator in replace.py isTitleExcepted() and isTextExcepted() is already implemented in ReplaceRobot. It is not necessary to filter pages from xml dump twice. Therefore deprecate XmlDumpReplacePageGenerator in favour of pagegenerators.XMLDumpPageGenerator Bug: T85334 Change-Id: I30a4ecfecbd449a2357f69aa3a629a7d8e34dd05 --- M scripts/replace.py 1 file changed, 22 insertions(+), 29 deletions(-) Approvals: Mpaa: Looks good to me, approved Xqt: Verified; Looks good to me, approved diff --git a/scripts/replace.py b/scripts/replace.py index 7e4b2da..2410c19 100755 --- a/scripts/replace.py +++ b/scripts/replace.py @@ -146,14 +146,14 @@ import re from collections.abc import Sequence from contextlib import suppress -from typing import Optional +from typing import Any, Optional import pywikibot from pywikibot import editor, fixes, i18n, pagegenerators, textlib from pywikibot.backports import Dict, Generator, List, Pattern, Tuple from pywikibot.bot import ExistingPageBot, SingleSiteBot from pywikibot.exceptions import InvalidPageError, NoPageError -from pywikibot.tools import chars +from pywikibot.tools import chars, deprecated # This is required for the text that is shown when you run this script @@ -382,6 +382,7 @@ return _get_text_exceptions(self.fix_set.exceptions or {}) +@deprecated('pagegenerators.XMLDumpPageGenerator', since='7.1.0') class XmlDumpReplacePageGenerator: """ @@ -389,26 +390,23 @@ These pages will be retrieved from a local XML dump file. + .. deprecated:: 7.1 + :param xmlFilename: The dump's path, either absolute or relative - :type xmlFilename: str :param xmlStart: Skip all articles in the dump before this one - :type xmlStart: str :param replacements: A list of 2-tuples of original text (as a compiled regular expression) and replacement text (as a string). - :type replacements: list of 2-tuples :param exceptions: A dictionary which defines when to ignore an occurrence. See docu of the ReplaceRobot initializer below. :type exceptions: dict """ - def __init__( - self, - xmlFilename, - xmlStart, - replacements, - exceptions, - site - ) -> None: + def __init__(self, + xmlFilename: str, + xmlStart: str, + replacements: List[Tuple[Any, str]], + exceptions: Dict[str, Any], + site) -> None: """Initializer.""" self.xmlFilename = xmlFilename self.replacements = replacements @@ -488,7 +486,6 @@ :param replacements: a list of Replacement instances or sequences of length 2 with the original text (as a compiled regular expression) and replacement text (as a string). - :type replacements: list :param exceptions: a dictionary which defines when not to change an occurrence. This dictionary can have these keys: @@ -508,17 +505,16 @@ dictionary in textlib._create_default_regexes() or must be accepted by textlib._get_regexes(). - :type exceptions: dict - :param allowoverlap: when matches overlap, all of them are replaced. + :keyword allowoverlap: when matches overlap, all of them are replaced. :type allowoverlap: bool - :param recursive: Recurse replacement as long as possible. + :keyword recursive: Recurse replacement as long as possible. :type recursive: bool :warning: Be careful, this might lead to an infinite loop. - :param addcat: category to be added to every page touched + :keyword addcat: category to be added to every page touched :type addcat: pywikibot.Category or str or None - :param sleep: slow down between processing multiple regexes + :keyword sleep: slow down between processing multiple regexes :type sleep: int - :param summary: Set the summary message text bypassing the default + :keyword summary: Set the summary message text bypassing the default :type summary: str :keyword always: the user won't be prompted before changes are made :type keyword: bool @@ -528,13 +524,10 @@ about the missing site """ - def __init__( - self, - generator, - replacements, - exceptions=None, - **kwargs - ) -> None: + def __init__(self, generator, + replacements: List[Tuple[Any, str]], + exceptions: Optional[Dict[str, Any]] = None, + **kwargs) -> None: """Initializer.""" self.available_options.update({ 'addcat': None, @@ -1086,8 +1079,8 @@ precompile_exceptions(exceptions, regex, flags) if xmlFilename: - gen = XmlDumpReplacePageGenerator(xmlFilename, xmlStart, - replacements, exceptions, site) + gen = pagegenerators.XmlDumpPageGenerator( + xmlFilename, xmlStart, namespaces=genFactory.namespaces, site=site) elif sql_query is not None: # Only -excepttext option is considered by the query. Other # exceptions are taken into account by the ReplaceRobot -- To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/769728 To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Change-Id: I30a4ecfecbd449a2357f69aa3a629a7d8e34dd05 Gerrit-Change-Number: 769728 Gerrit-PatchSet: 3 Gerrit-Owner: Xqt <info(a)gno.de> Gerrit-Reviewer: D3r1ck01 <xsavitar.wiki(a)aol.com> Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com> Gerrit-Reviewer: Xqt <info(a)gno.de> Gerrit-Reviewer: jenkins-bot Gerrit-MessageType: merged

2 years, 2 months

[Gerrit] ...core[master]: [IMPR] Outsource Category to its own file (part 2)

by Xqt (Code Review)

Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/771986 ) Change subject: [IMPR] Outsource Category to its own file (part 2) ...................................................................... [IMPR] Outsource Category to its own file (part 2) Merge branch 'category1' into category3 Change-Id: I68925222c543c9c0703c44c7753144ba4977a869 --- R pywikibot/page/_basepage.py 2 files changed, 0 insertions(+), 0 deletions(-) Approvals: Xqt: Verified; Looks good to me, approved diff --git a/pywikibot/page/basepage.py b/pywikibot/page/_basepage.py similarity index 100% rename from pywikibot/page/basepage.py rename to pywikibot/page/_basepage.py -- To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/771986 To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Change-Id: I68925222c543c9c0703c44c7753144ba4977a869 Gerrit-Change-Number: 771986 Gerrit-PatchSet: 1 Gerrit-Owner: Xqt <info(a)gno.de> Gerrit-Reviewer: Xqt <info(a)gno.de> Gerrit-Reviewer: jenkins-bot Gerrit-MessageType: merged

2 years, 2 months

← Newer
1
2
3
4
5
6
7
8
9
...
17
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Pywikibot-commits March 2022