Dear Community Members,
We are pleased to announce that registration for Wiki Indaba 2021
<https://meta.wikimedia.org/wiki/WikiIndaba_conference_2021> is now open.
This year’s Wiki Indaba will be hosted by the Wikimedia Community User
group Uganda and held virtually between 05 - 07 November 2021 under the
theme Rethink + Reset : Visions of the Future.
Click on this link to register http://bit.ly/wikiindaba2021
<http://bit.ly/wikiindaba2021>
The call for submissions is still open until September 24th, please submit
your session proposal either as a presentation, panel discussion, lightning
talk, or workshop before September 24th, 2021 by simply stating the title
of your session and providing a brief description via
https://pretalx.com/wiki-indaba-2021/cfp.
Please reach out to wikiuganda(a)gmail.com if you have any questions.
Regards,
Geoffrey Kateregga, on behalf of the Wiki Indaba 2021 Local Organizing
Committee.
Dear Christina,
You are likely to find more researchers and people who regularly work with
our metadata on the research mailing list.
Send Wiki-research-l mailing list submissions to
wiki-research-l(a)lists.wikimedia.org
To subscribe or unsubscribe, please visit
https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.wik…
Regards
WSC
>
> 2. Re: Accessing wikipedia metadata (Gava, Cristina)
>
>
> On Thu, 16 Sept 2021 at 14:04, Gava, Cristina via Wikimedia-l <
> wikimedia-l(a)lists.wikimedia.org> wrote:
>
> > Hello everyone,
> >
> >
> >
> > It is my first time interacting in this mailing list, so I will be happy
> > to receive further feedbacks on how to better interact with the
> community :)
> >
> >
> >
> > I am trying to access Wikipedia meta data in a streaming and
> time/resource
> > sustainable manner. By meta data I mean many of the voices that can be
> > found in the statistics of a wiki article, such as edits, editors list,
> > page views etc.
> >
> > I would like to do such for an online classifier type of structure:
> > retrieve the data from a big number of wiki pages every tot time and use
> it
> > as input for predictions.
> >
> >
> >
> > I tried to use the Wiki API, however it is time and resource expensive,
> > both for me and Wikipedia.
> >
> >
> >
> > My preferred choice now would be to query the specific tables in the
> > Wikipedia database, in the same way this is done through the Quarry tool.
> > The problem with Quarry is that I would like to build a standalone
> script,
> > without having to depend on a user interface like Quarry. Do you think
> that
> > this is possible? I am still fairly new to all of this and I don’t know
> > exactly which is the best direction.
> >
> > I saw [1] <https://meta.wikimedia.org/wiki/Research:Data> that I could
> > access wiki replicas both through Toolforge and PAWS, however I didn’t
> > understand which one would serve me better, could I ask you for some
> > feedback?
> >
> >
> >
> > Also, as far as I understood [2]
> > <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake>, directly
> > accessing the DB through Hive is too technical for what I need, right?
> > Especially because it seems that I would need an account with production
> > shell access and I honestly don’t think that I would be granted access to
> > it. Also, I am not interested in accessing sensible and private data.
> >
> >
> >
> > Last resource is parsing analytics dumps, however this seems less organic
> > in the way of retrieving and polishing the data. As also, it would be
> > strongly decentralised and physical-machine dependent, unless I upload
> the
> > polished data online every time.
> >
> >
> >
> > Sorry for this long message, but I thought it was better to give you a
> > clearer picture (hoping this is clear enough). If you could give me even
> > some hint it would be highly appreciated.
> >
> >
> >
> > Best,
> >
> > Cristina
> >
> >
> >
> > [1] https://meta.wikimedia.org/wiki/Research:Data
> >
> > [2] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake
> > _______________________________________________
> > Wikimedia-l mailing list -- wikimedia-l(a)lists.wikimedia.org, guidelines
> > at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> > https://meta.wikimedia.org/wiki/Wikimedia-l
> > Public archives at
> >
> https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org…
> > To unsubscribe send an email to wikimedia-l-leave(a)lists.wikimedia.org
>
Hello everyone,
It is my first time interacting in this mailing list, so I will be happy to receive further feedbacks on how to better interact with the community :)
I am trying to access Wikipedia meta data in a streaming and time/resource sustainable manner. By meta data I mean many of the voices that can be found in the statistics of a wiki article, such as edits, editors list, page views etc.
I would like to do such for an online classifier type of structure: retrieve the data from a big number of wiki pages every tot time and use it as input for predictions.
I tried to use the Wiki API, however it is time and resource expensive, both for me and Wikipedia.
My preferred choice now would be to query the specific tables in the Wikipedia database, in the same way this is done through the Quarry tool. The problem with Quarry is that I would like to build a standalone script, without having to depend on a user interface like Quarry. Do you think that this is possible? I am still fairly new to all of this and I don't know exactly which is the best direction.
I saw [1]<https://meta.wikimedia.org/wiki/Research:Data> that I could access wiki replicas both through Toolforge and PAWS, however I didn't understand which one would serve me better, could I ask you for some feedback?
Also, as far as I understood [2]<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake>, directly accessing the DB through Hive is too technical for what I need, right? Especially because it seems that I would need an account with production shell access and I honestly don't think that I would be granted access to it. Also, I am not interested in accessing sensible and private data.
Last resource is parsing analytics dumps, however this seems less organic in the way of retrieving and polishing the data. As also, it would be strongly decentralised and physical-machine dependent, unless I upload the polished data online every time.
Sorry for this long message, but I thought it was better to give you a clearer picture (hoping this is clear enough). If you could give me even some hint it would be highly appreciated.
Best,
Cristina
[1] https://meta.wikimedia.org/wiki/Research:Data
[2] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake
Hello all,
The September Wikimedia Research Showcase will be on September 15 at 16:30
UTC (9:30am PT/ 12:30pm ET/ 18:30pm CEST). The theme will be "socialization
on Wikipedia" with speakers Rosta Farzan and J. Nathan Matias.
Livestream: https://www.youtube.com/watch?v=YVqabVvLIZU
Talk 1
Speaker: Rosta Farzan (School of Computing and Information, University of
Pittsburgh)
Title: Unlocking the Wikipedia clubhouse to newcomers: results from two
studies
Abstract: It is no news to any of us that success of online production
communities such as Wikipedia highly relies on a continuous stream of
newcomers to replace the inevitable high turnover and to bring on board new
sources of ideas and workforce. However, these communities have been
struggling with attracting newcomers, especially from a diverse population
of users, and further retention of newcomers. In this talk, I will present
about two different approaches in engaging new editors in Wikipedia: (1)
newcomers joining through the Wiki Ed program, an online program in which
college students edit Wikipedia articles as class assignments; (2)
newcomers joining through a Wikipedia Art+Feminism edit-a-thon. I present
how each approach incorporated techniques in engaging newcomers and how
they succeed in attracting and retention of newcomers.
More information:
- Bring on Board New Enthusiasts! A Case Study of Impact of Wikipedia
Art + Feminism Edit-A-Thon Events on Newcomers
<https://link.springer.com/chapter/10.1007/978-3-319-47880-7_2>, SocInfo
2016 (pdf
<http://saviaga.com/wp-content/uploads/2016/06/socinfo_ediathons.pdf>)
- Successful Online Socialization: Lessons from the Wikipedia Education
Program <https://dl.acm.org/doi/abs/10.1145/3392857>, CSCW 2020 (pdf
<https://www.cc.gatech.edu/~dyang888/docs/cscw_li_2020_wiki.pdf>)
Talk 2
Speaker: J. Nathan Matias <http://natematias.com/> (Citizens and Technology
Lab <http://citizensandtech.org/>, Cornell University Departments of
Communication and Information Science)
Title: The Effect of Receiving Appreciation on Wikipedias. A Community
Co-Designed Field Experiment
Abstract: Can saying “thank you” make online communities stronger & more
inclusive? Or does thanking others for their voluntary efforts have little
effect? To ask this question, the Citizens and Technology Lab (CAT Lab)
organized 344 volunteers to send thanks to Wikipedia contributors across
the Arabic, German, Polish, and Persian languages. We then observed the
behavior of 15,558 newcomers and experienced contributors to Wikipedia. On
average, we found that organizing volunteers to thank others increases
two-week retention of newcomers and experienced accounts. It also caused
people to send more thanks to others. This study was a field experiment, a
randomized trial that sent thanks to some people and not to others. These
experiments can help answer questions about the impact of community
practices and platform design. But they can sometimes face community
mistrust, especially when researchers conduct them without community
consent. In this talk, learn more about CAT Lab's approach to community-led
research and discuss open questions about best practices.
More information:
-
Volunteers Thanked Thousands of Wikipedia Editors to Learn the Effects
of Receiving Thanks
<https://citizensandtech.org/2020/06/effects-of-saying-thanks-on-wikipedia/>,
blogpost (in EN, DE, AR, PL, FA) <https://osf.io/ueq5f/>
-
The Diffusion and Influence of Gratitude Expressions in Large-Scale
Cooperation: A Field Experiment in Four Knowledge Networks
<https://osf.io/ueq5f/>, paper preprint
More information: https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
--
Janna Layton (she/her)
Administrative Associate - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
Hi all,
Of course I am not from WMCUG, but they issued a statement regarding the
series of serious office actions:
https://qiuwen.wmcug.org.cn/archives/390/on-wmf-office-action-zh-1/
Of course there is an archive.is link so as not to , you know, they may
just take down:
https://archive.is/EE6AD
Interesting to see if they really translated this to English.
Regards,
William
Dear Wikimedians,
Below you can find the report of the Wikimedia Community User Group
Albania for 2020:https://meta.wikimedia.org/wiki/Wikimedia_Community_User_Group_Albania…
If you have any questions or suggestions, please let us know.
Best regards,
Nafie
On behalf of Wikimedia Community User Group Albania members
_______________________________________________
Please note: all replies sent to this mailing list will be immediately directed to Wikimedia-l, the public mailing list of the Wikimedia community. For more information about Wikimedia-l:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
_______________________________________________
WikimediaAnnounce-l mailing list -- wikimediaannounce-l(a)lists.wikimedia.org
To unsubscribe send an email to wikimediaannounce-l-leave(a)lists.wikimedia.org