Il giorno Mon, 4 Feb 2019 15:36:08 +0100
Kévin Bois <kevin.bois(a)biblissima-condorcet.fr> ha scritto:
Hello,
I'm trying to write a pywikibot script which read and create items /
properties on my Wikibase instance. Following pieces of tutorials and
script examples, I managed to write something working.
1/ The idea is to read a CSV file, and create an item with its
properties for each line. So I have to loop over thousands of lines
and create an item and multiple claims associated, and it takes quite
some time to do so. (atleast 1 hour to create 1000 items) I guess
it's because for each line, I create a new entity and new claims,
which means multiple requests for each line. Some pseudo code I use
in my script: To create a new item, I use : repo.editEntity({}, {},
summary='new item') assuming repo = site.data_repository() To create
a new claim, I use : self.user_add_claim_unless_exists(item, claim),
assuming my Bot inherit WikidataBot
Is there a better way to optimize that kind of bulk import ?
--
2/ I kind of have the same problem If I want to check if an item
already exists, because first I need to get all existing items and
check if they are in my CSV or not. (the CSV does not contain QIDs,
but does contain a "custom" ID I've created and added as a property
to each item )
--
I hope I was clear enough, any relevant example, idea, advice, would
be much appreciated. Bear in mind I'm a beginner with the whole
ecosystem so I'm open to any recommendation. Thanks !
_______________________________________________ pywikibot mailing list
pywikibot(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot
I do not know if this message will be delivered. I hope so.
About the first question, I think you can split split the workload
among different python threads.
About the second, could you generate the QID with an injective function
from your id, so you would just have to execute the function using your
ID and check if the correspondent QID exists.
Pellegrino