Hi Imre,
we can encode these rules using the JSON MongoDB database we created in
GlobalFactSync project
(
https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSyncRE).
As basis for the GFS Data Browser. The database has open read access.
Is there a list of geodata issues, somewhere? Can you give some example?
GFS focuses on both: overall quality measures and very domain specific
adaptations. We will also try to flag these issues for Wikipedians.
So I see that there is some notion of what is good and what not by
source. Do you have a reference dataset as well, or would that be
NaturalEarth itself? What would help you to measure completeness for
adding concordances to NaturalEarth.
-- Sebastian
On 24.08.19 21:26, Imre Samu wrote:
For geodata ( human settlements/rivers/mountains/...
) ( with GPS
coordinates ) my simple rules:
- if it has a "local wikipedia pages" or any big
lang["EN/FR/PT/ES/RU/.."] wikipedia page .. than it is OK.
- if it is only in "cebuano" AND outside of "cebuano BBOX" ->
then
.... this is lower quality
- only:{shwiki+srwiki} AND outside of "sh"&"sr" BBOX -> this
is lower
quality
- only {huwiki} AND outside of CentralEuropeBBOX -> this is lower quality
- geodata without GPS coordinate -> ...
- ....
so my rules based on wikipedia pages and languages areas ... and I
prefer wikidata - with local wikipedia pages.
This is based on my experience - adding Wikidata ID concordances to
NaturalEarth (
https://www.naturalearthdata.com/blog/ )
--
All the best,
Sebastian Hellmann
Director of Knowledge Integration and Linked Data Technologies (KILT)
Competence Center
at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects:
http://dbpedia.org,
http://nlp2rdf.org,
http://linguistics.okfn.org,
https://www.w3.org/community/ld4lt
<http://www.w3.org/community/ld4lt>
Homepage:
http://aksw.org/SebastianHellmann
Research Group:
http://aksw.org