Hello,
Could you explain why the non-standard ShEx has been chosen rather than
the W3C Recommendation SHACL?
I would assume that if one has several options for bringing a
functionality to something that largely promotes interoperability (like
Wikidata), the default choice should be a standard, and /only if/ one
has a carefully crafted argumentation to reject it, one would opt for
something else.
For those who may not know, the W3C RDF Data Shapes Working Group worked
between 2014 and 2017 on defining a standard for describing data shapes
in RDF. ShEx existed already and was a candidate for standardisation.
Eventually, another standard emerged, Shapes Constraint Language (SHACL,
see
).
Disclaimer: I did not contribute to either SHACL or ShEx, and I do not
know them enough to judge which one is better.
Best,
--AZ
On 19/05/2019 15:32, Léa Lacroix wrote:
Hello all,
After several months of development and testing together with the
WikiProject ShEx
<https://www.wikidata.org/wiki/Wikidata:WikiProject_ShEx>, Shape
Expressions are about to be enabled on Wikidata.
*First of all, what are Shape Expressions?*
ShEx (Q29377880) <https://www.wikidata.org/wiki/Q29377880> is a concise,
formal modeling and validation language for RDF structures. Shape
Expressions can be used to define shapes within the RDF graph. In the
case of Wikidata, this would be sets of properties, qualifiers and
references that describe the domain being modeled.
See also:
* a short video about ShEx
<https://www.youtube.com/watch?v=AR75KhEoRKg> made by community
members during the Wikimedia hackathon 2019
* introduction to ShEx <http://shex.io/shex-primer/>
* more details about the language <http://shex.io/shex-semantics/>
*What can it be used for?*
On Wikidata, the main goal of Shape Expressions would be to describe
what the basic structure of an item would be. For example, for a human,
we probably want to have a date of birth, a place of birth, and many
other important statements. But we would also like to make sure that if
a statement with the property “children” exists, the value(s) of this
property should be humans as well. Schemas will describe in detail what
is expected in the structure of items, statements and values of these
statements.
Once Schemas are created for various types of items, it is possible to
test some existing items against the Schema, and highlight possible
errors or lack of information. Subsets of the Wikidata graph can be
tested to see whether or not they conform to a specific shape through
the use of validation tools. Therefore, Schemas will be very useful to
help the editors improving the data quality. We imagine this to be
especially useful for wiki projects to more easily discuss and ensure
the modeling of items in their domain. In the spirit of Wikidata not
restricting the world, Shape Expressions are a tool to highlight, not
prevent, errors.
On top of this, one could imagine other uses of Schemas in the future,
for example building a tool that would suggest, when creating a new
item, what would be the basic structure for this item, and helping
adding statements or values. A bit like this existing tool, Cradle
<https://tools.wmflabs.org/wikidata-todo/cradle/#/>, that is currently
not based on ShEx.
*What is going to change on Wikidata?*
* A new extension will be added to Wikidata: EntitySchema
<https://www.mediawiki.org/wiki/Extension:EntitySchema>, defining
the Schema namespace and its behavior as well as special pages
related to it.
* A new entity type, EntitySchema, will be enabled to store Shape
Expressions. Schemas will be identified with the letter E.
* The Schemas will have multilingual labels, descriptions and aliases
(quite similar to the termbox on Items), and the schema text one can
fill with a syntax called ShEx Compact Syntax (ShExC)
<http://shex.io/shex-semantics/#shexc>. You can see an example here
<https://wikidata-shex.wmflabs.org/wiki/EntitySchema:E2>.
* The external tool shex-simple
<https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex-simple.html?schemaURL=https%3A%2F%2Fwikidata-shex.wmflabs.org%2Fwiki%2FSpecial%3AEntitySchemaText%2FE2>
is directly linked from the Schema pages in order to check entities
of your choice against the schema.
*When is this happening?*
Schemas will be enabled on on
test.wikidata.org
<http://test.wikidata.org> on May 21st and on
wikidata.org
<http://wikidata.org> on May 28th. After this release, they will be
integrated to the regular maintenance just like the rest of Wikidata’s
features.
*How can you help?*
* Before the release, you can try to edit or create Shape Expressions
on our test system <https://wikidata-shex.wmflabs.org/wiki/Main_Page>
* If you find any issue or feature you’d like to have, feel free to
create a new task on Phabricator with the tag |shape-expressions|
* Once Schemas are enabled, you can discuss about it on your favorite
wikiprojects: for example, what types of items would you like to model?
* You can also get more information about how to create a Schema
<https://www.wikidata.org/wiki/Wikidata:WikiProject_ShEx/How_to_get_started%3F>
*See also: *
* Main Phabricator board
<https://phabricator.wikimedia.org/tag/shape_expressions/>
* Technical documentation of the extension
<https://meta.wikimedia.org/wiki/Extension:EntitySchema>
* To enhance the interface, you can use this user script
<https://www.wikidata.org/wiki/User:Zvpunry/EntitySchemaHighlighter.js>
to highlight items and properties in the schema code and turn the
IDs into links
If you have any questions, feel free to reach me. Cheers,
--
Léa Lacroix
Project Manager Community Communication for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de <http://www.wikimedia.de>
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata