Hey everyone,
Lydia is focusing on some Outreach tasks at the moment so I have
volunteered to make this announcement. The development team are currently
planning to enable Phase 2 for all language editions of Wikiquote on June
10th. For those who don't know, Phase 2 is enabling data access from
Wikiquote to Wikidata and vice versa. Lydia also wants to thank all users
who helped make the Phase 1 deployment a successful launch on all Wikiquote
languages.
Thanks, John Lewis
David,
I am not familiar with Wiktionary and its datamodel. But your summary
looks like SKOS [1] would be a good fit. Also for your proposal to
extend the Wikidata datamodel. In short, SKOS distinguishes between
concepts (they carry the semantics ~ Q item) and labels (they are, well,
just labels). Concepts and labels are connected via a handful of
properties, e.g. skos:prefLabel or skos:altLabel. In ordinary SKOS
labels are simple strings but in SKOS-XL (also part of the spec) they
are objects (and thus can have properties and relations to other labels
(or anything) etc.).
Furthermore, SKOS is extensible, i.e. it is based on RDF and one can
define subclasses of skos:concept and skos-xl:label and one can define
subproperties of skos:prefLabel and skos:altLabel with particular
semantics, which might be relevant for Wikidata. With this some
label-like wikidata-properties could be defined as subproperties of,
say, skos:altLabel to have them show up in pick lists etc.
just my 2 cents,
michael
[1] The spec: http://www.w3.org/TR/skos-reference/
The primer: http://www.w3.org/TR/2009/NOTE-skos-primer-20090818
On 06.06.2014 14:00, wikidata-l-request(a)lists.wikimedia.org wrote:
> Message: 3
> Date: Thu, 5 Jun 2014 16:28:30 +0200
> From: David Cuenca<dacuetu(a)gmail.com>
> To: "Discussion list for the Wikidata project."
> <wikidata-l(a)lists.wikimedia.org>
> Subject: [Wikidata-l] What is the point of labels?
> Message-ID:
> <CAJBSGSoO60AsQbUFkmefqvpE_miwFYxO2vs8jSeq0p0D82JChg(a)mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> When I drafted the functional structure that is appearing on items [1],
> Gerard pointed out that it is drifting into the lexical area. That made me
> think that while useful to have lexical data as an independent item as we
> discussed last year for Wiktionary, the current structure "q item <label>
> string" doesn't seem to be compatible with that wish, or at least it would
> be more difficult to maintain the same label twice. And it is not just one
> label per item, there are many, and each one might have different lexical
> properties.
>
> For more efficiency, it seems that we would need statements like "q item
> <label> lexical item" to reflect that separation, but that adds further
> complexity, because according to the latest Wikidata:Wiktionary proposal
> [2], the "lexical item" (W) also contains senses/meanings (S). This is
> recurrent, as we already have Q items as the basis for meaning... or at
> least a concept that is more or less shared among languages. The only
> difference between "Q items" and the proposed "S items" is that S items
> represent only one of the lexeme meanings for one particular language, but
> other than that they have the same nature as Q items (it should be possible
> to add "subclass of" and other statements to them).
>
> Labels, aliases, and name properties are just normal statements where one
> of them is preferred, I have been wondering why don't we treat them as
> such... That way we could have some coherence, and have both "Q items" and
> "S items" as the units of meaning/sense and later on move the labels
> (lexemes), which now are strings, to the lexical items ("W items" in the
> example on the page Wikidata:Wiktionary).
>
> Summing up, labels in their current form make complete sense now, but when
> considered together with lexical information, it seems that it would be
> convenient to treat all of them as statements that later on could link with
> "W items". And as Joe pointed out, there are many more properties that are
> equivalent to a label, just more specific, and that now don't show up in
> the suggester, nor up above of the page where they should.
>
> I know that Wiktionary is still in the future and that there are many other
> priorities on the way, however since the representation of the items is
> being re-considered, I think it is a good moment to think about how to move
> little by little in the right direction. I also would like to point out
> that by keeping lexical information in wikidata, its complexity is going to
> increase inevitably. If new users already struggling to understand it now,
> I cannot imagine how will they cope with added elements...
>
> Micru
>
> [1]http://lists.wikimedia.org/pipermail/wikidata-l/2014-June/003941.html
> [2]https://www.wikidata.org/wiki/Wikidata:Wiktionary
--
Dr. Michael Erdmann | erdmann(a)diqa-pm.com | +49 151 6140 1790
DIQA Projektmanagement GmbH | Pfinztalstr. 90 | 76227 Karlsruhe, Germany
Handelsregister: Amtsgericht Mannheim HRB 715454 | USt-IdNr: DE283037270
Geschäftsführer: Dr. Michael Erdmann, Dipl.-Wirtsch.-Inf. Daniel Hansch
This email may contain confidential information. If you are not the
intended recipient please notify the sender immediately and delete this
email. Any unauthorized copying, disclosure or distribution of this
email is strictly forbidden.
Since the very beginning I have kept myself busy with properties, thinking
about which ones fit, which ones are missing to better describe reality,
how integrate into the ones that we have. The thing is that the more I work
with them, the less difference I see with normal items.... and if soon
there will be statements allowed in property pages, the difference will
blur even more.
I can understand that from the software development point of view it might
make sense to have a clear difference. Or for the community to get a deeper
understanding of the underlying concepts represented by words.
But semantically I see no difference between:
cement (Q45190) <emissivity (P1295)> 0.54
and
cement (Q45190) <emissivity (Q899670)> 0.54
Am I missing something here? Are properties really needed or are we adding
unnecessary artificial constraints?
Cheers,
Micru
When I drafted the functional structure that is appearing on items [1],
Gerard pointed out that it is drifting into the lexical area. That made me
think that while useful to have lexical data as an independent item as we
discussed last year for Wiktionary, the current structure "q item <label>
string" doesn't seem to be compatible with that wish, or at least it would
be more difficult to maintain the same label twice. And it is not just one
label per item, there are many, and each one might have different lexical
properties.
For more efficiency, it seems that we would need statements like "q item
<label> lexical item" to reflect that separation, but that adds further
complexity, because according to the latest Wikidata:Wiktionary proposal
[2], the "lexical item" (W) also contains senses/meanings (S). This is
recurrent, as we already have Q items as the basis for meaning... or at
least a concept that is more or less shared among languages. The only
difference between "Q items" and the proposed "S items" is that S items
represent only one of the lexeme meanings for one particular language, but
other than that they have the same nature as Q items (it should be possible
to add "subclass of" and other statements to them).
Labels, aliases, and name properties are just normal statements where one
of them is preferred, I have been wondering why don't we treat them as
such... That way we could have some coherence, and have both "Q items" and
"S items" as the units of meaning/sense and later on move the labels
(lexemes), which now are strings, to the lexical items ("W items" in the
example on the page Wikidata:Wiktionary).
Summing up, labels in their current form make complete sense now, but when
considered together with lexical information, it seems that it would be
convenient to treat all of them as statements that later on could link with
"W items". And as Joe pointed out, there are many more properties that are
equivalent to a label, just more specific, and that now don't show up in
the suggester, nor up above of the page where they should.
I know that Wiktionary is still in the future and that there are many other
priorities on the way, however since the representation of the items is
being re-considered, I think it is a good moment to think about how to move
little by little in the right direction. I also would like to point out
that by keeping lexical information in wikidata, its complexity is going to
increase inevitably. If new users already struggling to understand it now,
I cannot imagine how will they cope with added elements...
Micru
[1] http://lists.wikimedia.org/pipermail/wikidata-l/2014-June/003941.html
[2] https://www.wikidata.org/wiki/Wikidata:Wiktionary
Continuing with the discussion of last week about the nature of properties
I follow with my personal crusade to foster a better understanding of
Wikidata (which sometimes means asking difficult questions :)). This time I
ask about items, or concepts for that matter.
To start with I cherry-pick a very insightful question posed by Markus last
week, that unfortunately I left unanswered:
"The main question is "Did the reference say that pianos are instruments?"
but not "Did the reference say pianos are instruments because of the
definition of 'piano'?" Therefore, we don't need to put this information in
our labels."
To my mind that is a problem that, as the chicken and the egg, can be
settled with just a word: emergence. There is no such thing as a piano or a
concept of a piano. But both of them, concept and object, co-evolved over
time and now we recognize certain objects as "pianos". Timeline:
https://en.wikipedia.org/wiki/Piano#History
There have been so many innovations upon innovations, versions, and even
name changes, that what we call now "piano" is very different from what it
was long ago. Same can be said about other concepts like "country of
citizenship", which is not a valid concept when talking about historic
people.
When we are creating an item we are capturing a moment of time of the past,
according to a source in a different past. Eventually this item might
change its label, change its meaning, or become obsolete. So when I look in
Wikidata for:
- a way to reflect label changes over time: yes, that will be possible with
the mono-lingual datatype + qualifiers, creating a property "label"
- a way to reflect that the concept is obsolete: perhaps it could be
reflected with start/end date
- a way to indicate a different item with a related meaning: it can be done
with properties
This information is not about the item itself, but we treat it as other
statements.
In my opinion these kind of statements are different (as labels, or
descriptions), since they don't refer to the represented entity, but to the
container that represents the entity. Like the walls of a bubble.
I can imagine that there will be some confusion between labels that can
accept qualifiers, other than don't, and aliases that can edited in one
language but not in other, and all this not grouped with other statements
that belong to the same metadata group.
So I candidly ask: does it make sense to treat item metadata statements
just as any other statement? Would it bring more confusion or less?
Cheers,
Micru