My two cents:
Please let us collect the arguments for and against
using the IRI data
value *structure* here (not for being able to *identify* whether a string
is an IRI or a string).
I see no advantage in using the IriValue internally, as we have no use case for
accessing parts of the URL individually.
I see some slight disadvantages (a little extra code to be written for plain
text display and validation of IriValues).
* should, in the external JSON structure, for every
snak the data value
type be listed (as it currently is)? I.e. should it state "string" instead
of "Commons media filename"?
I think it's useless and even misleading to have that info in the external
format. It should never have been there.
However, now that it is there, it may be prudent to keep it. The biggest problem
with it is that people mistake/misuse it for identifying the type of the snak
value - which is *not* what this represents.
* should, in the external JSON structure, for every
snak the data type of
the property used be listed? This would then say URL, and this would solve
all the use cases mentioned by Markus, which rely on *identifying* this
distinction, not on the actual IRI data structure.
Yes, I think that would be very helpful for various reasons.
I would even go so far as to say that Snaks should also have this information
internally (in php and in the internal JSON format).
We need to have an export of the whole Wikidata
knowledge base in the
external JSON format, rather sooner than later, and hopefully also in RDF.
The lack of these dumps should not influence our decision right now, imho :)
Oh yes. It's really bad that people are relying on the internal format found in
the XML dumps.
-- daniel