On Monday 12 March 2007 00:59, Oldak Quill wrote:
A link to this project was posted on the Semantic
MediaWiki developers
mailing list.
In the project's own words: "The
dbpedia.org project approaches both
problems by extracting structured information from Wikipedia and by
making this information available on the Semantic Web.
dbpedia.org
allows you to ask sophisticated queries against Wikipedia and to link
other datasets on the Web to dbpedia data. "
The website has a demonstration of the kind of semantic information
that can be extracted from the main infobox on English Wikipedia's
Innsbruck article.
"The dbpedia dataset currently consists of around 25 million RDF
triples, which have been extracted from the English, German, French,
Spanish, Italian, Portuguese, Polish, Swedish, Dutch and Japanese
version of Wikipedia."
An example of reuse of Wikipedia data in a very useful way.
Yes, we are currently using this data for setting up large wikis for testing
Semantic MediaWiki. The point SMW is arguing for is that automatic extraction
of data is extremely useful, but not satisfactory as a sole basis for query
answering. The reason is that it is quite hard to do proper extraction from
MW templates (since they often contain texts and mixed type data values). If
you look at Dbpedia's data for a while, you can find many smaller bugs or
omissions that would require manual fixing -- it is just very "noisy".
Various people from Natural Language Processing are looking at the problem of
extracting semantic data from Wikipedia, but many such methods are tailored
towards one topic area and need much computation.
This is why SMW argues for manual annotation. You can use Dbpedia data to
bootstrap a semantic wiki, and then do queries right within the wiki, just
like the ones on the Dbpedia demo pages. But you can also directly fix or
extend the semantic data, without requiring computationally expensive
processing to update your data. Since SMW also exports all semantic data, it
can then again be fed into external query answering tools that are currently
used with Dbpedia. That's (part of) the Semantic Web idea of cross-site
interoperability ...
Maybe we will at some stage publish the semantic demo Wikipedia we currently
run internally, but we still need it private for performance testing.
-- Markus
--
Markus Krötzsch
Institute AIFB, University of Karlsruhe, D-76128 Karlsruhe
mak(a)aifb.uni-karlsruhe.de phone +49 (0)721 608 7362
www.aifb.uni-karlsruhe.de/WBS/ fax +49 (0)721 693 717