SJ:
> With a brief discussion about preserving privacy in aggregate data,
> randomizing test and control samples, and a tweak to allow web forms
> on pages that are aware of your wikipedia userid, we could have a
> simple projects-wide survey completed within a month. Let's make this
> a priority and make such a thing happen -- then figure out how to
> optimize future iterations.
>
> The latest discussions on meta are here:
> http://meta.wikimedia.org/wiki/General_User_Survey
>
SJ, great to hear you welcome the survey. After Wikimania 2005 the project
fell idle,
because I had too many other WM obligations and a not so good winter
healthwise.
Wikimania 2006 gave the project new elan and now someone else will code it.
Status:
Technical design has started but needs some more work:
I'll make a mockup input script for the form generator.
http://meta.wikimedia.org/wiki/General_User_Survey/Implementation
Programming will start in a reasonable time frame, see Kevin's earlier post.
I'm not so sure this takes only one month :(
Major issues:
1 Authentication
Best after single login is active, in a few weeks time?
2 Anonimisation of results
May need some more thinking, this is sensitive matter
We had a heated debate about this in Frankfurt, we'll probably get into
this further
when we have a proof of concept, and more people show up to give feedback.
3 Translation issues
A Mediawiki wide survey needs to be held in many languages to reduce bias
where opinions are asked.
4 Results should be script-processable, e.g. no free format feedback.
Thus all answers should be on a numeric scale or predefined
(e.g. country numbers instead of country names in all esoteric languages
that no script can handle)
Because of 3 a survey form needs to be built dynamically.
No English/German/Japanese/etc texts intermingled with PHP script.
That would be a maintenance nightmare.
Depending on how much time the programmer can spend on the project, we could
probably show an alpha version about 4 weeks after he starts.
Then start major discussion on final questions (this will work better when
people see a alpha version to play with),
and finally freeze questions and invite translators.
I would be happy if we did a major survey in November/December.
Please don't ask for quick hacks. I know all this sounds like an invitation
for some self-proclaimed code magician to make something barely functional
in a weekend, pronounce the job done and then leave the 'dirty details'
(usually 80% of what needs to be done) for others to clean up. I'd rather
see to it that the first version is usable and a good platform for future
reuse and extension.
Erik Zachte :)
Dear Ladies and Gentlemen,
Good day. I am Mike Tian-Jian Jiang (
http://www.linkedin.com/in/barabbas ), a team member of wikimania 2007.
I'm trying to arrange Hacking Days currently, here's some rough plan
that needs your precious suggestions.
We would like to invite experts like you to provide speech and to
encourage hackers involving MediaWiki/Wikimedia
development.
Please let me know if you have any advise to these outlines. For
academic professionals, we will also hold a
Conference for oral and poster paper presentations.
Also please forward this mail to whom may be interested in.
Thank you very much!
Sincerely,
Mike
Hacking Days Agenda plan:
* Wikimania 2006:
o http://wikimania2006.wikimedia.org/wiki/Hacking_Days
(Schedule MindMap is missing......)
o http://wikimania2006.wikimedia.org/wiki/Hacking_Days_Extras
* Wikimania 2005
o http://meta.wikimedia.org/wiki/Wikimania_2005_hacking_days or
o http://meta.wikimedia.org/wiki/Wikimania_2005:Hacking_Days
* MediaWiki API Introduction:as web 2.0 trend...
o Query API
o http://meta.wikimedia.org/wiki/API
o Python Bot Framework
* MediaWiki API application contest:a contest likes Google and
Yahoo!'s, tries to attract hackers.
o Wikipedia Gadget
o Wikipedia Yahoo! Widget
* MediaWiki enhancement
o Wikiwyg
+ Ingy's AJAX version
+ Flash version
o Collaboration editing/versioning
o Site searching (Is there any plan to make Lucene available
for other languages besides English?)
o Improving Simplified-Traditional Chinese conversion
o Audio/Video processing/streaming
o Community: from "Talk" page to forum/bbs with more a
flexible reputation system.
o Spam/Captcha
* MediaWiki system administration
o Large scale data processing; please refers to
+ http://radar.oreilly.com/tag/database
<http://docs.google.com/%20%20%20%20%20%20%20%20%20%20%20%20%20http://blog.i…>
+ http://labs.google.com/papers/bigtable.html
<http://docs.google.com/%20%20%20%20%20%20%20%20%20%20%20%20%20http://labs.g…>
o Load balancing
o "The great wall" problem
* Wikipedia content applications
o (Crosslingual) search: especially for
translation/transliteration of named entities from Wikimedia
contents.
o (Crosslingual) question-answering: CLEF 2006 already has a
pilot task WiQA.
Hey All,
I want to introduce myself - I'm Rut Jesus (vulpeto) - portuguese but
studying in Copenhagen. I am starting a PhD under the title 'Cooperation in
Emergent Cognition of Socio-Technological Networks'. It's quite
interdisciplinary and it can (still) go in many different directions.
I've studied Physics and Philosophy before - and now I'm again at their
crossroads: at the Center for Philosophy of Nature and Science Studies
(humanities'ish) which is at the Niels Bohr Institute (physics'ish).
For now I am mostly interested in being an observer - but certainly asking
questions, perhaps making interviews, engaging in discussions, etc - but
later I would also like to collaborate - specially in bringing some of the
discussions/good-ideas/practices to wikipedias smaller than the english one
(probably esperanto, perhaps portuguese, perhaps danish).
A virtual hug.
Best,
rut
Hello everybody !
I'm a graduate student at the Institute of Political Studies, Lyon, France,
and I'm writing my Ma thesis about philosophical and political grounds of
Wikipedia. The subject sounds broad, it surely is, and I'll try to make it
more precise in the future. I hope this mailing list will be a source of
help and discovery for each other. Last, please excuse my english, I 'll try
to improve it ! ;)
All best,
Sylvain
Dear colleagues,
We would like to announce a new research paper that uses Wikipedia
for computing semantic relatedness of natural language texts.
Evgeniy Gabrilovich and Shaul Markovitch (2007).
''Computing Semantic Relatedness using Wikipedia-based Explicit Semantic
Analysis''.
Proceedings of The 20th International Joint Conference on Artificial
Intelligence (IJCAI),
Hyderabad, India, January 2007
http://www.cs.technion.ac.il/~gabr/papers/ijcai-2007-sim.pdf
ABSTRACT
Computing semantic relatedness of natural language texts requires
access to vast amounts of common-sense and domain-specific world
knowledge. We propose Explicit Semantic Analysis (ESA), a novel
method that represents the meaning of texts in a high-dimensional
space of concepts derived from Wikipedia. We use machine learning
techniques to explicitly represent the meaning of any text as a
weighted vector of Wikipedia-based concepts. Assessing the
relatedness of texts in this space amounts to comparing the
corresponding vectors using conventional metrics (e.g., cosine).
Compared with the previous state of the art, using ESA results in
substantial improvements in correlation of computed relatedness
scores with human judgments: from r=0.56 to 0.75 for individual
words and from r=0.60 to 0.72 for texts. Importantly, due to the use
of natural concepts, the ESA model is easy to explain to human users.
Kind regards,
Evgeniy.
--
Evgeniy Gabrilovich
Ph.D. student in Computer Science
Department of Computer Science, Technion - Israel Institute of Technology
Technion City, Haifa 32000, Israel
Email: gabr(a)cs.technion.ac.il WWW: http://www.cs.technion.ac.il/~gabr