---- Original message ----
>Date: Sun, 09 Mar 2003 12:31:36 -0800
>From: Ray Saintonge <saintonge(a)telus.net>
>Subject: Re: [Wikipedia-l] Death to the comma count!
>To: wikipedia-l(a)wikipedia.org
>
>Takuya Murata wrote:
>
>>Only my concern is that we should not make a decision in
the
>>place where people can't participate. Yes, every one can
>>discuss here but only if they speak English. It seems most
>>of wikipedians in ja.wikipedia knows almost nothing about
>>what is going on the administration stage like here or
>>village pump in en.wikipedia.
>>
>>If you are talking about only en.wikipedia (since this
>>mailing list is in English), forgive me about my
>>misunderstanding.
>>
>>My suspecion is that we should stop using universal system
>>for all of editions. Different languages need different
>>coordination. The couting artciels should depend on the
>>language of edition. Therefore, the discussion here should
>>be irrelevant and we should move this to each wikipedia's
>>village pump.
>>
>There are very few rules that should apply across all
language
>Wikipedias. Perhaps some very serious things like NPOV or
not insulting
>each other, but in most areas I agree with your intent.
>
>At the same time, the much larger number of participants in
the English
>Wikipedia means that more subjects get discussed with a
wider range of
>opinions.
Needless to say, we can't use this as excuse for ignoring
the minority.
>When much later an issue arises in one of the other
languages it would
>be a good idea for somebody to translate a summary of the
English
>discussion with a fair representation of all sides. After
that there is
>no need to come to the same conclusion as the English
speakers.
I more like United Nations style than that of federal
government and states in the United States. We probably
should reflect the opinion of each edition not convey the
central opinion to each.
I posted some explanations on the matter in Japanese village pump and
solicited opinions for those who prefer communicating in Japanese.
Here is an input from Gombe, (as I understood).
Gombe thinks that while it's a little disappointing that article count
doesn't really increase, counting comma is not totally unacceptable because
the main focus of the project is elsewhere anyway.
As an alternative, Gombe suggested counting more-than-two-paragraph could be
good.
----
By the way, for those who are interested, Japanese wikipedia is like this
right now:
3762 pages
296 counted as articles
distribution of file-sizes in article namespace
7 = 0 bytes
127 < 50bytes
270 < 100bytes
812 < 200
1132 < 300
_________________________________________________________________
Help STOP SPAM with the new MSN 8 and get 2 months FREE*
http://join.msn.com/?page=features/junkmail
>Aha, again demonstrating the obsession over the count. Why
was it
>important to hit or not hit 100,000? Because of an offhand
remark made a
>couple years ago about "we hope to reach 100,000 articles"?
Actually, milestone is important. While some seems
meaningless including most active wikipedians, some statics
are important, including the number of register users and
the number of articles.
We want to know what we have achived, where we are heading
for and better, we want to show off such.
>Why? What's *wrong* with small articles?
Because some people short articles are useless. It is a
wrong concept. Needless to say, the length of an article
indicates nothing about the usefulness of the article.
I strongly believe the mere stub article of Japanese author
written in English is much more precious than the list of
songs with some stupid criteria.
>> > Unless a better count system is proposed, I will
replace the comma check
>> > with a greater-than-zero-size check within twelve hours.
>>
>> And what about the people who get the digest after your
12 hour deadline? How
>> about the other people who only check or respond to
Wikipedia posts during
>> the week? Shouldn't they have a say in this?
Brion's proposal is fair. While there is still objection,
greater-than-zero-size is more fair than comma counting.
After change, we can still keep debate.
>They had their say months ago when no one was able to
decide what to do.
>Do you really think a new consensus is going to come in 24
hours? 48? A
>week? A month? A year? I think you're sorely mistaken if
so. But,
>please, feel free to prove me wrong.
>
>Tell you what. I'll hold off until Wednesday night. Come up
with a
>consensus on a better system by then, or comma-count shall
be replaced
>with not-blank-count circa 07:00 UTC, 13 March. (11pm on
the 12th here
>in PST.)
I am afraid we don't reach the consensus this time too. I
suggest that we should at least change the counting system
of ja wikipedia in the exact way you proposed. Leaving
unfair system around is more injust than keeping debate
hoping to reach the consensus.
Besides, it figures no one supports the current comma
counting system, then why do we have to wait to overthrow it?
On Sunday 09 March 2003 12:01 pm, Brion Vibber wrote:
> Aha, again demonstrating the obsession over the count. Why was it
> important to hit or not hit 100,000? Because of an offhand remark made a
> couple years ago about "we hope to reach 100,000 articles"?
>
> When did this become our holy mission?
Round numbers, especially large ones, are milestones that get people's
attention. That is why x.0 is so important in the software world, why cities
celebrate the day they reach 1,000,000 inhabitants, why there was so much
mania when our calendars hit the year 2000, why the first billion-dollar
business and billionares are mentioned in history books, and why we got a lot
of media attention after en.wiki hit the 100,000 count.
The article count is also a measure (however crude) of our progress. So there
is nothing wrong with trying to improve that measure and make it more
conservative where it makes sense (Jimbo has already stated he wanted a more
conservative count. However right after he said that we had already hit the
100,000 mark and were being slashdotted).
> Did the messianic age begin when the counter flipped into six digits?
> Have we all been betrayed by a sinister being who wants to make us look
> bad by leading us astray and "inflating our count"?
>
> What the *heck* does it matter?
Boy are you in a really bad mood today. See above.
> Bad to whom? Embarrassing to whom? Is it solely the use of the word
> "article" that throws us off? Are we obsessed with proving that our
> "articles" are so fricking wonderful that every single one of them must
> be the greatest pinnacle of writing prowess or we must lock it in the
> basement of shame and never admit its existence?
No - a simple automatic measure is all that is needed. We mention the
definition of the count on en.wikis [[Wikipedia:What is an article]] page.
> Go open up a paper encyclopedia sometime. Look at it. A fair chunk of
> the articles are *one paragraph long*. Do their editors worry themselves
> over the metric they use to stamp "over 60,000 articles!" on the cover?
> Or do they just count the number of entries at some point and say "at
> least this many"?
Exactly - and how many bytes would a smallish complete paragraph be in such an
encyclopedia? Around 500 bytes. Then we could say that we *at least* have x
number of articles. Right now the count includes many entries that do not
consist of even one complete paragraph. A per language set
{{HEADLINEARTICLECOUNT}} would be flexible enough for both large and small
wikis. {{NUMBEROFARTICLES}} would be used for comparison purposes.
> Mav, thanks for proving my point again about count-mania. Are you
> seriously suggesting that the pseudo-random number spit out on the front
> page actually *defines* what articles are in a meaningful way?
Again, more unnecessary anger. Please calm down - we are not talking about
anything of such cosmic importance to warrent such feelings. :-)
The answer to your question is above (the part talking about tracking our
progress and how the outside world sees our progress). So, yes it is
important to have a conservative estimate of the number of articles we have.
That's not to say that everything a computer would recognize as an article is
actually what a human would consider to be one. But since the computer will
also miss entries that /could/ be considered articles, then everything
averages out in the end (some really obscure subjects can, in fact, be
covered in a sub-500 byte entry).
In short, I'm not asking for an AI article count - I just would like to see a
more conservative crude method used on en.wiki that excludes more entries
that are probably not articles (however we shouldn't go live with such a
count until after have enough entries to still be above 100,000 - otherwise
we could get some negative media attention and a drop in morale).
IMO the best way to do that is to have a per wiki set
{{HEADLINEARTICLECOUNT}} in addition to {{NUMBEROFARTICLES}}. It would be up
to each language to define their own byte threshold for their own headline
count (or they could choose to ignore {{HEADLINEARTICLECOUNT}} and use the
much less conservative {{NUMBEROFARTICLES}}. Of course, each wiki that uses
{{HEADLINEARTICLECOUNT}} would then have to publicly document their threshold
for their own headline count.
-- Daniel Mayer (aka mav)
WikiKarma
The usual at [[March 8]] (I'm fresh out of WikiKarma so I need to work on
creating some more balance in the Universe before I respond to your
response).
Hallo,
I just found a source for tons of bird images on
http://www.biologie.uni-hamburg.de/b-online/birds/naumann.htm, and I
am eager to use them. But I am not 100% sure, if I am allowed to use
them.
All images on that site are drawings from a 1905 field guide, and so
they should be public domain. BUT all images were (as they describe on
their page) scanned and reworked with a graphics tool in order to
change contrast, lightness and hue. I don't know, if this makes the
owners of the website new copyright holders.
Of course I tried to contact them, but the e-mail address on that page
is invalid, so there is no contact person.
Can anyone tell me, if I am allowed to use these images?
Mirko.
--
Mirko Thiessen
http://www.mirko-thiessen.de
On the dutch wikipedia I, together whit the others off cource, are looking
out for possibel copyright violations of images.
There is a system where there is asked permission to the owner of pictures
and the credit and the permission is included to the picture discription
page.
The problem is users are importing pictures from other wikipedias whit a
note like "from the English wikipedia" or "from the german wikipedia".
And when you look at that image on that wikipedia almost never there is a
comment about the source of that image.
Also some Wikipedias have a different view about clear copyright
violations, like this;
http://sv.wikipedia.org/wiki/Tintin
I have to explain to a user of wikipedia NL that the image she has made by
taking a frame from a televison broadcast not can be used. But it is very
difficult to inforce a strong copyright policy when the other wikipedias
are not taking the image copyright more serious.
--
Contact: giskart AT wikipedia.be
Ook een artikeltje schrijven? WikipediaNL, de vrije GNU/FDL encyclopedie
http://www.wikipedia.be
Ray Saintonge wrote:
> BTW the Esperanto Wikipedia seems to have the greatest number of zero
> length articles.
A while ago we had a guy come in and add a zillion pages consisting of
externally-linked images with little or no text, and no reference as to
copyright status or permission for use. After some yakking, we got him
to peacefully retract them. Those that never had any text are thus left
empty until they get either filled out or deleted outright.
-- brion vibber (brion @ pobox.com)
>Hi,
>
>The comma count doesn't make sense to me.
>I would suggest size with a threshold of 100 bytes, or even
200 bytes.
We should not set the threshold arbitrary. Needless to say,
the size of an article indicates nothing about the usefulness
of the article. So does the comma. Actually in this point,
Brion's idea is clever, which seems the most fair among other
suggestions.
I've been having troubles with my e-mail server, which should now be
resolved; anyone who's sent me any direct e-mail (not through the lists)
in the last day or so, I haven't gotten it -- please re-send.
-- brion vibber (brion @ pobox.com)
>Hum, you are very right with language issue
>But, even if this list is in english, it is the main
>list, so issues discussed here are likely to impact
>all wikipedias
Then, we should stop this. Please don't claim this is a
policy then you've got to conform that.
>Besides, that the french silliness with comma hunting
>that has been going on for about a month, that explain
>this subject being raised again. But the issue is not
>new :-)
My opinion is simple enough. We should stop governing non-
English wikipedias by English'd government.
>> Therefore, the discussion here
>> should
>> be irrelevant and we should move this to each
>> wikipedia's
>> village pump.
>
>Yes and no.
>It is not irrelevant, for some people thrive on
>monitoring the number of articles, and many
>internationals are eager to go up in the rank.
>For those, it is important that the numbering
>technique be the same, for the comparison to stay
>valid.
Why do we have to compare in the first place? We are not
doing race where who reaches first milestone among non-
English editions.
>On the other hand, it would be interesting that
>discussions go in parallele in pump (of coffee shop
>:-)) for everyone to feel involved
Not feel but all of people must be involved. Unless that, we
can claim the consensus is based on the public.
>My feeling is that counting technique should maybe be
>related to the language family.
That is my point. Couting should vary according to the
language or something like family.
By the way, I posted an issue at village pump of
ja.wikipedia. Hopefully they give me (not us, though) more
thoughts.