Illario, Latin doesn't have L1 speakers. And data about languages are such a mess, that I would stick with Ethnologue's data for L1 speakers, although they are not reliable. Ethnologue counts "there are 100,000 speakers of language X in country A and 34 in country B, thus there are 100,034 speakers in total" (although likely error margin for the first number is 150 times larger than the second number), as well as it has numerous other flaws, like fringe "macrolanguage" category is. However, besides counting the same way, English Wikipedia has much worse failures when we leave ~50 major languages safety, if not based on Ethnologue's data. (It's mostly about wishful thinking of ethnic nationalists and chronic lack of manpower to fix that bullshit promptly.)

Nemo, yes I was thinking about various data instead of article count and GDP/PPP per capita, so here are the thoughts, including those two parameters:

* Article count per speaker gives one one nice pseudo-hyperbolic curve. Basically, you can see a hyperbolic curve by drawing the line over the highest points: Hawaiian-Upper Sorbian-Basque-Swedish-Dutch-English. By normalizing the numbers, we could get targets per language.

* However, edit count seems like better idea. I think, but it has to be proved, that such numbers won't have to be adjusted for the number of speakers themselves.

* We could count various numbers related to users. For example, it seems that as smaller ratio between the number of active and very active users is, as healthier community is. Also, number of editors per million of speaker per GDP or HDI could be useful parameter.

* I was thinking yesterday about HDI. But then I've realized that it would be good to create all of possibly relevant charts and see what they bring as information. I am interested in comparison of Wikipedia stats with Gini coefficient, for example.

And I will do that. After I finish with the most frustrating part of the job: draw the line between Wikipedia editions, Ethnologue data and actual languages. Good news is that I am on ~150th of ~280 Wikipedia editions and it's likely I will finish it during the next week. (After almost eight years of dealing with this matter, whenever someone says that there are two hundred eighty something Wikipedia languages or that there are 7000 languages in the world, I reach for my revolver.)

On Jun 12, 2015 20:51, "Federico Leva (Nemo)" <nemowiki@gmail.com> wrote:
Milos Rancic, 08/06/2015 00:23:
And I suppose somebody with statistical knowledge would be able to
give us the number which would have meaning "ability to create
Wikipedia article".

Why not use the human development index (HDI) as factor? Also, instead of the number of articles I'd rather use database size or number of words.

Nemo

_______________________________________________
Languages mailing list
Languages@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/languages