Is there any reason why when inserting a new article from outside
EditPage.php doesn't update the category links?
$title = Title::newFromText("Hellos");
$article = new Article($title);
$ret = $article->insertNewArticle("hi there![[Category:Greetings]]",
"", false, false, false)
The article is successfully created and links to the Greetings
category. However, the "Greetings" category does not show the new
article in the list of articles in that category.
Travis
Hey Brion,
Doh! It does look like the UTM code that Google Analytics uses does
cookie anonymous users, will this prevent us from configuring Squid
properly? The cookies won't affect the display of the pages loaded, is
there a way to ignore a set of cookies that start with __utm in the
context of the cache?
Travis
> Message: 10
> Date: Mon, 05 Dec 2005 11:32:34 -0800
> From: Brion Vibber <brion(a)pobox.com>
> Subject: Re: [Wikitech-l] Squid configuration problem
> To: Wikimedia developers <wikitech-l(a)wikimedia.org>
> Message-ID: <439495D2.3050400(a)pobox.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Travis Derouin wrote:
> > Hey,
> >
> > Does anyone have any suggestions about configuring Squid in reverse proxy
> > for MW? We're seeing that our articles aren't being cached and Squid is
> > still making several requests per second for the same article, despite
> > configuring Squid as instructed at meta.wikimedia.org. Here are our
> > settings:
> >
> > $wgUseSquid = true;
> > $wgUseESI = false;
> > $wgInternalServer = $wgServer;
> > $wgSquidMaxage = 18000;
> > $wgSquidServers = array('10.234.169.202');
> > $wgSquidServersNoPurge = array();
> > $wgMaxSquidPurgeTitles = 400;
> > $wgSquidFastPurge = true;
> >
> > Squid and Apache are running on different machines, and we're seeing several
> > 200 response codes for unchanged articles from Apache, even in a small time
> > span of a few minutes. While the apache load has been reduced, we should be
> > seeing Squid handling more of the serving of normal article pages, instead
> > of forwarding the request to Apache almost 100% of the time.
>
> Can you confirm that anonymous users have no cookies? If you manually hit the
> server with manual HTTP can you confirm cache hits/misses?
>
> -- brion vibber (brion @ pobox.com)
>
>
This question belongs here, I believe. Would such a patch be possible?
----- Forwarded message from Gerrit Holl <gerrit(a)nl.linux.org> -----
Date: Mon, 5 Dec 2005 11:12:36 +0100
Subject: Whitelisting 'members'
From: Gerrit Holl <gerrit(a)nl.linux.org>
To: helpdesk-l-owner(a)wikimedia.org
Cc: helpdesk-l(a)wikimedia.org
User-Agent: Mutt/1.4.1i
Lines: 32
Gerrit Holl wrote:
> After Christmas I might have time to help in creating such a software
> patch, it should not be too difficult and it would probably make a lot
> of difference in the moderation queue work.
I might have time before as well. What version of mailman do the
mailinglist servers run?
Is read-only access to the database available from the mailinglist
server? It would require but a single SQL query:
select "" from user where user_email=email_address limit 1;
If this has a result, the e-mail is let through.
If it has not, it is moderated.
Almost all, if not all, of those e-mails will be good faith. A few lines
of code added to Mailman/Handlers/Moderate.py function process would do
the trick, I think before line 93 or 109: hold for approval if the action for
non-members is 'hold'. We could change that into: hold for approval if
the address does not occur in the enwiki database, and is not a member
of the mailing list, and the mailing list default is set to 'hold'.
What do you think?
Gerrit.
--
Temperature in Luleå, Norrbotten, Sweden:
| Current temperature 05-12-05 10:49:53 -4.5 degrees Celsius ( 23.8F) |
--
Det finns inte dåligt väder, bara dåliga kläder.
----- End forwarded message -----
--
Temperature in Luleå, Norrbotten, Sweden:
| Current temperature 05-12-05 11:19:50 -4.5 degrees Celsius ( 23.9F) |
--
Det finns inte dåligt väder, bara dåliga kläder.
Hi,
I'm having a go at coding a feature request, and I have a question
about messages. The feature is a new special page, so all of the code
is within one file, the only thing that is outside that file is a
require_once and some messages.
I was wondering where is the place to put new messages? I figured in
Language.php, but how then do I update the database with them and
create the relevant MediaWiki: pages?
Also, http://meta.wikimedia.org/wiki/How_to_become_a_MediaWiki_hacker
recommends creating a patch and posting it at bugzilla, but I'm on
Windows - how do I create a cvs diff in this environment?
Thanks,
--
Stephen Bain
stephen.bain(a)gmail.com
Folks,
I hope I am not off-topic to refer this RfC to the mailing list.
http://en.wikipedia.org/wiki/Wikipedia_talk:Requests_for_comment/Roylee
Please direct all discussion to the Talk page above, and not to this list ?
I include a little context below.
Cheers, Andy!
Purpose of this RfC
Personally, I am not interested in sanctioning the editor Roylee.
What I think is most important, and accordingly what should be the
aim of this RfC, is finding a solution to the underlying social and
technical problems of Wikipedia as exposed by this issue. As BanyanTree
said here, "it took over a hundred edits before Mark began to reel
Roylee in. Is there another user whose made 50 similar edits, who has
not been discovered? Are there a hundred such users?"
Simple-minded Linus's law-derivates are not the answer here. Many
eyes have looked at the articles that were affected, and did not
recognize any problems. More paradoxically, if literally everyone
would have assumed good faith, only brute force fact-checking could
have detected the problem eventually. This is profoundly worrying.
Quite frankly, all this has made me doubt, maybe for the first time,
the long-term viability of Wikipedia as a trustworthy resource. --
mark 14:46, 3 December 2005 (UTC)
A clever manipulator will always be able to insert unwarranted
material into Wikipedia for a time, but many eyes eventually
discover even believable nonsense. This issue is a case in
point. The principle of caveat lector should not be restricted
to elite education: we should all have been raised as doubters
of text, right from the start. Even the Encyclopaedia
Britannica. As for me, so far am I from continuing to Assume
Good Faith, in the face of bad edits, when I find a vandal I
try to check through that IP's contributions, and sometimes
discover previously unnoticed vandalism.
The question remains, how many of these bad edits remain in
Wikipedia articles? --Wetman 03:05, 4 December 2005 (UTC)
I disagree with your first point ("many eyes eventually
discover..."), which is of course a truism among Wikipedians as an
extension of the proposition that vandalized articles constitute a
fraction of one percent of all articles. This case appears to indicate
that "believable nonsense" can remain on Wikipedia for unacceptably long
periods. Cases such as the long-lasting misinformation found recently
on John Seigenthaler Sr. can perhaps be written off as "believable
nonsense" that was missed because the vandal did not edit a number of
articles, which has a better chance of rousing suspicion and being
reverted (another truism of vandal fighting). This is not the case with
Roylee, who created believable nonsense in self-supporting webs of
article across the normal Wikipedian topics of specialization over a
period of months.
This simply shouldn't have have been possible, and that fact that
it did happen indicates that Wikipedia's processes are not as
robust as they are advertised. As I mention in the post that Mark
links to above, Roylee throws any blanket reassurance given for
Wikipedia's credibility into doubt. - BanyanTree 19:59, 4 December
2005 (UTC)
Wikipedia has credibility? Have you seen pages like Nietzsche or Khmer
Rouge? Wikipedia has many, many roadblocks to overcome before it has any
hint of credibility collectively. I think currently, each article has
to prove it's own credibility, it's not simply inherited because it's a
Wikipedia article. (Bjorn Tipling 21:23, 4 December 2005 (UTC))
I basically concur (see Wikipedia:Researching with Wikipedia, largely
written by me and Dan Keshet), but the goal is to work toward having
credibility, and it looks like Roylee's contributions have been a
detriment. And that is what this RfC is about, no? -- Jmabel | Talk
03:53, 5 December 2005 (UTC)
This is especially the case as we cannot be sure if he is unique
or if his detection is unique, e.g. are there numerous users adding
misinformation using similar patterns? I would like to think not, but I
don't know how anyone can guarantee it. I had hoped to hear a solution
to the problem from people reading this RfC, and the lack makes me think
that the problem is structural rather than individual. I would love to
be proved wrong on this. - BanyanTree 19:59, 4 December 2005 (UTC)
I am currently working on a Java applet for JSP wiki which I would
eventually like to migrate for MediaWiki. It highlights wiki syntax to make
it easier for people to learn and use wikis (for details see
http://www.jspwiki.org/wiki/WikiWizard). Right now I want it to copy text
from MS Word and format it into wiki syntax. I have already figured out how
to get the Word content from the clipboard as HTML (for details see
http://forum.java.sun.com/thread.jspa?threadID=688889) and I was wondering
if anyone knows of any open source Java code that converts HTML to wiki
syntax. Any help would be greatly appreciated.
Thanks,
Chuck
PS: For those interested in the differences between MediaWiki syntax and JSP
Wiki syntax, I have created the following page:
http://www.jspwiki.org/wiki/MigratingFromMediaWiki
Thanks to Brion, who pointed out the matter of readabilty to me.
Accordingly, please allow me another, nicely formatted and more to the
point, posting:
It has been said that Wikipedia is „work in progress“ and will probably
continue to do so. On the other hand it ails from the fact that at no
given point in time you can be certain to have a simultaneously
1.
consistent (with respect to various articles on a similar topic)
2.
unvandalized and
3.
correct (with respect to a single article) throughout Wikipedia
From my point of view, compared to those three points the shortcoming
of the non-completeness of WP dwindles to almost nothing.
Let me draw your attention to the fact that the construction plans for
roads to stability – or at least local optima – have long been laid out
by physics. Heat a dynamic system quickly then let it cool down in a
slower and controlled fashion, allowing less and less dramatic changes
to take place as time passes. Simulated annealing
(http://en.wikipedia.org/wiki/Simulated_annealing) is the magic spell
that might work for wikixyzs in a way similar to that in the real world.
The rationale behind my suggestion is of course that articles that have
matured over time are - statistically speaking - less likely to improve
when large modifications are made than relatively new ones. Some of the
articles have reached a stage where well-meant editing effectively mucks
up the inner structure and logic.
What I think reasonable is to lift the threshold for substantial edits,
maybe not by limiting access but by asking for more substantial
background information from the authors (references, printed,
electronic,...) than the simple comment line. There is too much unproven
and partially unprovable information in the WP. That could have been
prevented long ago by obliging the authors to give references for their
information. Besides, this task would make it successively harder to
simply put established statements upside down. Whereas scientific
journals have peer review to prevent superfluous or erroneous
contributions, WP only offers the weak weapons of discussion pages (for
everyone) and reverts (mostly by admins, who can't always claim
erudition in all the domains they are watching, I guess).
So why not confer a little bit more of responsibility to the authors!?
He/she could be aided by predefined lists, checkboxes, comboboxes (for
ref.type, etc.). Asking a little more information from authors could be
a substantial part of the rising editing threshold necessary for
"cooling down" WP a bit.
I find myself increasingly involved in hunting down vandals and their
work – partly due to the ease of use WP offers for non-serious edits,
too, and I can‘t help feeling that a larger and larger part of WP keeps
a larger and larger part of the community busy with just keeping up the
existing standard. We mustn't be sure of still finding enthusiatic
acclaim in the years to come when WP becomes a battlefield in a fight
against distracting, redundant or plain wrong infobits.
Comments from both the user/admin and developer side welcome.
Best,
kai (kku)
Hello folks,
lately again we've had some stuff going on...
Though Brion was implementing that anon-blocking stuff (yay, more
blocking - faster performance!)
we were targeting other performance issues as well.
Tim did rewrite ip block code (did cut 50ms or so ;-) as well as made
lots of other nice stuff,
and now we implemented Mark's idea to run diskless squids (well, they
have disks, but no
cache on them).
Lots of our new servers have joined object cache running (hehe,
again) Tugela, instead of
memcached. It's interesting to see how it should grow. Sadly, no
expiration (memory->disk)
of objects happened yet in a week, so we can't measure anything.
BerkeleyDB standalone
might be a bit faster than memcached, though, benchmarks on same
hardware were not
conducted.
~22G of data is cached in object cache now - parser objects, image
metadata, diffs,
sessions, user objects, 'you have new messages' bits and language
objects. So far we didn't
notice any of glitches that forced us to remove Tugela from service
before (some cosmetic
patches were done). Anyway, we have more RAM, that didn't cost
millions, we use it.
Anyway, today with squids running from memory only we managed to
achieve 0.09s
average response times for logged in users, at least those who go
directy to Florida.
Before that Squid efficiency was really distorted by somewhat
blocking async i/o (if it
really existed there), poor sibling relations and memory leak.
We still have that memory leak and are somewhat lost with it.. Squid
'accounts' for 1G
memory, uses >2G, and it grows, until restarted. We need to solve
that, but nobody has
every really touched valgrind at such loads (eh, today squid servers
were serving like
>700 requests per second each), and I'm not sure if anyone touched
valgrind properly
at all ;-) We'll soon have a bunch of servers suitable for squid
task, but still, using them
more efficiently would help. We will always lack resources at some
place :-)
Guidelines could help, as well we could simply provide our sources, a
bit of configuration
and load documentation. *shrug*.
Another troubling part is sibling relations - right now each proxy
marks others as siblings
and proxy-only, that is shouldn't save contents into cache.
Eventually they do not talk
to each other at all and hit backend, and all have their separate
caches.
I'm not sure if that's related with equal object expiration times or
any other hypothesis.
If anyone has had experience with squids in such setups, where
there're lots of objects
and lots of servers and efficiency was managed, it would be sure nice
to hear it.
It is still strange that it blocks quite a bit at some housekeeping
operations on i/o.
BTW, it took a while today to detect a serious packet loss in our
upstream providers.
It does slightly affect client network performance, but quite stalls
communication
between our distributed clusters. Looking for such problems becomes a
bit of witch hunt :)
So much of today's experiences and joys ;-)
Cheers,
Domas
Thanks to all who replied to my previous message:
http://mail.wikipedia.org/pipermail/wikitech-l/2005-December/032856.html
In the end I used TortoiseCVS, which recommended WinMerge for diffs.
Apologies in advance for another newbie question, this is the first
time I've worked with CVS. I've finished my patch, and posted it at
MediaZilla:
http://bugzilla.wikimedia.org/show_bug.cgi?id=04028
How do I add this to the CVS, and more importantly, in which package
do I add my changes? Since the feture request is for a Special: page,
almost all the code is self-contained, and so it should be fairly safe
to upload.
--
Stephen Bain
stephen.bain(a)gmail.com