An automated run of parserTests.php showed the following failures:
Running test TODO: Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html)... FAILED!
Running test TODO: Link containing double-single-quotes '' (bug 4598)... FAILED!
Running test Magic Word: {{CURRENTMONTHNAMEGEN}}... FAILED!
Running test TODO: Template with thumb image (with link in description)... FAILED!
Running test TODO: message transform: <noinclude> in transcluded template (bug 4926)... FAILED!
Running test TODO: message transform: <onlyinclude> in transcluded template (bug 4926)... FAILED!
Running test BUG 1887, part 2: A <math> with a thumbnail- math enabled... FAILED!
Running test TODO: HTML bullet list, unclosed tags (bug 5497)... FAILED!
Running test TODO: HTML ordered list, unclosed tags (bug 5497)... FAILED!
Running test TODO: HTML nested bullet list, open tags (bug 5497)... FAILED!
Running test TODO: HTML nested ordered list, open tags (bug 5497)... FAILED!
Running test TODO: Parsing optional HTML elements (Bug 6171)... FAILED!
Running test TODO: Inline HTML vs wiki block nesting... FAILED!
Running test TODO: Mixing markup for italics and bold... FAILED!
Running test TODO: 5 quotes, code coverage +1 line... FAILED!
Running test TODO: HTML Hex character encoding.... FAILED!
Running test TODO: dt/dd/dl test... FAILED!
Passed 412 of 429 tests (96.04%) FAILED!
Hello,
some spammer was (or still is) sending spam with faked @wikipedia.org
sender addresses. The bounces for this spam were sent back to our mail
gateways, overloading them yesterday and today.
We changed the setup of the backup MX. It now knows about the existing
mailboxes and rejects mail to unknown recipients directly. In the past,
it accepted any mail for wikipedia.org or wikimedia.org. With this
change, the load on the primary mail server has gone down dramatically.
At noon, our primary MX goeje was handling 200 concurrent mail connections,
this was it's hard limit. After setting the limit to 500, 500
connections were established, but the box started heavy swapping.
The secondary MX has a relay_recipient_maps list configured, which is
updated every 15 minutes from goeje. If a new mailbox or mailing list is
set up, it takes up to 15 minutes until this mailbox is accessible via
the backup MX.
Regards,
jens
Does anyone know what all the steps are for the process of wikifying
(and de-wikifying) a string for URI use? e.g., replacing spaces with
underscores, urlencode()ing, etc.
Cheers,
Saqib
Hello,
I have wrote a "google suggest" like service for wikipedia under GPL
licence [1].
By the way, I am not sure if this project will interest you, I am open
to all comments from your community.
I wrote a simple web site for english/french on
http://suggest.speedblue.org, all software are available in the download
section.
Best Regards.
Julien Lemoine
[1] If you want me to change the licence to something other than GPL,
please let me know.
An automated run of parserTests.php showed the following failures:
Running test TODO: Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html)... FAILED!
Running test TODO: Link containing double-single-quotes '' (bug 4598)... FAILED!
Running test Magic Word: {{CURRENTMONTHNAMEGEN}}... FAILED!
Running test TODO: Template with thumb image (with link in description)... FAILED!
Running test TODO: message transform: <noinclude> in transcluded template (bug 4926)... FAILED!
Running test TODO: message transform: <onlyinclude> in transcluded template (bug 4926)... FAILED!
Running test BUG 1887, part 2: A <math> with a thumbnail- math enabled... FAILED!
Running test TODO: HTML bullet list, unclosed tags (bug 5497)... FAILED!
Running test TODO: HTML ordered list, unclosed tags (bug 5497)... FAILED!
Running test TODO: HTML nested bullet list, open tags (bug 5497)... FAILED!
Running test TODO: HTML nested ordered list, open tags (bug 5497)... FAILED!
Running test TODO: Parsing optional HTML elements (Bug 6171)... FAILED!
Running test TODO: Inline HTML vs wiki block nesting... FAILED!
Running test TODO: Mixing markup for italics and bold... FAILED!
Running test TODO: 5 quotes, code coverage +1 line... FAILED!
Running test TODO: HTML Hex character encoding.... FAILED!
Running test TODO: dt/dd/dl test... FAILED!
Passed 412 of 429 tests (96.04%) FAILED!
On 8/3/06, Mark Bell <typewritermark(a)gmail.com> wrote:
> I would like stats or a graph that shows:
>
> Number of unique editors per article X number of articles with that many
> unique editors.
>
> So for example
>
> number of unique editors Number of articles with that many unique
> editors
> 15 356
> 30 3455
>
>
> And so on. Is this even possible?
Certainly. If you know how to write in MySQL and some brand of
computer language that supports MySQL, it shouldn't be hard to
generate such stats from a database dump (see
http://meta.wikimedia.org/wiki/Database_dumps). If you don't know how
to, maybe someone who does will have pity on you and do it for you. I
haven't gotten around to learning MySQL yet, but it looks to be
probably fairly simple if you have access to a dump.
Julien Lemoine wrote:
> With this solution, you will not be able to handle a lot of queries per
> second (it will probably take more than 1 second to query 2.3 millions
> of entries). If you use a trie (yes I know this word now :)), everything
> is pre-computed and a query will use very few cpu and you will be able
> to handle a lot of queries per second.
:-)
I'm sorry I hadn't taken the time to read about how you are doing it - I
just looked up what a "trie" is, and I see why it's a hurdle - the whole
structure is based on the first x characters. It sounds like you are using
a file to store the trie datastructure, and not using SQL at all? I've got
to say that I only know basic SQL, PHP, and some C - I don't have a
background to speak intelligently about optimising code for zillions of
hits, but I would approach this by taking all the article titles, and
creating an index with every significant word in the title. After that, I'm
afraid I'd have to use SQL, and likely fall back onto using "LIKE" somewhere
if I was going to return partial-word matches (an "obvious and naive"
solution). For whole word matches (but matching multiple words, or words
that are not the first word in the page title) you could query the index
directly. I suppose one could build an index of partial words... would that
be faster than using "LIKE"? Hmm...
I'm throwing queries at my category_links table that match category_a OR
category_b, then group by page and count results to perform a category math
/ category intersections function. It seems pretty fast, but I'm not using
ajax, it's just a posted form. It would be fun to write an ajax
implementation though :-)
I have always assumed that MySQL is internally optimised enough that if one
sticks to simple queries and whole word matches, you get pretty good
performance - but that's an assumption on my part. I'm very interested in
that because I'd like to know more about optimising search queries, but of
course it's not your problem to teach me.
Best Regards and good luck with your project.
And Timwi wrote:
> >I set mine up to require at least 4 characters and to
> > to break on whitespace, so the SQL (pseudocode) is WHERE LIKE(wordone)
> AND
> > LIKE(wordtwo) and so on.
>
> This is, of course, the most obvious and most naive solution. For a data
> set as large as Wikipedia's list of article titles, this is far too slow
> and inefficient and would kill the site for everyone if it was used on
> the live database.
>
>
Timwi, from your other posts I can see that you know a lot more about
computer programming than I do, but I would hope to be able to offer some
opinion without getting slapped with "obvious and naive".
Regards,
Aerik
An automated run of parserTests.php showed the following failures:
Running test Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html)... FAILED!
Running test Link containing double-single-quotes '' (bug 4598)... FAILED!
Running test Magic Word: {{CURRENTMONTHNAMEGEN}}... FAILED!
Running test Template with thumb image (wiht link in description)... FAILED!
Running test message transform: <noinclude> in transcluded template (bug 4926)... FAILED!
Running test message transform: <onlyinclude> in transcluded template (bug 4926)... FAILED!
Running test BUG 1887, part 2: A <math> with a thumbnail- math enabled... FAILED!
Running test HTML bullet list, unclosed tags (bug 5497)... FAILED!
Running test HTML ordered list, unclosed tags (bug 5497)... FAILED!
Running test HTML nested bullet list, open tags (bug 5497)... FAILED!
Running test HTML nested ordered list, open tags (bug 5497)... FAILED!
Running test Parsing optional HTML elements (Bug 6171)... FAILED!
Running test Inline HTML vs wiki block nesting... FAILED!
Running test Mixing markup for italics and bold... FAILED!
Running test 5 quotes, code coverage +1 line... FAILED!
Running test HTML Hex character encoding.... FAILED!
Running test dt/dd/dl test... FAILED!
Passed 412 of 429 tests (96.04%) FAILED!
Hi all!
I am currently developing an extension for MediaWiki 1.6.7 where I want to
add a textarea to a page in edit mode right next to the summary field. I had
no problems creating a textarea but I don't know how I can show this
textarea only in "edit mode" next to the "Summary" textfield.
I also wanted to make a "select distinct" query to the database but I
haven't been able to find the right function or how to do that.
It would be great is someone could help me out with that.
Thank you.
Jennifer
Julien Lemoine wrote:
Steve Bennett wrote:
> > Just one more request: what about if the search string matches
> > *within* the phrase, but not necessarily at the start. Like "Hardy"
> > matching "Laurel and Hardy"? Is that possible with this data
> > structure?
> >
> No it will not. But I don't think this will be really usefull :
> Imagine a query with a single letter ("a" or "e" for example), it will
> matches a least half articles without understanding the results.
>
> Best Regards.
> Julien Lemoine
>
>
I wrote a "suggest" function for finding part numbers for a request form at
work - enabling multi-word search is really useful and powerful. I haven't
looked at your code, so I don't know if you're using a keyword index, or
LIKE SQL functions. I set mine up to require at least 4 characters and to
to break on whitespace, so the SQL (pseudocode) is WHERE LIKE(wordone) AND
LIKE(wordtwo) and so on.
Best Regards,
Aerik