Brion Vibber saith:
> The wikidown address has been getting spam in the last few days... I
> don't know if Lee still has it forwarding to his beeper, but that could
> be awful annoying. :P
>
> Suggestions?
Contact the spammers and say that this isn't a personal e-mail address
and that any advertisements will be ignored? 'Contact' meaning go to
the spammers' website and find a e-mail address, not (of course) reply
to the spam. Okay, maybe that's not the best idea...
Ask for the message to be quoted in its entirety? I myself already do
that and most others likely do. This is simpler than asking for 2pi,
"managing flamingos", etc. in the message, and serves the same purpose.
Change the wikidown address, say, to wiki.down(a)bomis.com? Or preferably
wikidown(a)wikipedia.org and set mail.bomis.com as a backup MX? I doubt
anyone has the wikidown address in his/her address book, so it wouldn't
be hard to change the address.
Is it a virus or HTML mail? Virus spam and HTML mail often contain
large attachments. If we require that the wikidown mail be in plaintext,
most people can do that, and there should be enough people reporting to
make up for those who can't send plaintext (if any).
Best idea: if the wiki's down in the sense of blocked database - not
in the sense of server crashed and the page won't load - let larousse
automatically send e-mail to an address at bomis.com that's hidden to
the public.
-Geoffrey
=====
-Geoffrey Thomas
geoffreyerffoeg(a)yahoo.com
__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com
Erik wrote:
>I still don't see the problem. If you dual license
>Wikitravel, your authors will have no direct
>disadvantage whatsoever. It's just another
>sentence on the article submission form. Both
>licenses grant derivative rights.
I see a problem; none of the current Wikimedia projects wants to be a travel
guide, so where o where will Wikitravel's text be used? The last thing I want
is travel guide type text such as "Good places to eat in Mexico City" in our
Wikipedia article on Mexico City. That type of stuff is not appropriate for
an encyclopedia. Just provide an external link to the Wikitravel article at
[[Mexico City]].
We plan on coercing both the CC and GNU people to make their copyleft content
licenses copy/paste compatible anyway, so I don't see any real benefits for
Wikitravel to be dual licensed now; all that would accomplish is confuse
their contributors (part of the whole point of the CC/Att-SA license is to be
/less/ confusing than other copyleft content licenses - such as the FDL).
-- Daniel Mayer (aka mav)
On Wednesday 06 August 2003 01:43, Brion Vibber wrote:
> The wikidown address has been getting spam in the last few days... I
> don't know if Lee still has it forwarding to his beeper, but that could
> be awful annoying. :P
>
> Suggestions?
Auto-reject anything that doesn't have a certain string in it?
Naturally this can't be something simply tossed along side the address. I'd
suggest omething like "Add pi to itself. For the purposes of this exercise,
pi is three-point-one-four. Include the result in the subject.". Takes two
seconds, not worth the effort for spammers to try and parse automatically
(yet), and not terribly likely to show up as an actual part of the spam.
Hey, this is Marumari - how does one go about banning a user account
(rather than an IP address). Can we even do this? I'm getting requests
to deal with Michael again...
--
Nick Reinking -- eschewing obfuscation since 1981 -- Minneapolis, MN
>>>>> "JW" == Jimmy Wales <jwales(a)bomis.com> writes:
JW> I notice that you're currently using the Creative Commons
JW> by-sa (attribution share-alike) license. That's a good
JW> license, nearly identical in spirit to the GNU FDL but not, in
JW> my opinion, strictly compatible.
Yes, I'm pretty sure they're incompatible, actually.
JW> Since you're only a few days in, you could still switch to GNU
JW> FDL easily enough -- this might be best for you, if you
JW> envision a lot of content sharing between your site and
JW> wikipedia (which seems natural).
I realize that the license incompatibility with Wikipedia articles is
a serious problem.
http://www.wikitravel.org/article/Wikitravel:Why_Wikitravel_isn't_GFDL
However, I think that the GFDL isn't really compatible with our goals
for Wikitravel.
http://www.wikitravel.org/article/Wikitravel:Goals_and_non-goals
Specifically, we want to make it really easy for a tourist information
office, a hotel or guesthouse, or a party planner to have a stack of
copies of a Wikitravel article for use by customers/visitors.
The heavyweight nature of the GFDL would make it hard for them to
comply: just having to redistribute the entire license text and a
changelog makes it unreasonable for 1-2 page articles, and having to
distribute source code is just out of the question.
The Attribution-ShareAlike 1.0 just requires a copyright notice and
the URL of the license -- small enough to fit in a little paragraph at
the end of an article.
It really bothers me a lot that the license incompatibilities make it
hard to borrow content from Wikipedia. I've thought about doing a dual
license, so that at least Wikitravel content could be shared into
Wikipedia, if not vice versa, but it seems like a lot of trouble (and
explaining).
The only consolation I have is that a lot of encyclopedic content
isn't really applicable for a travel guide. But there is a lot of
crossover. The trade-off is painful but apparently necessary.
JW> I'm enthusiastic about your project, and I think the idea is
JW> excellent.
Thanks for the support. We're excited about the project, too!
~ESP
--
Evan Prodromou
evan(a)prodromou.san-francisco.ca.us
Now there is the information that you can find a xx.wikipedia.org/stats
That is good but I would like more details.
Is there a different tool to view the logs instead to Webalizer that gives
more information?
If the information is to sensetive to be public make it sysop-only.
If webalizer is the best that can provided can the settings be changed to
make the lists longer?
I would like;
Top 30 of 15178 Total URLs -> to 100
Top 10 of 4943 Total Entry Pages -> to 50
Top 30 of 10407 Total Referrers -> to 150
Top 20 of 1882 Total Search Strings -> to 50
If it is only so for the Dutch Wikipedia that is that fine.
Walter
--
COMPUTERBILD 15/03: Premium-e-mail-Dienste im Test
--------------------------------------------------
1. GMX TopMail - Platz 1 und Testsieger!
2. GMX ProMail - Platz 2 und Preis-Qualitätssieger!
3. Arcor - 4. web.de - 5. T-Online - 6. freenet.de - 7. daybyday - 8. e-Post
I've set up a static copy of the search index table to run on Larousse,
the web server, separately from the main database server, and modified
the wiki to run searches through it.
This is rather experimental, and I don't know if it'll remain smooth. If
it doesn't play nice, it'll get disabled again. For now this is a static
copy of the search index, so it won't automatically update with new
pages, changed pages etc. But, it's better than nothing and should be
more up to date than Google for the time being.
(I'd prefer to be running these sorts of things on a third machine,
capable of being a full live backup database server, but we don't yet
have one. Larousse does have plenty of free memory at the moment, so
stealing some to run a limited-purpose mysqld shouldn't hurt it too much
in the short term.)
Technical notes: the duplicate searchindex table has been modified to
include the cur_is_redirect and cur_namespace columns out of cur, since
these are used to narrow down the search. The modified search engine
grabs a series of matching page ID numbers out of the alternate
database, then grabs the current title and and contents from the real
cur table using the index numbers for just the matches that it needs to
display, which should be quite fast. I've also dropped the default
number of results per page from 20 to 10.
I haven't checked in these experimental changes to CVS; a diff to
SearchEngine.php is attached. Also slight change to
DatabaseFunctions.php to support more options to wfGetDB() but it
doesn't diff cleanly against the current version and I'm too tired to
sort it out right now.
-- brion vibber (brion @ pobox.com)
Index: SearchEngine.php
===================================================================
RCS file: /cvsroot/wikipedia/phase3/includes/SearchEngine.php,v
retrieving revision 1.12
diff -u -r1.12 SearchEngine.php
--- SearchEngine.php 13 Jul 2003 23:19:56 -0000 1.12
+++ SearchEngine.php 6 Aug 2003 11:28:29 -0000
@@ -170,27 +170,64 @@
$searchnamespaces = $this->queryNamespaces();
$redircond = $this->searchRedirects();
- $sql = "SELECT cur_id,cur_namespace,cur_title," .
- "cur_text FROM cur,searchindex " .
- "WHERE cur_id=si_page AND {$this->mTitlecond} " .
+ global $wgDBsearchServer, $wgDBsearchUser, $wgDBsearchPassword, $wgDBsearchName;
+ if( $wgDBsearchServer ) {
+ wfGetDB( $wgDBsearchUser, $wgDBsearchPassword, $wgDBsearchServer, $wgDBsearchName );
+ #$meep = mysql_connect( $wgDBsearchServer, $wgDBsearchUser, $wgDBsearchPassword );
+ #mysql_select_db( $wgDBsearchName, $meep );
+ }
+ $sql = "SELECT si_page " .
+ "FROM searchindex " .
+ "WHERE {$this->mTitlecond} " .
"{$searchnamespaces} {$redircond}" .
"LIMIT {$offset}, {$limit}";
- $res1 = wfQuery( $sql, $fname );
+ $res1 = wfQuery( $sql, $fname ); #FIXME
$num = wfNumRows($res1);
-
+
if ( $wgDisableTextSearch ) {
$res2 = 0;
+ $num2 = 0;
} else {
- $sql = "SELECT cur_id,cur_namespace,cur_title," .
- "cur_text FROM cur,searchindex " .
- "WHERE cur_id=si_page AND {$this->mTextcond} " .
+ $sql = "SELECT si_page " .
+ "FROM searchindex " .
+ "WHERE {$this->mTextcond} " .
"{$searchnamespaces} {$redircond} " .
"LIMIT {$offset}, {$limit}";
$res2 = wfQuery( $sql, $fname );
- $num = $num + wfNumRows($res2);
+ #$num = $num + wfNumRows($res2);
+ $num2 = wfNumRows( $res2 );
+ }
+ if( $wgDBsearchServer ) {
+ global $wgDBconnection;
+ $wgDBconnection = NULL;
+ wfGetDB();
}
- if ( $num == $limit ) {
+ if($num) {
+ $ids = array();
+ while( $s = wfFetchObject( $res1 ) ) {
+ array_push( $ids, $s->si_page );
+ }
+ $sql = "SELECT cur_id,cur_namespace,cur_title," .
+ "cur_text FROM cur " .
+ "WHERE cur_id IN (" . implode( ",", $ids ) . ")";
+ $res1 = wfQuery( $sql, $fname );
+ $num = wfNumRows( $res1 );
+ }
+ if($num2) {
+ $ids = array();
+ while( $s = wfFetchObject( $res2 ) ) {
+ array_push( $ids, $s->si_page );
+ }
+ $sql = "SELECT cur_id,cur_namespace,cur_title," .
+ "cur_text FROM cur " .
+ "WHERE cur_id IN (" . implode( ",", $ids ) . ")";
+ $res2 = wfQuery( $sql, $fname );
+ $num2 = wfNumRows( $res2 );
+ }
+ $num += $num2;
+
+ if ( $num == $limit ) {
$top = wfShowingResults( $offset, $limit);
} else {
$top = wfShowingResultsNum( $offset, $limit, $num );
@@ -415,7 +452,9 @@
$wgOut->redirect( wfLocalUrl( $wgTitle->getPrefixedURL() ) );
return;
}
+ $wgTitle = Title::newFromText( $search );
+ /*
# Try a near match
#
$this->parseQuery();
@@ -424,7 +463,8 @@
if ( "" != $this->mTitlecond ) {
$res = wfQuery( $sql, $fname );
- }
+ }
+ */
if ( isset( $res ) && 0 != wfNumRows( $res ) ) {
$s = wfFetchObject( $res );
JeLuF wrote:
>This all seems very unlikely to me. I think Mav and
>perhaps some other "power user" are using the alexa
>toolbar and their many hits adulterate the statistics.
Is that an allegation that I'm rigging the system?
Instead of throwing around baseless accusations why don't you look up the
criteria for hit counting; one IP is counted only once a day per page (see
http://pages.alexa.com/exec/faqsidos/help/index.html?index=1 ). Oh and I use
a non-Windows operating system so the Alexa Toolbar doesn't work on my
computer - same for a disproportionately large number of daily contributors.
So if anything Wikipedia's stats are undercounted by the use of the "power
users".
Good day.
-- Daniel Mayer (aka mav)
Hi, folks!
I just stumbled over the "FIXME" in SpecialUserlogin.php. The message
that Phase3 sends to users has one hardcoded From field and one
hardcoded Reply-To field, which are different. Does it make sense to use
different mail addresses in this context? I propose we just use one
address that is defined in a global variable. If there are really weird
setting that require a Reply-To field different from the From field, the
maintainer will know how to work around this (namely $sender =
"forget(a)wiki.org\r\nReply-To: admin(a)wiki.org").
Bye!
Matthias
Pablo Saratxaga wrote:
> Sorting order is language specific, and some languages handle some
> accented letters as different letters of their own, not as simple
> variations of the base letter.
Sorting is one thing, searching is another. Both are determined by
the language "collation" order. For example: In Swedish, A and Ä are
two different letters, but Á is just an accented version of A.
I just finished reading chapter 8 of the MySQL manual, covering new
features in MySQL version 4.1, http://www.mysql.com/doc/en/Charset.html
The contents sounds promising for applications like Wikipedia. As far
as I know Wikipedia only uses MySQL 4.0.12 yet, but when 4.1 becomes
more stable it would be possible to use these new features. Among
them is the possibility to specify the collation order for fulltext
search (select match()) or sort (order by) operation. Each database,
table or column can also have a default collation. This means the
Hungarian Wikipedia could have the Hungarian collation as its default,
but a user could specify as her personal preference to use the English
collation (accent-ignorant) for her searches. I'm looking forward to
this.
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik - http://aronsson.se/