So, I just installed the CRM114 Markovian spam filtering software:
http://crm114.sourceforge.net/
The whole thing is based on Bayesian filtering, which is just a way to
make very dumb software make really smart decisions. With sufficient
training, a very simple piece of software can make very accurate
distinctions between spam and non-spam email messages. See Paul
Graham's famous "A Plan for Spam" about this:
http://www.paulgraham.com/spam.html
The CRM114 stuff is Markovian, which means it's _even_dumber_ than
Bayesian stuff, and makes _even_smarter_ decisions. More or less.
Anyways, one thing that's mentioned on the crm114 page is that folks
use the same technology for different kinds of text sorting. Like, for
system administrators, they can sort log file entries into ones
they're interested in and ones they're not.
And I was thinking: you know, it'd be nice to be able to flag
acceptable and problematic articles in MediaWiki Web sites. Like, say,
an admin sees some vandalism going on, and goes to fix it. One of the
checkmarks on saving is "Vandalism fix" or some such. This would tag
the previous version as... ungood. Something.
And then after a while the software gets good at understanding what's
ungood and what's not. And there could be a tracking page to say,
"These seem to be pages in an ungood state." And it would be easier to
find those and fix 'em.
~ESP
--
Evan Prodromou <evan(a)wikitravel.org>
Wikitravel - http://www.wikitravel.org/
The free, complete, up-to-date and reliable world-wide travel guide
Changes from 1.2.0rc3:
* Fixed new bug that broke 'remember my password' logins
* Fixed text alignment for thumbnail images, enhanced recentchanges in
RTL languages
* $wgSitename auto-detection now includes port number if not running on
port 80
* Tightened up indexes on links tables to try to prevent corrupted
entries
https://sourceforge.net/project/showfiles.php?group_id=34373
Anyone using rc3 is encouraged to upgrade. (To fix the login bug alone,
you can copy in User.php and Setup.php from rc4 to your rc3
installation.)
If you're using MySQL 4.0, you should also update the linkscc table to
fix the index there. update.php will do this, or remove
LocalSettings.php and re-run the in-place installer. (Or manually run
maintenance/archives/patch-linkscc.sql.)
-- brion vibber (brion @ pobox.com)
Because Brion asked for feedback I've taken some time to test the new
inplace installer. It works great, and thanks for the free link to
OpenFacts! ;-)
Two things I noticed:
- In Mozilla, the dropdown box with the languages seems to really screw
things up. When I scroll through it, the fonts suddenly change, and
afterwards the form elements become unusable. Could be my setup, but I'd
still recommend using just the Latin transliterations here.
- There seems to be no option to create an SQL database account, instead
the standard account seems to be used for that purpose. That's probably
not the most secure setup. Is there a particular reason for this?
On a more general note, as we move developer status away from being an
actual developer, we should probably rename the rights flag. I suggest
something like "siteadmin", which is intuitive and instantly recognizable
from other applications. The first user the script would create would
always be a site-admin, and that user would be able to assign arbitrary
rights to other users.
I think we should deprecate the command line scripts ASAP, but preferably
after my Linux Magazine article which explicitly refers to them comes out
;-) (early April, I think - the German version is already out).
Regards,
Erik
I have ~200 files that I would like to upload to the en:Wikipedia. They
are recordings of all the sounds in the International Phonetic Alphabet.
I don't want to upload them by hand, so I am hacking together a little
libwww-perl-based script to upload them automatically under my username.
Before I do this, I want to run this by wikitech-l:
Should I register a separate account for use with my bot, like NohatBot?
I've set the useragent of my script to NohatBot/0.9b. Will it be blocked?
Finally, should I make the script available in case others want to do
bulk uploads? If so, where?
Anything else?
- David [[User:Nohat]]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I was reading "developer policy" and noticed that cvs access seemed to
be different than developer access. Is there different policy for both?
I assume that developer access means direct ssh access to the server.
-----BEGIN PGP SIGNATURE-----
Note: This signature can be verified at https://www.hushtools.com/verify
Version: Hush 2.3
wkYEARECAAYFAkBS5g8ACgkQImu8Pbyf50EgngCaAjESvFeCw+LxAtsex9Of1FHqfewA
n0XqLb5D3zvU3tRjvTtW1DgppViZ
=eEqz
-----END PGP SIGNATURE-----
Concerned about your privacy? Follow this link to get
FREE encrypted email: https://www.hushmail.com/?l=2
Free, ultra-private instant messaging with Hush Messenger
https://www.hushmail.com/services.php?subloc=messenger&l=434
Promote security and make money with the Hushmail Affiliate Program:
https://www.hushmail.com/about.php?subloc=affiliate&l=427
I've just added a manual purge action that can be invoked like this:
http://en.wikipedia.org/w/wiki.phtml?title=Main_Page&action=purge
Works for any title, in all wikis. Hope this makes updating pages that
pull most content from other pages like the front page easier. A small,
hidden link on the main page or similar could make this more acessible if
needed.
--
Gabriel Wicke
http://en.wikipedia.org/stats/
How accurate do we think these numbers are?
Obviously, Dec and Jan are low, way low, and I believe I heard that
some logfiles were/are missing?
February shows a much higher average than March, 3 million pageviews
per day as opposed to 2 million. Does that seem plausible? Both
numbers are more than double our previous "best" month, which was
November of last year. (But of course Dec and Jan seem obviously
broken.)
I'm just looking for a nice number to tell reporters who ask me.
Can I safely say 75 million pageviews per month?
--Jimbo
I wrote:
> > user page throttle.minute throttle.hour throttle.day
>> expiration
>> jwales DNA 2 10 25 (timestamp)
> > wik * * * 3
And Jimbo responded:
>>There may be some ambiguity here in the meaning of '*'. To me, the
>>Wik line says "for each article, Wik may make up to 3 edits per day",
>>as opposed to "3 edits per day on all articles combined".
It might be useful to have a way of signalling both. How about "*" to
mean "for each article," and "#" to mean "on all articles combined"?
It seems to me being able to limit contributions to "all articles
combined" offers a useful tool in throttling vandals. An easy way to
vandalize is to go through and systematically delete the text of
entire articles. I had someone do this a couple of times on the
Disinfopedia. In the space of less than half an hour, the vandal
systematically replaced the text of around 100 pages with nonsense
phrases and profanity. A restriction that says someone can only make
three changes to each article wouldn't stop that kind of abuse at
all, because the vandal's M.O. consists of only making a single
change to each article.
I would also suggest generalizing the ability to block IP numbers, so
that for example you could easily place a block on 142.177.*.*. This
could be useful in dealing with a situation I encountered awhile back
in which a vandal used an anonymizer that kept changing his IP number
after about every 10 edits or so. If someone were to start that sort
of attack, you could thwart most of it with the following line:
*.*.*.* * * 3 *
This would say that each unique IP number is allowed to make no more
than three edits per hour -- a restriction that would be a minor
nuisance to most serious contributors who use anonymous IP numbers,
but it would inconvenience and slow down someone on a vandalism spree.
--Sheldon Rampton
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I try to run apache but get this error:
root@localhost mediawiki-1.2.0rc3 # apache2
apache2: Could not determine the server's fully qualified domain name,
using 127.0.0.1 for ServerName[Fri Mar 12 17:44:21 2004] [crit] (92)Protocol
not available: make_sock: for address [::]:80, apr_socket_opt_set: (IPV6_V6ONLY)
no listening sockets available, shutting down
Unable to open logs
root@localhost mediawiki-1.2.0rc3 #
Any help would be appreciated.
Thanks.
On Fri, 12 Mar 2004 14:08:07 -0800 Brion Vibber <brion(a)pobox.com> wrote:
>On Mar 12, 2004, at 13:48, <perl(a)hush.ai> wrote:
>> I have tried to install mediawiki (so I can submit some patches)
>using
>> the php script but it fails:
>>
>> root@localhost mediawiki-1.2.0rc3 # php install.php
>>
>> Warning: mkdir(/usr/local/apache/htdocs/wiki): No such file or
>>
>> directory
>> in /home/perl/mediawiki-1.2.0rc3/install.php on line 166
>> Could not create directory "/usr/local/apache/htdocs/wiki".
>>
>> This is probably a simple problem. Can someone tell me what I
>have done
>> wrong?
>
>Sounds like there is no directory /usr/local/apache/htdocs; check
>that
>this is your web server's actual DocumentRoot.
>
>(I'd highly recommend you try the new in-place installation, please
>see
>the file INSTALL.)
>
>-- brion vibber (brion @ pobox.com)
>
-----BEGIN PGP SIGNATURE-----
Note: This signature can be verified at https://www.hushtools.com/verify
Version: Hush 2.3
wkYEARECAAYFAkBR9y4ACgkQImu8Pbyf50Hr4gCeN3ddhTzNoG2WY7FRP5u72xkAxlcA
nRu6oBt8VCL5j54H3ATnlK+v6lTF
=ErQF
-----END PGP SIGNATURE-----
Concerned about your privacy? Follow this link to get
FREE encrypted email: https://www.hushmail.com/?l=2
Free, ultra-private instant messaging with Hush Messenger
https://www.hushmail.com/services.php?subloc=messenger&l=434
Promote security and make money with the Hushmail Affiliate Program:
https://www.hushmail.com/about.php?subloc=affiliate&l=427
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I have tried to install mediawiki (so I can submit some patches) using
the php script but it fails:
root@localhost mediawiki-1.2.0rc3 # php install.php
Warning: mkdir(/usr/local/apache/htdocs/wiki): No such file or directory
in /home/perl/mediawiki-1.2.0rc3/install.php on line 166
Could not create directory "/usr/local/apache/htdocs/wiki".
This is probably a simple problem. Can someone tell me what I have done
wrong?
Thanks!
-----BEGIN PGP SIGNATURE-----
Note: This signature can be verified at https://www.hushtools.com/verify
Version: Hush 2.3
wkYEARECAAYFAkBR6mIACgkQImu8Pbyf50G0rQCgoyHDRdSoKk6MUVwE0YL+/YAaOxEA
oKP64VERLVo3jwDJ6Av9RnVIov7V
=Cehw
-----END PGP SIGNATURE-----
Concerned about your privacy? Follow this link to get
FREE encrypted email: https://www.hushmail.com/?l=2
Free, ultra-private instant messaging with Hush Messenger
https://www.hushmail.com/services.php?subloc=messenger&l=434
Promote security and make money with the Hushmail Affiliate Program:
https://www.hushmail.com/about.php?subloc=affiliate&l=427