[WikiEN-l] Atomz Search?

Catherine Munro artslave at usa.net
Wed May 19 05:35:25 UTC 2004


>Fennec Foxen wrote:
>But we like our own database servers and searches, don't we? And we
>have a perfectly good search system in place (it's only turned off...
>something related to b0rk3d hardware. =/) Why pay even more for
>external searches when we can have our own?
>
>in reply to Catherine who wrote:
>Has anyone explored the Atomz Search application for use on
>Wikipedia?  (see http://www.atomz.com/applications/search/ and also
>their FAQ http://www.atomz.com/applications/search/faq.htm
We do not have a perfectly good search system -- it's been turned off for 
many many months.  And from my understanding it's not a hardware problem, 
it's the fact that we don't have anyone who is really expert at optimizing 
MySQL database searches.  The search as written now makes a massive hit on 
the servers, slowing down allover performance unacceptably even though we 
have better hardware now.  Ideally, we could develop this internally, and 
perhaps run the searches on a mirror site of some kind to minimize 
slowdowns, but either it's not a priority among the developers or the 
expertise we need isn't among our volunteer pool yet.

Google is a wonderful thing, but it has its glitches (as documented on the 
w:External Search Engines page -- we don't control when and which of our 
pages are indexed, it often puts tangentially related topics in the top ten 
ahead of the specific page you're looking for, it indexes empty edit 
(&action=edit) pages, and so on.  I admit I haven't tested the Yahoo 
external search as much; results seem comparable.

I'm not necessarily saying Atomz is THE answer, just asking if other people 
with more intimate knowledge of our needs would take the time to evaluate 
it.  If the price is outrageous, then it's not an option, but if not, it 
has many of the things we need *right now*, in a plug-and-play solution 
that will work with our structure.

Things that look appealing to me, quoted from their FAQ:

* "a professional application hosted and maintained entirely on our 
high-performance servers." - "High-capacity for searching millions of pages"
* "24x7 emergency support and other high end support services"
* "with a single Atomz Customer Login you can manage a different search 
engine for many different Web sites" (int'l wikis)
* "can be configured to access more than one domain by using our URL 
Entrypoints feature" (en, en2, etc.)
* "indexes HTML Web pages, static or dynamically generated, including pages 
built from databases."
* "can easily enable full content searching [...] or a more narrow 
topic-based search [...] in the title, meta-description or meta-keywords tags"
* "A Collection allows visitors to search specific areas of your Web site 
[..via..] a Drop down list or a group of check boxes" (namespaces)
* "URL Masks specify which pages on your site should be included and 
excluded from indexing.; also respects <noindex> tags and robots.txt."
* "Incremental indexing enables companies to frequently index those 
portions of their site which are dynamically changing."
* "can utilize scripts or programs to initiate an incremental index"
* "will properly index Web pages using all of the common character sets in 
use on the Internet today" including bidi, UTF-8, Arabic, Chinese, etc.
* "you can control [ordering of search results], which pages are returned 
with 100% relevancy for a given search"
* "search statistics for searches made by visitors on your site"
* template feature to match search results page to rest of site; construct 
a results page that uses the language of your choice
* template gives full control over what is displayed on results page 
(title, context excerpt, relevance score, many more options)
* can use simple single-text-box search on every page, and provide link to 
an "Advanced Search" page with many more options, including:
** Case sensitivity controls
** Excluded words controls
** Bidirectional multiword acronym support
** Synonym, phrase and acronym file upload from CSV or other formats.
** Sound-alike matching

All of this would be very complicated for us to code ourselves by hand to 
the same standard.   Let's at least find out what the cost would be?

Just my two cents,
Catherine

















-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wikimedia.org/pipermail/wikien-l/attachments/20040518/675cd014/attachment.htm 


More information about the WikiEN-l mailing list