Re: [Wikitech-l] Full text search

29 Mar 2005

...
  Actually, I think the more generic Lucene library
which Nutch is built
 upon will be more useful. We should be indexing the wikitext, not the
 HTML (which is a lower quality version ;)) 
This is the only open issue when you plan to use lucene, you need a 
good parser for the syntax and this is very difficult.

...

 Seriously, we also don't want a crawler. What is left in Nutch's 
 favour?
 Nothing! Use Lucene - trust me. :-)
It will definitely save wikipedia very very much load!!!

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Full text search