On Sat, Feb 23, 2008 at 8:32 PM, Ragib Hasan <ragibhasan(a)gmail.com> wrote:
Hi,
I need to extract the only the text from a Wikipedia page. I.e., I
need to remove all wiki markup, section headings etc, to extract only
the text a reader will read.
Get the rendered HTML, and remove all the HTML markup.