Dear all,
is there an API for extraction the previous section title from a wikipage?
My situation is the following. I have a wikipage that looks like that:
<page>
intro
==section1==
text
<math id=1>...</math>
text
<math id=2>...</math>
===section 2===
text
<math id=3>...</math>
</page>
And I want to know the previous section title for each math object in that page
1->section1
2->section1
3->section 2
It's certainly doable to write a program that extracts that that
information from the wikipage... but I guess seldom special cases
would cause a lot of long tail trouble.
So is there a API that could be used for that. Both parsoid or the old
regular parser works for me.
Best
Physikerwelt
Show replies by date
Moritz,
you can certainly do this in HTML, either using the PHP parser output or
Parsoid. Parsoid output makes it easier to identify math extension output.
If you need the wikitext for the heading, then Parsoid can also give you
the source offsets of the that in data-parsoid (see the dsr property in
there, it encodes startOffset, endOffset, startTagWidth, endTagWidth).
Gabriel