mm wrote:
Thereis a place where I can read about
relation between cur_title and last part of URL.
I am wrting some script and can't find some rows in cur using a query like:
SELECT cur_id, cur_title from cur WHERE cur_title BETWEEN 'title1' AND
'title1'
for some title1 from URL like:
Ok, several things:
1) URLs may be URL-encoded. Titles stored in the database are not.
2) Unless a title actually has a backslash in it (like \') it will not
have a backslash in it. Apostrophes are just plain apostrophes. (But
remember to encode your SQL queries -- SQL is *not* text, it's a protocol.)
3) page_namespace and page_title are always used in a pair, never
page_title alone. (It's cur_namespace and cur_title, and old_namespace
and old_title, in 1.4 and earlier.) page_namespace is an enumerated
field; see Defines.php for the list of key numbers, and the various
Language*.php files for the localized names of each.
4) If you're looking at an old SQL dump for the English Wikipedia or a
handful of other languages, the data is encoded in ISO 8859-1 (or in
places the nonstandard superset Windows-1252). In current databases all
data is encoded in UTF-8. So if using an old database as a data source
you may need to convert.
See Title::getLocalUrl() etc if you want a closer look at how the URLs
are generated.
-- brion vibber (brion @
pobox.com)