Just a couple of comments on this
Rob Lanphier wrote:
Ok, that's good. What about the ramifications for
relative URL handling
in RFC 2396?
http://www.ietf.org/rfc/rfc2396
I haven't found any immediate problems, but it would take me a while
reading through the BNF to figure out if there are places where it
breaks.
Not that it's a huge deal if relative URLs don't work, since MW can
always just stick to absolute references, but its one area where things
can go wrong.
Just a comment in passing: RFC 3986 obsoletes RFC 2396
http://www.ietf.org/rfc/rfc3986.txt
The one specific prohibition of double slash in the path is at the start
of path when the URI has no authority segment.
but it does fight typical conventions, which is kind of
a bad
thing. For example, it appears that Apache throws away extra slashes,
as can be seen here:
http://apache.org///foundation////faq.html
http://apache.org/foundation/faq.html
IIS seems to do the same thing:
http://www.microsoft.com////windowsserversystem///default.mspx
I assure you that Apache does not throw away extra slashes. I have
already done the necessary programming to do URLs such as those I have
mentioned. The examples you mentioned don't say anything about the
webservers themselves because both URLs are obviously mappings to a
filesystem (whether virtual or not); it is that filesystem that throws
away the extra slashes (which you can easily test: Both Linux and
Windows allow you to put double-/ resp. double-\ in a path and it won't
complain).
Compare:
http://www.livejournal.com/manage/index.bml
and
http://www.livejournal.com/manage//index.bml
They show the same page because the path is a mapping to a filesystem,
but the pages are different because the individual strings on it are
retrieved from codes that are based on the path. Those codes contain
only single slashes, so the second page is missing those strings. This
clearly shows that it's the filesystem and not Apache that "throws away"
double-slashes.
Ok, that's good.
Still, I maintain that assigning unique semantics to "//" versus "/"
when used in that part of a URL doesn't have a lot of precedent, which
also means that there's probably a lot of places it can break. I admit
that's a vague criticism, but I just have a bad gut feeling about going
down that road.
FWIW, I share your bad gut feeling. Superfluous slashes in the path may
be used as an evasion technique, and some security packages (for
example, mod_security for Apache) normalize the path, stripping out the
extra slashes. While this particular case might not pertain to our
installation at this time, it is an example of the kind of unforseen
problems that may lie down this road.