Discussion:
Valid characters in url
(too old to reply)
Jane A Taber
2009-04-26 11:23:51 UTC
Permalink
Are these characters valid in a url without escaping?

-
_
!
'
(
)
~
ยจ

If they are valid will they work without recardless of browser?
Robert Aldwinckle
2009-04-28 23:08:47 UTC
Permalink
Post by Jane A Taber
Are these characters valid in a url without escaping?
-
_
!
'
(
)
~
Apparently...

http://www.ietf.org/rfc/rfc2396.txt

(Live Search for
RFC URI escaped "special characters"
- first hit
)


And a Ctrl-F Find for reserved...

<quote>
Both RFC 1738 and RFC 1808 refer to the "reserved" set of characters
as if URI-interpreting software were limited to a single set of characters
with a reserved purpose (i.e., as meaning something other than the data
to which the characters correspond), and that this set was fixed by the
URI scheme. However, this has not been true in practice; any character
that is interpreted differently when it is escaped is, in effect, reserved.
Furthermore, the interpreting engine on a HTTP server is often dependent
on the resource, not just the URI scheme. The description of reserved
characters has been changed accordingly.
</quote>

and

<quote>
2.3. Unreserved Characters

Data characters that are allowed in a URI but do not have a reserved
purpose are called unreserved. These include upper and lower case
letters, decimal digits, and a limited set of punctuation marks and
symbols.

unreserved = alphanum | mark

mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"

Unreserved characters can be escaped without changing the semantics
of the URI, but this should not be done unless the URI is being used
in a context that does not allow the unescaped character to appear.

</quote>


So, you appear to have picked most of the Mark characters,
which are included in the unreserved set of characters.
Did you know that you were asking about these? ; )
Post by Jane A Taber
If they are valid will they work without recardless of browser?
I wouldn't want to bet on it and they might not stay unescaped
everywhere:

<quote>
In some cases, data that could be represented by an unreserved
character may appear escaped; for example, some of the unreserved
"mark" characters are automatically escaped by some systems.
</quote>


HTH

Robert Aldwinckle
---
Jane A Taber
2009-05-02 22:54:07 UTC
Permalink
Post by Robert Aldwinckle
Post by Jane A Taber
Are these characters valid in a url without escaping?
-
_
!
'
(
)
~
Apparently...
http://www.ietf.org/rfc/rfc2396.txt
(Live Search for
RFC URI escaped "special characters"
- first hit
)
And a Ctrl-F Find for reserved...
<quote>
Both RFC 1738 and RFC 1808 refer to the "reserved" set of characters
as if URI-interpreting software were limited to a single set of characters
with a reserved purpose (i.e., as meaning something other than the data
to which the characters correspond), and that this set was fixed by the
URI scheme. However, this has not been true in practice; any character
that is interpreted differently when it is escaped is, in effect, reserved.
Furthermore, the interpreting engine on a HTTP server is often dependent
on the resource, not just the URI scheme. The description of reserved
characters has been changed accordingly.
</quote>
and
<quote>
2.3. Unreserved Characters
Data characters that are allowed in a URI but do not have a reserved
purpose are called unreserved. These include upper and lower case
letters, decimal digits, and a limited set of punctuation marks and
symbols.
unreserved = alphanum | mark
mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
Unreserved characters can be escaped without changing the semantics
of the URI, but this should not be done unless the URI is being used
in a context that does not allow the unescaped character to appear.
</quote>
So, you appear to have picked most of the Mark characters,
which are included in the unreserved set of characters.
Did you know that you were asking about these? ; )
Post by Jane A Taber
If they are valid will they work without recardless of browser?
I wouldn't want to bet on it and they might not stay unescaped
<quote>
In some cases, data that could be represented by an unreserved
character may appear escaped; for example, some of the unreserved
"mark" characters are automatically escaped by some systems.
</quote>
HTH
Robert Aldwinckle
---
Thanks for your help and time Robert! It really cleared up my questions.
Thanks again ;)

Loading...