Re: Ahhh.. URL wants to get encoded. Does Java wanna?
Wayne wrote:
Roedy Green wrote:
On Tue, 06 Nov 2007 05:04:05 -0000, Fran?ois
<francois.x.hetu@gmail.com> wrote, quoted or indirectly quoted someone
who said :
. Just want to encode a string into a
readable URL (RFC2396:
see http://mindprod.com/jgloss/urlencoded.html
Roedy,
I just tried using URI, it doesn't seem to escape/encode
an ampersand in any part of the URI. Also, what about the
new IRIs? A Java program should be robust enough to
handle legal URLs/URIs/IRIs, converting the the (upto)
nine parts of an IRI correctly. My understanding of
your (excellent) urlencoded page and the API docs means this:
URI uri = new URI("http", "//www.example.com/you & I 10%? wierd & wierder", null);
System.out.println( uri.toURL() );
should produce:
http://www.example.com/you%20&%20I%2010%25?%20wierd%20%26%20wierder
But it produces:
http://www.example.com/you%20&%20I%2010%25?%20wierd%20&%20wierder
(The ampersand is not encoded.) What did I do wrong?
-Wayne
I guess the answer is to encode the query part separately, if needed.
The following code seems to work:
public String encodeURL ( String initialURL, boolean parseQuery )
{
// Parse the URL (without encoding):
URL url = new URL( initialURL );
String scheme = url.getProtocol(); // E.g., "http"
String authority = url.getAuthority(); // E.g., "//user@host:port"
String path = url.getPath(); // E.g., "/foo/bar.htm"
String query = url.getQuery(); // E.g., "foo=bar" (starts with '?")
if ( parseQuery )
query = URLEncoder.encode( query, "UTF-8" );
String fragment = url.getRef(); // I.e., the "anchor"
// Assemble the encoded URL, using URI class to properly
// encode each part:
URI uri = new URI( scheme, authority, path, query, fragment );
return uri.toString();
}
-Wayne