| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.commons.httpclient.URI
public class URI
The interface for the URI(Uniform Resource Identifiers) version of RFC 2396. This class has the purpose of supportting of parsing a URI reference to extend any specific protocols, the character encoding of the protocol to be transported and the charset of the document.
A URI is always in an "escaped" form, since escaping or unescaping a completed URI might change its semantics.
Implementers should be careful not to escape or unescape the same string more than once, since unescaping an already unescaped string might lead to misinterpreting a percent data character as another escaped character, or vice versa in the case of escaping an already escaped string.
In order to avoid these problems, data types used as follows:
URI character sequence: char octet sequence: byte original character sequence: String
So, a URI is a sequence of characters as an array of a char type, which is not always represented as a sequence of octets as an array of byte.
URI Syntactic Components
- In general, written as follows: Absolute URI = <scheme>:<scheme-specific-part> Generic URI = <scheme>://<authority><path>?<query> - Syntax absoluteURI = scheme ":" ( hier_part | opaque_part ) hier_part = ( net_path | abs_path ) [ "?" query ] net_path = "//" authority [ abs_path ] abs_path = "/" path_segments
The following examples illustrate URI that are in common use.
 ftp://ftp.is.co.za/rfc/rfc1808.txt
    -- ftp scheme for File Transfer Protocol services
 gopher://spinaltap.micro.umn.edu/00/Weather/California/Los%20Angeles
    -- gopher scheme for Gopher and Gopher+ Protocol services
 http://www.math.uio.no/faq/compression-faq/part1.html
    -- http scheme for Hypertext Transfer Protocol services
 mailto:mduerst@ifi.unizh.ch
    -- mailto scheme for electronic mail addresses
 news:comp.infosystems.www.servers.unix
    -- news scheme for USENET news groups and articles
 telnet://melvyl.ucop.edu/
    -- telnet scheme for interactive services via the TELNET Protocol
 
 Please, notice that there are many modifications from URL(RFC 1738) and
 relative URL(RFC 1808).
 The expressions for a URI
For escaped URI forms - URI(char[]) // constructor - char[] getRawXxx() // method - String getEscapedXxx() // method - String toString() // methodFor unescaped URI forms - URI(String) // constructor - String getXXX() // method
| Nested Class Summary | |
|---|---|
| static class | URI.DefaultCharsetChangedThe charset-changed normal operation to represent to be required to alert to user the fact the default charset is changed. | 
| static class | URI.LocaleToCharsetMapA mapping to determine the (somewhat arbitrarily) preferred charset for a given locale. | 
| Field Summary | |
|---|---|
| protected  char[] | _authorityThe authority. | 
| protected  char[] | _fragmentThe fragment. | 
| protected  char[] | _hostThe host. | 
| protected  boolean | _is_abs_path | 
| protected  boolean | _is_hier_part | 
| protected  boolean | _is_hostname | 
| protected  boolean | _is_IPv4address | 
| protected  boolean | _is_IPv6reference | 
| protected  boolean | _is_net_path | 
| protected  boolean | _is_opaque_part | 
| protected  boolean | _is_reg_name | 
| protected  boolean | _is_rel_path | 
| protected  boolean | _is_server | 
| protected  char[] | _opaqueThe opaque. | 
| protected  char[] | _pathThe path. | 
| protected  int | _portThe port. | 
| protected  char[] | _queryThe query. | 
| protected  char[] | _schemeThe scheme. | 
| protected  char[] | _uriThis Uniform Resource Identifier (URI). | 
| protected  char[] | _userinfoThe userinfo. | 
| protected static BitSet | abs_pathURI absolute path. | 
| protected static BitSet | absoluteURIBitSet for absoluteURI. | 
| static BitSet | allowed_abs_pathThose characters that are allowed for the abs_path. | 
| static BitSet | allowed_authorityThose characters that are allowed for the authority component. | 
| static BitSet | allowed_fragmentThose characters that are allowed for the fragment component. | 
| static BitSet | allowed_hostThose characters that are allowed for the host component. | 
| static BitSet | allowed_IPv6referenceThose characters that are allowed for the IPv6reference component. | 
| static BitSet | allowed_opaque_partThose characters that are allowed for the opaque_part. | 
| static BitSet | allowed_queryThose characters that are allowed for the query component. | 
| static BitSet | allowed_reg_nameThose characters that are allowed for the reg_name. | 
| static BitSet | allowed_rel_pathThose characters that are allowed for the rel_path. | 
| static BitSet | allowed_userinfoThose characters that are allowed for the userinfo component. | 
| static BitSet | allowed_within_authorityThose characters that are allowed for the authority component. | 
| static BitSet | allowed_within_pathThose characters that are allowed within the path. | 
| static BitSet | allowed_within_queryThose characters that are allowed within the query component. | 
| static BitSet | allowed_within_userinfoThose characters that are allowed for within the userinfo component. | 
| protected static BitSet | alphaBitSet for alpha. | 
| protected static BitSet | alphanumBitSet for alphanum (join of alpha & digit). | 
| protected static BitSet | authorityBitSet for authority. | 
| static BitSet | controlBitSet for control. | 
| protected static String | defaultDocumentCharsetThe default charset of the document. | 
| protected static String | defaultDocumentCharsetByLocale | 
| protected static String | defaultDocumentCharsetByPlatform | 
| protected static String | defaultProtocolCharsetThe default charset of the protocol. | 
| static BitSet | delimsBitSet for delims. | 
| protected static BitSet | digitBitSet for digit. | 
| static BitSet | disallowed_opaque_partDisallowed opaque_part before escaping. | 
| static BitSet | disallowed_rel_pathDisallowed rel_path before escaping. | 
| protected static BitSet | domainlabelBitSet for domainlabel. | 
| protected static BitSet | escapedBitSet for escaped. | 
| protected static BitSet | fragmentBitSet for fragment (alias for uric). | 
| protected  int | hashCache the hash code for this URI. | 
| protected static BitSet | hexBitSet for hex. | 
| protected static BitSet | hier_partBitSet for hier_part. | 
| protected static BitSet | hostBitSet for host. | 
| protected static BitSet | hostnameBitSet for hostname. | 
| protected static BitSet | hostportBitSet for hostport. | 
| protected static BitSet | IPv4addressBitset that combines digit and dot fo IPv$address. | 
| protected static BitSet | IPv6addressRFC 2373. | 
| protected static BitSet | IPv6referenceRFC 2732, 2373. | 
| protected static BitSet | markBitSet for mark. | 
| protected static BitSet | net_pathBitSet for net_path. | 
| protected static BitSet | opaque_partURI bitset that combines uric_no_slash and uric. | 
| protected static BitSet | paramBitSet for param (alias for pchar). | 
| protected static BitSet | pathURI bitset that combines absolute path and opaque part. | 
| protected static BitSet | path_segmentsBitSet for path segments. | 
| protected static BitSet | pcharBitSet for pchar. | 
| protected static BitSet | percentThe percent "%" character always has the reserved purpose of being the escape indicator, it must be escaped as "%25" in order to be used as data within a URI. | 
| protected static BitSet | portPort, a logical alias for digit. | 
| protected  String | protocolCharsetThe charset of the protocol used by this URI instance. | 
| protected static BitSet | queryBitSet for query (alias for uric). | 
| protected static BitSet | reg_nameBitSet for reg_name. | 
| protected static BitSet | rel_pathBitSet for rel_path. | 
| protected static BitSet | rel_segmentBitSet for rel_segment. | 
| protected static BitSet | relativeURIBitSet for relativeURI. | 
| protected static BitSet | reservedBitSet for reserved. | 
| protected static char[] | rootPathThe root path. | 
| protected static BitSet | schemeBitSet for scheme. | 
| protected static BitSet | segmentBitSet for segment. | 
| protected static BitSet | serverBitset for server. | 
| static BitSet | spaceBitSet for space. | 
| protected static BitSet | toplabelBitSet for toplabel. | 
| protected static BitSet | unreservedData characters that are allowed in a URI but do not have a reserved purpose are called unreserved. | 
| static BitSet | unwiseBitSet for unwise. | 
| protected static BitSet | URI_referenceBitSet for URI-reference. | 
| protected static BitSet | uricBitSet for uric. | 
| protected static BitSet | uric_no_slashURI bitset for encoding typical non-slash characters. | 
| protected static BitSet | userinfoBitset for userinfo. | 
| static BitSet | within_userinfoBitSet for within the userinfo component like user and password. | 
| Constructor Summary | |
|---|---|
| protected  | URI()Create an instance as an internal use | 
|   | URI(char[] escaped)Deprecated. Use #URI(String, boolean) | 
|   | URI(char[] escaped,
    String charset)Deprecated. Use #URI(String, boolean, String) | 
|   | URI(String original)Deprecated. Use #URI(String, boolean) | 
|   | URI(String s,
    boolean escaped)Construct a URI from a string with the given charset. | 
|   | URI(String s,
    boolean escaped,
    String charset)Construct a URI from a string with the given charset. | 
|   | URI(String original,
    String charset)Deprecated. Use #URI(String, boolean, String) | 
|   | URI(String scheme,
    String schemeSpecificPart,
    String fragment)Construct a general URI from the given components. | 
|   | URI(String scheme,
    String userinfo,
    String host,
    int port)Construct a general URI from the given components. | 
|   | URI(String scheme,
    String userinfo,
    String host,
    int port,
    String path)Construct a general URI from the given components. | 
|   | URI(String scheme,
    String userinfo,
    String host,
    int port,
    String path,
    String query)Construct a general URI from the given components. | 
|   | URI(String scheme,
    String userinfo,
    String host,
    int port,
    String path,
    String query,
    String fragment)Construct a general URI from the given components. | 
|   | URI(String scheme,
    String host,
    String path,
    String fragment)Construct a general URI from the given components. | 
|   | URI(String scheme,
    String authority,
    String path,
    String query,
    String fragment)Construct a general URI from the given components. | 
|   | URI(URI base,
    String relative)Deprecated. Use #URI(URI, String, boolean) | 
|   | URI(URI base,
    String relative,
    boolean escaped)Construct a general URI with the given relative URI string. | 
|   | URI(URI base,
    URI relative)Construct a general URI with the given relative URI. | 
| Method Summary | |
|---|---|
|  Object | clone()Create and return a copy of this object, the URI-reference containing the userinfo component. | 
|  int | compareTo(Object obj)Compare this URI to another object. | 
| protected static String | decode(char[] component,
       String charset)Decodes URI encoded string. | 
| protected static String | decode(String component,
       String charset)Decodes URI encoded string. | 
| protected static char[] | encode(String original,
       BitSet allowed,
       String charset)Encodes URI string. | 
| protected  boolean | equals(char[] first,
       char[] second)Test if the first array is equal to the second array. | 
|  boolean | equals(Object obj)Test an object if this URI is equal to another. | 
|  String | getAboveHierPath()Get the level above the this hierarchy level. | 
|  String | getAuthority()Get the authority. | 
|  String | getCurrentHierPath()Get the current hierarchy level. | 
| static String | getDefaultDocumentCharset()Get the recommended default charset of the document. | 
| static String | getDefaultDocumentCharsetByLocale()Get the default charset of the document by locale. | 
| static String | getDefaultDocumentCharsetByPlatform()Get the default charset of the document by platform. | 
| static String | getDefaultProtocolCharset()Get the default charset of the protocol. | 
|  String | getEscapedAboveHierPath()Get the level above the this hierarchy level. | 
|  String | getEscapedAuthority()Get the escaped authority. | 
|  String | getEscapedCurrentHierPath()Get the escaped current hierarchy level. | 
|  String | getEscapedFragment()Get the escaped fragment. | 
|  String | getEscapedName()Get the escaped basename of the path. | 
|  String | getEscapedPath()Get the escaped path. | 
|  String | getEscapedPathQuery()Get the escaped query. | 
|  String | getEscapedQuery()Get the escaped query. | 
|  String | getEscapedURI()It can be gotten the URI character sequence. | 
|  String | getEscapedURIReference()Get the escaped URI reference string. | 
|  String | getEscapedUserinfo()Get the escaped userinfo. | 
|  String | getFragment()Get the fragment. | 
|  String | getHost()Get the host. | 
|  String | getName()Get the basename of the path. | 
|  String | getPath()Get the path. | 
|  String | getPathQuery()Get the path and query. | 
|  int | getPort()Get the port. | 
|  String | getProtocolCharset()Get the protocol charset used by this current URI instance. | 
|  String | getQuery()Get the query. | 
|  char[] | getRawAboveHierPath()Get the level above the this hierarchy level. | 
|  char[] | getRawAuthority()Get the raw-escaped authority. | 
|  char[] | getRawCurrentHierPath()Get the raw-escaped current hierarchy level. | 
| protected  char[] | getRawCurrentHierPath(char[] path)Get the raw-escaped current hierarchy level in the given path. | 
|  char[] | getRawFragment()Get the raw-escaped fragment. | 
|  char[] | getRawHost()Get the host. | 
|  char[] | getRawName()Get the raw-escaped basename of the path. | 
|  char[] | getRawPath()Get the raw-escaped path. | 
|  char[] | getRawPathQuery()Get the raw-escaped path and query. | 
|  char[] | getRawQuery()Get the raw-escaped query. | 
|  char[] | getRawScheme()Get the scheme. | 
|  char[] | getRawURI()It can be gotten the URI character sequence. | 
|  char[] | getRawURIReference()Get the URI reference character sequence. | 
|  char[] | getRawUserinfo()Get the raw-escaped userinfo. | 
|  String | getScheme()Get the scheme. | 
|  String | getURI()It can be gotten the URI character sequence. | 
|  String | getURIReference()Get the original URI reference string. | 
|  String | getUserinfo()Get the userinfo. | 
|  boolean | hasAuthority()Tell whether or not this URI has authority. | 
|  boolean | hasFragment()Tell whether or not this URI has fragment. | 
|  int | hashCode()Return a hash code for this URI. | 
|  boolean | hasQuery()Tell whether or not this URI has query. | 
|  boolean | hasUserinfo()Tell whether or not this URI has userinfo. | 
| protected  int | indexFirstOf(char[] s,
             char delim)Get the earlier index that to be searched for the first occurrance in one of any of the given array. | 
| protected  int | indexFirstOf(char[] s,
             char delim,
             int offset)Get the earlier index that to be searched for the first occurrance in one of any of the given array. | 
| protected  int | indexFirstOf(String s,
             String delims)Get the earlier index that to be searched for the first occurrance in one of any of the given string. | 
| protected  int | indexFirstOf(String s,
             String delims,
             int offset)Get the earlier index that to be searched for the first occurrance in one of any of the given string. | 
|  boolean | isAbsoluteURI()Tell whether or not this URI is absolute. | 
|  boolean | isAbsPath()Tell whether or not the relativeURI or hier_part of this URI is abs_path. | 
|  boolean | isHierPart()Tell whether or not the absoluteURI of this URI is hier_part. | 
|  boolean | isHostname()Tell whether or not the host part of this URI is hostname. | 
|  boolean | isIPv4address()Tell whether or not the host part of this URI is IPv4address. | 
|  boolean | isIPv6reference()Tell whether or not the host part of this URI is IPv6reference. | 
|  boolean | isNetPath()Tell whether or not the relativeURI or heir_part of this URI is net_path. | 
|  boolean | isOpaquePart()Tell whether or not the absoluteURI of this URI is opaque_part. | 
|  boolean | isRegName()Tell whether or not the authority component of this URI is reg_name. | 
|  boolean | isRelativeURI()Tell whether or not this URI is relative. | 
|  boolean | isRelPath()Tell whether or not the relativeURI of this URI is rel_path. | 
|  boolean | isServer()Tell whether or not the authority component of this URI is server. | 
|  void | normalize()Normalizes the path part of this URI. | 
| protected  char[] | normalize(char[] path)Normalize the given hier path part. | 
| protected  void | parseAuthority(String original,
               boolean escaped)Parse the authority component. | 
| protected  void | parseUriReference(String original,
                  boolean escaped)In order to avoid any possilbity of conflict with non-ASCII characters, Parse a URI reference as a Stringwith the character
 encoding of the local system or the document. | 
| protected  boolean | prevalidate(String component,
            BitSet disallowed)Pre-validate the unescaped URI string within a specific component. | 
| protected  char[] | removeFragmentIdentifier(char[] component)Remove the fragment identifier of the given component. | 
| protected  char[] | resolvePath(char[] basePath,
            char[] relPath)Resolve the base and relative path. | 
| static void | setDefaultDocumentCharset(String charset)Set the default charset of the document. | 
| static void | setDefaultProtocolCharset(String charset)Set the default charset of the protocol. | 
|  void | setEscapedAuthority(String escapedAuthority)Set the authority. | 
|  void | setEscapedFragment(String escapedFragment)Set the escaped fragment string. | 
|  void | setEscapedPath(String escapedPath)Set the escaped path. | 
|  void | setEscapedQuery(String escapedQuery)Set the escaped query string. | 
|  void | setFragment(String fragment)Set the fragment. | 
|  void | setPath(String path)Set the path. | 
|  void | setQuery(String query)Set the query. | 
|  void | setRawAuthority(char[] escapedAuthority)Set the authority. | 
|  void | setRawFragment(char[] escapedFragment)Set the raw-escaped fragment. | 
|  void | setRawPath(char[] escapedPath)Set the raw-escaped path. | 
|  void | setRawQuery(char[] escapedQuery)Set the raw-escaped query. | 
| protected  void | setURI()Once it's parsed successfully, set this URI. | 
|  String | toString()Get the escaped URI string. | 
| protected  boolean | validate(char[] component,
         BitSet generous)Validate the URI characters within a specific component. | 
| protected  boolean | validate(char[] component,
         int soffset,
         int eoffset,
         BitSet generous)Validate the URI characters within a specific component. | 
| Methods inherited from class java.lang.Object | 
|---|
| finalize, getClass, notify, notifyAll, wait, wait, wait | 
| Field Detail | 
|---|
protected int hash
protected char[] _uri
protected String protocolCharset
protected static String defaultProtocolCharset
protected static String defaultDocumentCharset
protected static String defaultDocumentCharsetByLocale
protected static String defaultDocumentCharsetByPlatform
protected char[] _scheme
protected char[] _opaque
protected char[] _authority
protected char[] _userinfo
protected char[] _host
protected int _port
protected char[] _path
protected char[] _query
protected char[] _fragment
protected static final char[] rootPath
protected static final BitSet percent
protected static final BitSet digit
 digit    = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
            "8" | "9"
 
protected static final BitSet alpha
alpha = lowalpha | upalpha
protected static final BitSet alphanum
alphanum = alpha | digit
protected static final BitSet hex
 hex           = digit | "A" | "B" | "C" | "D" | "E" | "F" |
                         "a" | "b" | "c" | "d" | "e" | "f"
 
protected static final BitSet escaped
escaped = "%" hex hex
protected static final BitSet mark
 mark          = "-" | "_" | "." | "!" | "~" | "*" | "'" |
                 "(" | ")"
 
protected static final BitSet unreserved
unreserved = alphanum | mark
protected static final BitSet reserved
 reserved      = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
                 "$" | ","
 
protected static final BitSet uric
uric = reserved | unreserved | escaped
protected static final BitSet fragment
fragment = *uric
protected static final BitSet query
query = *uric
protected static final BitSet pchar
 pchar         = unreserved | escaped |
                 ":" | "@" | "&" | "=" | "+" | "$" | ","
 
protected static final BitSet param
param = *pchar
protected static final BitSet segment
segment = *pchar *( ";" param )
protected static final BitSet path_segments
path_segments = segment *( "/" segment )
protected static final BitSet abs_path
abs_path = "/" path_segments
protected static final BitSet uric_no_slash
 uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
                 "&" | "=" | "+" | "$" | ","
 
protected static final BitSet opaque_part
opaque_part = uric_no_slash *uric
protected static final BitSet path
path = [ abs_path | opaque_part ]
protected static final BitSet port
protected static final BitSet IPv4address
IPv4address = 1*digit "." 1*digit "." 1*digit "." 1*digit
protected static final BitSet IPv6address
IPv6address = hexpart [ ":" IPv4address ]
protected static final BitSet IPv6reference
IPv6reference = "[" IPv6address "]"
protected static final BitSet toplabel
toplabel = alpha | alpha *( alphanum | "-" ) alphanum
protected static final BitSet domainlabel
domainlabel = alphanum | alphanum *( alphanum | "-" ) alphanum
protected static final BitSet hostname
hostname = *( domainlabel "." ) toplabel [ "." ]
protected static final BitSet host
host = hostname | IPv4address | IPv6reference
protected static final BitSet hostport
hostport = host [ ":" port ]
protected static final BitSet userinfo
 userinfo      = *( unreserved | escaped |
                    ";" | ":" | "&" | "=" | "+" | "$" | "," )
 
public static final BitSet within_userinfo
protected static final BitSet server
server = [ [ userinfo "@" ] hostport ]
protected static final BitSet reg_name
 reg_name      = 1*( unreserved | escaped | "$" | "," |
                     ";" | ":" | "@" | "&" | "=" | "+" )
 
protected static final BitSet authority
authority = server | reg_name
protected static final BitSet scheme
scheme = alpha *( alpha | digit | "+" | "-" | "." )
protected static final BitSet rel_segment
 rel_segment   = 1*( unreserved | escaped |
                     ";" | "@" | "&" | "=" | "+" | "$" | "," )
 
protected static final BitSet rel_path
rel_path = rel_segment [ abs_path ]
protected static final BitSet net_path
net_path = "//" authority [ abs_path ]
protected static final BitSet hier_part
hier_part = ( net_path | abs_path ) [ "?" query ]
protected static final BitSet relativeURI
relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ]
protected static final BitSet absoluteURI
absoluteURI = scheme ":" ( hier_part | opaque_part )
protected static final BitSet URI_reference
URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
public static final BitSet control
public static final BitSet space
public static final BitSet delims
public static final BitSet unwise
public static final BitSet disallowed_rel_path
public static final BitSet disallowed_opaque_part
public static final BitSet allowed_authority
public static final BitSet allowed_opaque_part
public static final BitSet allowed_reg_name
public static final BitSet allowed_userinfo
public static final BitSet allowed_within_userinfo
public static final BitSet allowed_IPv6reference
public static final BitSet allowed_host
public static final BitSet allowed_within_authority
public static final BitSet allowed_abs_path
public static final BitSet allowed_rel_path
public static final BitSet allowed_within_path
public static final BitSet allowed_query
public static final BitSet allowed_within_query
public static final BitSet allowed_fragment
protected boolean _is_hier_part
protected boolean _is_opaque_part
protected boolean _is_net_path
protected boolean _is_abs_path
protected boolean _is_rel_path
protected boolean _is_reg_name
protected boolean _is_server
protected boolean _is_hostname
protected boolean _is_IPv4address
protected boolean _is_IPv6reference
| Constructor Detail | 
|---|
protected URI()
public URI(String s,
           boolean escaped,
           String charset)
    throws URIException,
           NullPointerException
s - URI character sequenceescaped - true if URI character sequence is in escaped form. 
                false otherwise.charset - the charset string to do escape encoding, if required
URIException - If the URI cannot be created.
NullPointerException - if input string is nullgetProtocolCharset()
public URI(String s,
           boolean escaped)
    throws URIException,
           NullPointerException
s - URI character sequenceescaped - true if URI character sequence is in escaped form. 
                false otherwise.
URIException - If the URI cannot be created.
NullPointerException - if input string is nullgetProtocolCharset()
public URI(char[] escaped,
           String charset)
    throws URIException,
           NullPointerException
escaped - the URI character sequencecharset - the charset string to do escape encoding
URIException - If the URI cannot be created.
NullPointerException - if escaped is nullgetProtocolCharset()
public URI(char[] escaped)
    throws URIException,
           NullPointerException
escaped - the URI character sequence
URIException - If the URI cannot be created.
NullPointerException - if escaped is nullgetDefaultProtocolCharset()
public URI(String original,
           String charset)
    throws URIException
original - the string to be represented to URI character sequence
 It is one of absoluteURI and relativeURI.charset - the charset string to do escape encoding
URIException - If the URI cannot be created.getProtocolCharset()
public URI(String original)
    throws URIException
URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
An URI can be placed within double-quotes or angle brackets like "http://test.com/" and <http://test.com/>
original - the string to be represented to URI character sequence
 It is one of absoluteURI and relativeURI.
URIException - If the URI cannot be created.getDefaultProtocolCharset()
public URI(String scheme,
           String schemeSpecificPart,
           String fragment)
    throws URIException
URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ] absoluteURI = scheme ":" ( hier_part | opaque_part ) opaque_part = uric_no_slash *uric
It's for absolute URI = <scheme>:<scheme-specific-part># <fragment>.
scheme - the scheme stringschemeSpecificPart - scheme_specific_partfragment - the fragment string
URIException - If the URI cannot be created.getDefaultProtocolCharset()
public URI(String scheme,
           String authority,
           String path,
           String query,
           String fragment)
    throws URIException
URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ] absoluteURI = scheme ":" ( hier_part | opaque_part ) relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ] hier_part = ( net_path | abs_path ) [ "?" query ]
It's for absolute URI = <scheme>:<path>?<query>#< fragment> and relative URI = <path>?<query>#<fragment >.
scheme - the scheme stringauthority - the authority stringpath - the path stringquery - the query stringfragment - the fragment string
URIException - If the new URI cannot be created.getDefaultProtocolCharset()
public URI(String scheme,
           String userinfo,
           String host,
           int port)
    throws URIException
scheme - the scheme stringuserinfo - the userinfo stringhost - the host stringport - the port number
URIException - If the new URI cannot be created.getDefaultProtocolCharset()
public URI(String scheme,
           String userinfo,
           String host,
           int port,
           String path)
    throws URIException
scheme - the scheme stringuserinfo - the userinfo stringhost - the host stringport - the port numberpath - the path string
URIException - If the new URI cannot be created.getDefaultProtocolCharset()
public URI(String scheme,
           String userinfo,
           String host,
           int port,
           String path,
           String query)
    throws URIException
scheme - the scheme stringuserinfo - the userinfo stringhost - the host stringport - the port numberpath - the path stringquery - the query string
URIException - If the new URI cannot be created.getDefaultProtocolCharset()
public URI(String scheme,
           String userinfo,
           String host,
           int port,
           String path,
           String query,
           String fragment)
    throws URIException
scheme - the scheme stringuserinfo - the userinfo stringhost - the host stringport - the port numberpath - the path stringquery - the query stringfragment - the fragment string
URIException - If the new URI cannot be created.getDefaultProtocolCharset()
public URI(String scheme,
           String host,
           String path,
           String fragment)
    throws URIException
scheme - the scheme stringhost - the host stringpath - the path stringfragment - the fragment string
URIException - If the new URI cannot be created.getDefaultProtocolCharset()
public URI(URI base,
           String relative)
    throws URIException
base - the base URIrelative - the relative URI string
URIException - If the new URI cannot be created.
public URI(URI base,
           String relative,
           boolean escaped)
    throws URIException
base - the base URIrelative - the relative URI stringescaped - true if URI character sequence is in escaped form. 
                false otherwise.
URIException - If the new URI cannot be created.
public URI(URI base,
           URI relative)
    throws URIException
URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ] relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ]
Resolving Relative References to Absolute Form. Examples of Resolving Relative URI References Within an object with a well-defined base URI of
http://a/b/c/d;p?q
the relative URI would be resolved as follows: Normal Examples
g:h = g:h g = http://a/b/c/g ./g = http://a/b/c/g g/ = http://a/b/c/g/ /g = http://a/g //g = http://g ?y = http://a/b/c/?y g?y = http://a/b/c/g?y #s = (current document)#s g#s = http://a/b/c/g#s g?y#s = http://a/b/c/g?y#s ;x = http://a/b/c/;x g;x = http://a/b/c/g;x g;x?y#s = http://a/b/c/g;x?y#s . = http://a/b/c/ ./ = http://a/b/c/ .. = http://a/b/ ../ = http://a/b/ ../g = http://a/b/g ../.. = http://a/ ../../ = http://a/ ../../g = http://a/g
 Some URI schemes do not allow a hierarchical syntax matching the
  
base - the base URIrelative - the relative URI
URIException - If the new URI cannot be created.
| Method Detail | 
|---|
protected static char[] encode(String original,
                               BitSet allowed,
                               String charset)
                        throws URIException
original character sequence->octet sequence->URI character sequence
An escaped octet is encoded as a character triplet, consisting of the percent character "%" followed by the two hexadecimal digits representing the octet code. For example, "%20" is the escaped encoding for the US-ASCII space character.
Conversion from the local filesystem character set to UTF-8 will normally involve a two step process. First convert the local character set to the UCS; then convert the UCS to UTF-8. The first step in the process can be performed by maintaining a mapping table that includes the local character set code and the corresponding UCS code. The next step is to convert the UCS character code to the UTF-8 encoding.
Mapping between vendor codepages can be done in a very similar manner as described above.
The only time escape encodings can allowedly be made is when a URI is being created from its component parts. The escape and validate methods are internally performed within this method.
original - the original character sequenceallowed - those characters that are allowed within a componentcharset - the protocol charset
URIException - null component or unsupported character encoding
protected static String decode(char[] component,
                               String charset)
                        throws URIException
URI character sequence->octet sequence->original character sequence
A URI must be separated into its components before the escaped characters within those components can be allowedly decoded.
Notice that there is a chance that URI characters that are non UTF-8 may be parsed as valid UTF-8. A recent non-scientific analysis found that EUC encoded Japanese words had a 2.7% false reading; SJIS had a 0.0005% false reading; other encoding such as ASCII or KOI-8 have a 0% false reading.
The percent "%" character always has the reserved purpose of being the escape indicator, it must be escaped as "%25" in order to be used as data within a URI.
The unescape method is internally performed within this method.
component - the URI character sequencecharset - the protocol charset
URIException - incomplete trailing escape pattern or unsupported
 character encoding
protected static String decode(String component,
                               String charset)
                        throws URIException
URI character sequence->octet sequence->original character sequence
A URI must be separated into its components before the escaped characters within those components can be allowedly decoded.
Notice that there is a chance that URI characters that are non UTF-8 may be parsed as valid UTF-8. A recent non-scientific analysis found that EUC encoded Japanese words had a 2.7% false reading; SJIS had a 0.0005% false reading; other encoding such as ASCII or KOI-8 have a 0% false reading.
The percent "%" character always has the reserved purpose of being the escape indicator, it must be escaped as "%25" in order to be used as data within a URI.
The unescape method is internally performed within this method.
component - the URI character sequencecharset - the protocol charset
URIException - incomplete trailing escape pattern or unsupported
 character encoding
protected boolean prevalidate(String component,
                              BitSet disallowed)
component - the component string within the componentdisallowed - those characters disallowed within the component
protected boolean validate(char[] component,
                           BitSet generous)
component - the characters sequence within the componentgenerous - those characters that are allowed within a component
protected boolean validate(char[] component,
                           int soffset,
                           int eoffset,
                           BitSet generous)
It's not that much strict, generous. The strict validation might be performed before being called this method.
component - the characters sequence within the componentsoffset - the starting offset of the given componenteoffset - the ending offset of the given component
 if -1, it means the length of the componentgenerous - those characters that are allowed within a component
protected void parseUriReference(String original,
                                 boolean escaped)
                          throws URIException
String with the character
 encoding of the local system or the document.
 The following line is the regular expression for breaking-down a URI reference into its components.
   ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
    12            3  4          5       6  7        8 9
 For example, matching the above expression to http://jakarta.apache.org/ietf/uri/#Related results in the following subexpression matches:
               $1 = http:
  scheme    =  $2 = http
               $3 = //jakarta.apache.org
  authority =  $4 = jakarta.apache.org
  path      =  $5 = /ietf/uri/
               $6 = 
  query     =  $7 = 
               $8 = #Related
  fragment  =  $9 = Related
   
original - the original character sequenceescaped - true if original is escaped
URIException - If an error occurs.
protected int indexFirstOf(String s,
                           String delims)
s - the string to be indexeddelims - the delimiters used to index
protected int indexFirstOf(String s,
                           String delims,
                           int offset)
s - the string to be indexeddelims - the delimiters used to indexoffset - the from index
protected int indexFirstOf(char[] s,
                           char delim)
s - the character array to be indexeddelim - the delimiter used to index
protected int indexFirstOf(char[] s,
                           char delim,
                           int offset)
s - the character array to be indexeddelim - the delimiter used to indexoffset - The offset.
protected void parseAuthority(String original,
                              boolean escaped)
                       throws URIException
original - the original character sequence of authority componentescaped - true if original is escaped
URIException - If an error occurs.protected void setURI()
getRawURI()public boolean isAbsoluteURI()
public boolean isRelativeURI()
public boolean isHierPart()
public boolean isOpaquePart()
public boolean isNetPath()
hasAuthority()public boolean isAbsPath()
public boolean isRelPath()
public boolean hasAuthority()
isNetPath()public boolean isRegName()
public boolean isServer()
public boolean hasUserinfo()
public boolean isHostname()
public boolean isIPv4address()
public boolean isIPv6reference()
public boolean hasQuery()
public boolean hasFragment()
public static void setDefaultProtocolCharset(String charset)
                                      throws URI.DefaultCharsetChanged
The character set used to store files SHALL remain a local decision and MAY depend on the capability of local operating systems. Prior to the exchange of URIs they SHOULD be converted into a ISO/IEC 10646 format and UTF-8 encoded. This approach, while allowing international exchange of URIs, will still allow backward compatibility with older systems because the code set positions for ASCII characters are identical to the one byte sequence in UTF-8.
An individual URI scheme may require a single charset, define a default charset, or provide a way to indicate the charset used.
 Always all the time, the setter method is always succeeded and throws
 DefaultCharsetChanged exception.
 So API programmer must follow the following way:
 
  import org.apache.util.URI$DefaultCharsetChanged;
      .
      .
      .
  try {
      URI.setDefaultProtocolCharset("UTF-8");
  } catch (DefaultCharsetChanged cc) {
      // CASE 1: the exception could be ignored, when it is set by user
      if (cc.getReasonCode() == DefaultCharsetChanged.PROTOCOL_CHARSET) {
      // CASE 2: let user know the default protocol charset changed
      } else {
      // CASE 2: let user know the default document charset changed
      }
  }
  
charset - the default charset for each protocol
URI.DefaultCharsetChanged - default charset changedpublic static String getDefaultProtocolCharset()
An individual URI scheme may require a single charset, define a default charset, or provide a way to indicate the charset used.
To work globally either requires support of a number of character sets and to be able to convert between them, or the use of a single preferred character set. For support of global compatibility it is STRONGLY RECOMMENDED that clients and servers use UTF-8 encoding when exchanging URIs.
public String getProtocolCharset()
getDefaultProtocolCharset()
public static void setDefaultDocumentCharset(String charset)
                                      throws URI.DefaultCharsetChanged
Notice that it will be possible to contain mixed characters (e.g. ftp://host/KoreanNamespace/ChineseResource). To handle the Bi-directional display of these character sets, the protocol charset could be simply used again. Because it's not yet implemented that the insertion of BIDI control characters at different points during composition is extracted.
 Always all the time, the setter method is always succeeded and throws
 DefaultCharsetChanged exception.
 So API programmer must follow the following way:
 
  import org.apache.util.URI$DefaultCharsetChanged;
      .
      .
      .
  try {
      URI.setDefaultDocumentCharset("EUC-KR");
  } catch (DefaultCharsetChanged cc) {
      // CASE 1: the exception could be ignored, when it is set by user
      if (cc.getReasonCode() == DefaultCharsetChanged.DOCUMENT_CHARSET) {
      // CASE 2: let user know the default document charset changed
      } else {
      // CASE 2: let user know the default protocol charset changed
      }
  }
  
charset - the default charset for the document
URI.DefaultCharsetChanged - default charset changedpublic static String getDefaultDocumentCharset()
public static String getDefaultDocumentCharsetByLocale()
public static String getDefaultDocumentCharsetByPlatform()
public char[] getRawScheme()
public String getScheme()
public void setRawAuthority(char[] escapedAuthority)
                     throws URIException,
                            NullPointerException
authority = server | reg_name
escapedAuthority - the raw escaped authority
URIException - If parseAuthority(java.lang.String,boolean) fails
NullPointerException - null authority
public void setEscapedAuthority(String escapedAuthority)
                         throws URIException
escapedAuthority - the escaped authority string
URIException - If parseAuthority(java.lang.String,boolean) failspublic char[] getRawAuthority()
public String getEscapedAuthority()
public String getAuthority()
                    throws URIException
URIException - If decode(char[], java.lang.String) failspublic char[] getRawUserinfo()
getAuthority()public String getEscapedUserinfo()
getAuthority()
public String getUserinfo()
                   throws URIException
URIException - If decode(char[], java.lang.String) failsgetAuthority()public char[] getRawHost()
host = hostname | IPv4address | IPv6reference
getAuthority()
public String getHost()
               throws URIException
host = hostname | IPv4address | IPv6reference
URIException - If decode(char[], java.lang.String) failsgetAuthority()public int getPort()
public void setRawPath(char[] escapedPath)
                throws URIException
escapedPath - the path character sequence
URIException - encoding error or not proper for initial instanceencode(java.lang.String, java.util.BitSet, java.lang.String)
public void setEscapedPath(String escapedPath)
                    throws URIException
escapedPath - the escaped path string
URIException - encoding error or not proper for initial instanceencode(java.lang.String, java.util.BitSet, java.lang.String)
public void setPath(String path)
             throws URIException
path - the path string
URIException - set incorrectly or fragment onlyencode(java.lang.String, java.util.BitSet, java.lang.String)
protected char[] resolvePath(char[] basePath,
                             char[] relPath)
                      throws URIException
basePath - a character array of the basePathrelPath - a character array of the relPath
URIException - no more higher path level to be resolved
protected char[] getRawCurrentHierPath(char[] path)
                                throws URIException
path - the path
URIException - no hierarchy level
public char[] getRawCurrentHierPath()
                             throws URIException
URIException - If getRawCurrentHierPath(char[]) fails.
public String getEscapedCurrentHierPath()
                                 throws URIException
URIException - If getRawCurrentHierPath(char[]) fails.
public String getCurrentHierPath()
                          throws URIException
URIException - If getRawCurrentHierPath(char[]) fails.decode(char[], java.lang.String)
public char[] getRawAboveHierPath()
                           throws URIException
URIException - If getRawCurrentHierPath(char[]) fails.
public String getEscapedAboveHierPath()
                               throws URIException
URIException - If getRawCurrentHierPath(char[]) fails.
public String getAboveHierPath()
                        throws URIException
URIException - If getRawCurrentHierPath(char[]) fails.decode(char[], java.lang.String)public char[] getRawPath()
path = [ abs_path | opaque_part ]
public String getEscapedPath()
path = [ abs_path | opaque_part ] abs_path = "/" path_segments opaque_part = uric_no_slash *uric
public String getPath()
               throws URIException
path = [ abs_path | opaque_part ]
URIException - If decode(char[], java.lang.String) fails.decode(char[], java.lang.String)public char[] getRawName()
public String getEscapedName()
public String getName()
               throws URIException
URIException - incomplete trailing escape pattern or unsupported
 character encodingdecode(char[], java.lang.String)public char[] getRawPathQuery()
public String getEscapedPathQuery()
public String getPathQuery()
                    throws URIException
URIException - incomplete trailing escape pattern or unsupported
 character encodingdecode(char[], java.lang.String)
public void setRawQuery(char[] escapedQuery)
                 throws URIException
escapedQuery - the raw-escaped query
URIException - escaped query not valid
public void setEscapedQuery(String escapedQuery)
                     throws URIException
escapedQuery - the escaped query string
URIException - escaped query not valid
public void setQuery(String query)
              throws URIException
When a query string is not misunderstood the reserved special characters ("&", "=", "+", ",", and "$") within a query component, it is recommended to use in encoding the whole query with this method.
 The additional APIs for the special purpose using by the reserved
 special characters used in each protocol are implemented in each protocol
 classes inherited from URI.  So refer to the same-named APIs
 implemented in each specific protocol instance.
query - the query string.
URIException - incomplete trailing escape pattern or unsupported
 character encodingencode(java.lang.String, java.util.BitSet, java.lang.String)public char[] getRawQuery()
public String getEscapedQuery()
public String getQuery()
                throws URIException
URIException - incomplete trailing escape pattern or unsupported
 character encodingdecode(char[], java.lang.String)
public void setRawFragment(char[] escapedFragment)
                    throws URIException
escapedFragment - the raw-escaped fragment
URIException - escaped fragment not valid
public void setEscapedFragment(String escapedFragment)
                        throws URIException
escapedFragment - the escaped fragment string
URIException - escaped fragment not valid
public void setFragment(String fragment)
                 throws URIException
fragment - the fragment string.
URIException - If an error occurs.public char[] getRawFragment()
The optional fragment identifier is not part of a URI, but is often used in conjunction with a URI.
The format and interpretation of fragment identifiers is dependent on the media type [RFC2046] of the retrieval result.
A fragment identifier is only meaningful when a URI reference is intended for retrieval and the result of that retrieval is a document for which the identified fragment is consistently defined.
public String getEscapedFragment()
public String getFragment()
                   throws URIException
URIException - incomplete trailing escape pattern or unsupported
 character encodingdecode(char[], java.lang.String)protected char[] removeFragmentIdentifier(char[] component)
component - the component that a fragment may be included
protected char[] normalize(char[] path)
                    throws URIException
Algorithm taken from URI reference parser at http://www.apache.org/~fielding/uri/rev-2002/issues.html.
path - the path to normalize
URIException - no more higher path level to be normalized
public void normalize()
               throws URIException
URIException - no more higher path level to be normalizedisAbsPath()
protected boolean equals(char[] first,
                         char[] second)
first - the first character arraysecond - the second character array
public boolean equals(Object obj)
equals in class Objectobj - an object to compare
public int hashCode()
hashCode in class Object
public int compareTo(Object obj)
              throws ClassCastException
compareTo in interface Comparableobj - the object to be compared.
ClassCastException - not URI argument
public Object clone()
             throws CloneNotSupportedException
String.
 
 To copy the identical URI object including the userinfo
 component, it should be used.
clone in class ObjectCloneNotSupportedExceptionpublic char[] getRawURI()
It is clearly unwise to use a URL that contains a password which is intended to be secret. In particular, the use of a password within the 'userinfo' component of a URL is strongly disrecommended except in those rare cases where the 'password' parameter is intended to be public.
When you want to get each part of the userinfo, you need to use the specific methods in the specific URL. It depends on the specific URL.
public String getEscapedURI()
public String getURI()
              throws URIException
URIException - incomplete trailing escape pattern or unsupported
 character encodingdecode(char[], java.lang.String)public char[] getRawURIReference()
public String getEscapedURIReference()
public String getURIReference()
                       throws URIException
URIException - If decode(char[], java.lang.String) fails.public String toString()
On the document, the URI-reference form is only used without the userinfo component like http://jakarta.apache.org/ by the security reason. But the URI-reference form with the userinfo component could be parsed.
 In other words, this URI and any its subclasses must not expose the
 URI-reference expression with the userinfo component like
 http://user:password@hostport/restricted_zone.
 It means that the API client programmer should extract each user and
 password to access manually.  Probably it will be supported in the each
 subclass, however, not a whole URI-reference expression.
toString in class Objectclone()| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||