Class Utilities
- Author:
- Mark Allen
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final classBuilder forextractEffectiveOrigin(EffectiveOriginResolver). -
Method Summary
Modifier and TypeMethodDescriptionencodeQueryParameters(@NonNull Map<@NonNull String, @NonNull Set<@NonNull String>> queryParameters, @NonNull QueryFormat queryFormat) Encodes decoded query parameters into a raw query string.Extracts theCharsetfrom the firstContent-Typeheader, if present and valid.extractCharsetFromHeaderValue(@Nullable String contentTypeHeaderValue) Extracts thecharset=...parameter from aContent-Typeheader value.Extracts the media type (without parameters) from the firstContent-Typeheader.extractContentTypeFromHeaderValue(@Nullable String contentTypeHeaderValue) Extracts the media type (without parameters) from aContent-Typeheader value.ParsesCookierequest headers into a map of cookie names to values.extractEffectiveOrigin(@NonNull Utilities.EffectiveOriginResolver effectiveOriginResolver) Best-effort attempt to determine a client's effective origin by examining request headers.extractHeadersFromRawHeaderLines(@NonNull List<@NonNull String> rawHeaderLines) Given a list of raw HTTP header lines, convert them into a normalized case-insensitive, order-preserving map which "inflates" comma-separated headers into distinct values where permitted according to RFC 7230/9110.extractLocalesFromAcceptLanguageHeaderValue(@NonNull String acceptLanguageHeaderValue) Parses anAccept-Languageheader value into a best-effort ordered list ofLocales.extractPathFromUrl(@NonNull String url, @NonNull Boolean performDecoding) Normalizes a URL or path into a canonical request path and optionally performs percent-decoding on the path.extractQueryParametersFromQuery(@NonNull String query, @NonNull QueryFormat queryFormat) Parses a query string such as"a=1&b=2&c=%20"into a multimap of names to values.extractQueryParametersFromQuery(@NonNull String query, @NonNull QueryFormat queryFormat, @NonNull Charset charset) Parses a query string such as"a=1&b=2&c=%20"into a multimap of names to values.extractQueryParametersFromUrl(@NonNull String url, @NonNull QueryFormat queryFormat) Parses query strings from relative or absolute URLs such as"/example?a=a=1&b=2&c=%20"or"https://www.soklet.com/example?a=1&b=2&c=%20"into a multimap of names to values.extractQueryParametersFromUrl(@NonNull String url, @NonNull QueryFormat queryFormat, @NonNull Charset charset) Parses query strings from relative or absolute URLs such as"/example?a=a=1&b=2&c=%20"or"https://www.soklet.com/example?a=1&b=2&c=%20"into a multimap of names to values.Extracts the raw (un-decoded) query component from a URL.trimAggressively(@Nullable String string) A "stronger" version ofString.trim()which discards any kind of whitespace or invisible separator.trimAggressivelyToEmpty(@Nullable String string) Aggressively trims Unicode whitespace from the given string and returns""if the input isnull.trimAggressivelyToNull(@Nullable String string) Aggressively trims Unicode whitespace from the given string and returnsnullif the result is empty.
-
Method Details
-
extractQueryParametersFromQuery
public static @NonNull Map<@NonNull String, @NonNull Set<@NonNull String>> extractQueryParametersFromQuery(@NonNull String query, @NonNull QueryFormat queryFormat) Parses a query string such as"a=1&b=2&c=%20"into a multimap of names to values.Decodes percent-escapes using UTF-8, which is usually what you want (see
extractQueryParametersFromQuery(String, QueryFormat, Charset)if you need to specify a different charset).Pairs missing a name are ignored.
Multiple occurrences of the same name are collected into a
Setin insertion order (duplicates are de-duplicated).- Parameters:
query- a raw query string such as"a=1&b=2&c=%20"queryFormat- how to decode:application/x-www-form-urlencodedor "strict" RFC 3986- Returns:
- a map of parameter names to their distinct values, preserving first-seen name order; empty if none
- Throws:
IllegalRequestException- if the query string contains malformed percent-encoding
-
extractQueryParametersFromQuery
public static @NonNull Map<@NonNull String, @NonNull Set<@NonNull String>> extractQueryParametersFromQuery(@NonNull String query, @NonNull QueryFormat queryFormat, @NonNull Charset charset) Parses a query string such as"a=1&b=2&c=%20"into a multimap of names to values.Decodes percent-escapes using the specified charset.
Pairs missing a name are ignored.
Multiple occurrences of the same name are collected into a
Setin insertion order (duplicates are de-duplicated).- Parameters:
query- a raw query string such as"a=1&b=2&c=%20"queryFormat- how to decode:application/x-www-form-urlencodedor "strict" RFC 3986charset- the charset to use when decoding percent-escapes- Returns:
- a map of parameter names to their distinct values, preserving first-seen name order; empty if none
- Throws:
IllegalRequestException- if the query string contains malformed percent-encoding
-
extractQueryParametersFromUrl
public static @NonNull Map<@NonNull String, @NonNull Set<@NonNull String>> extractQueryParametersFromUrl(@NonNull String url, @NonNull QueryFormat queryFormat) Parses query strings from relative or absolute URLs such as"/example?a=a=1&b=2&c=%20"or"https://www.soklet.com/example?a=1&b=2&c=%20"into a multimap of names to values.Decodes percent-escapes using UTF-8, which is usually what you want (see
extractQueryParametersFromUrl(String, QueryFormat, Charset)if you need to specify a different charset).Pairs missing a name are ignored.
Multiple occurrences of the same name are collected into a
Setin insertion order (duplicates are de-duplicated).- Parameters:
url- a relative or absolute URL/URI stringqueryFormat- how to decode:application/x-www-form-urlencodedor "strict" RFC 3986- Returns:
- a map of parameter names to their distinct values, preserving first-seen name order; empty if none
- Throws:
IllegalRequestException- if the URL or query contains malformed percent-encoding
-
extractQueryParametersFromUrl
public static @NonNull Map<@NonNull String, @NonNull Set<@NonNull String>> extractQueryParametersFromUrl(@NonNull String url, @NonNull QueryFormat queryFormat, @NonNull Charset charset) Parses query strings from relative or absolute URLs such as"/example?a=a=1&b=2&c=%20"or"https://www.soklet.com/example?a=1&b=2&c=%20"into a multimap of names to values.Decodes percent-escapes using the specified charset.
Pairs missing a name are ignored.
Multiple occurrences of the same name are collected into a
Setin insertion order (duplicates are de-duplicated).- Parameters:
url- a relative or absolute URL/URI stringqueryFormat- how to decode:application/x-www-form-urlencodedor "strict" RFC 3986charset- the charset to use when decoding percent-escapes- Returns:
- a map of parameter names to their distinct values, preserving first-seen name order; empty if none
- Throws:
IllegalRequestException- if the URL or query contains malformed percent-encoding
-
extractCookiesFromHeaders
public static @NonNull Map<@NonNull String, @NonNull Set<@NonNull String>> extractCookiesFromHeaders(@NonNull Map<@NonNull String, @NonNull Set<@NonNull String>> headers) ParsesCookierequest headers into a map of cookie names to values.Header name matching is case-insensitive (
"Cookie"vs"cookie"), but cookie names are case-sensitive. Values are parsed per the following liberal rules:- Components are split on
';'unless inside a quoted string. - Quoted values have surrounding quotes removed and common backslash escapes unescaped.
- Percent-escapes are decoded as UTF-8.
'+'is not treated specially.
Setin insertion order.- Parameters:
headers- request headers as a multimap of header name to values (must be non-null)- Returns:
- a map of cookie name to distinct values; empty if no valid cookies are present
- Components are split on
-
extractPathFromUrl
public static @NonNull String extractPathFromUrl(@NonNull String url, @NonNull Boolean performDecoding) Normalizes a URL or path into a canonical request path and optionally performs percent-decoding on the path.For example,
"https://www.soklet.com/ab%20c?one=two"would be normalized to"/ab c".The
OPTIONS *special case returns"*".Behavior:
- If input starts with
http://orhttps://, the path portion is extracted. - Ensures the result begins with
'/'. - Removes any trailing
'/'(except for the root path'/'). - Safely normalizes path traversals, e.g. path
'/a/../b'would be normalized to'/b' - Strips any query string.
- Applies aggressive trimming of Unicode whitespace.
- Rejects malformed percent-encoding when decoding is enabled.
- Parameters:
url- a URL or path to normalizeperformDecoding-trueif decoding should be performed on the path (e.g. replace%20with a space character),falseotherwise- Returns:
- the normalized path,
"/"for empty input
- If input starts with
-
extractRawQueryFromUrl
Extracts the raw (un-decoded) query component from a URL.For example,
"/path?a=b&c=d%20e"would return"a=b&c=d%20e".- Parameters:
url- a raw URL or path- Returns:
- the raw query component, or
Optional.empty()if none
-
encodeQueryParameters
public static @NonNull String encodeQueryParameters(@NonNull Map<@NonNull String, @NonNull Set<@NonNull String>> queryParameters, @NonNull QueryFormat queryFormat) Encodes decoded query parameters into a raw query string.For example, given
{a=[b], c=[d e]}andQueryFormat.RFC_3986_STRICT, returns"a=b&c=d%20e".- Parameters:
queryParameters- the decoded query parametersqueryFormat- the encoding strategy- Returns:
- the encoded query string, or the empty string if no parameters
-
extractLocalesFromAcceptLanguageHeaderValue
public static @NonNull List<@NonNull Locale> extractLocalesFromAcceptLanguageHeaderValue(@NonNull String acceptLanguageHeaderValue) Parses anAccept-Languageheader value into a best-effort ordered list ofLocales.Quality weights are honored by
Locale.LanguageRange.parse(String); results are then converted toLocaleinstances that represent the client-supplied language tags. Wildcard ranges are ignored unless they include a language component (e.g.en-*becomesen). On parse failure, an empty list is returned.- Parameters:
acceptLanguageHeaderValue- the raw header value (must be non-null)- Returns:
- locales in descending preference order; empty if none could be resolved
-
extractEffectiveOrigin
public static @NonNull Optional<String> extractEffectiveOrigin(@NonNull Utilities.EffectiveOriginResolver effectiveOriginResolver) Best-effort attempt to determine a client's effective origin by examining request headers.An effective origin in this context is defined as
<scheme>://host<:optional port>, but no path or query components.Soklet is generally the "last hop" behind a load balancer/reverse proxy but may also be accessed directly by clients.
Normally a load balancer/reverse proxy/other upstream proxies will provide information about the true source of the request through headers like the following:
HostForwardedOriginX-Forwarded-ProtoX-Forwarded-ProtocolX-Url-SchemeFront-End-HttpsX-Forwarded-SslX-Forwarded-HostX-Forwarded-Port
This method may take these and other headers into account when determining an effective origin.
For example, the following would be legal effective origins returned from this method:
https://www.soklet.comhttp://www.fake.com:1234
The following would NOT be legal effective origins:
www.soklet.com(missing protocol)https://www.soklet.com/(trailing slash)https://www.soklet.com/test(trailing slash, path)https://www.soklet.com/test?abc=1234(trailing slash, path, query)
Originis treated as a fallback signal only and will not override a conflictingHostor forwarded host value.Forwarded headers are only used when permitted by
Utilities.EffectiveOriginResolver.TrustPolicy. When usingUtilities.EffectiveOriginResolver.TrustPolicy.TRUST_PROXY_ALLOWLIST, you must provide a trusted proxy predicate or allowlist. If the remote address is missing or not trusted, forwarded headers are ignored.Extraction order is: trusted forwarded headers →
Host→ (optional)Originfallback. IfUtilities.EffectiveOriginResolver.allowOriginFallback(Boolean)is unset,Originfallback is enabled only forUtilities.EffectiveOriginResolver.TrustPolicy.TRUST_ALL.- Parameters:
effectiveOriginResolver- request headers and trust settings- Returns:
- the effective origin, or
Optional.empty()if it could not be determined
-
extractContentTypeFromHeaders
public static @NonNull Optional<String> extractContentTypeFromHeaders(@NonNull Map<@NonNull String, @NonNull Set<@NonNull String>> headers) Extracts the media type (without parameters) from the firstContent-Typeheader.For example,
"text/html; charset=UTF-8"→"text/html".- Parameters:
headers- request/response headers (must be non-null)- Returns:
- the media type if present; otherwise
Optional.empty() - See Also:
-
extractContentTypeFromHeaderValue
public static @NonNull Optional<String> extractContentTypeFromHeaderValue(@Nullable String contentTypeHeaderValue) Extracts the media type (without parameters) from aContent-Typeheader value.For example,
"application/json; charset=UTF-8"→"application/json".- Parameters:
contentTypeHeaderValue- the raw header value; may benullor blank- Returns:
- the media type if present; otherwise
Optional.empty()
-
extractCharsetFromHeaders
public static @NonNull Optional<Charset> extractCharsetFromHeaders(@NonNull Map<@NonNull String, @NonNull Set<@NonNull String>> headers) Extracts theCharsetfrom the firstContent-Typeheader, if present and valid.Tolerates additional parameters and arbitrary whitespace. Invalid or unknown charset tokens yield
Optional.empty().- Parameters:
headers- request/response headers (must be non-null)- Returns:
- the charset declared by the header; otherwise
Optional.empty() - See Also:
-
extractCharsetFromHeaderValue
public static @NonNull Optional<Charset> extractCharsetFromHeaderValue(@Nullable String contentTypeHeaderValue) Extracts thecharset=...parameter from aContent-Typeheader value.Parsing is forgiving: parameters may appear in any order and with arbitrary spacing. If a charset is found, it is validated via
Charset.forName(String); invalid names result inOptional.empty().- Parameters:
contentTypeHeaderValue- the raw header value; may benullor blank- Returns:
- the resolved charset if present and valid; otherwise
Optional.empty()
-
trimAggressively
A "stronger" version ofString.trim()which discards any kind of whitespace or invisible separator.In a web environment with user-supplied inputs, this is the behavior we want the vast majority of the time. For example, users copy-paste URLs from Microsoft Word or Outlook and it's easy to accidentally include a
U+202F "Narrow No-Break Space (NNBSP)"character at the end, which might break parsing.See https://www.compart.com/en/unicode/U+202F for details.
- Parameters:
string- the string to trim- Returns:
- the trimmed string, or
nullif the input string isnullor the trimmed representation is of length0
-
trimAggressivelyToNull
Aggressively trims Unicode whitespace from the given string and returnsnullif the result is empty.See
trimAggressively(String)for details on which code points are removed.- Parameters:
string- the input string; may benull- Returns:
- a trimmed, non-empty string; or
nullif input wasnullor trimmed to empty
-
trimAggressivelyToEmpty
Aggressively trims Unicode whitespace from the given string and returns""if the input isnull.See
trimAggressively(String)for details on which code points are removed.- Parameters:
string- the input string; may benull- Returns:
- a trimmed string (never
null);""if input wasnull
-
extractHeadersFromRawHeaderLines
public static @NonNull Map<@NonNull String, @NonNull Set<@NonNull String>> extractHeadersFromRawHeaderLines(@NonNull List<@NonNull String> rawHeaderLines) Given a list of raw HTTP header lines, convert them into a normalized case-insensitive, order-preserving map which "inflates" comma-separated headers into distinct values where permitted according to RFC 7230/9110.For example, given these raw header lines:
The result of parsing would look like this:List<String> lines = List.of( "Cache-Control: no-cache, no-store", "Set-Cookie: a=b; Path=/; HttpOnly", "Set-Cookie: c=d; Expires=Wed, 21 Oct 2015 07:28:00 GMT; Path=/" );result.get("cache-control") -> [ "no-cache", "no-store" ] result.get("set-cookie") -> [ "a=b; Path=/; HttpOnly", "c=d; Expires=Wed, 21 Oct 2015 07:28:00 GMT; Path=/" ]Keys in the returned map are case-insensitive and are guaranteed to be in the same order as encountered in
rawHeaderLines.Values in the returned map are guaranteed to be in the same order as encountered in
rawHeaderLines.- Parameters:
rawHeaderLines- the raw HTTP header lines to parse- Returns:
- a normalized mapping of header name keys to values
-