java.lang.Object

com.soklet.Utilities

@ThreadSafe public final class Utilities extends Object

A non-instantiable collection of utility methods.

Author:: Mark Allen

Method Summary

Modifier and Type

Method

Description

static String

encodeQueryParameters(Map<String, Set<String>> queryParameters, QueryFormat queryFormat)

Encodes decoded query parameters into a raw query string.

static Optional<Charset>

extractCharsetFromHeaders(Map<String, Set<String>> headers)

Extracts the Charset from the first Content-Type header, if present and valid.

static Optional<Charset>

extractCharsetFromHeaderValue(String contentTypeHeaderValue)

Extracts the charset=... parameter from a Content-Type header value.

static Optional<String>

extractClientUrlPrefixFromHeaders(Map<String, Set<String>> headers)

Best-effort attempt to determine a client's URL prefix by examining request headers.

static Optional<String>

extractContentTypeFromHeaders(Map<String, Set<String>> headers)

Extracts the media type (without parameters) from the first Content-Type header.

static Optional<String>

extractContentTypeFromHeaderValue(String contentTypeHeaderValue)

Extracts the media type (without parameters) from a Content-Type header value.

static Map<String, Set<String>>

extractCookiesFromHeaders(Map<String, Set<String>> headers)

Parses Cookie request headers into a map of cookie names to values.

static Map<String, Set<String>>

extractHeadersFromRawHeaderLines(List<String> rawHeaderLines)

Given a list of raw HTTP header lines, convert them into a normalized case-insensitive, order-preserving map which "inflates" comma-separated headers into distinct values where permitted according to RFC 7230/9110.

static List<Locale>

extractLocalesFromAcceptLanguageHeaderValue(String acceptLanguageHeaderValue)

Parses an Accept-Language header value into a best-effort ordered list of Locales.

static String

extractPathFromUrl(String url, Boolean performDecoding)

Normalizes a URL or path into a canonical request path and optionally performs percent-decoding on the path.

static Map<String, Set<String>>

extractQueryParametersFromQuery(String query, QueryFormat queryFormat)

Parses a query string such as "a=1&b=2&c=%20" into a multimap of names to values.

static Map<String, Set<String>>

extractQueryParametersFromQuery(String query, QueryFormat queryFormat, Charset charset)

Parses a query string such as "a=1&b=2&c=%20" into a multimap of names to values.

static Map<String, Set<String>>

extractQueryParametersFromUrl(String url, QueryFormat queryFormat)

Parses query strings from relative or absolute URLs such as "/example?a=a=1&b=2&c=%20" or "https://www.soklet.com/example?a=1&b=2&c=%20" into a multimap of names to values.

static Map<String, Set<String>>

extractQueryParametersFromUrl(String url, QueryFormat queryFormat, Charset charset)

Parses query strings from relative or absolute URLs such as "/example?a=a=1&b=2&c=%20" or "https://www.soklet.com/example?a=1&b=2&c=%20" into a multimap of names to values.

static Optional<String>

extractRawQueryFromUrl(String url)

Extracts the raw (un-decoded) query component from a URL.

static String

trimAggressively(String string)

A "stronger" version of String.trim() which discards any kind of whitespace or invisible separator.

static String

trimAggressivelyToEmpty(String string)

Aggressively trims Unicode whitespace from the given string and returns "" if the input is null.

static String

trimAggressivelyToNull(String string)

Aggressively trims Unicode whitespace from the given string and returns null if the result is empty.

Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- extractQueryParametersFromQuery
  
  @Nonnull public static Map<String, Set<String>> extractQueryParametersFromQuery(@Nonnull String query, @Nonnull QueryFormat queryFormat)
  
  Parses a query string such as "a=1&b=2&c=%20" into a multimap of names to values.
  Decodes percent-escapes using UTF-8, which is usually what you want (see extractQueryParametersFromQuery(String, QueryFormat, Charset) if you need to specify a different charset).
  Pairs missing a name are ignored.
  Multiple occurrences of the same name are collected into a Set in insertion order (duplicates are de-duplicated).
  
  Parameters:
  
  query - a raw query string such as "a=1&b=2&c=%20"
  
  queryFormat - how to decode: application/x-www-form-urlencoded or "strict" RFC 3986
  
  Returns:
  
  a map of parameter names to their distinct values, preserving first-seen name order; empty if none
- extractQueryParametersFromQuery
  
  @Nonnull public static Map<String, Set<String>> extractQueryParametersFromQuery(@Nonnull String query, @Nonnull QueryFormat queryFormat, @Nonnull Charset charset)
  
  Parses a query string such as "a=1&b=2&c=%20" into a multimap of names to values.
  Decodes percent-escapes using the specified charset.
  Pairs missing a name are ignored.
  Multiple occurrences of the same name are collected into a Set in insertion order (duplicates are de-duplicated).
  
  Parameters:
  
  query - a raw query string such as "a=1&b=2&c=%20"
  
  queryFormat - how to decode: application/x-www-form-urlencoded or "strict" RFC 3986
  
  charset - the charset to use when decoding percent-escapes
  
  Returns:
  
  a map of parameter names to their distinct values, preserving first-seen name order; empty if none
- extractQueryParametersFromUrl
  
  @Nonnull public static Map<String, Set<String>> extractQueryParametersFromUrl(@Nonnull String url, @Nonnull QueryFormat queryFormat)
  
  Parses query strings from relative or absolute URLs such as "/example?a=a=1&b=2&c=%20" or "https://www.soklet.com/example?a=1&b=2&c=%20" into a multimap of names to values.
  Decodes percent-escapes using UTF-8, which is usually what you want (see extractQueryParametersFromUrl(String, QueryFormat, Charset) if you need to specify a different charset).
  Pairs missing a name are ignored.
  Multiple occurrences of the same name are collected into a Set in insertion order (duplicates are de-duplicated).
  
  Parameters:
  
  url - a relative or absolute URL/URI string
  
  queryFormat - how to decode: application/x-www-form-urlencoded or "strict" RFC 3986
  
  Returns:
  
  a map of parameter names to their distinct values, preserving first-seen name order; empty if none/invalid
- extractQueryParametersFromUrl
  
  @Nonnull public static Map<String, Set<String>> extractQueryParametersFromUrl(@Nonnull String url, @Nonnull QueryFormat queryFormat, @Nonnull Charset charset)
  
  Parses query strings from relative or absolute URLs such as "/example?a=a=1&b=2&c=%20" or "https://www.soklet.com/example?a=1&b=2&c=%20" into a multimap of names to values.
  Decodes percent-escapes using the specified charset.
  Pairs missing a name are ignored.
  Multiple occurrences of the same name are collected into a Set in insertion order (duplicates are de-duplicated).
  
  Parameters:
  
  url - a relative or absolute URL/URI string
  
  queryFormat - how to decode: application/x-www-form-urlencoded or "strict" RFC 3986
  
  charset - the charset to use when decoding percent-escapes
  
  Returns:
  
  a map of parameter names to their distinct values, preserving first-seen name order; empty if none/invalid
- extractCookiesFromHeaders
  @Nonnull public static Map<String, Set<String>> extractCookiesFromHeaders(@Nonnull Map<String, Set<String>> headers)
  
  Parses Cookie request headers into a map of cookie names to values.
  Header name matching is case-insensitive ("Cookie" vs "cookie"), but cookie names are case-sensitive. Values are parsed per the following liberal rules:
  
  Components are split on ';' unless inside a quoted string.
  
  Quoted values have surrounding quotes removed and common backslash escapes unescaped.
  
  Percent-escapes are decoded as UTF-8. '+' is not treated specially.
  
  Multiple occurrences of the same cookie name are collected into a Set in insertion order.
  
  Parameters:
  
  headers - request headers as a multimap of header name to values (must be non-null)
  
  Returns:
  
  a map of cookie name to distinct values; empty if no valid cookies are present
- extractPathFromUrl
  @Nonnull public static String extractPathFromUrl(@Nonnull String url, @Nonnull Boolean performDecoding)
  
  Normalizes a URL or path into a canonical request path and optionally performs percent-decoding on the path.
  For example, "https://www.soklet.com/ab%20c?one=two" would be normalized to "/ab c".
  The OPTIONS * special case returns "*".
  Behavior:
  
  If input starts with http:// or https://, the path portion is extracted.
  
  Ensures the result begins with '/'.
  
  Removes any trailing '/' (except for the root path '/').
  
  Safely normalizes path traversals, e.g. path '/a/../b' would be normalized to '/b'
  
  Strips any query string.
  
  Applies aggressive trimming of Unicode whitespace.
  
  Parameters:
  
  url - a URL or path to normalize
  
  performDecoding - true if decoding should be performed on the path (e.g. replace %20 with a space character), false otherwise
  
  Returns:
  
  the normalized path, "/" for empty input
- extractRawQueryFromUrl
  
  @Nonnull public static Optional<String> extractRawQueryFromUrl(@Nonnull String url)
  
  Extracts the raw (un-decoded) query component from a URL.
  For example, "/path?a=b&c=d%20e" would return "a=b&c=d%20e".
  
  Parameters:
  
  url - a raw URL or path
  
  Returns:
  
  the raw query component, or Optional.empty() if none
- encodeQueryParameters
  
  @Nonnull public static String encodeQueryParameters(@Nonnull Map<String, Set<String>> queryParameters, @Nonnull QueryFormat queryFormat)
  
  Encodes decoded query parameters into a raw query string.
  For example, given {a=[b], c=[d e]} and QueryFormat.RFC_3986_STRICT, returns "a=b&c=d%20e".
  
  Parameters:
  
  queryParameters - the decoded query parameters
  
  queryFormat - the encoding strategy
  
  Returns:
  
  the encoded query string, or the empty string if no parameters
- extractLocalesFromAcceptLanguageHeaderValue
  
  @Nonnull public static List<Locale> extractLocalesFromAcceptLanguageHeaderValue(@Nonnull String acceptLanguageHeaderValue)
  
  Parses an Accept-Language header value into a best-effort ordered list of Locales.
  Quality weights are honored by Locale.LanguageRange.parse(String); results are then mapped to available JVM locales. Unknown or unavailable language ranges are skipped. On parse failure, an empty list is returned.
  
  Parameters:
  
  acceptLanguageHeaderValue - the raw header value (must be non-null)
  
  Returns:
  
  locales in descending preference order; empty if none could be resolved
- extractClientUrlPrefixFromHeaders
  @Nonnull public static Optional<String> extractClientUrlPrefixFromHeaders(@Nonnull Map<String, Set<String>> headers)
  
  Best-effort attempt to determine a client's URL prefix by examining request headers.
  A URL prefix in this context is defined as <scheme>://host<:optional port>, but no path or query components.
  Soklet is generally the "last hop" behind a load balancer/reverse proxy and does get accessed directly by clients.
  Normally a load balancer/reverse proxy/other upstream proxies will provide information about the true source of the request through headers like the following:
  
  Host
  
  Forwarded
  
  Origin
  
  X-Forwarded-Proto
  
  X-Forwarded-Protocol
  
  X-Url-Scheme
  
  Front-End-Https
  
  X-Forwarded-Ssl
  
  X-Forwarded-Host
  
  X-Forwarded-Port
  
  This method may take these and other headers into account when determining URL prefix.
  For example, the following would be legal URL prefixes returned from this method:
  
  https://www.soklet.com
  
  http://www.fake.com:1234
  
  The following would NOT be legal URL prefixes:
  
  www.soklet.com (missing protocol)
  
  https://www.soklet.com/ (trailing slash)
  
  https://www.soklet.com/test (trailing slash, path)
  
  https://www.soklet.com/test?abc=1234 (trailing slash, path, query)
  
  Parameters:
  
  headers - HTTP request headers
  
  Returns:
  
  the URL prefix, or Optional.empty() if it could not be determined
- extractContentTypeFromHeaders
  @Nonnull public static Optional<String> extractContentTypeFromHeaders(@Nonnull Map<String, Set<String>> headers)
  
  Extracts the media type (without parameters) from the first Content-Type header.
  For example, "text/html; charset=UTF-8" → "text/html".
  
  Parameters:
  
  headers - request/response headers (must be non-null)
  
  Returns:
  
  the media type if present; otherwise Optional.empty()
  
  See Also:
  
  extractContentTypeFromHeaderValue(String)
- extractContentTypeFromHeaderValue
  
  @Nonnull public static Optional<String> extractContentTypeFromHeaderValue(@Nullable String contentTypeHeaderValue)
  
  Extracts the media type (without parameters) from a Content-Type header value.
  For example, "application/json; charset=UTF-8" → "application/json".
  
  Parameters:
  
  contentTypeHeaderValue - the raw header value; may be null or blank
  
  Returns:
  
  the media type if present; otherwise Optional.empty()
- extractCharsetFromHeaders
  @Nonnull public static Optional<Charset> extractCharsetFromHeaders(@Nonnull Map<String, Set<String>> headers)
  
  Extracts the Charset from the first Content-Type header, if present and valid.
  Tolerates additional parameters and arbitrary whitespace. Invalid or unknown charset tokens yield Optional.empty().
  
  Parameters:
  
  headers - request/response headers (must be non-null)
  
  Returns:
  
  the charset declared by the header; otherwise Optional.empty()
  
  See Also:
  
  extractCharsetFromHeaderValue(String)
- extractCharsetFromHeaderValue
  
  @Nonnull public static Optional<Charset> extractCharsetFromHeaderValue(@Nullable String contentTypeHeaderValue)
  
  Extracts the charset=... parameter from a Content-Type header value.
  Parsing is forgiving: parameters may appear in any order and with arbitrary spacing. If a charset is found, it is validated via Charset.forName(String); invalid names result in Optional.empty().
  
  Parameters:
  
  contentTypeHeaderValue - the raw header value; may be null or blank
  
  Returns:
  
  the resolved charset if present and valid; otherwise Optional.empty()
- trimAggressively
  
  @Nullable public static String trimAggressively(@Nullable String string)
  
  A "stronger" version of String.trim() which discards any kind of whitespace or invisible separator.
  In a web environment with user-supplied inputs, this is the behavior we want the vast majority of the time. For example, users copy-paste URLs from Microsoft Word or Outlook and it's easy to accidentally include a U+202F "Narrow No-Break Space (NNBSP)" character at the end, which might break parsing.
  See https://www.compart.com/en/unicode/U+202F for details.
  
  Parameters:
  
  string - the string to trim
  
  Returns:
  
  the trimmed string, or null if the input string is null or the trimmed representation is of length 0
- trimAggressivelyToNull
  
  @Nullable public static String trimAggressivelyToNull(@Nullable String string)
  
  Aggressively trims Unicode whitespace from the given string and returns null if the result is empty.
  See trimAggressively(String) for details on which code points are removed.
  
  Parameters:
  
  string - the input string; may be null
  
  Returns:
  
  a trimmed, non-empty string; or null if input was null or trimmed to empty
- trimAggressivelyToEmpty
  
  @Nonnull public static String trimAggressivelyToEmpty(@Nullable String string)
  
  Aggressively trims Unicode whitespace from the given string and returns "" if the input is null.
  See trimAggressively(String) for details on which code points are removed.
  
  Parameters:
  
  string - the input string; may be null
  
  Returns:
  
  a trimmed string (never null); "" if input was null
- extractHeadersFromRawHeaderLines
  @Nonnull public static Map<String, Set<String>> extractHeadersFromRawHeaderLines(@Nonnull List<String> rawHeaderLines)
  
  Given a list of raw HTTP header lines, convert them into a normalized case-insensitive, order-preserving map which "inflates" comma-separated headers into distinct values where permitted according to RFC 7230/9110.
  For example, given these raw header lines:
  List<String> lines = List.of( "Cache-Control: no-cache, no-store", "Set-Cookie: a=b; Path=/; HttpOnly", "Set-Cookie: c=d; Expires=Wed, 21 Oct 2015 07:28:00 GMT; Path=/" );
  The result of parsing would look like this:
  result.get("cache-control") -> [ "no-cache", "no-store" ] result.get("set-cookie") -> [ "a=b; Path=/; HttpOnly", "c=d; Expires=Wed, 21 Oct 2015 07:28:00 GMT; Path=/" ]
  
  Keys in the returned map are case-insensitive and are guaranteed to be in the same order as encountered in rawHeaderLines.
  Values in the returned map are guaranteed to be in the same order as encountered in rawHeaderLines.
  
  Parameters:
  
  rawHeaderLines - the raw HTTP header lines to parse
  
  Returns:
  
  a normalized mapping of header name keys to values

Class Utilities

Method Summary

Methods inherited from class Object

Method Details

extractQueryParametersFromQuery

extractQueryParametersFromQuery

extractQueryParametersFromUrl

extractQueryParametersFromUrl

extractCookiesFromHeaders

extractPathFromUrl

extractRawQueryFromUrl

encodeQueryParameters

extractLocalesFromAcceptLanguageHeaderValue

extractClientUrlPrefixFromHeaders

extractContentTypeFromHeaders

extractContentTypeFromHeaderValue

extractCharsetFromHeaders

extractCharsetFromHeaderValue

trimAggressively

trimAggressivelyToNull

trimAggressivelyToEmpty

extractHeadersFromRawHeaderLines