Client Configuration¶
When Nessie is integrated into a broader data processing environment, authentication settings need to be provided in a way specific to the tool used.
Nessie client configuration options¶
See also Authentication Settings below.
Common settings¶
Property | Description |
---|---|
nessie.uri | Config property name (“nessie.uri”) for the Nessie service URL. |
nessie.authentication.type | ID of the authentication provider to use, default is no authentication. Valid values are BASIC , BEARER , OAUTH2 and AWS . The value is matched against the values returned as the supported auth-type by implementations of ( NessieAuthenticationProvider ) across all available authentication providers. Note that “basic” HTTP authentication is not considered secure, use BEARER instead. |
nessie.ref | Name of the initial Nessie reference, usually main . |
nessie.ref.hash | Commit ID (hash) on “nessie.ref”, usually not specified. |
nessie.tracing | Enable adding the HTTP headers of an active OpenTracing span to all Nessie requests. Disabled by default. |
nessie.client-builder-name | Name of the Nessie client to use. If not specified, the implementation prefers the new Java HTTP client ( JavaHttp ), if running on Java 11 or newer, or the Java URLConnection client. The Apache HTTP client ( ApacheHttp ) can be used, if it has been made available on the classpath. |
nessie.client-builder-impl | Similar to “nessie.client-builder-name”, but uses a class name. Deprecated Prefer using Nessie client implementation names, configured via “nessie.client-builder-name”. |
nessie.enable-api-compatibility-check | Enables API compatibility check when creating the Nessie client. The default is true . You can also control this setting by setting the system property nessie.client.enable-api-compatibility-check to true or false . |
nessie.client-api-version | Explicitly specify the Nessie API version number to use. The default for this setting depends on the client being used. |
nessie.commit-meta.message | Override all commit messages with the configured value. The corresponding HTTP header is Nessie-Commit-Message . |
nessie.commit-meta.authors | Set/override the author of all commits. The corresponding HTTP header is Nessie-Commit-Authors . Multiple authors can be specified, separated with , . |
nessie.commit-meta.signed-off-by | Set/override the signed-off-by of all commits. The corresponding HTTP header is Nessie-Commit-SignedOffBy . Multiple signed-off-by can be specified, separated with , . |
Network settings¶
Property | Description |
---|---|
nessie.transport.read-timeout | Network level read timeout in milliseconds. When running with Java 11, this becomes a request timeout. Default is 25000 ms. |
nessie.transport.connect-timeout | Network level connect timeout in milliseconds, default is 5000. |
nessie.transport.disable-compression | Config property name (“nessie.transport.disable-compression”) to disable compression on the network layer, if set to true . |
nessie.ssl.no-certificate-verification | Optional, disables certificate verifications, if set to true . Can be useful for testing purposes, not recommended for production systems. |
nessie.ssl.cipher-suites | Optional, list of comma-separated cipher suites for SSL connections. This parameter only works on Java 11 and newer with the Java HTTP client. |
nessie.ssl.protocols | Optional, list of comma-separated protocols for SSL connections. This parameter only works on Java 11 and newer with the Java HTTP client. |
nessie.ssl.sni-hosts | Optional, comma-separated list of SNI host names for SSL connections. This parameter only works on Java 11 and newer with the Java HTTP client. |
nessie.ssl.sni-matcher | Optional, a single SNI matcher for SSL connections. Takes a single SNI hostname matcher, a regular expression representing the SNI hostnames to match. This parameter only works on Java 11 and newer with the Java HTTP client. |
HTTP settings¶
Property | Description |
---|---|
nessie.http2-upgrade | Optional, allow HTTP/2 upgrade, if set to true . This parameter only works on Java 11 and newer with the Java HTTP client. |
nessie.http-redirects | Optional, specify how redirects are handled. * NEVER : Never redirect. * ALWAYS : Always redirect. * NORMAL : Always redirect, except from HTTPS URLs to HTTP URLs. This parameter only works on Java 11 and newer with the Java HTTP client. |
Bearer authentication settings¶
See also Authentication Settings below.
Property | Description |
---|---|
nessie.authentication.token | Token used for BEARER authentication. |
OAuth2 settings¶
General OAuth2 settings. See also Authentication Settings below.
Property | Description |
---|---|
nessie.authentication.oauth2.issuer-url | OAuth2 issuer URL. The root URL of the OpenID Connect identity issuer provider, which will be used for discovering supported endpoints and their locations. For Keycloak, this is typically the realm URL: https://<keycloak-server>/realms/<realm-name> . Endpoint discovery is performed using the OpenID Connect Discovery metadata published by the issuer. See OpenID Connect Discovery 1.0 for more information. Either this property or ( nessie.authentication.oauth2.token-endpoint ) must be set. |
nessie.authentication.oauth2.token-endpoint | URL of the OAuth2 token endpoint. For Keycloak, this is typically https://<keycloak-server>/realms/<realm-name>/protocol/openid-connect/token . Either this property or ( nessie.authentication.oauth2.issuer-url ) must be set. In case it is not set, the token endpoint will be discovered from the issuer URL (nessie.authentication.oauth2.issuer-url ), using the OpenID Connect Discovery metadata published by the issuer. |
nessie.authentication.oauth2.grant-type | The grant type to use when authenticating against the OAuth2 server. Valid values are: * “client_credentials” * “password” * “authorization_code” * “device_code” * “token_exchange” Optional, defaults to “client_credentials”. |
nessie.authentication.oauth2.client-id | Client ID to use when authenticating against the OAuth2 server. Required if using OAuth2 authentication, ignored otherwise. |
nessie.authentication.oauth2.client-secret | Client secret to use when authenticating against the OAuth2 server. Required if using OAuth2 authentication, ignored otherwise. |
nessie.authentication.oauth2.client-scopes | Space-separated list of scopes to include in each request to the OAuth2 server. Optional, defaults to empty (no scopes). The scope names will not be validated by the Nessie client; make sure they are valid according to RFC 6749 Section 3.3 . |
OAuth2 Resource Owner Password Credentials settings¶
OAuth2 settings relevant when using the password
grant type. See below for details.
Property | Description |
---|---|
nessie.authentication.oauth2.username | Username to use when authenticating against the OAuth2 server. Required if using OAuth2 authentication and “password” grant type, ignored otherwise. |
nessie.authentication.oauth2.password | Password to use when authenticating against the OAuth2 server. Required if using OAuth2 authentication and the “password” grant type, ignored otherwise. |
OAuth2 Authorization Code Grant settings¶
OAuth2 settings relevant when using the authorization_code
grant type. See below for details.
Property | Description |
---|---|
nessie.authentication.oauth2.auth-endpoint | URL of the OAuth2 authorization endpoint. For Keycloak, this is typically https://<keycloak-server>/realms/<realm-name>/protocol/openid-connect/auth . If using the “authorization_code” grant type, either this property or ( nessie.authentication.oauth2.issuer-url ) must be set. In case it is not set, the authorization endpoint will be discovered from the issuer URL (nessie.authentication.oauth2.issuer-url ), using the OpenID Connect Discovery metadata published by the issuer. |
nessie.authentication.oauth2.auth-code-flow.web-port | Port of the OAuth2 authorization code flow web server. When running a client inside a container make sure to specify a port and forward the port to the container host. The port used for the internal web server that listens for the authorization code callback. This is only used if the grant type to use is “authorization_code”. Optional; if not present, a random port will be used. |
nessie.authentication.oauth2.auth-code-flow.timeout | Defines how long the client should wait for the authorization code flow to complete. This is only used if the grant type to use is “authorization_code”. Optional, defaults to “PT5M”. |
OAuth2 Device Authorization Grant settings¶
OAuth2 settings relevant when using the device_code
grant type. See below for details.
Property | Description |
---|---|
nessie.authentication.oauth2.device-auth-endpoint | URL of the OAuth2 device authorization endpoint. For Keycloak, this is typically http://<keycloak-server>/realms/<realm-name>/protocol/openid-connect/auth/device . If using the “Device Code” grant type, either this property or ( nessie.authentication.oauth2.issuer-url ) must be set. |
nessie.authentication.oauth2.device-code-flow.timeout | Defines how long the client should wait for the device code flow to complete. This is only used if the grant type to use is “device_code”. Optional, defaults to “PT5M”. |
nessie.authentication.oauth2.device-code-flow.poll-interval | Defines how often the client should poll the OAuth2 server for the device code flow to complete. This is only used if the grant type to use is “device_code”. Optional, defaults to “PT5S”. |
OAuth2 Token Exchange Grant settings¶
OAuth2 settings relevant when using the token_exchange
grant type. See below for details.
Warning
The feature is experimental and subject to change.
Property | Description |
---|---|
nessie.authentication.oauth2.token-exchange.resource | For token exchanges only. A URI that indicates the target service or resource where the client intends to use the requested security token. Optional. |
nessie.authentication.oauth2.token-exchange.audience | For token exchanges only. The logical name of the target service where the client intends to use the requested security token. This serves a purpose similar to the resource parameter but with the client providing a logical name for the target service. |
nessie.authentication.oauth2.token-exchange.subject-token | For token exchanges only. The subject token to exchange. This can take 3 kinds of values: * The value “current_access_token”, if the client should use its current access token; * The value “current_refresh_token”, if the client should use its current refresh token (a refresh token must be available in this case); * An arbitrary token: in this case, the client will always use the static token provided here. The default is to use the current access token. Note: when using token exchange as the initial grant type, no current access token will be available: in this case, a valid, static subject token to exchange must be provided via configuration. |
nessie.authentication.oauth2.token-exchange.subject-token-type | For token exchanges only. The type of the subject token. Must be a valid URN. The default is either urn:ietf:params:oauth:token-type:access_token or urn:ietf:params:oauth:token-type:refresh_token , depending on the value of “nessie.authentication.oauth2.token-exchange.subject-token”. If the client is configured to use its access or refresh token as the subject token, please note that if an incorrect token type is provided here, the token exchange could fail. |
nessie.authentication.oauth2.token-exchange.actor-token | For token exchanges only. The actor token to exchange. This can take 4 kinds of values: * The value “no_token”, if the client should not include any actor token in the exchange request; * The value “current_access_token”, if the client should use its current access token; * The value “current_refresh_token”, if the client should use its current refresh token (if available); * An arbitrary token: in this case, the client will always use the static token provided here. The default is to not include any actor token. |
nessie.authentication.oauth2.token-exchange.actor-token-type | For token exchanges only. The type of the actor token. Must be a valid URN. The default is either urn:ietf:params:oauth:token-type:access_token or urn:ietf:params:oauth:token-type:refresh_token , depending on the value of “nessie.authentication.oauth2.token-exchange.actor-token”. If the client is configured to use its access or refresh token as the actor token, please note that if an incorrect token type is provided here, the token exchange could fail. |
OAuth2 impersonation settings¶
OAuth2 settings relevant when using impersonation. See below for details.
Warning
The feature is experimental and subject to change.
Property | Description |
---|---|
nessie.authentication.oauth2.impersonation.enabled | Whether to enable “impersonation” mode. If enabled, each access token obtained from the OAuth2 server using the configured initial grant type will be exchanged for a new token, using the token exchange grant type. |
nessie.authentication.oauth2.impersonation.issuer-url | For impersonation only. The root URL of an alternate OpenID Connect identity issuer provider, to use when exchanging tokens only. If neither this property nor “nessie.authentication.oauth2.impersonation.token-endpoint” are defined, the global token endpoint will be used. This means that the same authorization server will be used for both the initial token request and the token exchange. Endpoint discovery is performed using the OpenID Connect Discovery metadata published by the issuer. See OpenID Connect Discovery 1.0 for more information. |
nessie.authentication.oauth2.impersonation.token-endpoint | For impersonation only. The URL of an alternate OAuth2 token endpoint to use when exchanging tokens only. If neither this property nor “nessie.authentication.oauth2.impersonation.issuer-url” are defined, the global token endpoint will be used. This means that the same authorization server will be used for both the initial token request and the token exchange. |
nessie.authentication.oauth2.impersonation.client-id | For impersonation only. An alternate client ID to use. If not provided, the global client ID will be used. If provided, and if the client is confidential, then its secret must be provided as well with “nessie.authentication.oauth2.impersonation.client-secret” – the global client secret will NOT be used. |
nessie.authentication.oauth2.impersonation.client-secret | For impersonation only. The client secret to use, if “nessie.authentication.oauth2.impersonation.client-id” is defined and the token exchange client is confidential. |
nessie.authentication.oauth2.impersonation.scopes | For impersonation only. Space-separated list of scopes to include in each token exchange request to the OAuth2 server. Optional. If undefined, the global scopes configured through “nessie.authentication.oauth2.client-scopes” will be used. If defined and null or empty, no scopes will be used. The scope names will not be validated by the Nessie client; make sure they are valid according to RFC 6749 Section 3.3 . |
OAuth2 token refresh settings¶
OAuth2 settings related to token refreshes. You should rarely need to change the defaults.
Property | Description |
---|---|
nessie.authentication.oauth2.default-access-token-lifespan | Default access token lifespan; if the OAuth2 server returns an access token without specifying its expiration time, this value will be used. Optional, defaults to “PT1M”. Must be a valid ISO-8601 duration. |
nessie.authentication.oauth2.default-refresh-token-lifespan | Default refresh token lifespan. If the OAuth2 server returns a refresh token without specifying its expiration time, this value will be used. Optional, defaults to “PT30M”. Must be a valid ISO-8601 duration. |
nessie.authentication.oauth2.refresh-safety-window | Refresh safety window to use; a new token will be fetched when the current token’s remaining lifespan is less than this value. Optional, defaults to “PT10S”. Must be a valid ISO-8601 duration. |
nessie.authentication.oauth2.preemptive-token-refresh-idle-timeout | Defines for how long the OAuth2 provider should keep the tokens fresh, if the client is not being actively used. Setting this value too high may cause an excessive usage of network I/O and thread resources; conversely, when setting it too low, if the client is used again, the calling thread may block if the tokens are expired and need to be renewed synchronously. Optional, defaults to “PT30S”. Must be a valid ISO-8601 duration. |
nessie.authentication.oauth2.background-thread-idle-timeout | Defines how long the background thread should be kept running if the client is not being actively used, or no token refreshes are being executed. Optional, defaults to “PT30S”. Setting this value too high will cause the background thread to keep running even if the client is not used anymore, potentially leaking thread and memory resources; conversely, setting it too low could cause the background thread to be restarted too often. Must be a valid ISO-8601 duration. |
AWS authentication settings¶
Additional AWS authentication configuration should be provided via standard AWS configuration files.
See also Authentication Settings below.
Property | Description |
---|---|
nessie.authentication.aws.region | AWS region used for AWS authentication, must be configured to the same region as the Nessie setver. |
nessie.authentication.aws.profile | AWS profile name used for AWS authentication (optional). |
Basic authentication settings¶
See also Authentication Settings below.
Property | Description |
---|---|
nessie.authentication.username | Username used for the insecure BASIC authentication. Deprecated “basic” HTTP authentication is not considered secure. Use ( nessie.authentication.token ) instead. |
nessie.authentication.password | Password used for the insecure BASIC authentication. Deprecated “basic” HTTP authentication is not considered secure. Use ( nessie.authentication.token ) instead. |
Java 11 connection pool options¶
The Java 11 HTTP client can be configured using Java system properties. Since Java’s HttpClient
API does not support the configuration of these properties programmatically, Nessie cannot expose those via its configuration mechanism.
System property | Meaning |
---|---|
jdk.httpclient.connectionPoolSize | The size of the HTTP connection pool.Defaults to 0 , which means the number of connections is unlimited. |
jdk.httpclient.keepalive.timeout | Number of seconds an idle HTTP connection will be kept alive. Defaults is 1200 seconds. |
jdk.httpclient.receiveBufferSize | Size of the network level receive buffer size. Defaults to 0 , which means the operating system defaults apply. |
jdk.httpclient.sendBufferSize | Size of the network level send buffer size. Defaults to 0 , which means the operating system defaults apply. |
Note
See Javadoc of javax.net.ssl.SSLParameters
for valid options/values for the configuration parameters starting with nessie.ssl.
.
Note
See Javadoc of org.projectnessie.client.NessieConfigConstants
as well.
Note
In case you run into issues with Nessie’s new HTTP client for Java 11 and newer, you can try to use the legacy URLConnection
based HTTP client by setting the system property or configuration option nessie.client-builder-name
to URLConnection
.
Spark¶
When Nessie is used in Spark-based environments (with Iceberg the Nessie authentication settings are configured via Spark session properties (Replace <catalog_name>
with the name of your catalog).
// local spark instance, assuming NONE authentication
conf.set("spark.sql.catalog.<catalog_name>", "org.apache.iceberg.spark.SparkCatalog")
.set("spark.sql.catalog.<catalog_name>.authentication.type", "NONE")
.set(...);
spark = SparkSession.builder()
.master("local[2]")
.config(conf)
.getOrCreate();
# local spark instance, assuming NONE authentication
spark = SparkSession.builder \
.config("spark.sql.catalog.<catalog_name>", "org.apache.iceberg.spark.SparkCatalog") \
.config("spark.sql.catalog.<catalog_name>.authentication.type", "NONE") \
.config(...)
.getOrCreate()
Property Prefixes¶
The spark.sql.catalog.<catalog_name>
prefix identifies properties for the Nessie catalog. The <catalog_name>
part is just the name of the catalog in this case (not to be confused with the Nessie project name).
Multiple Nessie catalogs can be configured in the same Spark environment, each with its own set of configuration properties and its own property name prefix.
Flink¶
When Nessie is used in Flink with Iceberg, the Nessie authentication settings are configured when creating the Nessie catalog in Flink (Replace <catalog_name>
with the name of your catalog):
table_env.execute_sql(
"""CREATE CATALOG <catalog_name> WITH (
'type'='iceberg',
'catalog-impl'='org.apache.iceberg.nessie.NessieCatalog',
'authentication.type'='NONE')""")
Authentication Settings¶
The sections below discuss specific authentication settings. The property names are shown without environment-specific prefixes for brevity. Nonetheless, in practice the property names should be given appropriate prefixes (as in the examples above) for them to be recognized by the tools and Nessie code.
The value of the authentication.type
property can be one of the following:
NONE
(default)BEARER
OAUTH2
AWS
Authentication Type NONE
¶
For the Authentication Type NONE
only the authentication.type
property needs to be set.
This is also the default authentication type if nothing else is configured.
Authentication Type BEARER
¶
For the BEARER
Authentication Type the authentication.token
property should be set to a valid OpenID token.
This authentication type is recommended only when the issued access token has a lifespan large enough to cover the duration of the entire Nessie client’s session. Once the token is expired, the Nessie client will not be able to refresh it and will have to be restarted, with a different token. If the token needs to be refreshed periodically, then the OAUTH2
authentication type should be preferred to this one.
Authentication Type OAUTH2
¶
The OAUTH2
Authentication Type is able to authenticate against an OAuth2 server and obtain a valid access token. Only Bearer access tokens are currently supported. The access token is then used to authenticate against Nessie. The client will automatically refresh the access token. This authentication type is recommended when the access token has a lifespan shorter than the Nessie client’s session lifespan.
Note that the Nessie server must be configured to accept OAuth2 tokens from the same server. For example, if the OAuth2 server is Keycloak, this can be done by defining the following properties in the application.properties
file of the Nessie server:
nessie.server.authentication.enabled=true
quarkus.oidc.auth-server-url=https://<keycloak-server>/realms/<realm-name>
OAuth is a complex framework and usually requires many configuration settings on the client side. The full list of available settings is shown above, but here are some general configuration guidelines:
Configuring endpoints¶
The Nessie client interacts with the OAuth2 server by contacting its endpoints, in order to authenticate and obtain access tokens, using various grants. The main endpoint is the token endpoint and is always required, but other endpoints may also be required, depending on the grant type being used.
The endpoints can be provided with the following properties:
- Token endpoint:
nessie.authentication.oauth2.token-endpoint
(always required); - Authorization endpoint:
nessie.authentication.oauth2.auth-endpoint
(required when using theauthorization_code
grant); - Device authorization endpoint:
nessie.authentication.oauth2.device-auth-endpoint
(required when using thedevice_code
grant).
However, instead of specifying the endpoints individually, it is recommended to use the all-in-one property nessie.authentication.oauth2.issuer-url
whenever possible. When this property is provided, the client is capable of discovering all the required endpoints automatically by querying the authorization server well-known metadata endpoint.
Configuring grant types¶
Another important property is authentication.oauth2.grant-type
, which defines the grant type to use when authenticating against the OAuth2 server. Valid values are:
client_credentials
: enables the Client Credentials grant (default);password
: enables the Resource Owner Password Credentials grant;authorization_code
: enables the Authorization Code grant;device_code
: enables the Device Authorization grant;token_exchange
: enables the Token Exchange grant.
Note
The Device Authorization grant can also be specified using its canonical URN: urn:ietf:params:oauth:grant-type:device_code
.
Note
The Token Exchange grant can also be specified using its canonical URN: urn:ietf:params:oauth:grant-type:token-exchange
.
The client_credentials
grant type is the simplest one, but it requires the client to be granted enough permissions to access the Nessie server on behalf of the user. This is not always possible, and should be avoided if the session is interactive (that is, when the client is being controlled by a human).
For this grant type, the following properties must be provided:
nessie.authentication.oauth2.issuer-url
ornessie.authentication.oauth2.token-endpoint
;nessie.authentication.oauth2.client-id
;nessie.authentication.oauth2.client-secret
(unless the client is public).
The password
grant type is also simple, but it requires passing the user’s password to the client, which may not be acceptable in some cases for security reasons. Many identity providers forbid its usage.
All the properties required for client_credentials
are also required for this grant type, as well as the following ones:
nessie.authentication.oauth2.username
;nessie.authentication.oauth2.password
.
For real users trying to authenticate within a terminal session, such as a Spark shell, the authorization_code
grant type is recommended. It requires the user to authenticate in a browser window, thus sparing the need to provide the user’s password directly to the client. The user will be prompted to authenticate in a separate browser window, and the Nessie client will be notified when the authentication is complete.
All the properties required for client_credentials
are also required for this grant type. As explained above, if nessie.authentication.oauth2.issuer-url
is provided, then no further configuration is required. Otherwise, in addition to the token endpoint, the authorization endpoint must also be provided (nessie.authentication.oauth2.auth-endpoint
).
If the terminal session is running remotely however, on inside an embedded device, then the authorization_code
grant type may not be suitable, as the browser and the terminal session must be running on the same machine. In this case, the device_code
grant type is recommended. Similar to the authorization_code
grant type, it requires the user to authenticate in a browser window, but it does not require the browser and the terminal session to be running on the same machine. The user will be prompted to authenticate in a local browser window, and the remote Nessie client will poll the OAuth2 server for the authentication status, until the authentication is complete.
All the properties required for client_credentials
are also required for this grant type. As explained above, if nessie.authentication.oauth2.issuer-url
is provided, then no further configuration is required. Otherwise, in addition to the token endpoint, the device authorization endpoint must also be provided (nessie.authentication.oauth2.device-auth-endpoint
).
Finally, the token_exchange
grant type is the most complex one. In-depth configuration of a token exchange grant is outside the scope of this document but in general, two use cases can be envisaged:
-
Initial token exchange: enabled when
authentication.oauth2.grant-type
istoken_exchange
. In this scenario, the client will use token exchange as the primary grant. A subject token must be provided withnessie.authentication.oauth2.token-exchange.subject-token
. Other properties undernessie.authentication.oauth2.token-exchange.*
may also be required. -
Impersonation or delegation: this is the most typical usage, enabled when
nessie.authentication.oauth2.impersonation.enabled
istrue
. Here, the client will first obtain an initial token using another grant type, then exchange the received access token for another access token, possibly from a second OAuth2 server. If a second OAuth2 server must be contacted, use the properties undernessie.authentication.oauth2.impersonation.*
. And finally, since impersonation uses the token exchange grant type behind the scenes, properties undernessie.authentication.oauth2.token-exchange.*
may also be relevant.
Warning
When using impersonation, the property authentication.oauth2.grant-type
must be another grant type than token_exchange
.
Warning
If a second OAuth2 server is required to perform impersonation, the admin user is responsible for configuring the trust relationship between the two servers.