Server Configuration¶
The Nessie server is configurable via properties as listed in the application.properties file.
These properties can be set when starting up the docker image in two different ways. For example, if you want to set Nessie to use the JDBC
version store and provide a JDBC connection URL, you can either:
-
Set these values via the
JAVA_OPTS_APPEND
option in the Docker invocation. Each setting should be inserted inside the variable’s value as-D<name>=<value>
pairs:docker run -p 19120:19120 \ -e JAVA_OPTS_APPEND="-Dnessie.version.store.type=JDBC -Dquarkus.datasource.jdbc.url=jdbc:postgresql://host.com:5432/db" \ ghcr.io/projectnessie/nessie
-
Alternatively, set them via the
--env
(or-e
) option in the Docker invocation. Each setting must be provided separately as--env NAME=value
options:docker run -p 19120:19120 \ --env NESSIE_VERSION_STORE_TYPE=JDBC \ --env QUARKUS_DATASOURCE_JDBC_URL="jdbc:postgresql://host.com:5432/db" \ ghcr.io/projectnessie/nessie
Note how the original property name is converted to an environment variable, e.g. nessie.version.store.type
becomes NESSIE_VERSION_STORE_TYPE
. The conversion is done by replacing all .
with _
and converting the name to upper case. See here for more details.
For more information on docker images, see Docker image options below.
Core Nessie Configuration Settings¶
Core Settings¶
Nessie server configuration to be injected into the JAX-RS application.
Property | Default Value | Type | Description |
---|---|---|---|
nessie.server.default-branch | main | String | The default branch to use if not provided by the user. |
nessie.server.send-stacktrace-to-client | false | boolean | Whether stack traces should be sent to the client in case of error. The default is false to not expose internal details for security reasons. |
nessie.server.access-checks-batch-size | 100 | int | The number of entity-checks that are grouped into a call to BatchAccessChecker . The default value is quite conservative, it is the responsibility of the operator to adjust this value according to the capabilities of the actual authz implementation. Note that the number of checks can be slightly exceeded by the implementation, depending on the call site. |
Related Quarkus settings:
Property | Default values | Type | Description |
---|---|---|---|
quarkus.http.port | 19120 | int | Sets the HTTP port |
Info
A complete set of configuration options for Quarkus can be found on quarkus.io
Version Store Settings¶
Version store configuration.
Property | Default Value | Type | Description |
---|---|---|---|
nessie.version.store.type | IN_MEMORY | IN_MEMORY, ROCKSDB, DYNAMODB, MONGODB, CASSANDRA, JDBC, BIGTABLE | Sets which type of version store to use by Nessie. |
nessie.version.store.events.enable | true | boolean | Sets whether events for the version-store are enabled. In order for events to be published, it’s not enough to enable them in the configuration; you also need to provide at least one implementation of Nessie’s EventListener SPI. |
Support for the database specific implementations¶
Database | Status | Configuration value for nessie.version.store.type | Notes |
---|---|---|---|
“in memory” | only for development and local testing | IN_MEMORY | Do not use for any serious use case. |
RocksDB | production, single node only | ROCKSDB | |
Google BigTable | production | BIGTABLE | |
MongoDB | production | MONGODB | |
Amazon DynamoDB | beta, only tested against the simulator | DYNAMODB | |
PostgreSQL | production | JDBC | |
CockroachDB | experimental, known issues | JDBC | Known to raise user-facing “write too old” errors under contention. |
Apache Cassandra | experimental, known issues | CASSANDRA | Known to raise user-facing errors due to Cassandra’s concept of letting the driver timeout too early, or database timeouts. |
ScyllaDB | experimental, known issues | CASSANDRA | Known to raise user-facing errors due to Cassandra’s concept of letting the driver timeout too early, or database timeouts. Known to be slow in container based testing. Unclear how good Scylla’s LWT implementation performs. |
BigTable Version Store Settings¶
When setting nessie.version.store.type=BIGTABLE
which enables Google BigTable as the version store used by the Nessie server, the following configurations are applicable.
Property | Default Value | Type | Description |
---|---|---|---|
nessie.version.store.persist.bigtable.instance-id | nessie | String | Sets the instance-id to be used with Google BigTable. |
nessie.version.store.persist.bigtable.emulator-port | 8086 | int | When using the BigTable emulator, used to configure the port. |
nessie.version.store.persist.bigtable.enable-telemetry | true | boolean | Enables telemetry with OpenCensus. |
nessie.version.store.persist.bigtable.table-prefix | String | Prefix for tables, default is no prefix. | |
nessie.version.store.persist.bigtable.no-table-admin-client | false | boolean | |
nessie.version.store.persist.bigtable.app-profile-id | String | Sets the profile-id to be used with Google BigTable. | |
nessie.version.store.persist.bigtable.quota-project-id | String | Google BigTable quote project ID (optional). | |
nessie.version.store.persist.bigtable.endpoint | String | Google BigTable endpoint (if not default). | |
nessie.version.store.persist.bigtable.mtls-endpoint | String | Google BigTable MTLS endpoint (if not default). | |
nessie.version.store.persist.bigtable.emulator-host | String | When using the BigTable emulator, used to configure the host. | |
nessie.version.store.persist.bigtable.jwt-audience-mapping. <mapping> | String | Google BigTable JWT audience mappings (if necessary). | |
nessie.version.store.persist.bigtable.initial-retry-delay | Duration | Initial retry delay. | |
nessie.version.store.persist.bigtable.max-retry-delay | Duration | Max retry-delay. | |
nessie.version.store.persist.bigtable.retry-delay-multiplier | double | ||
nessie.version.store.persist.bigtable.max-attempts | int | Maximum number of attempts for each Bigtable API call (including retries). | |
nessie.version.store.persist.bigtable.initial-rpc-timeout | Duration | Initial RPC timeout. | |
nessie.version.store.persist.bigtable.max-rpc-timeout | Duration | ||
nessie.version.store.persist.bigtable.rpc-timeout-multiplier | double | ||
nessie.version.store.persist.bigtable.total-timeout | Duration | Total timeout (including retries) for Bigtable API calls. | |
nessie.version.store.persist.bigtable.min-channel-count | int | Minimum number of gRPC channels. Refer to Google docs for details. | |
nessie.version.store.persist.bigtable.max-channel-count | int | Maximum number of gRPC channels. Refer to Google docs for details. | |
nessie.version.store.persist.bigtable.initial-channel-count | int | Initial number of gRPC channels. Refer to Google docs for details | |
nessie.version.store.persist.bigtable.min-rpcs-per-channel | int | Minimum number of RPCs per channel. Refer to Google docs for details. | |
nessie.version.store.persist.bigtable.max-rpcs-per-channel | int | Maximum number of RPCs per channel. Refer to Google docs for details. |
Related Quarkus settings:
Property | Default values | Type | Description |
---|---|---|---|
quarkus.google.cloud.project-id | String | The Google project ID, mandatory. | |
(Google authentication) | See Quarkiverse for documentation. |
Info
A complete set of Google Cloud & BigTable configuration options for Quarkus can be found on Quarkiverse.
JDBC Version Store Settings¶
Setting nessie.version.store.type=JDBC
enables transactional/RDBMS as the version store used by the Nessie server. Configuration of the datastore will be done by Quarkus and depends on many factors, such as the actual database in use. A complete set of JDBC configuration options can be found on quarkus.io.
Property | Default Value | Type | Description |
---|---|---|---|
nessie.version.store.persist.jdbc.catalog | String | The JDBC catalog name. If not provided, will be inferred from the datasource. | |
nessie.version.store.persist.jdbc.schema | String | The JDBC schema name. If not provided, will be inferred from the datasource. |
RocksDB Version Store Settings¶
When setting nessie.version.store.type=ROCKSDB
which enables RocksDB as the version store used by the Nessie server, the following configurations are applicable.
Property | Default Value | Type | Description |
---|---|---|---|
nessie.version.store.persist.rocks.database-path | /tmp/nessie-rocksdb-store | Path | Sets RocksDB storage path. |
Cassandra Version Store Settings¶
When setting nessie.version.store.type=CASSANDRA
which enables Apache Cassandra or ScyllaDB as the version store used by the Nessie server, the following configurations are applicable.
Property | Default Value | Type | Description |
---|---|---|---|
nessie.version.store.cassandra.dml-timeout | PT3S | Duration | Timeout used for queries and updates. |
nessie.version.store.cassandra.ddl-timeout | PT5S | Duration | Timeout used when creating tables. |
Related Quarkus settings:
Property | Default values | Type | Description |
---|---|---|---|
quarkus.cassandra.keyspace | String | The Cassandra keyspace to use. | |
quarkus.cassandra.contact-points | String | The Cassandra contact points, see Quarkus docs. | |
quarkus.cassandra.local-datacenter | String | The Cassandra local datacenter to use, see Quarkus docs. | |
quarkus.cassandra.auth.username | String | Cassandra authentication username, see Quarkus docs. | |
quarkus.cassandra.auth.password | String | Cassandra authentication password, see Quarkus docs. | |
quarkus.cassandra.health.enabled | false | boolean | See Quarkus docs. |
Info
A complete set of the Quarkus Cassandra extension configuration options can be found on quarkus.io
DynamoDB Version Store Settings¶
When setting nessie.version.store.type=DYNAMODB
which enables DynamoDB as the version store used by the Nessie server, the following configurations are applicable.
Property | Default Value | Type | Description |
---|---|---|---|
nessie.version.store.persist.dynamodb.table-prefix | String | Prefix for tables, default is no prefix. |
Related Quarkus settings:
Property | Default values | Type | Description |
---|---|---|---|
quarkus.dynamodb.aws.region | String | Sets DynamoDB AWS region. | |
quarkus.dynamodb.aws.credentials.type | default | String | See Quarkiverse docs for possible values. Sets the credentials provider that should be used to authenticate with AWS. |
quarkus.dynamodb.endpoint-override | URI | Sets the endpoint URI with which the SDK should communicate. If not specified, an appropriate endpoint to be used for the given service and region. | |
quarkus.dynamodb.sync-client.type | url | String | Possible values are: url , apache . Sets the type of the sync HTTP client implementation |
Info
A complete set of DynamoDB configuration options for Quarkus can be found on Quarkiverse.
MongoDB Version Store Settings¶
When setting nessie.version.store.type=MONGODB
which enables MongoDB as the version store used by the Nessie server, the following configurations are applicable in combination with nessie.version.store.type
.
Related Quarkus settings:
Property | Default values | Type | Description |
---|---|---|---|
quarkus.mongodb.database | String | Sets MongoDB database name. | |
quarkus.mongodb.connection-string | String | Sets MongoDB connection string. |
Info
A complete set of MongoDB configuration options for Quarkus can be found on quarkus.io.
In-Memory Version Store Settings¶
No special configuration options for this store type.
Version Store Advanced Settings¶
The following configurations are advanced configurations for version stores to configure how Nessie will store the data into the configured data store:
Usually, only the cache-capacity should be adjusted to the amount of the Java heap “available” for the cache. The default is conservative, bumping the cache size is recommended.
Property | Default Value | Type | Description |
---|---|---|---|
nessie.version.store.persist.repository-id | (empty) | String | Nessie repository ID (optional) that identifies a particular Nessie storage repository. When remote (shared) database is used, multiple Nessie repositories may co-exist in the same database (and in the same schema). In that case this configuration parameter can be used to distinguish those repositories. |
nessie.version.store.persist.commit-retries | 2147483647 | int | maximum retries for CAS-like operations. Used when committing to Nessie, when the HEAD (or tip) of a branch changed during the commit, this value defines the maximum number of retries. Default means unlimited. See: #retryMaxSleepMillis() |
nessie.version.store.persist.commit-timeout-millis | 5000 | long | Timeout for CAS-like operations in milliseconds. See: #retryMaxSleepMillis() |
nessie.version.store.persist.retry-initial-sleep-millis-lower | 5 | long | When the commit logic has to retry an operation due to a concurrent, conflicting update to the database state, usually a concurrent change to a branch HEAD, this parameter defines the initial lower bound of the exponential backoff. See: #retryMaxSleepMillis() |
nessie.version.store.persist.retry-initial-sleep-millis-upper | 25 | long | When the commit logic has to retry an operation due to a concurrent, conflicting update to the database state, usually a concurrent change to a branch HEAD, this parameter defines the initial upper bound of the exponential backoff. See: #retryMaxSleepMillis() |
nessie.version.store.persist.retry-max-sleep-millis | 250 | long | When the commit logic has to retry an operation due to a concurrent, conflicting update to the database state, usually a concurrent change to a branch HEAD, this parameter defines the maximum sleep time. Each retry doubles the lower and upper bounds of the random sleep time, unless the doubled upper bound would exceed the value of this configuration property. See: #retryInitialSleepMillisUpper() |
nessie.version.store.persist.parents-per-commit | 20 | int | Number of parent-commit-hashes stored in each commit. This is used to allow bulk-fetches when accessing the commit log. |
nessie.version.store.persist.max-serialized-index-size | 204800 | int | The maximum allowed serialized size of the content index structure in a reference index _ segment. This value is used to determine, when elements in a reference index segment need to be split. Note: this value _must be smaller than a database’s _hard item/row size limit _. |
nessie.version.store.persist.max-incremental-index-size | 51200 | int | The maximum allowed serialized size of the content index structure in a Nessie commit, called incremental index. This value is used to determine, when elements in an incremental index, which were kept from previous commits, need to be pushed to a new or updated reference index. Note: this value must be smaller than a database’s _hard item/row size limit _. |
nessie.version.store.persist.max-reference-stripes-per-commit | 50 | int | Maximum number of referenced index objects stored inside commit objects. If the external reference index for this commit consists of up to this amount of stripes, the references to the stripes will be stored inside the commit object. If there are more than this amount of stripes, an external index segment will be created instead. |
nessie.version.store.persist.assumed-wall-clock-drift-micros | 5000000 | long | Assumed wall-clock drift between multiple Nessie instances in microseconds. |
nessie.version.store.persist.namespace-validation | true | boolean | Whether namespace validation is enabled, changing this to false will break the Nessie specification! Committing operations by default enforce that all (parent) namespaces exist. This configuration setting is only present for a few Nessie releases to work around potential migration issues and is subject to removal. Since: 0.52.0 Deprecated This setting will be removed. |
nessie.version.store.persist.ref-previous-head-count | 20 | int | Named references keep a history of up to this amount of previous HEAD pointers, and up to the configured age. |
nessie.version.store.persist.ref-previous-head-time-span-seconds | 300 | long | Named references keep a history of previous HEAD pointers with this age in seconds, and up to the configured amount. |
nessie.version.store.persist.cache-capacity-mb | int | Fixed amount of heap used to cache objects, set to 0 to disable the cache entirely. Must not be used with fractional cache sizing. See description for cache-capacity-fraction-of-heap for the default value. | |
nessie.version.store.persist.cache-capacity-fraction-min-size-mb | int | When using fractional cache sizing, this amount in MB is the minimum cache size. | |
nessie.version.store.persist.cache-capacity-fraction-of-heap | double | Fraction of Java’s max heap size to use for cache objects, set to 0 to disable. Must not be used with fixed cache sizing. If neither this value nor a fixed size is configured, a default of .7 (70%) is assumed. | |
nessie.version.store.persist.cache-capacity-fraction-adjust-mb | int | When using fractional cache sizing, this amount in MB of the heap will always be “kept free” when calculating the cache size. |
Authentication settings¶
Configuration for Nessie authentication settings.
Property | Default Value | Type | Description |
---|---|---|---|
nessie.server.authentication.enabled | false | boolean | Enable Nessie authentication. |
Related Quarkus settings:
Property | Default values | Type | Description |
---|---|---|---|
quarkus.oidc.auth-server-url | String | Sets the base URL of the OpenID Connect (OIDC) server if nessie.server.authentication.enabled=true | |
quarkus.oidc.client-id | String | Sets client-id of the application if nessie.server.authentication.enabled=true . Each application has a client-id that is used to identify the application. |
Authorization settings¶
Configuration for Nessie authorization settings.
Property | Default Value | Type | Description |
---|---|---|---|
nessie.server.authorization.enabled | false | boolean | Enable Nessie authorization. |
nessie.server.authorization.type | CEL | String | Sets the authorizer type to use. |
nessie.server.authorization.rules. <name> | String | CEL authorization rules where the key represents the rule id and the value the CEL expression. |
Metrics¶
Metrics are published using prometheus and can be collected via standard methods. See: Prometheus.
Traces¶
Since Nessie 0.46.0, traces are published using OpenTelemetry. See Using OpenTelemetry in the Quarkus documentation.
In order for the server to enable OpenTelemetry and publish its traces, the quarkus.otel.exporter.otlp.traces.endpoint
property must be defined. Its value must be a valid collector endpoint URL, with either http://
or https://
scheme. The collector must talk the OpenTelemetry protocol (OTLP) and the port must be its gRPC port (by default 4317), e.g. “http://otlp-collector:4317”. If this property is not set, the server will not publish traces.
Alternatively, it’s possible to forcibly disable OpenTelemetry at runtime by setting the following property: quarkus.otel.sdk.disabled=true
.
Troubleshooting traces¶
If the server is unable to publish traces, check first for a log warning message like the following:
SEVERE [io.ope.exp.int.grp.OkHttpGrpcExporter] (OkHttp http://localhost:4317/...) Failed to export spans.
The request could not be executed. Full error message: Failed to connect to localhost/0:0:0:0:0:0:0:1:4317
This means that the server is unable to connect to the collector. Check that the collector is running and that the URL is correct.
Swagger UI¶
The Swagger UI allows for testing the REST API and reading the API docs. It is available via localhost:19120/q/swagger-ui
Docker image options¶
By default, Nessie listens on port 19120. To expose that port on the host, use -p 19120:19120
. To expose that port on a different port on the host system, use the -p
option and map the internal port to some port on the host. For example, to expose Nessie on port 8080 of the host system, use the following command:
docker run -p 8080:19120 ghcr.io/projectnessie/nessie
Then you can browse Nessie’s UI on the host by pointing your browser to http://localhost:8080.
Note: this doesn’t change the port Nessie listens on, it only changes the port on the host system that is mapped to the port Nessie listens on. Nessie still listens on port 19120 inside the container. If you want to change the port Nessie listens on, you can use the QUARKUS_HTTP_PORT
environment variable. For example, to make Nessie listen on port 8080 inside the container, and expose it to the host system also on 8080, use the following command:
docker run -p 8080:8080 -e QUARKUS_HTTP_PORT=8080 ghcr.io/projectnessie/nessie
Nessie Docker image types¶
Nessie publishes a Java based multiplatform (for amd64, arm64, ppc64le, s390x) image running on OpenJDK 17.
Advanced Docker image tuning (Java images only)¶
There are many environment variables available to configure the Docker image. If in doubt, leave everything at its default. You can configure the behavior using the following environment variables. They come from the base image used by Nessie, ubi8/openjdk-17. The extensive list of supported environment variables can be found here.
Examples¶
Example | docker run option |
---|---|
Using another GC | -e GC_CONTAINER_OPTIONS="-XX:+UseShenandoahGC" lets Nessie use Shenandoah GC instead of the default parallel GC. |
Set the Java heap size to a fixed amount | -e JAVA_OPTS_APPEND="-Xms8g -Xmx8g" lets Nessie use a Java heap of 8g. |
Reference¶
Environment variable | Description |
---|---|
JAVA_OPTS or JAVA_OPTIONS | NOT RECOMMENDED. JVM options passed to the java command (example: “-verbose:class”). Setting this variable will override all options set by any of the other variables in this table. To pass extra settings, use JAVA_OPTS_APPEND instead. |
JAVA_OPTS_APPEND | User specified Java options to be appended to generated options in JAVA_OPTS (example: “-Dsome.property=foo”). |
JAVA_TOOL_OPTIONS | This variable is defined and honored by all OpenJDK distros, see here. Options defined here take precedence over all else; using this variable is generally not necessary, but can be useful e.g. to enforce JVM startup parameters, to set up remote debug, or to define JVM agents. |
JAVA_MAX_MEM_RATIO | Is used when no -Xmx option is given in JAVA_OPTS. This is used to calculate a default maximal heap memory based on a containers restriction. If used in a container without any memory constraints for the container then this option has no effect. If there is a memory constraint then -Xmx is set to a ratio of the container available memory as set here. The default is 50 which means 50% of the available memory is used as an upper boundary. You can skip this mechanism by setting this value to 0 in which case no -Xmx option is added. |
JAVA_INITIAL_MEM_RATIO | Is used when no -Xms option is given in JAVA_OPTS. This is used to calculate a default initial heap memory based on the maximum heap memory. If used in a container without any memory constraints for the container then this option has no effect. If there is a memory constraint then -Xms is set to a ratio of the -Xmx memory as set here. The default is 25 which means 25% of the -Xmx is used as the initial heap size. You can skip this mechanism by setting this value to 0 in which case no -Xms option is added (example: “25”) |
JAVA_MAX_INITIAL_MEM | Is used when no -Xms option is given in JAVA_OPTS. This is used to calculate the maximum value of the initial heap memory. If used in a container without any memory constraints for the container then this option has no effect. If there is a memory constraint then -Xms is limited to the value set here. The default is 4096MB which means the calculated value of -Xms never will be greater than 4096MB. The value of this variable is expressed in MB (example: “4096”) |
JAVA_DIAGNOSTICS | Set this to get some diagnostics information to standard output when things are happening. This option, if set to true, will set -XX:+UnlockDiagnosticVMOptions . Disabled by default (example: “true”). |
JAVA_DEBUG | If set remote debugging will be switched on. Disabled by default (example: true”). |
JAVA_DEBUG_PORT | Port used for remote debugging. Defaults to 5005 (example: “8787”). |
CONTAINER_CORE_LIMIT | A calculated core limit as described in https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt. (example: “2”) |
CONTAINER_MAX_MEMORY | Memory limit given to the container (example: “1024”). |
GC_MIN_HEAP_FREE_RATIO | Minimum percentage of heap free after GC to avoid expansion.(example: “20”) |
GC_MAX_HEAP_FREE_RATIO | Maximum percentage of heap free after GC to avoid shrinking.(example: “40”) |
GC_TIME_RATIO | Specifies the ratio of the time spent outside the garbage collection.(example: “4”) |
GC_ADAPTIVE_SIZE_POLICY_WEIGHT | The weighting given to the current GC time versus previous GC times. (example: “90”) |
GC_METASPACE_SIZE | The initial metaspace size. (example: “20”) |
GC_MAX_METASPACE_SIZE | The maximum metaspace size. (example: “100”) |
GC_CONTAINER_OPTIONS | Specify Java GC to use. The value of this variable should contain the necessary JRE command-line options to specify the required GC, which will override the default of -XX:+UseParallelGC (example: -XX:+UseG1GC). |
HTTPS_PROXY | The location of the https proxy. (example: “myuser@127.0.0.1:8080”) |
HTTP_PROXY | The location of the http proxy. (example: “myuser@127.0.0.1:8080”) |
NO_PROXY | A comma separated lists of hosts, IP addresses or domains that can be accessed directly. (example: “foo.example.com,bar.example.com”) |