Nessie Spark SQL Extension Reference¶
See also the Nessie Spark SQL Extensions main page.
Nessie SQL commands reference¶
The following syntax descriptions illustrate how commands are used and the order of the clauses.
The commands provided by the Nessie Spark SQL are actually a subset of the commands that are available in the Nessie CLI. Nessie Spark SQL commands however have the IN <catalog-name>
clause, which is not needed in the Nessie CLI.
Info
CODE
style means the term is a keyword.
**BoldTerms** mean variable input, see <u>[Descripton of Command Parts below](#command-parts)</u>
Square brackets `[` `]` mean that the contents are optional (0 or 1 occurrence).
Curly brackets `{` `}` mean that the contents can be repeated 0 or more times.
CREATE BRANCH
/ TAG
¶
CREATE
ReferenceType
[IF
NOT
EXISTS
]
ReferenceName
[IN
CatalogName ]
[FROM
ExistingReference ]
[AT
[TIMESTAMP
|COMMIT
] TimestampOrCommit ]
Creates a new Nessie branch or tag using the name specified using the ReferenceName
parameter. The reference type is specified using the ReferenceType
parameter.
By default, the new branch or tag is created from the latest commit on the current reference of the Nessie CLI (see USE
statement). Another source reference name can use specified using the FROM
clause. The optional AT
clause allows specifying a different commit ID (hash) to create the new reference from.
This command will fail, if a references with the name ReferenceName
already exists, unless the optional IF NOT EXISTS
is specified.
DROP BRANCH
/ TAG
¶
DROP
ReferenceType
[IF
EXISTS
]
ExistingReference
[IN
CatalogName ]
Drops a new Nessie branch or tag using the name specified using the ReferenceName
parameter. The reference type is specified using the ReferenceType
parameter.
This command will fail, if a references with the name ReferenceName
does not exist, unless the optional IF EXISTS
is specified.
ASSIGN BRANCH
/ TAG
¶
ASSIGN
ReferenceType
[ ExistingReference ]
[TO
ExistingReference [AT
[TIMESTAMP
|COMMIT
] TimestampOrCommit ] ]
[IN
CatalogName ]
Assigns a Nessie branch or tag using the name specified using the ReferenceName
parameter to another commit. The reference type is specified using the ReferenceType
parameter.
By default, the branch or tag is updated to the latest commit on the current reference of the Nessie CLI (see USE
statement). Another target reference name can use specified using the TO
clause. The optional AT
clause allows specifying a different commit ID (hash) to assign the reference to.
LIST REFERENCES
¶
LIST
REFERENCES
[FILTER
Value
| [STARTING
WITH
Value ] [CONTAINING
Value ]
]
[IN
CatalogName ]
Lists all named references.
An optional CEL-filter can be specified, which is evaluated on the server side.
The optional STARTING WITH
clause starts the output at the content-key with the given value.
The optional CONTAINING
clause only outputs entities with a content-key that contain the given value.
MERGE BRANCH
¶
MERGE
[DRY
]
[ ReferenceType ]
ExistingReference
[AT
[TIMESTAMP
|COMMIT
] TimestampOrCommit ]
[INTO
ExistingReference ]
[BEHAVIOR
MergeBehaviorKind ]
[BEHAVIORS
ContentKey=
MergeBehaviorKind {AND
ContentKey=
MergeBehaviorKind } ]
[IN
CatalogName ]
Merges a branch or tag into another branch, supporting manual conflict resolution.
The optional DRY
keyword defines that Nessie shall simulate a merge operation. This is useful to check whether a merge operation would succeed.
Specifying the name of the “from” reference is mandatory. By default, the latest commit of the “from” branch or tag will be merged, which can be overridden using the AT
clause.
By default, MERGE
uses the CLI’s current reference as the target branch. The INTO
clause can be used to specify another target branch.
Nessie merge operations currently support three different merge behaviors:
NORMAL
: a merge succeeds, if the content does not have a conflicting change in the target branch.FORCE
: a merge always succeeds, the content from the “from” reference will be applied onto the target branch.DROP
: likeNORMAL
, but does not cause a conflict, so does not fail the whole merge operation.
The merge behavior for all contents defaults to NORMAL
and can be changed using the BEHAVIOR
clause.
Specific merge behaviors can be specified using the BEHAVIORS
clause for individual content keys.
SHOW LOG
¶
SHOW
LOG
[ [ON
[ ReferenceType ] ] ExistingReference ]
[AT
[TIMESTAMP
|COMMIT
] TimestampOrCommit ]
[LIMIT
PositiveInt ]
[IN
CatalogName ]
Shows the Nessie commit log.
By default, the commit log fetched for the current reference of the Nessie CLI, or in the branch or tab specified using the IN
clause. By default entities on the latest commit of the branch or tag will be listed, which can be overridden using the AT
clause.
The output can be limited using the LIMIT
clause. It is safe to omit the LIMIT
clause for ANSI terminals, because the commit log will be safely paged with neither overloading the Nessie CLI or Nessie server.
SHOW REFERENCE
¶
SHOW
REFERENCE
[ ExistingReference ]
[AT
[TIMESTAMP
|COMMIT
] TimestampOrCommit ]
[IN
CatalogName ]
Shows information about the current or given reference.
If no reference is specified, information about the current reference of the Nessie CLI is shown, otherwise information about the given reference. By default, entities information of latest commit of the branch or tag will be shown, which can be overridden using the AT
clause.
Command parts¶
CatalogName¶
Spark catalog name.
ReferenceType¶
BRANCH
|TAG
ExistingReference¶
Name of an existing reference in Nessie.
ReferenceName¶
Nessie reference name.
TimestampOrCommit¶
Either a Nessie commit ID (hash) or a timestamp in ISO format. Examples:
2024-04-26T10:31:05.277650575Z
is a valid ISO timestampfa32a50d5303a53826f65649277561f5c6772eba019e7f1e01a359becb764877
is a valid Nessie commit ID (hash)