forked from delta-io/delta-sharing
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit 043a39e
authored
fix(deps): update dependency io.delta:delta-standalone_2.13 to v3 (#170)
[](https://renovatebot.com)
This PR contains the following updates:
| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [io.delta:delta-standalone_2.13](https://delta.io/)
([source](https://togithub.com/delta-io/delta)) | `0.6.0` -> `3.0.0` |
[](https://docs.renovatebot.com/merge-confidence/)
|
[](https://docs.renovatebot.com/merge-confidence/)
|
[](https://docs.renovatebot.com/merge-confidence/)
|
[](https://docs.renovatebot.com/merge-confidence/)
|
---
### Release Notes
<details>
<summary>delta-io/delta (io.delta:delta-standalone_2.13)</summary>
### [`v3.0.0`](https://togithub.com/delta-io/delta/releases/tag/v3.0.0):
Delta Lake 3.0.0
We are excited to announce the final release of Delta Lake 3.0.0. This
release includes several exciting new features and artifacts.
#### Highlights
Here are the most important aspects of 3.0.0:
##### Spark 3.5 Support
Unlike the initial preview release, Delta Spark is now built on top of
Apache Spark™ 3.5. See the Delta Spark section below for more details.
##### Delta Universal Format (UniForm)
- Documentation: https://docs.delta.io/3.0.0/delta-uniform.html
- Maven artifacts:
[delta-iceberg\_2.12](https://repo1.maven.org/maven2/io/delta/delta-iceberg\_2.12/3.0.0/),
[delta-iceberg\_2.13](https://repo1.maven.org/maven2/io/delta/delta-iceberg\_2.13/3.0.0/)
Delta Universal Format (UniForm) will allow you to read Delta tables
with Hudi and Iceberg clients. Iceberg support is available with this
release. UniForm takes advantage of the fact that all table storage
formats, such as Delta, Iceberg, and Hudi, actually consist of Parquet
data files and a metadata layer. In this release, UniForm automatically
generates Iceberg metadata and commits to Hive metastore, allowing
Iceberg clients to read Delta tables as if they were Iceberg tables.
Create a UniForm-enabled table using the following command:
```sql
CREATE TABLE T (c1 INT) USING DELTA TBLPROPERTIES (
'delta.universalFormat.enabledFormats' = 'iceberg');
```
Every write to this table will automatically keep Iceberg metadata
updated. See the documentation
[here](https://docs.delta.io/3.0.0/delta-uniform.html) for more details,
and the key implementations
[here](https://togithub.com/delta-io/delta/commit/9b50cd206004ae28105846eee9d910f39019ab8b)
and [here](https://togithub.com/delta-io/delta/commit/01fee68c).
##### Delta Kernel
- API documentation:
https://docs.delta.io/3.0.0/api/java/kernel/index.html
- Maven artifacts:
[delta-kernel-api](https://repo1.maven.org/maven2/io/delta/delta-kernel-api/3.0.0/),
[delta-kernel-defaults](https://repo1.maven.org/maven2/io/delta/delta-kernel-defaults/3.0.0/)
The Delta Kernel project is a set of Java libraries (Rust will be coming
soon!) for building Delta connectors that can read (and, soon, write to)
Delta tables without the need to understand the [Delta protocol
details](https://togithub.com/delta-io/delta/blob/master/PROTOCOL.md)).
You can use this library to do the following:
- Read data from Delta tables in a single thread in a single process.
- Read data from Delta tables using multiple threads in a single
process.
- Build a complex connector for a distributed processing engine and read
very large Delta tables.
- \[soon!] Write to Delta tables from multiple threads / processes /
distributed engines.
Reading a Delta table with Kernel APIs is as follows.
```java
TableClient myTableClient = DefaultTableClient.create() ; // define a client
Table myTable = Table.forPath(myTableClient, "/delta/table/path"); // define what table to scan
Snapshot mySnapshot = myTable.getLatestSnapshot(myTableClient); // define which version of table to scan
Predicate scanFilter = ... // define the predicate
Scan myScan = mySnapshot.getScanBuilder(myTableClient) // specify the scan details
.withFilters(scanFilter)
.build();
Scan.readData(...) // returns the table data
```
Full example code can be found
[here](https://togithub.com/delta-io/delta/blob/branch-3.0/kernel/examples/table-reader/src/main/java/io/delta/kernel/examples/SingleThreadedTableReader.java).
For more information, refer to:
- [User
guide](https://togithub.com/delta-io/delta/blob/branch-3.0/kernel/USER_GUIDE.md)
on step by step process of using Kernel in a standalone Java program or
in a distributed processing connector.
-
[Slides](https://docs.google.com/presentation/d/1PGSSuJ8ndghucSF9GpYgCi9oeRpWolFyehjQbPh92-U/edit)
explaining the rationale behind Kernel and the API design.
- Example [Java
programs](https://togithub.com/delta-io/delta/tree/branch-3.0/kernel/examples/table-reader/src/main/java/io/delta/kernel/examples)
that illustrate how to read Delta tables using the Kernel APIs.
- Table and default TableClient API Java
[documentation](https://docs.delta.io/3.0.0/api/java/kernel/index.html)
This release of Delta contains the Kernel Table API and default
TableClient API definitions and implementation which allow:
- Reading Delta tables with optional Deletion Vectors enabled or column
mapping (name mode only) enabled.
- Partition pruning optimization to reduce the number of data files to
read.
##### Welcome Delta Connectors to the Delta repository!
All previous connectors from https://github.com/delta-io/connectors have
been moved to this repository (https://github.com/delta-io/delta) as we
aim to unify our Delta connector ecosystem structure. This includes
Delta-Standalone, Delta-Flink, Delta-Hive, PowerBI, and
SQL-Delta-Import. The repository https://github.com/delta-io/connectors
is now deprecated.
#### Delta Spark
Delta Spark 3.0.0 is built on top of [Apache Spark™
3.5](https://spark.apache.org/releases/spark-release-3-5-0.html).
Similar to Apache Spark, we have released Maven artifacts for both Scala
2.12 and Scala 2.13. Note that the Delta Spark maven artifact has been
renamed from **delta-core** to **delta-spark**.
- Documentation: https://docs.delta.io/3.0.0/index.html
- API documentation:
https://docs.delta.io/3.0.0/delta-apidoc.html#delta-spark
- Maven artifacts:
[delta-spark\_2.12](https://repo1.maven.org/maven2/io/delta/delta-spark\_2.12/3.0.0/),
[delta-spark\_2.13](https://repo1.maven.org/maven2/io/delta/delta-spark\_2.13/3.0.0/),
[delta-contribs\_2.12](https://repo1.maven.org/maven2/io/delta/delta-contribs\_2.12/3.0.0/),
[delta_contribs\_2.13](https://repo1.maven.org/maven2/io/delta/delta-contribs\_2.13/3.0.0/),
[delta-storage](https://repo1.maven.org/maven2/io/delta/delta-storage/3.0.0/),
[delta-storage-s3-dynamodb](https://repo1.maven.org/maven2/io/delta/delta-storage-s3-dynamodb/3.0.0/),
[delta-iceberg\_2.12](https://repo1.maven.org/maven2/io/delta/delta-iceberg\_2.12/3.0.0/),
[delta-iceberg\_2.13](https://repo1.maven.org/maven2/io/delta/delta-iceberg\_2.13/3.0.0/)
- Python artifacts: https://pypi.org/project/delta-spark/3.0.0/
The key features of this release are:
- [Support for Apache Spark
3.5](https://togithub.com/delta-io/delta/commit/4f9c8b9cc294ec7b321847115bf87909c356bc5a)
- [Delta Universal
Format](https://togithub.com/delta-io/delta/commit/9b50cd206004ae28105846eee9d910f39019ab8b)
- Write as Delta, read as Iceberg! See the highlighted section above.
- [Up to 10x performance improvement of UPDATE using Deletion
Vectors](https://togithub.com/delta-io/delta/commit/0a0ea97b) - Delta
UPDATE operations now support writing Deletion Vectors. When enabled,
the performance of UPDATEs will receive a significant boost.
- [More than 2x performance improvement of DELETE using Deletion
Vectors](https://togithub.com/delta-io/delta/commit/fc39f78d) - This fix
improves the file path canonicalization logic by avoiding calling
expensive `Path.toUri.toString` calls for each row in a table, resulting
in a several hundred percent speed boost on DELETE operations (only when
Deletion Vectors have been
[enabled](https://docs.delta.io/latest/delta-deletion-vectors.html#enable-deletion-vectors)
on the table).
- [Up to 2x faster MERGE operation
](https://togithub.com/delta-io/delta/issues/1827)- MERGE now better
leverages data skipping, the ability to use the insert-only code path in
more cases, and an overall improved execution to achieve up to 2x better
performance in various scenarios.
- [Support streaming reads from column mapping enabled
tables](https://togithub.com/delta-io/delta/commit/3441df16) when `DROP
COLUMN` and `RENAME COLUMN` have been used. This includes streaming
support for Change Data Feed. See the documentation
[here](https://docs.delta.io/3.0.0/delta-streaming.html#tracking-non-additive-schema-changes)
for more details.
- [Support specifying the columns for which Delta will collect
file-skipping
statistics](https://togithub.com/delta-io/delta/commit/8f2b532a) via the
table property `delta.dataSkippingStatsColumns`. Previously, Delta would
only collect file-skipping statistics for the first N columns in the
table schema (default to 32). Now, users can easily customize this.
- [Support](https://togithub.com/delta-io/delta/commit/d9a5f9f9)
zero-copy [convert to Delta from
Iceberg](https://docs.delta.io/3.0.0/delta-utility.html#convert-an-iceberg-table-to-a-delta-table)
tables on Apache Spark 3.5 using `CONVERT TO DELTA`. This feature was
excluded from the Delta Lake 2.4 release since Iceberg did not yet
support Apache Spark 3.4 (or 3.5). This command generates a Delta table
in the same location and does not rewrite any parquet files.
- [Checkpoint
V2](https://togithub.com/delta-io/delta/blob/master/PROTOCOL.md#v2-checkpoint-table-feature)
- Introduced a new [Checkpoint V2
format](https://togithub.com/delta-io/delta/blob/master/PROTOCOL.md#v2-checkpoint-table-feature)
in Delta Protocol Specification and implemented
[read](https://togithub.com/delta-io/delta/commit/6859c863e88bfe7be6d5ccbb0c221bdde57a00c3)/[write](https://togithub.com/delta-io/delta/commit/7442ebfb8df1ae7ed8630d092abd617c110be5d6)
support in Delta Spark. The new checkpoint v2 format provides more
reliability over the existing v1 checkpoint format.
- [Log
Compactions](https://togithub.com/delta-io/delta/commit/5d43f1db5975dca31da29f714b1a155aa4367aee)
- Introduced new log compaction files in Delta Protocol Specification
which could be useful in reducing the frequency of Delta checkpoints.
Added [read
support](https://togithub.com/delta-io/delta/commit/0e05caf5c2124f61da69dc6671c8011450a6e831)
for log compaction files in Delta Spark.
- [Safe casts enabled by default for UPDATE and MERGE
operations](https://togithub.com/delta-io/delta/commit/6d78d434) - Delta
UPDATE and MERGE operations now result in an error when values cannot be
safely cast to the type in the target table schema. All implicit casts
in Delta now follow `spark.sql.storeAssignmentPolicy` instead of
`spark.sql.ansi.enabled`.
- [General Apache Spark catalog support for auxiliary commands
](https://togithub.com/delta-io/delta/commit/4eb177eaf4c16080887d78407bb64a4183832686)–
Several popular auxiliary commands now support general table resolution
in Apache Spark. This simplifies the code and also makes it possible to
use these commands with custom table catalogs based on Delta Lake
tables. The following commands are now supported in this way: VACUUM,
RESTORE TABLE, DESCRIBE DETAIL, DESCRIBE HISTORY, SHALLOW CLONE,
OPTIMIZE.
Other notable changes include
-
[Fix](https://togithub.com/delta-io/delta/commit/7251507fd83518fd206e54574968054f77a11cc0)
for a bug in MERGE statements that contain a scalar subquery with
non-deterministic results. Such a subquery can return different results
during source materialization, while finding matches, and while writing
modified rows. This can cause rows to be either dropped or duplicated.
- [Fix](https://togithub.com/delta-io/delta/commit/2d922660) for
potential resource leak when DV file not found during parquet read
- [Support](https://togithub.com/delta-io/delta/commit/f0a38649)
protocol version downgrade
- [Fix](https://togithub.com/delta-io/delta/commit/9a5eeb73) to initial
preview release to support converting null partition values in UniForm
- [Fix](https://togithub.com/delta-io/delta/commit/d9ba620c) to WRITE
command to not commit empty transactions, just like what DELETE, UPDATE,
and MERGE commands do already
- [Support](https://togithub.com/delta-io/delta/commit/3ff4075d) 3-part
table name identifier. Now, commands like `OPTIMIZE
<catalog>.<db>.<tbl>` will work.
- [Performance
improvement](https://togithub.com/delta-io/delta/commit/d19e989e) to CDF
read queries scanning in batch to reduce the number of cloud requests
and to reduce Spark scheduler pressure
- [Fix](https://togithub.com/delta-io/delta/commit/8a2da73d) for edge
case in CDF read query optimization due to incorrect statistic value
- [Fix](https://togithub.com/delta-io/delta/commit/d36623f0) for edge
case in streaming reads where having the same file with different DVs in
the same batch would yield incorrect results as the wrong file and DV
pair would be read
- [Prevent](https://togithub.com/delta-io/delta/commit/d9070685) table
corruption by disallowing `overwriteSchema` when partitionOverwriteMode
is set to dynamic
- [Fix](https://togithub.com/delta-io/delta/commit/e41db5c1) a bug where
DELETE with DVs would not work on Column Mapping-enabled tables
- [Support](https://togithub.com/delta-io/delta/commit/dbb22100)
automatic schema evolution in structs that are inside maps
- [Minor fix](https://togithub.com/delta-io/delta/commit/7e51538d) to
Delta table path URI concatenation
- [Support](https://togithub.com/delta-io/delta/commit/84c869c5) writing
parquet data files to the `data` subdirectory via the SQL configuration
`spark.databricks.delta.write.dataFilesToSubdir`. This is used to add
UniForm support on BigQuery.
#### Delta Flink
Delta-Flink 3.0.0 is built on top of Apache Flink™ 1.16.1.
- Documentation:
https://github.com/delta-io/delta/tree/branch-3.0/connectors/flink
- API Documentation:
https://docs.delta.io/3.0.0/api/java/flink/index.html
- Maven artifact:
[delta-flink](https://repo1.maven.org/maven2/io/delta/delta-flink/3.0.0/)
The key features of this release are
- Support for [Flink SQL and
Catalog](https://togithub.com/delta-io/delta/commit/47ae5a35). You can
now use the Flink/Delta connector for Flink SQL jobs. You can CREATE
Delta tables, SELECT data from them (uses the Delta Source), and INSERT
new data into them (uses the Delta Sink). Note: for correct operations
on the Delta tables, you must first configure the Delta Catalog using
CREATE CATALOG before running a SQL command on Delta tables. For more
information, please see the documentation
[here](https://togithub.com/delta-io/delta/blob/branch-3.0/connectors/flink/README.md).
- [Significant performance
improvement](https://togithub.com/delta-io/delta/commit/5759de83) to
Global Committer initialization - The last-successfully-committed delta
version by a given Flink application is now loaded lazily significantly
reducing the CPU utilization in the most common scenarios.
Other notable changes include
- [Fix](https://togithub.com/delta-io/delta/commit/23826a3b) a bug where
Flink STRING types were incorrectly truncated to type VARCHAR with
length 1
#### Delta Standalone
- Documentation: https://docs.delta.io/3.0.0/delta-standalone.html
- API Documentation:
https://docs.delta.io/3.0.0/api/java/standalone/index.html
- Maven artifacts:
[delta-standalone\_2.12](https://repo1.maven.org/maven2/io/delta/delta-standalone\_2.12/3.0.0/),
[delta-standalone\_2.13](https://repo1.maven.org/maven2/io/delta/delta-standalone\_2.13/3.0.0/)
The key features in this release are:
- [Support](https://togithub.com/delta-io/delta/commit/baf54ffd) for
disabling Delta checkpointing during commits - For very large tables
with millions of files, performing Delta checkpoints can become an
expensive overhead during writes. Users can now disable this
checkpointing by setting the hadoop configuration property
`io.delta.standalone.checkpointing.enabled` to `false`. This is only
safe and suggested to do if another job will periodically perform the
checkpointing.
- [Performance](https://togithub.com/delta-io/delta/commit/f11c3556)
improvement to snapshot initialization - When a delta table is loaded at
a particular version, the snapshot must contain, at a minimum, the
latest protocol and metadata. This PR improves the snapshot load
performance for repeated table changes.
- [Support adding absolute
paths](https://togithub.com/delta-io/delta/commit/02a46d19) to the Delta
log - This now enables users to manually perform `SHALLOW CLONE`s and
create Delta tables with external files.
- [Fix](https://togithub.com/delta-io/delta/commit/4dadc028) in schema
evolution to prevent adding non-nullable columns to existing Delta
tables
#### Credits
Adam Binford, Ahir Reddy, Ala Luszczak, Alex, Allen Reese, Allison
Portis, Ami Oka, Andreas Chatzistergiou, Animesh Kashyap, Anonymous,
Antoine Amend, Bart Samwel, Bo Gao, Boyang Jerry Peng, Burak Yavuz,
CabbageCollector, Carmen Kwan, ChengJi-db, Christopher Watford, Christos
Stavrakakis, Costas Zarifis, Denny Lee, Desmond Cheong, Dhruv Arya, Eric
Maynard, Eric Ogren, Felipe Pessoto, Feng Zhu, Fredrik Klauss, Gengliang
Wang, Gerhard Brueckl, Gopi Krishna Madabhushi, Grzegorz Kołakowski,
Hang Jia, Hao Jiang, Herivelton Andreassa, Herman van Hovell, Jacek
Laskowski, Jackie Zhang, Jiaan Geng, Jiaheng Tang, Jiawei Bao, Jing
Wang, Johan Lasperas, Jonas Irgens Kylling, Jungtaek Lim, Junyong Lee,
K.I. (Dennis) Jung, Kam Cheung Ting, Krzysztof Chmielewski, Lars Kroll,
Lin Ma, Lin Zhou, Luca Menichetti, Lukas Rupprecht, Martin Grund, Min
Yang, Ming DAI, Mohamed Zait, Neil Ramaswamy, Ole Sasse, Olivier
NOUGUIER, Pablo Flores, Paddy Xu, Patrick Pichler, Paweł Kubit, Prakhar
Jain, Pulkit Singhal, RunyaoChen, Ryan Johnson, Sabir Akhadov, Satya
Valluri, Scott Sandre, Shixiong Zhu, Siying Dong, Son, Tathagata Das,
Terry Kim, Tom van Bussel, Venki Korukanti, Wenchen Fan, Xinyi, Yann
Byron, Yaohua Zhao, Yijia Cui, Yuhong Chen, Yuming Wang, Yuya Ebihara,
Zhen Li, aokolnychyi, gurunath, jintao shen, maryannxue, noelo,
panbingkun, windpiger, wwang-talend, sherlockbeard
</details>
---
### Configuration
📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.
♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.
🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.
---
- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box
---
This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/agile-lab-dev/whitefox).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4xMjcuMCIsInVwZGF0ZWRJblZlciI6IjM3LjEyNy4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>1 parent f1a8b9e commit 043a39eCopy full SHA for 043a39e
File tree
Expand file treeCollapse file tree
1 file changed
+1
-1
lines changedFilter options
- server/core
Expand file treeCollapse file tree
1 file changed
+1
-1
lines changedCollapse file: server/core/build.gradle.kts
server/core/build.gradle.kts
Copy file name to clipboardExpand all lines: server/core/build.gradle.kts+1-1Lines changed: 1 addition & 1 deletion
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
27 | 27 |
| |
28 | 28 |
| |
29 | 29 |
| |
30 |
| - | |
| 30 | + | |
31 | 31 |
| |
32 | 32 |
| |
33 | 33 |
| |
|
0 commit comments