Skip to content

Conversation

wenym1
Copy link

@wenym1 wenym1 commented Dec 6, 2024

Previously, we use async_trait for trait IcebergWriter and IcebergWriterBuilder. For traits implemented with async_trait, all call to the async methods will generate a BoxedFuture, which may incur unnecessary cost in box allocation.

In this PR, we will avoid using async_trait for the two traits, so that the BoxedFuture can be optionally avoided. To retain the object-safety, we provide with the object-safe counterpart to the two traits, named DynIcebergWriter and DynIcebergWriterBuilder. We do impl IcebergWriter for Box<dyn DynIcebergWriter> and impl IcebergWriterBuilder for Box<dyn DynIcebergWriterBuilder>, so that the type erased dyn trait object can still be used as IcebergWriter and IcebergWriterBuilder. Nevertheless, for the two dyn traits, the futures generated from calling their async methods are still boxed futures.

Note that, after this PR, there can be backward compatibility issue in the public API of the library. When impl the two traits, we may have to remove the previous async_trait that wraps the impl block like what we did in this PR.

@wenym1
Copy link
Author

wenym1 commented Dec 6, 2024

cc @ZENOTME

@ZENOTME
Copy link
Contributor

ZENOTME commented Dec 6, 2024

Thanks, @wenym1! I think PR is great so that we can't avoid some limits to designing the writer API because of object safety. cc @liurenjie1024 @Xuanwo @Fokko

@ZENOTME
Copy link
Contributor

ZENOTME commented Dec 10, 2024

This PR needs to fix the conflict after #741. It change the interface of writer builder

@wenym1 wenym1 requested a review from ZENOTME December 10, 2024 04:43
@wenym1
Copy link
Author

wenym1 commented Dec 10, 2024

@ZENOTME Comments are addressed. PTAL

@wenym1 wenym1 requested a review from ZENOTME December 10, 2024 04:56
@liurenjie1024
Copy link
Contributor

Thanks @wenym1 for this pr, could you elaborate the benefit of this change? As you said, this may introduce breaking api change, why we need to do this? One point you mentioned is the box allocation, do we have measurement of how much this cost is compared with actual IO?

@ZENOTME
Copy link
Contributor

ZENOTME commented Dec 10, 2024

Thanks @wenym1 for this pr, could you elaborate the benefit of this change? As you said, this may introduce breaking api change, why we need to do this? One point you mentioned is the box allocation, do we have measurement of how much this cost is compared with actual IO?

Personally, I think more important benefits of this PR is to provide extra dyn traits for object safety. After separating these two trait, we can design the inner trait without worrying about the object safety. It can avoid some problem like #703 (comment). Also, after this PR, our writer builder be object safe now, it originally isn't. In practice, we found it's useful for this because in some case, user want to store writer builder in some place and wrap it Box make things easier.

@ZENOTME
Copy link
Contributor

ZENOTME commented Dec 12, 2024

I find that the implementation has some problems now, it will cause recursive calls endlessly and stack overflow finally.

Reproduce:

   #[tokio::test]
    async fn test_box_writer() {
        let temp_dir = TempDir::new().unwrap();
        let file_io = FileIOBuilder::new_fs_io().build().unwrap();
        let location_gen =
            MockLocationGenerator::new(temp_dir.path().to_str().unwrap().to_string());
        let file_name_gen =
            DefaultFileNameGenerator::new("test".to_string(), None, DataFileFormat::Parquet);

        let pw = ParquetWriterBuilder::new(
            WriterProperties::builder().build(),
            file_io.clone(),
            location_gen,
            file_name_gen,
        );
        let data_file_builder =
            DataFileWriterBuilder::new(Arc::new(Schema::builder().build().unwrap()), pw, None).boxed();

        // stack overflow here.
        let mut writer = data_file_builder.build().await.unwrap();
    }

@wenym1
Copy link
Author

wenym1 commented Dec 12, 2024

I find that the implementation has some problems now, it will cause recursive calls endlessly and stack overflow finally.

Reproduce:

   #[tokio::test]
    async fn test_box_writer() {
        let temp_dir = TempDir::new().unwrap();
        let file_io = FileIOBuilder::new_fs_io().build().unwrap();
        let location_gen =
            MockLocationGenerator::new(temp_dir.path().to_str().unwrap().to_string());
        let file_name_gen =
            DefaultFileNameGenerator::new("test".to_string(), None, DataFileFormat::Parquet);

        let pw = ParquetWriterBuilder::new(
            WriterProperties::builder().build(),
            file_io.clone(),
            location_gen,
            file_name_gen,
        );
        let data_file_builder =
            DataFileWriterBuilder::new(Arc::new(Schema::builder().build().unwrap()), pw, None).boxed();

        // stack overflow here.
        let mut writer = data_file_builder.build().await.unwrap();
    }

Thanks for pointing it out. The stack overflow was caused by accidentally repeatedly interleaving call on the build of IcebergWriterBuilder and DynIcebergWriterBuilder. Already fixed.

@Xuanwo
Copy link
Member

Xuanwo commented Dec 13, 2024

Hi, thank you @wenym1 for your work on this, and thanks to @ZENOTME and @liurenjie1024 for their reviews.

I'm a bit concerned about the complexity this PR introduces.

One point you mentioned is the box allocation, do we have measurement of how much this cost is compared with actual IO?

Given that users always utilize the dyn-compatible API from outside, I believe the box allocation cannot be avoided.

Personally, I think more important benefits of this PR is to provide extra dyn traits for object safety.

I thought IcebergWriter was already a dyn-compatible trait, but IcebergWriterBuilder is not. Maybe we could work on IcebergWriterBuilder directly? Perhaps we could modify it to take &self instead of self.

@Xuanwo
Copy link
Member

Xuanwo commented Apr 14, 2025

Hi, I personally feel that the benefits of avoiding the use of async_trait at our FileWrite level aren't worth it. I'm going to close this PR now. Feel free to start a discussion about your motivation behind this!

@Xuanwo Xuanwo closed this Apr 14, 2025
@wenym1
Copy link
Author

wenym1 commented Apr 14, 2025

Hi, I personally feel that the benefits of avoiding the use of async_trait at our FileWrite level aren't worth it. I'm going to close this PR now. Feel free to start a discussion about your motivation behind this!

Sorry for leaving these PRs for a while.

For motivation, it's more like a general rust improvement, to allow improving from dynamic dispatch to static dispatch. For async_trait, every call on its async method will create a BoxedFuture, which involves a box allocation. The relative cost of this box allocation depends on the cost of the actual work of function body. If we write/read a large chunk in a single call, then the cost can be ignored, but if we only read/write a small slice, this cost will amplify the overall total cost.

I haven't done a benchmark on iceberg-rust yet. In RisingWave, we used to have a similar PR risingwavelabs/risingwave#4182. In RisingWave's LSM SST iterator, we used to have to create a BoxedFuture for every row in SST due to the previous use of async_trait. After using static-typed future, we've observed at most 20% improvement in the benchmark.

@liurenjie1024
Copy link
Contributor

Hi, I personally feel that the benefits of avoiding the use of async_trait at our FileWrite level aren't worth it. I'm going to close this PR now. Feel free to start a discussion about your motivation behind this!

Sorry for leaving these PRs for a while.

For motivation, it's more like a general rust improvement, to allow improving from dynamic dispatch to static dispatch. For async_trait, every call on its async method will create a BoxedFuture, which involves a box allocation. The relative cost of this box allocation depends on the cost of the actual work of function body. If we write/read a large chunk in a single call, then the cost can be ignored, but if we only read/write a small slice, this cost will amplify the overall total cost.

I haven't done a benchmark on iceberg-rust yet. In RisingWave, we used to have a similar PR risingwavelabs/risingwave#4182. In RisingWave's LSM SST iterator, we used to have to create a BoxedFuture for every row in SST due to the previous use of async_trait. After using static-typed future, we've observed at most 20% improvement in the benchmark.

I'm also not a big fan of approach used in this pr, it makes things too complicated to maintain. Perf improvements without clarifying experiments setup and execution seems meaningless to me.

@Xuanwo
Copy link
Member

Xuanwo commented Apr 14, 2025

I haven't done a benchmark on iceberg-rust yet. In RisingWave, we used to have a similar PR risingwavelabs/risingwave#4182. In RisingWave's LSM SST iterator, we used to have to create a BoxedFuture for every row in SST due to the previous use of async_trait. After using static-typed future, we've observed at most 20% improvement in the benchmark.

Thank you for sharing this. However, the difference between having loops inside a Box<dyn Iterator> and impl Iterator can be quite significant compared to the difference between a Box<dyn Future> and impl Future, especially when I/O is involved.

I suggest having a simple benchmark for us to evaluate in these cases.

@wenym1
Copy link
Author

wenym1 commented Apr 14, 2025

I haven't done a benchmark on iceberg-rust yet. In RisingWave, we used to have a similar PR risingwavelabs/risingwave#4182. In RisingWave's LSM SST iterator, we used to have to create a BoxedFuture for every row in SST due to the previous use of async_trait. After using static-typed future, we've observed at most 20% improvement in the benchmark.

Thank you for sharing this. However, the difference between having loops inside a Box<dyn Iterator> and impl Iterator can be quite significant compared to the difference between a Box<dyn Future> and impl Future, especially when I/O is involved.

I suggest having a simple benchmark for us to evaluate in these cases.

Just to clarify, the mentioned usage is more like Box<dyn Stream> vs impl Stream, because the LSM SST iterator also involves I/O, and this is the ultimate reason for using async.

And for Box<dyn Future>, it means only some, rather than all, of the calls involve I/O. For example, in a concrete FileWrite implementation, we may have a large write buffer, and in every call on write, we append new data to the write buffer, and do I/O to send the buffered data only when the buffer is full. In this scenario, only the few writes that make the buffer full involve I/O. In the worse case, when we write data byte by byte, we will have to repeatedly create a Box<dyn Future> for simply writing a byte to the write buffer for each byte.

Anyway, I will have some simple benchmarks later to see the effects later when I have time.

@Xuanwo
Copy link
Member

Xuanwo commented Apr 14, 2025

And for Box<dyn Future>, it means only some, rather than all, of the calls involve I/O. For example, in a concrete FileWrite implementation, we may have a large write buffer, and in every call on write, we append new data to the write buffer, and do I/O to send the buffered data only when the buffer is full. In this scenario, only the few writes that make the buffer full involve I/O. In the worse case, when we write data byte by byte, we will have to repeatedly create a Box<dyn Future> for simply writing a byte to the write buffer for each byte.

Thank you @wenym1 for the explanation. This makes much more sense—the cost of building a Box<dyn Future> for each byte can be relatively high.

I'm looking forward to seeing your results on this. I'm happy to continue the discussion if you're willing to start a separate one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants