Skip to content

Add datafusion cli for iceberg #1143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Add datafusion cli for iceberg #1143

wants to merge 2 commits into from

Conversation

liurenjie1024
Copy link
Contributor

Which issue does this PR close?

What changes are included in this PR?

Initial check in iceberg cli.

Are these changes tested?

Yes, ut.

@liurenjie1024 liurenjie1024 requested a review from Xuanwo March 27, 2025 04:47
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 285 files.

Valid Invalid Ignored Fixed
241 5 39 0
Click to see the invalid file list
  • crates/integrations/cli/Cargo.toml
  • crates/integrations/cli/README.md
  • crates/integrations/cli/src/catalog.rs
  • crates/integrations/cli/src/lib.rs
  • crates/integrations/cli/src/main.rs
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 285 files.

Valid Invalid Ignored Fixed
241 5 39 0
Click to see the invalid file list
  • crates/integrations/cli/Cargo.toml
  • crates/integrations/cli/README.md
  • crates/integrations/cli/src/catalog.rs
  • crates/integrations/cli/src/lib.rs
  • crates/integrations/cli/src/main.rs
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

@liurenjie1024 liurenjie1024 requested a review from sdd March 27, 2025 04:48
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can exclude crates/integrations from workspace, and have separate rust-toolchain.toml for them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because currently we already use different MSRV for the datafusion integration.

https://github.com/apache/iceberg-rust/blame/e1de63d3ec47ee97d2b22cd193634e9aa9ff6ed8/crates/integrations/datafusion/Cargo.toml#L23-L26

And in previous discussion:

We currently only have msrv for iceberg. I'm not sure if our goal is to maintain msrv for all our integration crates, such as datafusion.

Originally posted by @Xuanwo in #849 (comment)

You can see e.g., this PR is blocked by this. Actually we can upgrade the toolchain version used by the datafusion integration, without upgrading the project MSRV.

And e.g., for the cli project, since it depends on the datafusion integration, it needs to use max(project_msrv, datafusion_integration_msrv). That's why I proposed to just have a separate toolchain for all the integrations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it seems our new MSRV policy (3 month) can cover datafusion's 4 month policy, so maybe there's no need to have separate MSRV any more.

However, if we only upgrade MSRV "on release", that may not be enough. (imagine datafusion releases more frequently, and we can't bump MSRV until next release.) Maybe we could just bump MSRV "on demand" in this case? We just need to guarantee at any time, the 3 month time range is covered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(imagine datafusion releases more frequently, and we can't bump MSRV until next release.)

Do we need to catch up with every latest datafusion release? In this case I would prefer to release iceberg-rust more often.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinions on this. It would be nice to have more frequent iceberg-rs releases.

But it seems currently we want to wait for some internal milestones to be finished before release. So if some user/developer wants to upgrade datafusion, they have to wait for it. This slowness seems a little unnecessary to me. 🤔

To summarize a bit,

  • approach 1: use separate toolchain/msrv for integrations. (currently used)
  • approach 2: use the same toolchain/msrv
    • to do this, we might either need to bump msrv on demand (not only on release ), or release more frequently.

also cc @Fokko @Xuanwo @sdd @c-thiel for opinions

@xxchan
Copy link
Member

xxchan commented Mar 28, 2025

Maybe we could have a tracking issue and create subtasks for feature ideas?

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 291 files.

Valid Invalid Ignored Fixed
245 2 44 0
Click to see the invalid file list
  • crates/integrations/cli/Cargo.toml
  • crates/integrations/cli/README.md
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 291 files.

Valid Invalid Ignored Fixed
245 2 44 0
Click to see the invalid file list
  • crates/integrations/cli/Cargo.toml
  • crates/integrations/cli/README.md
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 291 files.

Valid Invalid Ignored Fixed
245 2 44 0
Click to see the invalid file list
  • crates/integrations/cli/Cargo.toml
  • crates/integrations/cli/README.md
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 291 files.

Valid Invalid Ignored Fixed
245 2 44 0
Click to see the invalid file list
  • crates/integrations/cli/Cargo.toml
  • crates/integrations/cli/README.md
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

@liurenjie1024
Copy link
Contributor Author

Maybe we could have a tracking issue and create subtasks for feature ideas?

Do you mean to have a tracking issue for all feature ideas?

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 291 files.

Valid Invalid Ignored Fixed
245 2 44 0
Click to see the invalid file list
  • crates/integrations/cli/Cargo.toml
  • crates/integrations/cli/README.md
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 291 files.

Valid Invalid Ignored Fixed
245 2 44 0
Click to see the invalid file list
  • crates/integrations/cli/Cargo.toml
  • crates/integrations/cli/README.md
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

@liurenjie1024
Copy link
Contributor Author

Seems we have can't merge this since datafusion-cli requires newer version of rustc. We should postpone it after we release 0.5.0.

@xxchan
Copy link
Member

xxchan commented Apr 7, 2025

Maybe we could have a tracking issue and create subtasks for feature ideas?

Do you mean to have a tracking issue for all feature ideas?

Yes

Seems we have can't merge this since datafusion-cli requires newer version of rustc. We should postpone it after we release 0.5.0.

I replied some thoughts on #1143 (comment)

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 292 files.

Valid Invalid Ignored Fixed
246 2 44 0
Click to see the invalid file list
  • crates/integrations/cli/Cargo.toml
  • crates/integrations/cli/README.md
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 292 files.

Valid Invalid Ignored Fixed
246 2 44 0
Click to see the invalid file list
  • crates/integrations/cli/Cargo.toml
  • crates/integrations/cli/README.md
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 292 files.

Valid Invalid Ignored Fixed
246 2 44 0
Click to see the invalid file list
  • crates/integrations/cli/Cargo.toml
  • crates/integrations/cli/README.md
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

@liurenjie1024
Copy link
Contributor Author

In favor #1193

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

An iceberg cli tool
2 participants