Skip to content

Conversation

magentaqin
Copy link
Contributor

@magentaqin magentaqin commented Aug 4, 2025

Description

This PR is to support uploading by host. It's based on my previous PR: #1575

Why we need this host support?

upload function in pixi and rattler_build is different. In pixi, it only supports upload by url, but in rattler_build, it only supports upload by subcommands "prefix", "anaconda"...etc. We need to align these two.

Code changes I made

1.support parsing url, match it to specific server type and extract base_url and channel from the host
2.align progress bar style: migrate pixi_progress and create rattler_progress that serves the uploading part.

impl ServerPatterns {
fn new() -> Self {
Self {
// Prefix.dev patterns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if using regex is the best choice. We could probably also do this with regular URL parsing and some contains queries and similar, e.g.:

url.host_str.contains(prefix.dev)
url.path().ends_with(/conda-forge)
url.protocol() == "s3" ... 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Indeed, Regex is fragile and has some maintainability issues...I'll switch to the URL parsing.

pub fn extract_artifactory_info(url: &Url) -> Result<(Url, String), Box<dyn std::error::Error>> {
let url_str = url.as_str();
let artifactory_pattern =
Regex::new(r"^(https?://[^/]+)/artifactory/([^/]+)(?:/(.+)/([^/]+))?").unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think here just using regular URL parsing and then skipping components will be more robust than regex.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example:

let path_segments: Vec<&str> = url.path_segments()
    .ok_or("Invalid URL: cannot extract path segments")?
    .collect();

if path_segments.len() >= 2 && path_segments[0] == "artifactory" {
    // 1. Create base URL by removing path components
    let mut base_url = url.clone();
    base_url.set_path("");
    base_url.set_query(None);
    base_url.set_fragment(None);

    // 2. Extract channel
    let channel = path_segments[1].to_string();

    Ok((base_url, channel))
} else {
    Err("Invalid Artifactory URL format".into())
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

Copy link
Collaborator

@baszalmstra baszalmstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk if a "generic" rattler_progress makes sense. The implementation is very specific. Id much rather have a slimmed down version of progress reporting in the rattler_upload code itself that can be enabled with a feature (we do the same for rattler install code).

@magentaqin
Copy link
Contributor Author

Idk if a "generic" rattler_progress makes sense. The implementation is very specific. Id much rather have a slimmed down version of progress reporting in the rattler_upload code itself that can be enabled with a feature (we do the same for rattler install code).

Thanks @baszalmstra for your advice! My original thought is not only for rattler_upload, but to align "progress" in pixi and rattler-build, so I directly moved the progress bar from pixi to rattler. Do you think it's necessary to have this rattler_progress crate?

let host_url = args.host.as_ref().unwrap();
let (base_url, channel) =
extract_prefix_info(host_url).expect("Failed to parse Prefix URL");
ServerType::Prefix(PrefixOpts {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we fill in e.g.. API key from the args here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Wolf. There're three things I need to explain.

  1. According to pixi doc, upload doesn't have api_key arg. https://pixi.sh/latest/reference/cli/pixi/upload/#description

  2. The common args was designed like this(not having api_key) option.

Screenshot 2025-09-23 at 09 39 54
  1. There're two ways to upload packages: URL and subcommand.

Only when you specify the subcommand as "prefix", can you pass the arg api_key. But in our case, we upload package through URL(prefix pattern), api_key is not allowed to pass through the args. Users must pass the token by setting them locally.


/// The region of the S3 backend
pub region: String,
pub region: Option<String>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this now optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it supports upload by URL now, and the region can be written in the URL. For example, the user can upload like this " https://.s3..amazonaws.com/my-channel"

/// Options for uploading to S3
#[cfg(feature = "s3")]
#[derive(Clone, Debug, PartialEq, Parser)]
pub struct S3Opts {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can reuse the S3CredentialsOpts from crates/rattler_s3/src/clap.rs here to get a consistent CLI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do it! thanks for reminding!

#[derive(Parser, Debug)]
pub struct UploadOpts {
/// The host + channel (optional if the server type is provided)
pub host: Option<Url>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are clap arguments here to make sure that if host is provide that channel must also be provide. Or that it conflicts with server_type. Use those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Bas! Nice catch! I mean the hostURL should consider "host" + "channel", and in check_server_type(https://github.com/magentaqin/rattler/blob/14a0e826791da57dbe5e02db24597b6cf5b8873a/crates/rattler_upload/src/utils/server_util.rs#L25), I will check whether the URL contains channel.
But you're right, I should use clap arguments to make "host" and "server_type" mutually exclusive.

#[cfg(feature = "s3")]
#[derive(Debug)]
#[allow(missing_docs)]
pub struct S3Data {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this struct? The name is quite ambigous. Also, doesnt the rattler_s3 crate provide similar functionality?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, Bas. S3Data was migrated before we had rattler_s3 crate. Let me see how to reuse the functionality!

/// * `SimpleServerType` - The detected server type or Unknown
///
/// ```
pub fn check_server_type(host_url: &Url) -> SimpleServerType {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implement From or SimpleServerType::from_url.

#[cfg(feature = "s3")]
S3,
CondaForge,
Unknown,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cant we represent this as an Option<SimpleServerType>?


// 3. Check Conda-forge (GitHub)
if host.contains("conda-forge") {
return SimpleServerType::CondaForge;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is conda-forge a server?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's originally from here: https://github.com/prefix-dev/rattler-build/blob/v0.44.0/src/upload/conda_forge.rs I just migrated it to rattler! And for the URL part, I add the corresponding part.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file needs a lot of tests to verify that the SimpleServerType is correct for a number of URLs.

Anaconda,
#[cfg(feature = "s3")]
S3,
CondaForge,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of server is this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's originally from here: https://github.com/prefix-dev/rattler-build/blob/v0.44.0/src/upload/conda_forge.rs I just migrated it to rattler!

"rt-multi-thread",
"process",
] }
regex = "1.10"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it. It was used before refactor.

@wolfv
Copy link
Contributor

wolfv commented Oct 13, 2025

Hey @magentaqin I'll take this PR over and get it merged! I'll remove some changes from the S3 upload because IMO progress bar shoudl go to a different PR.

@magentaqin
Copy link
Contributor Author

Hey @magentaqin I'll take this PR over and get it merged! I'll remove some changes from the S3 upload because IMO progress bar shoudl go to a different PR.

Sounds good, thanks for taking it over! 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants