-
Notifications
You must be signed in to change notification settings - Fork 31
Add support for s3-compatible storage #601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| /// An Amazon Web Services storage bucket. | ||
| #[cfg(feature = "server-aws")] | ||
| Aws { | ||
| /// Region in which the bucket is located. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add some docs here as to when this is and is not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the current (code) documentation feel here?
07d83b9 to
db923cc
Compare
f419682 to
86fdf79
Compare
|
I'm slowly getting my local rust development environment up to snuff, running docs and tests locally :D. I fixed the existing tests just with skipping the new optional fields. What would good tests here look like? Do we want a series of tests that require a real, running S3-compatible storage the way we have the AWS tests? Is there a nice rust mechanism ("features" maybe?) for running the same tests on different backends? ... or we could keep using the existing AWS implementation, and just pass in the endpoint_url as s3.us-east-1.amazonaws.com 🤔 (although that wouldn't be a great test, since if endpoint_url wasn't piped through correctly, it would still work). For this PR, should I be doing anything for the failed semver check? I'm pretty sure we expected that in #546, mentioning that in #599 we were already planning a breaking change. |
djmitche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slowly getting my local rust development environment up to snuff, running docs and tests locally :D.
I fixed the existing tests just with skipping the new optional fields.
Nice, and that also verifies that existing users have a simple migration path.
What would good tests here look like? Do we want a series of tests that require a real, running S3-compatible storage the way we have the AWS tests? Is there a nice rust mechanism ("features" maybe?) for running the same tests on different backends?
For the moment, I think we can document your manual testing against a specific backend, with an issue filed to automate that testing. I think we will want to document which backends we have tested with, and where possible automate that testing -- but some of those backends are probably too complicated to set up.
... or we could keep using the existing AWS implementation, and just pass in the endpoint_url as s3.us-east-1.amazonaws.com 🤔 (although that wouldn't be a great test, since if endpoint_url wasn't piped through correctly, it would still work).
This is a good idea just to test the codepaths.
For this PR, should I be doing anything for the failed semver check? I'm pretty sure we expected that in #546, mentioning that in #599 we were already planning a breaking change.
Yes, you can bump the version in Cargo.toml:
version = "4.0.0-pre"
I figured out how to use the Step up from "manual test" of
Done below!
Was 4.x intentional? It looks like 3.x was the next bump! I went for that, but can happily change up to 4. Alright, good and bad! I've actually updated those manual tests to allow specifying our two new parameters - endpoint url and force path style Here's what works:
If we modify the code to make AWS_TEST_REGION optional, we would still get an error if we did not specify it: Caused by:
Error { code: "PermanentRedirect", message: "The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.", s3_extended_request_id: "sIBjQDiUbiaghYjxvN1sY4FHbz6kb3Z071s6u0+8lrTX29KLNOkyL1lx+otmOBw5ECKgUBo6ynQ=", aws_request_id: "GB3PYDCYZGCRFZ47" }).. if my AWS_PROFILE is setup to use that region (or just the aws-sdk supported env var I think that's OK Now, what's worse is that despite manual testing being successful ( AWS_TEST_REGION=fake AWS_TEST_BUCKET=task-test-be13a225-d157-4488-8c56-65cbb9a7a522 AWS_TEST_ACCESS_KEY_ID=travis AWS_TEST_SECRET_ACCESS_KEY=hunter2 AWS_TEST_ENDPOINT_URL=https://mydomain AWS_TEST_FORCE_PATH_STYLE=true cargo test
....
failures:
---- server::cloud::aws::tests::compare_and_swap_matches stdout ----
thread 'server::cloud::aws::tests::compare_and_swap_matches' (252165) panicked at src/server/cloud/test.rs:189:5:
assertion failed: svc.compare_and_swap(&pfx("testy"), Some(b"foo2".to_vec()), b"bar".to_vec())?
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
server::cloud::aws::tests::compare_and_swap_matches(the reason my manual testing worked.. I only added new tasks, I don't think I modified an existing one when I attempted to re-sync?) After a lot of debugging this morning I've gotten it down to:
And, here is what's interesting. When I printed out the etag with some printf-style debugging, I noticed it was printing I setup two aliases, aaws s3api put-object --bucket test-travis-a96d --key foo --body file2 | jq -r '.ETag'
"b1946ac92492d2347c6235b4d2611184"
caws s3api put-object --bucket task-test-be13a225-d157-4488-8c56-65cbb9a7a522 --key foo --body file2 | jq -r '.ETag'
"b1946ac92492d2347c6235b4d2611184"Note that the ETag value has quotes in it! OK, let's test that out: $ aaws s3api put-object --bucket test-travis-a96d --key foo --body file --if-match '"b1946ac92492d2347c6235b4d2611184"'
{
"ETag": "\"a10edbbb8f28f8e98ee6b649ea2556f4\"",
"ChecksumCRC64NVME": "NiebZSsdOpc=",
"ChecksumType": "FULL_OBJECT",
"ServerSideEncryption": "AES256"
}
$ caws s3api put-object --bucket task-test-be13a225-d157-4488-8c56-65cbb9a7a522 --key foo --body file --if-match '"b1946ac92492d2347c6235b4d2611184"'
argument of type 'NoneType' is not iterableAnd let's try if we strip the quotes: $ aaws s3api put-object --bucket test-travis-a96d --key foo --body file2 | jq -r '.ETag'
"b1946ac92492d2347c6235b4d2611184"
$ caws s3api put-object --bucket task-test-be13a225-d157-4488-8c56-65cbb9a7a522 --key foo --body file2 | jq -r '.ETag'
"b1946ac92492d2347c6235b4d2611184"
$ aaws s3api put-object --bucket test-travis-a96d --key foo --body file --if-match 'b1946ac92492d2347c6235b4d2611184'
{
"ETag": "\"a10edbbb8f28f8e98ee6b649ea2556f4\"",
"ChecksumCRC64NVME": "NiebZSsdOpc=",
"ChecksumType": "FULL_OBJECT",
"ServerSideEncryption": "AES256"
}
$ caws s3api put-object --bucket task-test-be13a225-d157-4488-8c56-65cbb9a7a522 --key foo --body file --if-match 'b1946ac92492d2347c6235b4d2611184'
{
"ETag": "\"a10edbbb8f28f8e98ee6b649ea2556f4\""
}So, the AWS API supports using the quoted value from the ETag response directly (which includes quotes intentionally. There's some previous aws sdk issues on GH about it I can add later), but at least my Ceph object gateway does not! I'm not sure if the right answer is for us to strip quotes here, because it appears to work with AWS S3 without them (do they document anywhere that it should work?), add more configuration options to support this for s3-"compatible" apis, or submit an upstream ticket to ceph (I probably should do that either way). We also can generate the etag from the file contents instead of relying on an extra get_object call + comparing content. The ETag that s3 returns is just an md5sum of the file contents: md5sum < file2
b1946ac92492d2347c6235b4d2611184 -I think I want to check if one more s3-compatible API has this same problem [edit] https://tracker.ceph.com/issues/64439 rhymes with our current problem, but I can't quite say for sure if it's the same |
|
Testing minio fails, but on a different test 🙄 failures:
---- server::cloud::aws::tests::compare_and_swap_disappears stdout ----
deleting object fcacc68a90e3426b81a400a551299155-racing-delete
thread 'server::cloud::aws::tests::compare_and_swap_disappears' (254663) panicked at src/server/cloud/test.rs:250:5:
assertion failed: !svc.compare_and_swap(&pfx("racing-delete"), Some(b"foo1".to_vec()),
b"bar".to_vec())?(it succeeded |
This patch switches to calculating the md5sum, and avoiding the extra diff --git a/Cargo.lock b/Cargo.lock
index 26cf0ae9d..b966e1678 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -1639,6 +1639,12 @@ dependencies = [
"digest",
]
+[[package]]
+name = "md5"
+version = "0.8.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "ae960838283323069879657ca3de837e9f7bbb4c7bf6ea7f1b290d5e9476d2e0"
+
[[package]]
name = "memchr"
version = "2.7.4"
@@ -2675,6 +2681,7 @@ dependencies = [
"flate2",
"google-cloud-storage",
"log",
+ "md5",
"pretty_assertions",
"proptest",
"ring",
diff --git a/Cargo.toml b/Cargo.toml
index 6faf0908b..4b8dd0726 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -67,6 +67,7 @@ thiserror = "2.0"
ureq = { version = "^2.12.1", features = ["tls"], optional = true }
uuid = { version = "^1.16.0", features = ["serde", "v4"] }
url = { version = "2", optional = true }
+md5 = "0.8.0"
[dev-dependencies]
proptest = "^1.7.0"
diff --git a/src/server/cloud/aws.rs b/src/server/cloud/aws.rs
index 7f74c80f3..6ede4888b 100644
--- a/src/server/cloud/aws.rs
+++ b/src/server/cloud/aws.rs
@@ -226,35 +226,8 @@ impl Service for AwsService {
) -> Result<bool> {
self.block_on(async {
validate_object_name(name);
- let get_res = if_key_exists(
- self.client
- .get_object()
- .bucket(self.bucket.clone())
- .key(name)
- .send()
- .await
- .map_err(aws_err),
- )?;
- // Check the expectation and gather the e_tag for the existing value.
- let e_tag;
- if let Some(get_res) = get_res {
- // If a value was not expected but one exists, that expectation has not been met.
- let Some(existing_value) = existing_value else {
- return Ok(false);
- };
- e_tag = get_res.e_tag.clone();
- let body = get_body(get_res).await?;
- if body != existing_value {
- return Ok(false);
- }
- } else {
- // If a value was expected but none exists, that expectation has not been met.
- if existing_value.is_some() {
- return Ok(false);
- }
- e_tag = None;
- };
+ let e_tag = existing_value.map(md5::compute).map(|d| format!("{:x}", d));
// When testing, an object named "$pfx-racing-delete" is deleted between get_object and
// put_object.(I have no idea if the Thoughts? I'm not sure how I feel about a change like this to get around another service's bug :|, but on the bright side we reduce an extra API call? |
|
The reason minio does not pass the racing delete test.. $ maws s3api head-object --bucket test --key foo
An error occurred (404) when calling the HeadObject operation: Not Found
$ maws s3api put-object --bucket test --key foobar --body file2 --if-match b1946ac92492d2347c6235b4d2611184
{
"ETag": "\"b1946ac92492d2347c6235b4d2611184\"",
"ChecksumCRC64NVME": "akP7S61aVgc=",
"ChecksumType": "FULL_OBJECT"
}minio ignores the |
|
We can't/shouldn't assume the ETag is an md5sum: https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity-upload.html#checking-object-integrity-etag-and-md5 (If you use a bucket w/ KMS encryption, it won't be). |
|
It seems like Minio doesn't provide a way to reliably compare-and-swap a value for the absence of a value (that is, to prevent two racing processes from succeeding at putting objects when none exists). So that's not actually enough to support Taskchampion. I think the Ceph issue can be worked around by just stripping the quotes: both services include quotes in responses, and both services will accept an unquoted string in a request. Or have I missed a further issue there? |
|
Exciting news! minio was fixed in minio/minio#21526 (comment) And as of their version:
On the ceph front, 👍 stripping quotes seems to work and we could get tests to pass with: That being said,
It looks like stripping quotes would technically be non-conformant with the spec, as they consider the quotes part of the etag. We can't say for sure if this could break other implementations (there's others like Heztner's that I won't be able to test). Do we want to intentionally release something non-conformant, with it in mind that it works in practice for the implementations we see? And if a new implementation comes along and we have to choose between breaking ceph that is non-compliant and supporting $newthing that is compliant.. make the choice then? (Or flagg-ify it?) |
|
https://tracker.ceph.com/issues/64439 does seem like the same issue, and I think that's a bug in Ceph. We should probably try harder to get that fixed and only build in a workaround if both (a) someone really wants Ceph support and (b) Ceph provides a reasonable justification for not fixing the bug. I realize I'm contradicting an earlier comment of mine, sorry :) But for the moment it looks like we could advertise support for minio, with the minimum version you specified -- that seems like a great thing to get landed now. We can include a note in the docs that Ceph support is waiting on a fix for the linked bug. |
(Will add description and move out of draft)