Skip to content

Conversation

elwincheng
Copy link
Collaborator

@elwincheng elwincheng commented Aug 17, 2025

elwin@DESKTOP-46267GC:~/dev/Mist/src/images$ go test
2025/09/15 18:52:19 WARN Volume does not exist volumeName=nonexistent_volume_t4
2025/09/15 18:52:21 ERROR Failed to remove volume volumeName=test_volume_t5 error="Error response from daemon: remove test_volume_t5: volume is in use - [b3dc76981255bcae4547ef997970146d478c2c8dcb5ed1f88f44c69a48d8412a, ec89342dfcb1a93ae63f0b0ad93b3d38cfa0b7ee1561825a7e34f8f3f53269cf]"
2025/09/15 18:52:30 ERROR Failed to remove volume volumeName=test_volume_t5 error="Error response from daemon: remove test_volume_t5: volume is in use - [b3dc76981255bcae4547ef997970146d478c2c8dcb5ed1f88f44c69a48d8412a]"
2025/09/15 18:52:53 ERROR Failed to remove volume volumeName=test_volume_t7 error="Error response from daemon: remove test_volume_t7: volume is in use - [6cb2b2cc2d799d0e89b5598a0d1df347cde053e485def4c0d7ba14e187d12acc, 1f00b4847e15778db0976b305976981b5a8d49e2821fa6af0ce1f583ab645d92, 8826fc56492f60d4f60bd3e79311dc9d317320da3e0fb3d6d682ba80694f8a19]"
2025/09/15 18:53:14 WARN Volume limit reached limit=100
2025/09/15 18:53:22 WARN Container limit reached limit=10
2025/09/15 18:54:59 ERROR Failed to remove volume volumeName=test_volume_t10 error="Error response from daemon: remove test_volume_t10: volume is in use - [da4c01ee51ae2939f5f6935f04cdb0c93ef125d5910432b81b35cd664e030243, c2cd99981648f58a85ac61857d14c1139f690ba9da335448cd58d4e4214344ec, d46c023de92057fa0f5984500be0f55185f101830f2274e7be3a38c6128bfe7d, a25e7bcac907360331392f5f1a19b4a49dff798193f1ffb64a0f116e59888313, 315aebee4f5aa3f5e9f98a21b0ca0cbca439c44bde3cd4a811c741a52561f0f6, 18a9cbbf388cfd72a83be637deeae7ddd11bbf33f7aeb3fa8acfa0b5d51ef42b, c0e8771cf2583b1187a713a114c183a78d234ec9b819d405b92a383dd91a7346, 1e7a8bb38c07e3887968490c819439778cf8ba076643316736970708697a99dc, 4e8c0a7bd18bbca04f1cd509ee575f1e26bdab78109d4970dd0a4f5cc850b2b8, 34667eb684c077cae995c5a6177c8d9687fff2c8561822cdf470c61a8cbacac2, d5245332e4b810511144700cf1c14ee5be590da05f0e79dbb101d5b81e5da54a, 2155a5d5af438b0656ca81521ade194ac3dc15df46692b67fce66a4040e3b842, 1552b023fabbcc0c7994738e375398ce6b9a26b1015833231131965b54c783bc, 6290884ae1e77e0f0e6bf88717342f18eb848395cea64fecb31d7bb5e69c51c3]"
PASS
ok mist/images 170.630s

Copy link
Collaborator

@KCui0327 KCui0327 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! General Comments:

  1. Push changes in go.mod and go.sum
  2. Please provide documentation on the APIs and the existing test plans
  3. Add some comments in the code for logic that may be tricky to understand or ambigious
  4. Add more logging messages (i.e. if we enter error branch, we should log explicitly what is the error)
  5. Avoid using panic for code that will go into prod. but rather handle them in the caller

}
}

func (mgr *ContainerMgr) stopContainer(containerID string) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return status when stopContainer exits

Copy link
Collaborator

@blobcode blobcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall: looks ok; though should probably be relocated / better integrated under the supervisor, esp. ContainerMgr (probably out of scope for this PR though)

@elwincheng elwincheng force-pushed the elwin_dev_container_squashed branch from 641739c to 258ce6e Compare September 3, 2025 19:09
Copy link
Collaborator

@KCui0327 KCui0327 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You still need add race condition protections on volumes and containers.
  2. For next iterations, please do not amend your changes and force push, this makes it hard for reviews to look for diffs since last review

@elwincheng
Copy link
Collaborator Author

elwincheng commented Sep 7, 2025

  1. You still need add race condition protections on volumes and containers.

@KCui0327 Sorry I am not understanding. I am acquiring and releasing a mutex lock when calling volume/container functions.

@KCui0327
Copy link
Collaborator

KCui0327 commented Sep 7, 2025

  1. You still need add race condition protections on volumes and containers.

@KCui0327 Sorry I am not understanding. I am acquiring and releasing a mutex lock when calling volume/container functions.

ah I see it now, I saw there was no diff in the review so I thought you didn't add it, I will re-review my apologies. But yeah I think for next iterations, please do not amend then force push, it makes it hard to see diffs between revisions

Copy link
Collaborator

@KCui0327 KCui0327 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Add docstrings
  2. include log file containing the output of the unit tests


---

### `func (mgr *ContainerMgr) runContainerCuda(volumeName string) (string, error)`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no runContainerCuda in the code, could you please double check

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks that is a good catch

Copy link
Collaborator

@KCui0327 KCui0327 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be last change, need to include proper logging that was committed Tara, refer to the documentation. Provide proper logging in the API and levels

@elwincheng elwincheng force-pushed the elwin_dev_container_squashed branch from 271c05d to 5f9b42f Compare September 15, 2025 22:57
@elwincheng
Copy link
Collaborator Author

elwincheng commented Sep 15, 2025

I had to force push because I rebased off main cause I thought I needed Tara's commit for slog mb.

return NewContainerMgr(cli, 10, 100)
}

// T1: create a volume, check exists, delete, check not exists
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't number these

return "", fmt.Errorf("volume %s does not exist", volumeName)
}

resp, err := cli.ContainerCreate(ctx, &container.Config{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix formatting

return resp.ID, err
}

io.Copy(os.Stdout, out)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't dump container output to console - @KCui0327 should be another task to log this better? - we probably want container logs user accessible, no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is correct, we can add that in a separate PR

@blobcode blobcode mentioned this pull request Sep 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants