-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config: base GID must be present in the supplementary GIDs array #1168
base: main
Are you sure you want to change the base?
config: base GID must be present in the supplementary GIDs array #1168
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but please squash commits (EDIT: Let me withdraw my LGTM, for compatibility sake)
The intention is to make review easier by not destroying review history (as Github does not have a concept of patch sets/series like Gerrit or I'll likely wait until more maintainers have had time to review to squash, unless you think this is ready for merge now, in which case I can perform the rebase. |
thank you @neersighted for opening this PR |
I am not fully convinced about this change. It is a breaking change and there won't be a way to keep the existing behavior even if desired. Personally, I still believe it is the caller's responsibility to ensure the gid is part of the additional gids. |
For what its worth, I don't think that this precludes something like #1020. There is also the potential of adding a new field so we can have a backwards compatible behavior; however I do strongly feel that patching the existing spec is best as there does not seem to be any case where it is desirable to have the primary GID set but not in the supplementary GIDs. |
900ec22
to
7706c0a
Compare
I've pushed up a squashed version as it doesn't seem too disruptive; would still like more eyes and opinions on this. I still strongly feel that this is the best way to address the mismatch between 'real' Linux systems and OCI containers, as otherwise some implementations will always be deficient. Note that entities that use an OCI runtime SHOULD still specify the primary group ID explicitly, for older implementations. |
@cyphar what do you think about this change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neersighted can you squash the commits? Otherwise LGTM
Currently, the spec is unclear whether the list of [supplementary GIDs][POSIX-sgids-def] used to create a container process should include the 'base' GID implicitly, or whether the config needs to specify this explicitly if desired. While [per POSIX][POSIX-sgids-rat] it is permissible for a system to include or exclude the base GID from the list of supplementary GIDs, in all Runtime Spec platforms the base GID is always added, and omitting it can have [real security consequences][benthams-gaze] as fully dropping a group is not typically allowed in Unix. This recently led to a number of CVEs in OCI Runtime Spec implementations, as it was concluded that it is necessary for a Unix container to always include the base GID in the list of supplementary GIDs, as originally established by 4.4BSD. Some of the CVEs include: * [Podman (CVE-2022-2989)][CVE-2022-2989] * [Moby (CVE-2022-36109)][CVE-2022-36109] * [Buildah (CVE-2022-2990)][CVE-2022-2990] * [CRI-O (CVE-2022-2995)][CVE-2022-2995] Some examples of how existing implementations handle this: * util-linux [calls][util-linux] [initgroups(3)][initgroups.3-linux] with the user's primary GID. * shadowutils (Linux) [calls][shadowutils] [initgroups(3)][initgroups.3-linux] with the user's primary GID. * FreeBSD [calls][freebsd-setusercontext] [initgroups(3)][initgroups.3-freebsd] with the user's GID from the password file (aka the primary GID). * Solaris [calls][solaris-setup_credentials] [initgroups(3)][initgroups.3-solaris] with the user's primary GID. * Z/OS's session creation code is not available; however [initgroups(3)][initgroups.3-zos] specifies a convention of including including the real group ID from the user database (aka the primary GID). * OpenSSH [calls][openssh] initgroups(3) with the user's primary GID; on all of the above platforms this will have the same result as a login(1), including the primary GID in the list of supplementary GIDs. While login(1) has generally been used as the example above, the same holds true for su(1) and other methods of starting a new session (including OpenSSH, as explained above). Given this seems clearly desirable and the OCI runtime is effectively the equivalent of login(1)/su(1)/any other program that sets up a new session, the OCI runtime is the best place to ensure that the list of supplementary group IDs contains the base GID. [POSIX-sgids-def]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_378 [POSIX-sgids-rat]: https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap03.html#tag_21_03_00_73 [CVE-2022-2989]: https://access.redhat.com/security/cve/cve-2022-2989 [CVE-2022-36109]: GHSA-rc4r-wh2q-q6c4 [CVE-2022-2990]: https://access.redhat.com/security/cve/cve-2022-2990 [CVE-2022-2995]: https://access.redhat.com/security/cve/cve-2022-2995 [benthams-gaze]: https://www.benthamsgaze.org/2022/08/22/vulnerability-in-linux-containers-investigation-and-mitigation/ [util-linux]: https://github.com/util-linux/util-linux/blob/96ccdc00e1fcf1684f9734a189baf90e00ff0c9a/login-utils/login.c#L1443 [shadowutils]: https://github.com/shadow-maint/shadow/blob/eaebea55a495a56317ed85e959b3599f73c6bdf2/libmisc/setugid.c#L55 [freebsd-setusercontext]: https://github.com/freebsd/freebsd-src/blob/eeaf9d562fe137e0c52b8c346742dccfc8bde015/lib/libutil/login_class.c#L486 [solaris-setup_credentials]: https://github.com/illumos/illumos-gate/blob/d9c3e05c2d8261e3f133b5e96a300b4fa6c0f1b7/usr/src/cmd/login/login.c#L1926 [openssh]: https://github.com/openssh/openssh-portable/blob/25bd659cc72268f2858c5415740c442ee950049f/session.c#L1379 [initgroups.3-linux]: https://man7.org/linux/man-pages/man3/initgroups.3.html [initgroups.3-freebsd]: https://www.freebsd.org/cgi/man.cgi?initgroups(3) [initgroups.3-solaris]: https://illumos.org/man/3C/initgroups [initgroups.3-zos]: https://www.ibm.com/docs/en/zos/2.2.0?topic=functions-initgroups-initialize-supplementary-group-id-list-process Signed-off-by: Bjorn Neergaard <[email protected]>
1c74533
to
d143e99
Compare
Pushed a squashed version with no other changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I am still against this change as it is since there is no way to keep the current behavior. Could we at least add a new knob to maintain the existing behavior?
Let me withdraw my LGTM on Nov 3, for compatibility sake |
Seems like there's a consensus that we should introduce a new field to preserve existing behavior. I'll try and work up a draft next week. |
There was a proposal by Cory Snider in the private discussion of GHSA-rc4r-wh2q-q6c4 which I think intended to preserve backwards compatibility. I don't have any particular views on it, but it might be a useful starting point for discussion. Since it was private, I don't think I should copy and paste it here, but those who can see the issue can read it and Cory could be asked for permission to publish it. |
Cory and I have discussed it outside of that security bulletin and iterated on the idea a bit more (we both work on the same team at Mirantis); I will be work-shopping something similar to his design with him before presenting it here 😄 |
Here is what I had proposed in the GHSA discussion linked above:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a squash, and your Co-authored-by:
trailer.
@neersighted not squashing was intentional. (Omitting the |
Currently, the spec is unclear whether the list of supplementary GIDs[1] used to create a container process should include the 'base' GID implicitly, or whether the config needs to specify this explicitly if desired. While per POSIX[2] it is permissible for a system to include or exclude the base GID from the list of supplementary GIDs, in all Runtime Spec platforms the base GID is always added, and omitting it can have real security consequences[3] as fully dropping a group is not typically allowed in Unix. This recently led to a number of CVEs in OCI Runtime Spec implementations, as it was concluded that it is necessary for a Unix container to always include the base GID in the list of supplementary GIDs, as originally established by 4.4BSD. Some of the CVEs include: * Podman (CVE-2022-2989) * Moby (CVE-2022-36109) * Buildah (CVE-2022-2990) * CRI-O (CVE-2022-2995) Some examples of how existing implementations handle this: * util-linux calls initgroups(3) with the user's primary GID. [4,5] * shadowutils (Linux) calls initgroups(3) with the user's primary GID. [5,6] * FreeBSD calls initgroups(3) with the user's GID from the password file (aka the primary GID). [7,8] * Solaris calls initgroups(3) with the user's primary GID. [9,10] * Z/OS's session creation code is not available; however initgroups(3) specifies a convention of including the real group ID from the user database (aka the primary GID). [11] * OpenSSH[12] calls initgroups(3) with the user's primary GID; on all of the above platforms this will have the same result as a login(1), including the primary GID in the list of supplementary GIDs. While login(1) has generally been used as the example above, the same holds true for su(1) and other methods of starting a new session (including OpenSSH, as explained above). Given this seems clearly desirable and the OCI runtime is effectively the equivalent of login(1)/su(1)/any other program that sets up a new session, the OCI runtime is the best place to ensure that the list of supplementary group IDs contains the base GID. [1]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_378 [2]: https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap03.html#tag_21_03_00_73 [3]: https://www.benthamsgaze.org/2022/08/22/vulnerability-in-linux-containers-investigation-and-mitigation/ [4]: https://github.com/util-linux/util-linux/blob/96ccdc00e1fcf1684f9734a189baf90e00ff0c9a/login-utils/login.c#L1443 [5]: https://man7.org/linux/man-pages/man3/initgroups.3.html [6]: https://github.com/shadow-maint/shadow/blob/eaebea55a495a56317ed85e959b3599f73c6bdf2/libmisc/setugid.c#L55 [7]: https://github.com/freebsd/freebsd-src/blob/eeaf9d562fe137e0c52b8c346742dccfc8bde015/lib/libutil/login_class.c#L486 [8]: https://www.freebsd.org/cgi/man.cgi?initgroups(3) [9]: https://github.com/illumos/illumos-gate/blob/d9c3e05c2d8261e3f133b5e96a300b4fa6c0f1b7/usr/src/cmd/login/login.c#L1926 [10]: https://illumos.org/man/3C/initgroups [11]: https://www.ibm.com/docs/en/zos/2.2.0?topic=functions-initgroups-initialize-supplementary-group-id-list-process [12]: https://github.com/openssh/openssh-portable/blob/25bd659cc72268f2858c5415740c442ee950049f/session.c#L1379 [CVE-2022-2989]: https://access.redhat.com/security/cve/cve-2022-2989 [CVE-2022-36109]: GHSA-rc4r-wh2q-q6c4 [CVE-2022-2990]: https://access.redhat.com/security/cve/cve-2022-2990 [CVE-2022-2995]: https://access.redhat.com/security/cve/cve-2022-2995 Signed-off-by: Bjorn Neergaard <[email protected]> Co-authored-by: Cory Snider <[email protected]> Signed-off-by: Cory Snider <[email protected]>
efd45cc
to
2d3f86c
Compare
I feel this is too complicated. Runtime implementations may raise a warning when |
That would still leave the generator of the container config as the single point of failure. The purpose of this proposal is to provide defense-in-depth: the bundle generator (container engine) and the runtime would both have to be out-of-date or otherwise get things wrong for a container to run with broken negative group permissions. My proposal, described simply, is:
The complexity of how that is specified comes from having to work within the confines of what is, and is not, defined by POSIX. Privileged operations including
Logging a warning wouldn't be of much help as the container would still be started with a broader scope of privilege than intended. And runtime-spec doesn't mandate that warnings logged by the runtime reach the user.
Image? This is runtime-spec. Images are out of scope for this discussion as there's no guarantee that a container bundle was even derived from an image. |
I agree, the current design is too complicated and IMO not worth it. I still feel this is not the OCI runtime responsibility and the security issue is anyway difficult to find in reality as it affects only images that use negative DACs and setuid binaries. |
This is a case where the OCI runtime is creating a userspace environment that is not aligned with what any Unix has done, historically. Other programs that create a login session get this right (e.g. OpenSSH, systemd, LXC) -- there is no reason for us to do the incorrect thing and create a not-quite-Unix environment. The question is how we get there -- if simplicity is desired, I think my original version might still be more acceptable. |
All these programs need to implement the same behavior, they don't expect |
@giuseppe all those programs call |
Currently, the spec is unclear whether the list of supplementary GIDs used to create a container process should include the 'base' GID implicitly, or whether the config needs to specify this explicitly if desired.
While per POSIX it is permissible for a system to include or exclude the base GID from the list of supplementary GIDs, in all Runtime Spec platforms the base GID is always added, and omitting it can have real security consequences as fully dropping a group is not typically allowed in Unix.
This recently led to a number of CVEs in OCI Runtime Spec implementations, as it was concluded that it is necessary for a Unix container to always include the base GID in the list of supplementary GIDs, as originally established by 4.4BSD.
Some of the CVEs include:
Some examples of how existing implementations handle this:
While login(1) has generally been used as the example above, the same holds true for su(1) and other methods of starting a new session (including OpenSSH, as explained above).
Given this seems clearly desirable and the OCI runtime is effectively equivalent of login(1)/su(1)/any other program that sets up a new session, the OCI runtime is the best place to ensure that the list of supplementary group IDs contains the base GID.