Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chd: Fix various memory and overflow bugs #132

Merged
merged 13 commits into from
Nov 10, 2024
Merged

Conversation

stenzek
Copy link
Contributor

@stenzek stenzek commented Oct 26, 2024

As part of trying to improve the robustness of input file handling for my emulator, I have identified and fixed a number of issues in libchdr's CHD handling. Most of these were discovered through some very brief fuzzing with LibFuzzer.

The majority are straightforward - missing error checking, or assuming input was structurally valid. The only one I'm a bit iffy on is d5935ca - is a 128MB maximum hunk size and 10GB data size reasonable?

It's one of the first things that libfuzzer hit, so I had to add some basic checks, and even if the allocation will fail on most systems. Should you have a corrupted file, you don't really want them to swap out all their working sets to satisfy it, only to fail loading later.

5eb9a28 is related to #118, however it only fixes the malicious/corrupted case, I didn't have any very low-compression CHDs to check with.

I fully acknowledge that many of these would require some serious work to actually accomplish anything malicious with, and many of them are just reads, not writes, but IMO it's best to validate everything where possible, especially when there is basically no performance cost to doing so. And it saves OOM/crashing if the user does have accidental file corruption.

I've included the fuzz target in this PR as well. If that's not something you want upstream, let me know and I'll drop the commit. If anyone wants to spend more time fuzzing the library, simply enable the CMake option and use make fuzz to invoke it. You'll probably also need to set -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ as well on most systems. There are no doubt more data validation issues... :)


/* input may be truncated, double-check */
if ((ecc_bytes + 2) > complen)
return CHDERR_DECOMPRESSION_ERROR;

/* extract compressed length of base */
uint32_t complen_base = (src[ecc_bytes + 0] << 8) | src[ecc_bytes + 1];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an inconsistency in variable declarations here; Your new code declares variables at the start of functions, but here (and in other places) a variable is declared in between statements.
If the latter is acceptable, then perhaps declare the error variables only when they are first assigned to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for declare-at-time-of-use. If you don't care about C89, then I'd absolutely move it down.
Actually, having the bounds check in there would probably break C89 anyway, so I'd have to move complen_base up.. ugh.

const uint32_t header_bytes = ecc_bytes + complen_bytes;

/* input may be truncated, double-check */
if ((ecc_bytes + 2) > complen)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most other comparisons in this file seem to mention the variable first, and the limit it's compared to second.
This line, and a few others you've introduced have it the other way around.
Your choice whether it's worth gaining consistency by swapping these.

Copy link

@PatrickvL PatrickvL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential improvements

@rtissera
Copy link
Owner

rtissera commented Nov 2, 2024

Everything looks very good and tidy to me.
@stenzek is the PR ready to merge ?

@stenzek
Copy link
Contributor Author

stenzek commented Nov 3, 2024

Yep, it is. I just tacked the last commit on there since it was straightforward, the others have been tested by DS users for a week or so.

@@ -1491,6 +1544,11 @@ static inline void map_assemble(uint8_t *base, map_entry *entry)
-------------------------------------------------*/
static inline int map_size_v5(chd_header* header)
{
// Avoid overflow due to corrupted data.
const uint32_t max_hunkcount = (UINT32_MAX / header->mapentrybytes);
if (header->hunkcount > max_hunkcount)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the sole remaining newly introduced comparison that doesn't have the variable on the left side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable from the file is on the left in this case, max_hunkcount <= header->hunkcount definitely does not read correctly to me.

src/libchdr_chd.c Outdated Show resolved Hide resolved
src/libchdr_chd.c Outdated Show resolved Hide resolved
@@ -1980,10 +2064,7 @@ CHD_EXPORT chd_error chd_precache(chd_file *chd)

if (chd->file_cache == NULL)
{
size = core_fsize(chd->file);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size variable is no longer initialized by this change, leading to undefined behaviour when it's accessed further down.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, C89 strikes again. Indeed, this would break precaching. I'm using a rewritten version of this function in DS with progress reports back, hence why I never noticed...

Copy link

@PatrickvL PatrickvL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More suggestions and one potential bug spotted.

@rtissera
Copy link
Owner

rtissera commented Nov 3, 2024

@stenzek if you can address / answer @PatrickvL comments when you have some time I would appreciate and happily merge

@silverbacknet
Copy link

Why is C89 a goal? Just a personal flex? C11 should be the bare minimum supported across compilers, even ones for relatively older hardware.

@stenzek
Copy link
Contributor Author

stenzek commented Nov 4, 2024

Why is C89 a goal? Just a personal flex? C11 should be the bare minimum supported across compilers, even ones for relatively older hardware.

FWIW I agree 100%, but that is a can of worms that is best kept closed.

Unless things have changed of course - it would be nice to see.

@rtissera
Copy link
Owner

rtissera commented Nov 4, 2024

C89 is not a goal. It’s 2024 and C99 or C11 are fine to me.
I know it matters for libretro Crown however.

@stenzek
Copy link
Contributor Author

stenzek commented Nov 7, 2024

Had to re-do 45670a9, apparently if you don't specify at least 4 codecs (i.e. 1-3 manually via -c), it duplicates the none codec.

It seems unlikely that the same compressed codec would intentionally be specified multiple times (it would double free/alloc before), but in the interest of not breaking compatibility, I added some ugly checks around it.

@rtissera
Copy link
Owner

LGTM

@rtissera rtissera merged commit 51d1f28 into rtissera:master Nov 10, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants