Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINT - Implement GPU accelerated nvJPEG encoder #20

Merged
merged 5 commits into from
Mar 22, 2025

Conversation

jestrada-atlassian
Copy link

Adding a GPU accelerated jpeg encoder. This will allow us to accelerate parts of our transcoder thumbnail job that we previously have been unable to run on GPU.

https://www.loom.com/share/9ee806965ce244189b3de057b82dcae7

@jestrada-atlassian jestrada-atlassian requested review from a team, sjhsieh, jpujol, w3sip, ayushgr, yiming-loom, ncooleyy and mchin3loom and removed request for a team March 16, 2025 22:34
Copy link

@w3sip w3sip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least one critical problem that I see (memory leak), but looks good otherwise.

}

// Handle YUV input formats and populate nv_image data
if (sw_format == AV_PIX_FMT_YUV420P || sw_format == AV_PIX_FMT_YUVJ420P) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want three different conditions / implementations here? Seems the same.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a leftover from trying to get NV12 to work. Unfortunately couldn't get it to work. Will consolidate.

}

// Retrieve the bitstream again to populate output buffer
CHECK_NVJPEG(nvjpegEncodeRetrieveBitstream(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would leak out_buf

NvjpegContext* ctx = avctx->priv_data;

if (ctx->encoder_state) {
CHECK_NVJPEG(nvjpegEncoderStateDestroy(ctx->encoder_state));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we continue with cleanup even if one of these fails?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be safe for now. We'll emit a log if this does happen but once we get to this point we'd have a valid output so I don't think we should throw an error even if this fails. We should still try to destroy the rest of the states if possible anyway.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But CHECK_NVJPEG would return on error, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Removed it from all these and just manually emitting a log


ctx->width = avctx->width;
ctx->height = avctx->height;
if ((ret = nvjpeg_init(avctx)) < 0)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal nit: I usually try to avoid assignment and check of a condition in the same line. Depends on the codebase convention, though.
Another nit, I often prefer to have curly braces even on single-line scope. Again, depends on codebase convention.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm roughly following the ffmpeg code convention found here. This implementation doesn't follow it perfectly but I'll clean it up before submitting the patch to them,

https://ffmpeg.org/developer.html#Coding-Rules-1

@jpujol
Copy link

jpujol commented Mar 17, 2025

@jestrada-atlassian : do we expect a lot of gain here? even the software encoder for jpeg is extremely fast
EDITED: just saw your video

@jpujol
Copy link

jpujol commented Mar 17, 2025

is this something we can submit upstream? It would be nice to have it reviewed by the nvidia guys

@jestrada-atlassian
Copy link
Author

is this something we can submit upstream? It would be nice to have it reviewed by the nvidia guys

I have this on my TODO list alongside getting pad_npp reviewed. Shooting for sometime mid quarter.

}

// Retrieve the bitstream again to populate output buffer
ret = CHECK_NVJPEG(nvjpegEncodeRetrieveBitstream(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove CHECK_NVJPEG (still would return without freeing the buffer)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

nv_image.channel[2] = frame->data[2];
nv_image.pitch[2] = frame->linesize[2];
} else if (input_format != NVJPEG_INPUT_TYPE_RGB) {
// Handle BGR/RGB input formats
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be misunderstanding, bit either the comment is wrong, or != is? Could be totally misreading this, though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified this. We're already checking the software format earlier so it doesn't make sense to check it here again

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could simplify it further with
int planes = (input_format == NVJPEG_INPUT_TYPE_YUV) ? 3 : NVJPEG_MAX_COMPONENT;
and then

for (i = 0; i < planes; i++) {
            nv_image.channel[i] = frame->data[i];
            nv_image.pitch[i] = frame->linesize[i];
        }

, up to you.

Copy link

@w3sip w3sip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Couple of things to look at, but neither is a must.

nv_image.channel[2] = frame->data[2];
nv_image.pitch[2] = frame->linesize[2];
} else if (input_format != NVJPEG_INPUT_TYPE_RGB) {
// Handle BGR/RGB input formats
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could simplify it further with
int planes = (input_format == NVJPEG_INPUT_TYPE_YUV) ? 3 : NVJPEG_MAX_COMPONENT;
and then

for (i = 0; i < planes; i++) {
            nv_image.channel[i] = frame->data[i];
            nv_image.pitch[i] = frame->linesize[i];
        }

, up to you.

ctx->nvjpeg_handle, ctx->encoder_state, out_buf, &out_buf_size, NULL);

if (ret != NVJPEG_STATUS_SUCCESS) {
av_free(out_buf);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a log statement here?

@jestrada-atlassian jestrada-atlassian merged commit d4fc3bd into n7.1.loom-patch3 Mar 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants