Streaming encoder #23

onel · 2019-01-08T15:03:41Z

First of all, thanks for this great library.

I have a question: is there a way to do encoding of a specific audio buffer and only get that back, and not the whole recording?
For example, sending a Float32Array, vmsg encodes it and then sends it back.
Right now I think during a recording, everything is held in memory and returned when calling vmsg_flush().
This would be useful for longer recordings where you want to encode something and maybe upload it and not keep it in memory.

I've tried to do something similar, by calling vmsg_init, vmsg_encode and then vmsg_flush, inside the data event listener for the worker. I don't think this is the right way to do it.

  case "data":

    if (!vmsg_init(msg.rate)) return postMessage({type: "error", data: "vmsg_init"});

    if (!vmsg_encode(msg.data)) return postMessage({type: "error", data: "vmsg_encode"});

    const blob = vmsg_flush();
    if (!blob) {
      return postMessage({type: "error", data: "vmsg_flush"});
    }

    postMessage({
      type: "blob",
      data: blob
    });
    
    break;

Is there a way to do that? A change would also need to be made inside vmsg.c, right?
Thanks

The text was updated successfully, but these errors were encountered:

Kagami · 2019-01-08T15:31:42Z

Yes, it's possible, just need to make vmsg_encode C function return the number of bytes written, so you can send v->mp3+v->size-n .. v->mp3+v->size bytes via PostMessage to the main thread. At the end you also should fix the lame tag (lame_get_lametag_frame), need additional message for that.

I'm not sure if we want to use that method for normal recordings, because it would require to send every encoded chunk back to the main thread and copy it to the buffer, it might introduce additional delay. But should be ok to make it optional.

onel · 2019-01-09T18:28:53Z

Ok, I understand.
Don't have experience with c but maybe I'll try that in a fork.
Thank you so much for the details.

onel · 2019-03-15T17:11:53Z

Hi there, I took a stab at making this work and I wanted to check with you if this is the right way to do it.
I haven't create a PR for this because I don't know if you would want to integrate it. But let me know if you would want that.
The idea is that on each buffer we would do vmsg_encode, vmsg_flush and then a new method vmsg_reset.
Inside the worker this would look like this:

  case "data":

    if (!vmsg_encode(msg.data)) return postMessage({type: "error", data: "vmsg_encode"});

    const blob = vmsg_flush();
    if (!blob) {
      return postMessage({type: "error", data: "vmsg_flush"});
    }

    postMessage({
      type: "blob",
      data: blob
    });

    FFI.vmsg_reset()
    
    break;

This will return the blob for that specific buffer each time.

The changes that I've made are:
For vmsg_encode the size is returned each time:

WASM_EXPORT
int vmsg_encode(vmsg *v, int nsamples) {
  if (nsamples > MAX_SAMPLES)
    return -1;

  if (fix_mp3_size(v) < 0)
    return -1;

  uint8_t *buf = v->mp3 + v->size;
  int n = lame_encode_buffer_ieee_float(v->gfp, v->pcm_l, NULL, nsamples, buf, BUF_SIZE);

  if (n < 0)
    return n;

  v->size += n;
  return v->size;
}

And the new method:

WASM_EXPORT
int vmsg_reset(vmsg *v, int rate) {
  if (v) {
    lame_close(v->gfp);
    v->size = 0;

    v->gfp = lame_init();
    if (!v->gfp) {
      vmsg_free(v);
      return -1;
    }
    
    lame_set_mode(v->gfp, MONO);
    lame_set_num_channels(v->gfp, 1);
    lame_set_in_samplerate(v->gfp, rate);
    lame_set_VBR(v->gfp, vbr_default);
    lame_set_VBR_quality(v->gfp, 5);

   if (lame_init_params(v->gfp) < 0) {
	 vmsg_free(v);
	 return -1;
   }
    
  }

  return 0;
}

This basically looks like init but without the memory allocation.
The problem I'm having is that the resulting mp3 blob is not actually usable. I think in vmsg_reset the encoder is not set up correctly.
My questions are:
Do you thing this is a good way to do buffer encoding?
And, what would you recommend we don in vmsg_reset?
Thanks

flieks · 2020-01-16T10:16:39Z

@onel did you get it working ? i am also interested in this for live speech to text (on the server)

stefan-reich · 2021-07-30T23:11:47Z

Damn. I want this too. What if we fake it and just swap the encoder with a new one every few seconds? I'm fine with lots of relatively short mp3s.

stefan-reich · 2021-07-30T23:18:04Z

Ah I think I'll simply use MediaRecorder. It should record as .webm, right?

Kagami changed the title ~~Encode a specific audio buffer~~ Streaming encoder Jan 8, 2019

Kagami mentioned this issue Jun 30, 2019

memory access out of bounds #25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming encoder #23

Streaming encoder #23

onel commented Jan 8, 2019

Kagami commented Jan 8, 2019

onel commented Jan 9, 2019

onel commented Mar 15, 2019

flieks commented Jan 16, 2020

stefan-reich commented Jul 30, 2021 •

edited

Loading

stefan-reich commented Jul 30, 2021 •

edited

Loading

Streaming encoder #23

Streaming encoder #23

Comments

onel commented Jan 8, 2019

Kagami commented Jan 8, 2019

onel commented Jan 9, 2019

onel commented Mar 15, 2019

flieks commented Jan 16, 2020

stefan-reich commented Jul 30, 2021 • edited Loading

stefan-reich commented Jul 30, 2021 • edited Loading

stefan-reich commented Jul 30, 2021 •

edited

Loading

stefan-reich commented Jul 30, 2021 •

edited

Loading