Skip to content

Incorrect model selection by pipeline dtype parameterΒ #1581

@starkdg

Description

@starkdg

System Info

transformers.js@3.8.1
Brave Browser v1.87 (Chromium 145.0)
debian linux 6

dependencies:
 @babel/core@7.29.0
β”œβ”€β”€ @babel/preset-env@7.29.0
β”œβ”€β”€ @huggingface/transformers@3.8.1
β”œβ”€β”€ @mozilla/readability@0.6.0
β”œβ”€β”€ babel-loader@10.1.1
β”œβ”€β”€ buffer@6.0.3
β”œβ”€β”€ copy-webpack-plugin@13.0.1
β”œβ”€β”€ css-loader@7.1.4
β”œβ”€β”€ html-webpack-plugin@5.6.6
β”œβ”€β”€ idb@8.0.3
β”œβ”€β”€ style-loader@4.0.0
β”œβ”€β”€ webpack-cli@6.0.1
└── webpack@5.105.4

Environment/Platform

  • Website/web-app
  • Browser extension
  • Server-side (e.g., Node.js, Deno, Bun)
  • Desktop app (e.g., Electron)
  • Other (e.g., VSCode extension)

Description

I stumbled on a curious issue. Technically, I suppose it's not really a bug, but weird behavior by the model. However, there's definitely something wrong here, because the results are definitely not what they should be.

Using a pipeline object for summarization produces a corrupted, nonsensical summary. There appears to be some confusion as to which model download with dtype: int8. Frankly, I don't know which one it did download, but it's producing nonsensical output. Here's an example of the kind of summary it produces:

The most addictive illegal illegal illegal drugs, the cocaine. The drug is a well of the most addictive drug. In the study, the most illegal illegal drug. It has an addiction to the cocaine, and the cocaine in the cocaine of the cocaine and the drug. I. cocaine. I have the cocaine that they are the most addicted. I know. 

The article summarized was an article about addiction, but it's like the summarizer was on cocaine! I've had no issues up to now. Just a short while ago it was working fine. But recently, it only worked fine when I removed the dtype parameter and let it go to its default. But it was marked as "q8" in my console. I presume it must be the int8 quantized model, because there's no model marked "q8". This is for the "Xenova/bart-large-cnn" model. It worked fine after that change. But I can't figure out what dtype to use. Anyway, it works fine without the dtype parameter. But I would like to be able to experiment with some of the other dtypes.

I think I've provided everything I can think of. If you need any more info just let me know.

Reproduction

Steps to reproduce the behavior:

  1. create a pipeline with:
const summarizer = await pipeline('summarization', 'Xenova/bart-large-cnn', { dtype: "int8" }); 
  1. feed it some content:Feed it whatever parameters you want, it doesn't appear to matter. For what it's worth, this is what I used:
const summaryResponse = await summarizer(text, {
		    num_beams: 1, 
		    early_stopping: true,
		    max_new_tokens: 150, 
		    do_sampling: false, 
		    repetition_penalty: 1.2,
});

It doesn't matter where I set any of the parameters. I've tried adjusting the repetition parameter and a few others but it produces the same output.

I would not have bothered reporting this, but I stumbled on a fix by simply deleting the dtype parameter from the pipeline constructor call just to see what it would download. This is what happened:

 background.js:75 Unable to determine content-length from response headers. Will expand buffer when needed.
background.js:75 Unable to determine content-length from response headers. Will expand buffer when needed.
background.js:75 dtype not specified for "encoder_model". Using the default dtype (q8) for this device (wasm).
background.js:75 dtype not specified for "decoder_model_merged". Using the default dtype (q8) for this device (wasm).

It defaults to "q8". What is this "q8"? I don't see any q8 files as an option. I ima

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions