Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] 在无gpu的机器上执行case,运行时报错数据集未注册(其实已经注册) #1725

Open
2 tasks done
Caeser-SONG opened this issue Nov 29, 2024 · 1 comment
Assignees

Comments

@Caeser-SONG
Copy link

Prerequisite

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

  1. conda python 10
  2. cpu模式运行设置 (num_gpus=0)
  3. 运行环境无gpu
  4. dataset 已经注册在LOAD_DATASET 里
  5. 报错位置
opencompass/task/openical_infer.py

def run(self, cur_model=None, cur_model_abbr=None):
        self.logger.info(f'Task {task_abbr_from_cfg(self.cfg)}')
        for model_cfg, dataset_cfgs in zip(self.model_cfgs, self.dataset_cfgs):
            self.max_out_len = model_cfg.get('max_out_len', None)
            self.batch_size = model_cfg.get('batch_size', None)
            self.min_out_len = model_cfg.get('min_out_len', None)
            if cur_model and cur_model_abbr == model_abbr_from_cfg(model_cfg):
                self.model = cur_model
            else:
                self.model = build_model_from_cfg(model_cfg)

            for dataset_cfg in dataset_cfgs:
                self.model_cfg = model_cfg
                self.dataset_cfg = dataset_cfg
                self.infer_cfg = self.dataset_cfg['infer_cfg']
                self.dataset = build_dataset_from_cfg(self.dataset_cfg)

Reproduces the problem - code/configuration sample

执行的配置代码

from docs.en.conf import project
from opencompass.datasets.mymodel.load_data import MyDataset
from opencompass.openicl.icl_retriever import ZeroRetriever
from opencompass.openicl.icl_inferencer import GenInferencer
from opencompass.runners import LocalRunner
from opencompass.tasks import OpenICLInferTask
from opencompass.openicl.icl_prompt_template import PromptTemplate
from opencompass.partitioners import NaivePartitioner

from opencompass.metrics.mymodel.eval_metrics import ObjectiveEvaluator
from opencompass.models.mymodel.llama_cpp_model import LlamaCppModel
from opencompass.models import HuggingFacewithChatTemplate,HuggingFaceBaseModel

reader_cfg = dict(
    input_columns=['question', 'prompt'],
    output_column='reference',
)

infer_cfg = dict(
    # Prompt 生成配置
    prompt_template=dict(
        type=PromptTemplate,
        # Prompt 模板,模板形式与后续指定的 inferencer 类型相匹配
        # 这里为了计算 PPL,需要指定每个答案对应的 Prompt 模板
        template=dict(
            begin=[
                dict(role='system', prompt='{prompt}\n')
            ],
            round=[
                dict(role='user', prompt='{question}\n'),
            ])),
    retriever=dict(type=ZeroRetriever),
    # 推理方式配置
    #   - PPLInferencer 使用 PPL(困惑度)获取答案
    #   - GenInferencer 使用模型的生成结果获取答案
    inferencer=dict(type=GenInferencer),
    partitioner=dict(type=NaivePartitioner),
    runner=dict(type=LocalRunner, max_num_workers=16, task=dict(type=OpenICLInferTask)),
)

project_ch=['文本扩写']
project_en=['text_expansion']

datasets = []

for ch,en in zip(project_ch,project_en):
    eval_cfg = dict(evaluator=dict(type=ObjectiveEvaluator, project=ch), num_gpus=0)
    datasets.append(dict(
            type=MyDataset,
            abbr=en,
            filename=en,
            path=f'./data/{en}',
            reader_cfg=reader_cfg,
            infer_cfg=infer_cfg,
            eval_cfg=eval_cfg)
    )

models = [
    dict(
        abbr='qwen2.5-1.5B-instruct',
        type=HuggingFacewithChatTemplate,
        path='opencompass/models/mymodel/modelscope/qwen/Qwen2___5-1___5B-instruct',
        max_out_len=5000,
        batch_size=16,
        run_cfg=dict(num_gpus=0),
        stop_words=['<|im_end|>', '<|im_start|>'],
        max_seq_len=5000
        # model_kwargs=dict(tensor_parallel_size=1, gpu_memory_utilization=0.8),
    )
]
@LOAD_DATASET.register_module()
class MyDataset(BaseDataset):

Reproduces the problem - command or script

python3 run.py configs/objective_testcase/eval_all.py

Reproduces the problem - error message

标准输出:

launch OpenICLInfer[qwen2.5-1.5B-instruct/schedule_extraction] on CPU                                                                   
  0%|                                                                                                             | 0/1 [00:00<?, ?it/s]11/29 14:19:04 - OpenCompass - ERROR - /home/caesar/codes/opencompass/opencompass/runners/local.py - _launch - 236 - task OpenICLInfer[qwen2.5-1.5B-instruct/schedule_extraction] fail, see
outputs/all/20241129_141900/logs/infer/qwen2.5-1.5B-instruct/schedule_extraction.out
100%| 1/1 [00:04<00:00,  4.80s/it]
11/29 14:19:04 - OpenCompass - ERROR - /home/caesar/codes/opencompass/opencompass/runners/base.py - summarize - 64 - OpenICLInfer[qwen2.5-1.5B-instruct/schedule_extraction] failed with code 1
11/29 14:19:04 - OpenCompass - INFO - Partitioned into 1 tasks.

output输出:

11/29 15:08:35 - OpenCompass - INFO - Task [qwen2.5-1.5B-instruct/schedule_extraction]
WARNING 11-29 15:08:37 _custom_ops.py:19] Failed to import from vllm._C with ImportError('libcuda.so.1: cannot open shared object file: No such file or directory')
11/29 15:08:40 - OpenCompass - INFO - using stop words: ['<|endoftext|>', '<|im_start|>', '<|im_end|>']
Traceback (most recent call last):
  File "/home/caesar/codes/opencompass/opencompass/tasks/openicl_infer.py", line 161, in <module>
    inferencer.run()
  File "/home/caesar/codes/opencompass/opencompass/tasks/openicl_infer.py", line 79, in run
    self.dataset = build_dataset_from_cfg(self.dataset_cfg)
  File "/home/caesar/miniconda3/envs/opencompass/lib/python3.10/site-packages/opencompass/utils/build.py", line 13, in build_dataset_from_cfg
    return LOAD_DATASET.build(dataset_cfg)
  File "/home/caesar/miniconda3/envs/opencompass/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
    return self.build_func(cfg, *args, **kwargs, registry=self)
  File "/home/caesar/miniconda3/envs/opencompass/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 100, in build_from_cfg
    raise KeyError(
KeyError: 'opencompass.datasets.mymodel.mymodel.MyDataset is not in the opencompass::load_dataset registry. Please check whether the value of `opencompass.datasets.mymodel.mymodel.MyDataset` is correct or it was registered as expected. More details can be found at https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html#import-the-custom-module'

Other information

在main函数中打印LOAD_DATASET 是一直有MYDATASET的;但是运行报错代码的时候 LOAD_DATASET里就没有MYDATASET
了,但是main打印的在执行task前后都是有的,不知道是不是多线程导致的

@tonysy
Copy link
Collaborator

tonysy commented Dec 5, 2024

You need import the datasets in opencompass/datasets/__init__.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants