Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于link prediction #4

Closed
ZihaoZheng98 opened this issue Jun 7, 2022 · 22 comments
Closed

关于link prediction #4

ZihaoZheng98 opened this issue Jun 7, 2022 · 22 comments
Labels
question Further information is requested

Comments

@ZihaoZheng98
Copy link

Hi,你们好,请问link prediction部分,看论文好像说对Bert的词表进行了扩充?这个是不是应该上传相应vocab或者config文件?我看代码好像就这里提到了,如果是我没注意到还请告诉我一下,感谢~

vision_config = CLIPConfig.from_pretrained('/home/lilei/package/clip-vit-base-patch32').vision_config
    text_config = BertConfig.from_pretrained('/home/lilei/package/bert-base-uncased')
    bert = BertModel.from_pretrained('/home/lilei/package/bert-base-uncased')
@flow3rdown
Copy link
Contributor

您好,添加实体的过程是在运行过程中完成的,不需要显式更改vocab或config文件。

如更改tokenizer:

num_added_tokens = self.tokenizer.add_special_tokens({'additional_special_tokens': entity_list})

更改模型的embedding:

# resize the word embedding layer
self.model.resize_token_embeddings(len(self.tokenizer))

@ZihaoZheng98
Copy link
Author

好的感谢~,然后提个建议,readme那里,好像写成了bash **.py,应该改一下

@flow3rdown
Copy link
Contributor

好的感谢~,然后提个建议,readme那里,好像写成了bash **.py,应该改一下

已修改,感谢指正

@flow3rdown flow3rdown added the question Further information is requested label Jun 9, 2022
@liang-ry
Copy link

请问MKG任务中的WN18-images数据集是哪种形式呀?我按照RSME的教程下载之后,发现训练不了模型。

@flow3rdown
Copy link
Contributor

WN18-images形式如下:
image
子文件夹为entities_id前加一个'n'

@liang-ry
Copy link

我也是按照这个格式,然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样
10
,请问您这边有遇到过这个问题吗

@flow3rdown
Copy link
Contributor

我也是按照这个格式,然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样 10 ,请问您这边有遇到过这个问题吗

我这边跑的结果正常,您预训练实体embedding阶段的模型表现如何?

@liang-ry
Copy link

我也是按照这个格式,然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样 10 ,请问您这边有遇到过这个问题吗

我这边跑的结果正常,您预训练实体embedding阶段的模型表现如何?

实体embedding是这样的:
image
不知道是不是我数据集有问题,我是直接使用KG-Bert里WN18的数据集。请问您是否可以提供一份原始的WN18的数据集,麻烦您了。

@flow3rdown
Copy link
Contributor

我也是按照这个格式,然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样 10 ,请问您这边有遇到过这个问题吗

我这边跑的结果正常,您预训练实体embedding阶段的模型表现如何?

实体embedding是这样的: image 不知道是不是我数据集有问题,我是直接使用KG-Bert里WN18的数据集。请问您是否可以提供一份原始的WN18的数据集,麻烦您了。

WN18的实体和关系数据已经上传到了MKG/dataset/WN18路径下,由于license问题,不方便公开

@Maigewm
Copy link

Maigewm commented Oct 25, 2022

你好,我也遇到了同样的问题,请问到底是哪里出错了呢?怎么解决的呢?

@flow3rdown
Copy link
Contributor

您好,您的问题是指WN18的训练结果很差吗?

@Maigewm
Copy link

Maigewm commented Oct 25, 2022

对的,给您看下我的训练结果,非常奇怪。
image
这是wn18的,hit出奇的高。
image
这是fbk的,hit@1为0。我已经反复训练多次,都是这种结果,不知道是哪里出的问题?
数据集都是根据您提供的链接下载并解压的,请问是否是预训练的数据集需要改动吗?

@Maigewm
Copy link

Maigewm commented Oct 26, 2022

我也是按照这个格式,然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样 10 ,请问您这边有遇到过这个问题吗

您好,我跟您遇到的情况一样,请问您解决了吗?怎么解决的呢?

@flow3rdown
Copy link
Contributor

对的,给您看下我的训练结果,非常奇怪。 image 这是wn18的,hit出奇的高。 image 这是fbk的,hit@1为0。我已经反复训练多次,都是这种结果,不知道是哪里出的问题? 数据集都是根据您提供的链接下载并解压的,请问是否是预训练的数据集需要改动吗?

请问您是根据README中提供的百度云链接下载的数据吗?运行的脚本是否有改动呢?预训练阶段的效果如何呀?

@Maigewm
Copy link

Maigewm commented Oct 26, 2022

对的,给您看下我的训练结果,非常奇怪。 image 这是wn18的,hit出奇的高。 image 这是fbk的,hit@1为0。我已经反复训练多次,都是这种结果,不知道是哪里出的问题? 数据集都是根据您提供的链接下载并解压的,请问是否是预训练的数据集需要改动吗?

请问您是根据README中提供的百度云链接下载的数据吗?运行的脚本是否有改动呢?预训练阶段的效果如何呀?

没错,我都是使用的百度云链接的数据,运行的脚本没有改动,预训练阶段效果也非常差,如下所示:
max number of filter entities : 4364 954
convert text to examples: 100%|██████████████████████████████████████████████| 14951/14951 [00:00<00:00, 213264.86it/s]
100%|██████████████████████████████████████████████████████████████████████████| 14951/14951 [00:07<00:00, 1951.95it/s]
delete entities without text name.: 100%|███████████████████████████████████| 20466/20466 [00:00<00:00, 1050025.39it/s]
total entity not in text : 0
max number of filter entities : 4364 954
convert text to examples: 100%|███████████████████████████████████████████████| 14951/14951 [00:00<00:00, 40823.01it/s]
100%|██████████████████████████████████████████████████████████████████████████| 14951/14951 [00:08<00:00, 1835.56it/s]
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]
Testing: 0it [00:00, ?it/s]/home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.)
tensor = as_tensor(value)
/home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.)
tensor = as_tensor(value)
/home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.)
tensor = as_tensor(value)
/home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.)
tensor = as_tensor(value)
Testing DataLoader 0: 100%|██████████████████████████████████████████████████████████| 234/234 [03:00<00:00, 1.30it/s]
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Test metric DataLoader 0
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Test/hits1 0.0
Test/hits10 0.0006019664236505919
Test/hits20 0.0011370476891177847
Test/hits3 0.0002006554745501973
Test/mean_rank 7485.40171226005
Test/mrr 0.0005941319530979928
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[{'Test/hits10': 0.0006019664236505919, 'Test/hits20': 0.0011370476891177847, 'Test/hits3': 0.0002006554745501973, 'Test/hits1': 0.0, 'Test/mean_rank': 7485.40171226005, 'Test/mrr': 0.0005941319530979928}]
pathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpath

@KiteYN
Copy link

KiteYN commented Nov 7, 2022

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢?

@zxlzr
Copy link
Contributor

zxlzr commented Nov 7, 2022

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢?
您好请参见 #17 (comment)#16

@KiteYN
Copy link

KiteYN commented Nov 7, 2022 via email

@flow3rdown
Copy link
Contributor

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢?

如果效果很差的话,应该是环境版本的问题,pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

@KiteYN
Copy link

KiteYN commented Nov 7, 2022

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢?

如果效果很差的话,应该是环境版本的问题,pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

收到,因为修改了pytorch, pytorch_lightning版本后报错了(

import pytorch_lightning as pl
Traceback (most recent call last):
File "", line 1, in
File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/init.py", line 20, in
from pytorch_lightning import metrics # noqa: E402
File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/init.py", line 15, in
from pytorch_lightning.metrics.classification import ( # noqa: F401
File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in
from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401
File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in
from pytorch_lightning.metrics.utils import deprecated_metrics
File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/utils.py", line 22, in
from torchmetrics.utilities.data import get_num_classes as _get_num_classes
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/torchmetrics/utilities/data.py)

好像是torchmetrics的问题,请问您们那边 torchmetrics 的版本号是?

@flow3rdown
Copy link
Contributor

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢?

如果效果很差的话,应该是环境版本的问题,pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

收到,因为修改了pytorch, pytorch_lightning版本后报错了(

import pytorch_lightning as pl
Traceback (most recent call last):
File "", line 1, in
File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/init.py", line 20, in
from pytorch_lightning import metrics # noqa: E402
File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/init.py", line 15, in
from pytorch_lightning.metrics.classification import ( # noqa: F401
File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in
from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401
File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in
from pytorch_lightning.metrics.utils import deprecated_metrics
File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/utils.py", line 22, in
from torchmetrics.utilities.data import get_num_classes as _get_num_classes
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/torchmetrics/utilities/data.py)

好像是torchmetrics的问题,请问您们那边 torchmetrics 的版本号是?

torchmetrics==0.7.3

@KiteYN
Copy link

KiteYN commented Nov 7, 2022

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢?

如果效果很差的话,应该是环境版本的问题,pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

谢谢!预训练性能正常了!^_^

@zjunlp zjunlp deleted a comment from Maigewm Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants