Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data目录下已经是完整数据集了吗? #1

Closed
bojone opened this issue Jan 27, 2021 · 1 comment
Closed

data目录下已经是完整数据集了吗? #1

bojone opened this issue Jan 27, 2021 · 1 comment

Comments

@bojone
Copy link

bojone commented Jan 27, 2021

你好,今天在Arxiv上刚看到这篇文章,觉得这个任务颇有意思。我看data下的数据集只有很少,应该不是完整的数据集吧?请问完整的数据集可以申请吗?

@Pzoom522
Copy link
Owner

Pzoom522 commented Jan 27, 2021

苏剑林大佬好!谢谢关注我们的工作:heart:

很不幸,如我们在论文所提到的,本次发布的两种语言各100条数据集已经是完整的语料了。现阶段无法继续扩大规模的原因在于,标注这个任务的数据集的成本极其高昂;因此,它应全部用于模型测试而非训练。我们在EACL 2021会议前后,会完整开放我们预训练好的词向量、迁移学习摘要模型等,您届时可以在此基础上探索不依赖大规模有标注训练数据集来完成古文摘要任务的新策略。

如您对我们的项目或者论文有任何问题,欢迎留言讨论~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants