You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/labs/bioasq.md
+46-23Lines changed: 46 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: "BioASQ"
3
3
draft: false
4
4
params:
5
-
subtitle: "Large-scale biomedical semantic indexing and question answering"
5
+
subtitle: "A challenge in large-scale biomedical semantic indexing and question answering"
6
6
url: https://www.bioasq.org/workshop2026
7
7
menu:
8
8
main:
@@ -13,29 +13,52 @@ menu:
13
13
14
14
15
15
16
-
BioASQ organizes a series of challenges (shared tasks) for biomedical information access and machine learning systems in two complementary research directions: (a) the automated indexing of large volumes of unlabelled data, such as scientific articles, with biomedical concepts, (b) the processing of biomedical questions and the generation of comprehensible answers. Regarding the first direction, this year BioASQ introduces i) the new Task BioNNE-R, Nested Relation Extraction in Russian and English, ii) a new edition of Task ELCardioCC on Clinical Coding of Greek Cardiology Discharge Letters written in Greek, that focuses on document-level annotation, and iii) a new edition of Task GutBrainIE on Gut-Brain Interplay Information Extraction, incorporating more diverse and relevant biomedical literature. Regarding biomedical Question Answering (QA) direction, a whole infrastructure has been developed to support the established QA task (Task B), as well as the innovative Task Synergy, on QA for developing problems. In addition, a new edition of Task MultiClinSum on multilingual summarization of clinical case reports is introduced this year (Task MultiClinSum-2), extended to additional languages, namely German, Dutch, Catalan, Swedish, Norwegian, and Italian.
17
-
16
+
The aim of the BioASQ Lab is to push the research frontier towards systems that use the diverse and voluminous information available online to respond directly to the information needs of biomedical scientists.
18
17
<!--more-->
19
18
20
-
## Organizers
19
+
## Tasks
20
+
21
+
The BioASQ Lab features six different tasks:
21
22
22
-
- Anastasia Krithara (NCSR "Demokritos")
23
-
- Anastasios Nentidis (NSCR "Demokritos")
24
-
- Martin Krallinger (BSC)
25
-
- Miguel Rodriguez Ortega (BSC)
26
-
- Elena Tutubalina (KFU)
27
-
- Natalia Loukachevitch (Research Computing Center of Moscow State University)
28
-
- Igor Rozhkov (Lomonosov Moscow State University)
29
-
- Giorgio Maria Di Nunzio (University of Padova)
30
-
- Nicola Ferro (University of Padova)
31
-
- Stefano Marchesin (University of Padova)
32
-
- Marco Martinelli (University of Padova)
33
-
- Gianmaria Silvello (University of Padua)
34
-
- Grigorios Tsoumakas (Aristotle University of Thessaloniki)
35
-
- George Giannakoulas (Aristotle University of Thessaloniki)
36
-
- Dimitris Dimitriadis (Aristotle University of Thessaloniki)
37
-
- Alexandra Bekiaridou (Northwell Health)
38
-
- Athanasios Samaras (Aristotle University of Thessaloniki)
39
-
- Vasiliki Patsiou (Aristotle University of Thessaloniki)
Benchmark datasets of biomedical questions, in English, along with gold standard (reference) answers constructed by a team of biomedical experts. The participants have to respond with relevant articles, and snippets from designated resources, as well as exact and "ideal" answers.
26
+
{{< /accordion-item >}}
27
+
{{< accordion-item title="Synergy: Question Answering for developing problems" >}}
28
+
Biomedical experts pose unanswered questions for developing problems, such as COVID-19, receive the responses provided by the participating systems, and provide feedback, together with updated questions in an iterative procedure that aims to facilitate the incremental understanding of developing problems in biomedicine and public health.
A shared task on the automatic summarization of lengthy clinical case reports written in different languages. The organizers distribute lengthy clinical case reports written in English, Spanish, French, Portuguese, German, Dutch, Catalan, Swedish, Norwegian and Italian. The participants generate summaries of the clinical case reports. The evaluation is based on a comparison with manual summaries of the clinical case reports.
32
+
{{< /accordion-item >}}
33
+
{{< accordion-item title="BioNNE-R: Nested Relation Extraction in Russian and English" >}}
34
+
A shared task on NLP challenges in nested entity linking and relation extraction for English and Russian languages. The training and development sets will include relations among the most popular entities found in the NEREL-BIO dataset, such as disorders, anatomical structures, procedures, and chemicals. The evaluation is based on a comparison with manual nested relation annotations.
35
+
{{< /accordion-item >}}
36
+
{{< accordion-item title="ElCardioCC: Clinical Coding in Cardiology" >}}
37
+
The ELCardioCC 2026 shared task concerns the automatic assignment of cardiology-related ICD-10 codes to hospital discharge letters at the document level. The dataset comprises 5,000 documents for training and development and 1,000 documents for testing.
38
+
{{< /accordion-item >}}
39
+
{{< accordion-item title="GutBrainIE: Gut-Brain Interplay Information Extraction" >}}
40
+
The GutBrainIE task aims to foster the development of Information Extraction (IE) systems that support experts by automatically extracting and linking knowledge from scientific literature, facilitating the understanding of gut-brain interplay and its role in neurological disease. The task is divided into three subtasks: i) extraction of named entities, ii) identifying binary relations between entity pairs, and iii) linking entities to concepts in a reference ontology.
41
+
{{< /accordion-item >}}
42
+
{{< /accordion >}}
43
+
44
+
## Organizers
41
45
46
+
- Anastasia Krithara (National Center for Scientific Research "Demokritos", Greece)
47
+
- Anastasios Nentidis (National Center for Scientific Research "Demokritos", Greece)
48
+
- Martin Krallinger (Barcelona Supercomputing Center, Spain)
49
+
- Miguel Rodriguez Ortega (Barcelona Supercomputing Center, Spain)
50
+
- Elena Tutubalina (Artificial Intelligence Research Institute, Russia & Kazan Federal University, Russia )
51
+
- Natalia Loukachevitch (Moscow State University, Russia)
52
+
- Igor Rozhkov (Moscow State University, Russia)
53
+
- Giorgio Maria Di Nunzio (University of Padua, Italy)
54
+
- Nicola Ferro (University of Padua, Italy)
55
+
- Stefano Marchesin (University of Padua, Italy)
56
+
- Marco Martinelli (University of Padua, Italy)
57
+
- Gianmaria Silvello (University of Padua, Italy)
58
+
- Grigorios Tsoumakas (Aristotle University of Thessaloniki, Greece)
59
+
- George Giannakoulas (Aristotle University of Thessaloniki, Greece)
60
+
- Dimitris Dimitriadis (Aristotle University of Thessaloniki, Greece)
61
+
- Alexandra Bekiaridou (Northwell Health, USA)
62
+
- Athanasios Samaras (Aristotle University of Thessaloniki, Greece)
63
+
- Vasiliki Patsiou (Aristotle University of Thessaloniki, Greece)
64
+
- Georgios Paliouras (National Center for Scientific Research "Demokritos", Greece)
Copy file name to clipboardExpand all lines: content/labs/checkthat.md
+21-8Lines changed: 21 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,14 +13,27 @@ menu:
13
13
14
14
15
15
16
-
The 9th edition of the CheckThat! lab at CLEF targets three tasks: (i) scientific web discourse, (ii) generating full-fact-checking articles, and (iii) fact-checking numerical and temporal claims. These tasks represent challenging classification and retrieval problems, including multilingual settings.
16
+
The main objective of the lab is to advance the development of innovative technologies combating disinformation and manipulation efforts in online communication across a multitude of languages and platforms. The 9th edition of the CheckThat! lab at CLEF targets three tasks: (i) source retrieval for scientific web claims, (ii) fact-checking numerical claims, and (iii) generating full fact-checking articles.
17
17
18
18
<!--more-->
19
19
20
-
- Julia Maria Struß (University of Applied Sciences Potsdam)
21
-
- Venktesh V (Delft University of Technology)
22
-
- Vinay Setty (University of Stavanger)
23
-
- Stefan Dietze (GESIS Leibniz Institute for the Social Sciences)
24
-
- Tanmoy Chakraborty (IIT Delhi)
25
-
- Preslav Nakov (Mohamed bin Zayed University of Artificial Intelligence)
26
-
- Sebastian Schellhammer (GESIS Leibniz Institute for the Social Sciences)
20
+
## Tasks
21
+
22
+
The CheckThat! Lab features three different tasks:
23
+
24
+
{{< accordion >}}
25
+
{{< accordion-item title="Source Retrieval for Scientific Web Claims" >}}
26
+
Given a social media post that contains a scientific claim and an implicit reference to a scientific paper (mentions it without a URL), retrieve the mentioned paper from a pool of candidate papers.
This task involves verifying naturally occurring claims containing numerical quantities and temporal expressions by improving the reasoning process of Large Language Models (LLMs) through test-time scaling.
30
+
{{< /accordion-item >}}
31
+
{{< accordion-item title="Generating Full Fact-Checking Articles" >}}
32
+
Given a claim, its veracity, and a set of evidence documents consulted for fact-checking the claim, generate a full fact-checking article.
33
+
{{< /accordion-item >}}
34
+
{{< /accordion >}}
35
+
36
+
## Organizers
37
+
38
+
- Julia Maria Struß (University of Applied Sciences Potsdam, Germany)
39
+
- Sebastian Schellhammer (GESIS Leibniz Institute for the Social Sciences, Germany)
Copy file name to clipboardExpand all lines: content/labs/eloquent.md
+29-5Lines changed: 29 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: "ELOQUENT"
3
3
draft: false
4
4
params:
5
-
subtitle: "New evaluation methods for generative language models"
5
+
subtitle: "Lab for evaluation of generative language model quality"
6
6
url: https://eloquent-lab.github.io/
7
7
menu:
8
8
main:
@@ -13,17 +13,41 @@ menu:
13
13
14
14
15
15
16
-
The ELOQUENT evaluation lab experiments with new evaluation methods for generative language models to meet some of the challenges in the path from laboratory to application. The organisers include commercially active AI developers as well as research groups. This lab explores the following important characteristics of generative language
17
-
model quality: (1) Trustworthiness, a many-faceted notion which involves topical relevance and truthfulness, discourse competence, reasoning in language, controllability, and robustness across varied input, which is at the forefront of current development projects for generative language models; (2) Multi-linguality and cultural fit: the suitability of a language model for some cultural and linguistic area which is at top of attention, not least for the European arena; (3) Self-assessment: the reliability of a language model to assess the quality of itself or some other language model, using as little human effort as possible; (4) Limits of language models: the delimitation of world knowledge and generative capacity.
16
+
The ELOQUENT lab for evaluation of generative language model quality and usefulness addresses high-level quality criteria for generative language models through a set of open-ended shared tasks.
18
17
19
18
<!--more-->
20
19
20
+
## Tasks
21
+
22
+
The ELOQUENT Lab features three different tasks:
23
+
24
+
{{< accordion >}}
25
+
{{< accordion-item title="Voight-Kampff" >}}
26
+
Can machine-generated text be distinguished from human-authored text?
27
+
{{< /accordion-item >}}
28
+
{{< accordion-item title="Robustness " >}}
29
+
Will a generative language model's output reflect cultural variety? Will it be able to provide robust responses irrespective of interaction language?
30
+
{{< /accordion-item >}}
31
+
{{< accordion-item title="Topical PISA Quiz" >}}
32
+
Can a generative language model create a useful topical quiz from given text? Can it score responses to such quiz questions?
33
+
{{< /accordion-item >}}
34
+
{{< /accordion >}}
35
+
36
+
21
37
## Organizers
22
38
23
39
- Jussi Karlgren (AMD Silo AI)
24
-
- Ondřej Bojar (Charles University in Prague, ÚFAL)
25
-
- Marie Engels (Fraunhofer IAIS)
40
+
- Marie Isabel Engels (Fraunhofer IAIS)
41
+
- Maria Barrett
42
+
- Diandra Fabre
26
43
- Pavel Šindelář (Charles University)
44
+
- Ondřej Bojar (Charles University in Prague, ÚFAL)
Copy file name to clipboardExpand all lines: content/labs/erisk.md
+22-7Lines changed: 22 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,24 +3,39 @@ title: "eRisk"
3
3
draft: false
4
4
params:
5
5
subtitle: "Early risk prediction on the internet"
6
-
url: https://erisk.irlab.org
6
+
url: https://erisk.irlab.org/
7
7
menu:
8
8
main:
9
9
identifier: "lab-erisk"
10
10
parent: "labs"
11
11
weight: 40
12
12
---
13
13
14
+
eRisk explores the evaluation methodology, effectiveness metrics and practical applications (particularly those related to health and safety) of early risk detection on the Internet. Early detection technologies can be employed in different areas, particularly those related to health and safety. Our main goal is to pioneer a new interdisciplinary research area that would be potentially applicable to a wide variety of situations and to many different personal profiles.
14
15
16
+
<!--more-->
15
17
16
-
We propose eRisk 2026, the next edition of CLEF’s lab series on early risk prediction in online data. Building on nine previous editions (2017–2025), that explored important tasks such as depression, anorexia, self-harm, pathological gambling or eating disorders. This edition, eRisk 2026, introduces three main challenges: The first task is related with the interaction with conversational agents who have been instructed to simulate different user behaviours and conditions. Participants need to interact with the LLMs, predict the depression severity and their main symptoms (if exists). The second task corresponds to the second edition of the Contextualised Early-Depression Detection task, leveraging full Reddit conversation threads for richer conversational and contextual scenarios to emit timely risk predictions. Finally, the third task, symptom sentence ranking for Attention-Deficit Hyperactivity Disorder (ADHD), extends our ranking framework to a previously unexplored condition. For considering the ADHD symptoms, we use the ASRS-v1.1 clinical questionnaire. The lab continues the established three-year task cycle, offers baselines and high-quality datasets, and advances conversational and symptom level analysis as key elements for mental health solutions.
Detecting depression through conversational agents. Participants will interact with LLM personas fine-tuned with diverse user histories and released on Hugging Face. The challenge is to determine whether each persona exhibits signs of depression and, within a limited conversational window, identify active depressive symptoms and the overall depression level.
25
+
{{< /accordion-item >}}
26
+
{{< accordion-item title="Contextualised Early Detection of Depression" >}}
27
+
This task continues last year’s shift from isolated posts to full conversational contexts, aiming to capture real interaction dynamics across multiple speakers. Participants must process dialogues sequentially, accumulating evidence, where a message may only become informative when interpreted alongside preceding or subsequent publications.
28
+
{{< /accordion-item >}}
29
+
{{< accordion-item title="Sentence Ranking for ADHD Symptoms" >}}
30
+
This new task targets sentence-level retrieval for the 18 symptoms defined in the Adult ADHD Self-Report Scale (ASRS–v1.1). Participants must rank candidate sentences by their relevance to each symptom. A sentence is considered relevant when it conveys information about the user’s state with respect to the target ADHD symptom (irrespective of polarity or stance), encouraging models to capture clinically meaningful evidence rather than surface keywords.
31
+
{{< /accordion-item >}}
32
+
{{< /accordion >}}
17
33
18
-
<!--more-->
19
34
20
35
## Organizers
21
36
22
-
- Anxo Pérez (University of A Coruña)
23
-
-Xi Wang (University of Sheffield)
24
-
-Javier Parapar (University of A Coruña)
25
-
- Fabio Crestani (Università della Svizzera Italiana)
37
+
- Anxo Pérez (University of A Coruña, Spain)
38
+
-Javier Parapar (University of A Coruña, Spain)
39
+
-Xi Wang (University of Sheffield, United Kingdom)
40
+
- Fabio Crestani (Università della Svizzera Italiana, Switzerland)
Copy file name to clipboardExpand all lines: content/labs/exist.md
+34-8Lines changed: 34 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,6 +3,7 @@ title: "EXIST"
3
3
draft: false
4
4
params:
5
5
subtitle: "Sexism identification in social networks"
6
+
url: https://nlp.uned.es/exist2026/
6
7
menu:
7
8
main:
8
9
identifier: "lab-exist"
@@ -12,16 +13,41 @@ menu:
12
13
13
14
14
15
15
-
This lab focuses on the detection of sexist messages in social networks (SN). Inequality and discrimination against women that remains embedded in society is increasingly being replicated online. Internet perpetuates and even naturalizes gender differences and sexist attitudes. The EXIST 2026 lab will continue to focus on the detection of sexism in social networks, while introducing a novel paradigm that integrates human-centered signals into the AI development pipeline. In this edition, we extend the "learning with disagreement" framework by incorporating sensor-based data from people exposed to potentially sexist content. This includes measurements such as skin conductance, heart rate variability, and other sensor data that reflect unconscious responses to sexism. Given the nature of these multimodal signals, we will concentrate on analyzing memes and short videos—formats that combine visual and textual cues and are especially suited for capturing the emotional and cognitive impact of online content. This human-in-the-loop approach not only acknowledges the diversity of subjective reactions to sexism, but also opens new avenues for building more robust, equitable, and interpretable systems. By integrating both conscious feedback and unconscious reactions from annotators, EXIST 2026 aims to foster a more nuanced and ethically grounded understanding of sexism across platforms and formats.
16
+
This lab focuses on the detection of sexist messages in social networks (SN), particularly in complex multimedia formats such as memes and short videos. Inequality and discrimination against women that remains embedded in society is increasingly being replicated online. In this edition, we extend the Learning with Disagreement (LwD) framework by incorporating sensor-based data from people exposed to potentially sexist content. This includes measurements such as heart rate variability, EEG, eye-tracking, and other sensor data.
Aims to categorize the meme in different types of sexism according to the categorization proposed by experts that considers the different facets of women that are undermined.
33
+
{{< /accordion-item >}}
34
+
{{< /accordion >}}
35
+
19
36
## Organizers
20
37
21
-
- Jorge Carrillo-De-Albornoz (Universidad Nacional de Educacion a Distancia)
22
-
- Laura Plaza (Universidad Nacional de Educación a Distancia)
23
-
- Damiano Spina (RMIT University)
24
-
- Paolo Rosso (Universitat Politècnica de València)
25
-
- Iván Arcos (Universitat Politècnica de València)
26
-
- Elena Gomis (Universitat Politècnica de València)
27
-
- María Aloy (Universitat Politècnica de València)
38
+
- Laura Plaza (Universidad Nacional de Educación a Distancia, Spain)
39
+
- Jorge Carrillo-De-Albornoz (Universidad Nacional de Educacion a Distancia, Spain)
40
+
- Iván Arcos (Universitat Politècnica de València, Spain)
41
+
- María Aloy (Universitat Politècnica de València, Spain)
42
+
- Paolo Rosso (Universitat Politècnica de València, Spain)
0 commit comments