-
(New Preprint) Can Knowledge Editing Really Correct Hallucinations?
+
(ICLR 2025) Can Knowledge Editing Really Correct Hallucinations?
- We proposed
HalluEditBench to holistically benchmark knowledge editing methods in correcting real-world hallucinations on five dimensions including
Efficacy,
Generalization,
Portability,
Locality, and
Robustness. We find their effectiveness could be far from what their performance on existing datasets suggests, and the performance beyond
Efficacy for all methods is generally unsatisfactory.
-
(New Preprint) Can Editing LLMs Inject Harm?
+
(Preprint) Can Editing LLMs Inject Harm?
- We propose to reformulate knowledge editing as a new type of safety threat for LLMs, namely
Editing Attack, and discover its emerging risk of injecting misinformation or bias into LLMs stealthily, indicating the feasibility of disseminating misinformation or bias with LLMs as new channels.
-
+
+
(SIGKDD Explorations 2024) Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges
- This survey paper systematically categorizes authorship attribution in the era of LLMs into four problems:
attributing unknown texts to human authors,
detecting LLM-generated texts,
identifying specific LLMs or human authors, and
classifying texts as human-authored, machine-generated, or co-authored by both, while also highlighting key challenges and open problems.
(EMNLP 2024 Findings) Can Large Language Models Identify Authorship?
- We propose
Linguistically Informed Prompting (LIP) strategy, which offers in-context linguistic guidance, to boost LLMs' reasoning capacity for
authorship verification and
attribution tasks, while also providing natural language explanations.
-
-
(AI Magazine 2024) Combating Misinformation in the Age of LLMs: Opportunities and Challenges
-
- - A survey of the opportunities (
can we utilize LLMs to combat misinformation) and challenges (
how to combat LLM-generated misinformation) of combating misinformation in the age of LLMs.
-
-
(Proceedings of ICLR 2024) Can LLM-Generated Misinformation Be Detected?
-
- - We discover that LLM-generated misinformation can be
harder to detect for humans and detectors compared to human-written misinformation with the same semantics, which suggests it can have
more deceptive styles and potentially cause more harm.
-
+
-->