-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathchapter_blurbs.json
113 lines (113 loc) · 12.7 KB
/
chapter_blurbs.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
[
{
"short": "introduction",
"summary": "Since this is a book about question answering, I should try to answer some questions to start things out. I'm going to try to answer: ``what's this book about'', ``who am I'', and ``who can use this book''."
},
{
"short": "turing",
"summary": "In the 1950s, Alan Turing proposed a parlor game that would come to define the artificial intelligence: could a wiley interrogator discern whom they were talking to just through posing clever questions. The eponymous Turing Test is the most durable (but contentious) definition of what an intelligent computer is, and this chapter reviews how it shaped the development of artificial intelligence.",
"sections": [
{"label": "legacy", "title": "Turing's Legacy", "summary": "Why Turing is widely considered the father of computer science for multiple reasons."},
{"label": "imitation", "title": "The Imitation Game", "summary": "A gay man in England---where homosexuality is criminalized---posits a games about the ability of someone to pass in different gender roles. Why this is difficult lays the foundation for why computers still struggle to capture human nuance."},
{"label": "failures", "title": "No, the Turing Test has not been Solved", "summary": "While many have claimed to pass the Turing Test, it hasn't happened yet. We review some notable claims of passing the Turing Test and why they fell short."},
{"label": "test", "title": "A Rigorous Test", "summary": "The ingredients that make a Turing Test legitimate"},
{"label": "agi", "title": "General Artificial Intelligence", "summary": "How the holy grail of AI is defined and how the Turing Test can measure it."}
]
},
{
"short": "history",
"summary": "Despite its central importance to \\abr{ai}, asking and answering questions is not new---it stretches back into our collective unconcious. Riddles and trivia have persisted for millenia not just because they're entertaining. Indeed, being able to ask and answer questions is close to godliness. From the Sphynx to Gestumblindi, this chapter examines how myths from several cultures connect answering questions to forming identity, unlocking the secrets of the universe, and gaining intelligence. And connecting this to nascent ``superhuman'' \\abr{ai}. Later, beyond proving worthiness in myth, asking questions helped combat corruption and nepotism, the downfall of multiple civilizations. However, the administration of carefully formed questions saved the China and the United States via the reforms of Wu Zetian to the Pendelton Act. Exams created meritocracy and social mobility while also building an understanding of what it means to write a ``fair'' and ``useful'' question.",
"sections": [
{"label": "epic", "title": "Epic Riddles"},
{"label": "civil-service", "title": "How Exams Saved China and the US"}
]
},
{
"short": "classic",
"summary": "The earliest approaches to question answering.",
"sections": [
{"label": "baseball", "title": "Who's on First: Answering Questions about Baseball", "summary": "Green et al.'s 1979 Baseball system"},
{"label": "lunar", "title": "Can you Believe they Put a Man on the Moon?", "summary": "Woods and Kaplan's 1972 LUNAR system"},
{"label": "shrlu", "title": "A Sentient Printing Machine", "summary": "Winograd's SHRDLU"},
{"label": "kb", "title": "Knowledge Bases Get Large", "summary": "Scaling up old-fashioned methods in the Internet age"}
]},
{
"short": "ir",
"summary": "Information retrieval is the foundation of multi-billion dollar web companies from Baidu to Yahoo! But none of these companies would exist without the ideas of reusable test collections and term-based queries, which came about because of a few crazy experiments that happened in a small UK University in Cranfield. This chapter lays out the history of this methodology (which has become known as the Cranfield paradigm) and how it can answer questions.",
"sections": [
{"label": "cranfield", "title": "Reusible test collections"},
{"label": "tf-idf", "title": "Automatic indexing"},
{"label": "trec", "title": "Making Hide and Seek a Professional Game"}
]
},
{
"short": "manchester",
"summary": "While computer scientists were getting their feet wet answering questions in the wake of World War II, trivia enthusiasts were perfecting how to ask the perfect question. This chapter outlines the conventions and processes of the highest form of question answering---in the biased opinion of the author---and argues for why many of these standards could be a part of creating data for computer question answering. Previously published as \\citet{boyd-graber-20}.",
"sections": [
{"label": "name", "title": "Two British University Towns Alike in Question Answering"},
{"label": "university-challenge", "summary": "Grenada Studios in Manchester's QA format."},
{"label": "qb", "title": "It Takes a Village to Ask a Question", "summary": "The norms of the trivia community"},
{"label": "naming", "title": "Cranfield vs. Manchester", "summary": "The distinction between the two paradigms and why we call it that."}
]
},
{
"short": "watson",
"summary": "It's been nearly a decade since IBM Watson crushed two puny humans on Jeopardy! Some people took that to mean that computers were definitively better than humans at trivia. But that isn't the complete answer---this chapter, inspired by Jeopardy!'s gimmick of responding to answers to questions, questions some of the ``answers'' that emerged from IBM's tour de force."
},
{
"short": "formats",
"summary": "How do you ask a question: is it text, a picture, or a conversation? This chapter reviews the different forms question answering can take and what complexities that can introduce.",
"sections": [
{"label": "task", "title": "Task vs. Format"},
{"label": "question", "title": "What is a Question?", "summary": ""},
{"label": "evidence", "title": "Where do you get an Answer?", "summary": ""},
{"label": "conversation", "title": "What did you Mean?", "summary": ""},
{"label": "domain", "title": "A Picture is worth a Thousand Questions", "summary": ""},
{"label": "language", "title": "Questions Beyond English", "summary": ""},
{"label": "skills", "title": "The Skills Needed to Answer Questions", "summary": ""},
{"label": "taxonomy", "title": "A Taxonomy of Question Answering", "summary": ""}
]
},
{
"short": "datasets",
"summary": "The cliche is that data are the new oil, powering \\abr{ai}. Fortunately, because humans naturally ask questions, there are many datasets that we can find 'for free'. However, these datasets still come at a cost: many of these datasets have inherent problems (e.g., ambiguities and false presuppositions) or oddities (only talking about American men) that make them difficult to use for question answering. This chapter discusses these datasets that have formed the foundation of much of modern \\abr{ai}. If you can't find the data you need, build it yourself. But this is not always a perfect solution. This chapter discusses the datasets that people have built and the new problems that this can create.",
"sections": [
{"label": "nq", "title": "Natural Questions Five Years Later", "summary": ""},
{"label": "iid", "title": "Should QA be IID?", "summary": ""},
{"label": "badq", "title": "How questions go Wrong", "summary": ""},
{"label": "crowdsourcing", "title": "Constructed Datasets from Crowdsourcing", "summary": ""},
{"label": "generating", "title": "Generating Questions", "summary": ""},
{"label": "adversarial", "title": "", "summary": "If existing datasets and game show appearances aren't enough to tell whether humans or computers are better at answering questions, what can we do? While there is an argument for focusing on natural data, modern language models are changing not just what is possible computationally but changing the language itself. Thus, we need to select examples specifically to challenge computers. These examples are called \\emph{adversarial examples}, and this chapter presents how to gather them and how they can reveal the strengths and weaknesses of computer question answering."}
]
},
{
"short": "methods",
"summary": "Having discussed how we create datasets that can teach computers how to answer questions, we now explore how those datasets can train modern computer methods.",
"sections": [
{"label": "kb", "title": "Modern Knowledge", "summary": "This section reviews how we can turn natural language queries into actionable queries in databases."},
{"label": "mr", "title": "Johnny 5 Can't Read", "summary": "However, putting information in a database is difficult and time-consuming\\dots not all information is in a database (indeed, some information defies strict database schemas). Thus, we need to teach computers how to read. This chapter reviews the process of ``machine reading'', where computers find information in a large text corpus and then extracts the answer from it."},
{"label": "deep-retrieval", "title": "A Vector in a Haystack", "summary": "As deep learning became practical, the field has moved from representing text with discrete words to continuous vectors. This is also the case for question answering. This chapter reviews how these representations can help us find relevant answers to questions."},
{"label": "generation", "title": "We Couldn't Find an Answer, so we had to Generate One", "summary": "The deep learning revolution not just helps us find answers but to generate them. This chapter talks about the promise and peril of these approaches: how we can synthesize much richer information, make stuff up, encourage these agents to align to our wishes, and how it's hard to tell if an answer is any good."},
{"label": "cooperative", "title": "Cooperative QA", "summary": "Humans and Computers Working Together can Answer Questions Better"}
]
},
{
"short": "leaderboards",
"summary": "How do we know how smart a machine is? Like a human, we typically give it a test; the difference is that tests for computers are called `leaderboards'. This chapter talks about the pros and cons of leaderboards and how some of the methods used to analyze human standardized tests can help us understand the strengths and weaknesses of not just computer question answering specifically but \\abr{ai} generally. This also marks the reappearance of item response theory, which can help make deciding the smartest computer more efficient. Previously published as \\citet{Rodriguez-21:leaderboards} and \\citet{mgor-24}.",
"sections": [
{"label": "cheat", "title":"When 21 questions are worthless", "summary": "We compare how a trivia cheating scandal extends to AI leaderboards."}
]
},
{
"short": "gameshow",
"summary": "The year is 2025, and there's a new gameshow that not only showcases the most advanced \\abr{ai} available but also keeps the public informed about the limitations and struggles of current technology. This chapter outlines the ten seasons of the show and how it tracks the development of machine intelligence, leading to passing Turing Test."
},
{
"short": "scifi",
"summary": "While the book began with how question answering in myth helped the ancients grapple with a changing and uncertain world, today's fictions represent our contemporary attempts to understand the advent of intelligent computers. This chapter reviews human--computer question answering in 2001, Star Trek, The Terminator, Blade Runner, Futurama, and what these depictions reveal about both human conceptions of artificial intelligence and how it might shape future deployments of question answering.",
"sections": [
{"label": "presentations", "title": "Computer, speculate", "summary": "A review of the role of QA in science fiction: voigt-kampf, star trek, etc."},
{"label": "fears", "title": "p(Doom)", "summary": "Beyond the imagining of science fiction, what are the actual downsides to the widespread availability and deployment of `intelligent' \\abr{ai}? This chapter talks about \\abr{ai}'s ability to amplify existing negative \\emph{human} tendencies in responding to questions and how---while dangerous and worthy of mitigation---don't demand us to halt the development of modern \\abr{ai}."}
]
}
]