-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Here is an example of 0th
instead of 5th
: (2nd line of the tifu_all_tokenized_and_filtered.json
)
"selftext_html": "[...] Confuse a 5th grade girl for a boy in front of half of her class. Kids are mean. Sorry Sandra.</strong></p>\n</div><!-- SC_ON -->",
"tldr_tokenized": [
"confuse",
"a",
"0th",
"grade",
"girl",
"for",
"a",
"boy",
"in",
"front",
"of",
"half",
"of",
"her",
"class",
"kids",
"are",
"mean",
"sorry",
"sandra",
"*"
],
I guess this is an error or is this intended for some reason?
PS: Additionally, I just realized that the *
is erroneous as well, isn't it? It is probably because of the bold text in the original string (see https://www.reddit.com/r/tifu/comments/1ggydk/tifu_by_genderstereotyping/)
Metadata
Metadata
Assignees
Labels
No labels