Consistent analysis of `after all` #19

AngledLuffa · 2024-12-28T01:29:10Z

There are a few different analyses of after all in this treebank:

# sent_id = en_lines-ud-train-doc6-2241
# text = We felt it was his day, after all.
1       We      we      PRON    PERS-P1PL-NOM   Case=Nom|Number=Plur|Person=1|PronType=Prs      2       nsubj   _       _
2       felt    feel    VERB    PAST    Mood=Ind|Tense=Past|VerbForm=Fin        0       root    _       _
3       it      it      PRON    PERS-SG _       6       nsubj   _       _
4       was     be      AUX     PAST    Mood=Ind|Number=Sing|Person=1|Tense=Past|VerbForm=Fin   6       cop     _       _
5       his     his     PRON    P3SG-GEN        Case=Gen|Gender=Masc|Number=Sing|Person=3|Poss=Yes|PronType=Prs 6       nmod:poss       _       _
6       day     day     NOUN    SG-NOM  Number=Sing     2       xcomp   _       SpaceAfter=No
7       ,       ,       PUNCT   Comma   _       8       punct   _       _
8       after   after   ADV     _       _       6       advmod  _       _
9       all     all     ADV     _       _       8       fixed   _       SpaceAfter=No
10      .       .       PUNCT   Period  _       2       punct   _       _

# sent_id = en_lines-ud-train-doc3-1011
# text = The United States is, after all, the prime revolutionary country.
1       The     the     DET     DEF     Definite=Def|PronType=Art       2       det     _       _
2       United  United  PROPN   SG-NOM  Number=Sing     12      nsubj   _       _
3       States  States  PROPN   SG-NOM  Number=Plur     2       flat    _       _
4       is      be      AUX     PRES    Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   12      cop     _       SpaceAfter=No
5       ,       ,       PUNCT   Comma   _       7       punct   _       _
6       after   after   ADP     _       _       7       case    _       _
7       all     all     PRON    TOT-SG  Case=Nom        12      nmod    _       SpaceAfter=No

# sent_id = en_lines-ud-train-doc4-1418
# text = After all, I also was a part of the great cause of these high and just proceedings.
1       After   after   ADP     _       _       2       case    _       _
2       all     all     PRON    TOT-SG  Case=Nom        8       nmod    _       SpaceAfter=No
3       ,       ,       PUNCT   Comma   _       2       punct   _       _
4       I       I       PRON    PERS-P1SG-NOM   Case=Nom|Number=Sing|Person=1|PronType=Prs      8       nsubj   _       _
5       also    also    ADV     _       _       8       advmod  _       _
6       was     be      AUX     PAST    Mood=Ind|Number=Sing|Person=1|Tense=Past|VerbForm=Fin   8       cop     _       _
7       a       a       DET     IND-SG  Definite=Ind|PronType=Art       8       det     _       _
8       part    part    NOUN    SG-NOM  Number=Sing     0       root    _       _
9       of      of      ADP     _       _       12      case    _       _
...

and elsewhere

Actually, none of these quite agree with what is done in EWT, which treats all as a DET in this expression:

# sent_id = weblog-blogspot.com_dakbangla_20050311135387_ENG_20050311_135387-0218
# text = We are, after all, in this together.
1       We      we      PRON    PRP     Case=Nom|Number=Plur|Person=1|PronType=Prs      8       nsubj   8:nsubj _
2       are     be      AUX     VBP     Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin   8       cop     8:cop   SpaceAfter=No
3       ,       ,       PUNCT   ,       _       2       punct   2:punct _
4       after   after   ADP     IN      _       5       case    5:case  _
5       all     all     DET     DT      PronType=Tot    8       obl     8:obl:after     SpaceAfter=No
6       ,       ,       PUNCT   ,       _       5       punct   5:punct _
7       in      in      ADP     IN      _       8       case    8:case  _
8       this    this    PRON    DT      Number=Sing|PronType=Dem        0       root    0:root  _
9       together        together        ADV     RB      _       8       advmod  8:advmod        SpaceAfter=No
10      .       .       PUNCT   .       _       8       punct   8:punct _

Open to having a PR which unifies these treatments?

The text was updated successfully, but these errors were encountered:

nschneid · 2024-12-28T01:35:26Z

I would favor the EWT approach. Note also that "all" attaches as obl, not nmod.

LarsAhrenberg · 2024-12-28T14:05:34Z

Thanks for pointing out the inconsistencies. I will fix them.

As regards the choice of DET vs. PRON for all I have followed the same annotation guidelines as for the Swedish treebanks and, apparently, generally for UD v.1. The UPOS would depend on the presence or absence of a head word. One reason is that I want the annotation for English-LinES and Swedish-LinES to be as similar as possible. And I'm not sure that the guidelines for English DET vs. PRON is in total agreement with the general guidelines for DET. What is the "hypothetical modified noun" in cases such as first of all, above all, at all?

I am also puzzled why demonstratives are treated differently from the quantifiers all, some, each, ... in following the v.1 guidelines? The general guidelines say that DETs in comparison to PRONs "are more likely to be used attributively (modifying a noun phrase) than substantively (replacing a noun phrase). Is there a difference here for English between quantifiers and demonstratives?

I note that German has a stricter division, only words that cannot be used attributively are assigned the UPOS PRON. This means that English, German and Swedish apply different principles for the choice of DET vs PRON.

It would be easy to make English_LinES follow the current principles, it would just mean changing the UPOS and FEATS in accordance with the proposed values when there is a deviation. (And it would not be so difficult to change back). However, I'd prefer waiting some time in the hope that there may be more agreement among the Germanic language family as a whole.

nschneid · 2024-12-28T14:17:42Z

You are right that there is an exception for demonstratives in the English guidelines, which otherwise generally follow Penn conventions that never treat "all", "some", etc. as pronouns.

That decision predated me. Perhaps @jnivre can weigh in on whether a uniform definition of DET vs. PRON for Germanic languages is desirable (and worth the disruption to longstanding within-language practices).

jnivre · 2024-12-28T14:19:09Z

Thanks for pointing out the inconsistencies. I will fix them.

As regards the choice of DET vs. PRON for all I have followed the same annotation guidelines as for the Swedish treebanks and, apparently, generally for UD v.1. The UPOS would depend on the presence or absence of a head word. One reason is that I want the annotation for English-LinES and Swedish-LinES to be as similar as possible. And I'm not sure that the guidelines for English DET vs. PRON is in total agreement with the general guidelines for DET. What is the "hypothetical modified noun" in cases such as first of all, above all, at all?

I am also puzzled why demonstratives are treated differently from the quantifiers all, some, each, ... in following the v.1 guidelines? The general guidelines say that DETs in comparison to PRONs "are more likely to be used attributively (modifying a noun phrase) than substantively (replacing a noun phrase). Is there a difference here for English between quantifiers and demonstratives?

I note that German has a stricter division, only words that cannot be used attributively are assigned the UPOS PRON. This means that English, German and Swedish apply different principles for the choice of DET vs PRON.

It would be easy to make English_LinES follow the current principles, it would just mean changing the UPOS and FEATS in accordance with the proposed values when there is a deviation. (And it would not be so difficult to change back). However, I'd prefer waiting some time in the hope that there may be more agreement among the Germanic language family as a whole.

I agree with @LarsAhrenberg that it would be good to discuss this more generally, at least for Germanic languages. I think the special treatment of demonstratives in English is a heritage from the PTB. However, it is not clear that this should carry over to UPOS tags (as opposed to XPOS tags).

LarsAhrenberg mentioned this issue Dec 28, 2024

Each as PRON instead of DET #11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent analysis of `after all` #19

Consistent analysis of `after all` #19

AngledLuffa commented Dec 28, 2024

nschneid commented Dec 28, 2024

LarsAhrenberg commented Dec 28, 2024

nschneid commented Dec 28, 2024

jnivre commented Dec 28, 2024

Consistent analysis of after all #19

Consistent analysis of after all #19

Comments

AngledLuffa commented Dec 28, 2024

nschneid commented Dec 28, 2024

LarsAhrenberg commented Dec 28, 2024

nschneid commented Dec 28, 2024

jnivre commented Dec 28, 2024

Consistent analysis of `after all` #19

Consistent analysis of `after all` #19