-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nmoddesc #13
Nmoddesc #13
Conversation
If you are able to share the Ssurgeon script that could be good for future reference. |
Also if we could keep a list somewhere of known titles, I'm trying to figure out how to implement this for GUM... just ran into "Tsarevna" :) |
15e432d
to
9f99123
Compare
@nschneid good call. I just put it into the changes themselves, eg the last three changes which affected the train set. The test & dev set I had done by hand. |
@amir-zeldes I used this regex based on what I saw when checking by hand:
but that clearly misses quite a few. other nobility titles, such as Queen or Duke ... (actually, seems I missed Queen myself, better go back and revise that) other jobs: historian showed up in PUD, and if I recall was part of how this whole thing was kicked off "Simple tailor Garak and Captain Sisko were observed loudly fighting shortly after the news of Senator Vreenak's death" sports (could consider it part of job): defender, forward, goalie, quarterback, etc eg "Quarterbacks Jalen Hurts and Jayden Daniels meet for the third time this weekend" |
135f6be
to
31f7a1b
Compare
alright, went back and touched up my current changes with Queen included. Happy to take any other suggestions for added titles to search for |
There are way more personal roles. See e.g. the list in PersonalRoles.pm. |
Very useful, thanks for pointing it out! |
Yes, quite helpful, thanks! I will point out again there are quite a few sports ones missing: winger (or just wing), forward, center, quarterback... trying to hit a variety of team sports here |
also some female versions of titles: empress, duchess |
The script linked from UniversalDependencies/UD_English-EWT#561 has a bunch |
Admiral? ADMIRAL Kirk? ... but in general I found a few missing ones in ParTUT thanks to this list, so, thanks again for sharing,. In the phrase |
👍
Right, "his sister" and "Laure" are separate nominals connected by |
0aa2225
to
4641048
Compare
Alright, Martin's list helped me find one more in the dev set I had missed and a couple others in the train set. Thanks for the help! Will merge and call it a day |
…n the train set *without* flipping the other edges... yet Also flip nmod:desc for sentences where the word in question was the root Will need to carefully adjust the links that used to go to Minister (and similar words), as presumably Prime should not modify Shinzo for example Adjust words such as critic w/o doing any 'the critic ...' Attempt to fix "poet and critic John Dryden" "Marxist playwright and director Bertolt Brecht" compare to the original by using the first title as the nmod:desc Ssurgeon script used follows. Note that this clearly indicates a missing feature in Ssurgeon, macros Also note that there could be an addition clause, "!> /nmod:poss/ {}", in each of the expressions. There are two sentences affected by such a change: should be updated: # text = **his Captain Ahab** in Moby-Dick is a classic tragic hero, inspired by King Lear. should NOT be updated: # text = the following year he was joined by **his sister Laure** and they spent four years away from home. Rather than hash out how to do that via Ssurgeon, if even possible, we simply edited this by hand. # in this expression: # the 'othertitle' is not a parent via nmod|compound|flat so that the phrase isn't # Mr Cox, Mr Hänsch, ... {word:/(?i:actor|admiral|adviser|economist|father|general|judge|justice|lieutenant|lord|miss|mother|professor|representative|scholar|scientist|sister|writer|Queen|poet|Mr|Mrs|Madam|Commissioner|Messrs|Minister|President|Governor|Chancellor|economist|fellow|Director|philosopher|critic|King|novelist|playwright|Lady|author|Captain)/}=oldhead <conj ({word:/(?i:actor|admiral|adviser|economist|father|general|judge|justice|lieutenant|lord|miss|mother|professor|representative|scholar|scientist|sister|writer|Queen|poet|Mr|Mrs|Madam|Commissioner|Messrs|Minister|President|Governor|Chancellor|economist|fellow|Director|philosopher|critic|King|novelist|playwright|Lady|author|Captain)/}=othertitle <=reln {} !>/nmod|compound|flat/ {} !>det {} !< conj ({} >det {})) >/nmod|compound|flat/=dead {}=newhead . {}=newhead reattachNamedEdge -edge reln -dep newhead removeNamedEdge -edge dead addEdge -gov newhead -dep othertitle -reln nmod:desc {word:/(?i:actor|admiral|adviser|economist|father|general|judge|justice|lieutenant|lord|miss|mother|professor|representative|scholar|scientist|sister|writer|Queen|poet|Mr|Mrs|Madam|Commissioner|Messrs|Minister|President|Governor|Chancellor|economist|fellow|Director|philosopher|critic|King|novelist|playwright|Lady|author|Captain)/}=oldhead <conj ({word:/(?i:actor|admiral|adviser|economist|father|general|judge|justice|lieutenant|lord|miss|mother|professor|representative|scholar|scientist|sister|writer|Queen|poet|Mr|Mrs|Madam|Commissioner|Messrs|Minister|President|Governor|Chancellor|economist|fellow|Director|philosopher|critic|King|novelist|playwright|Lady|author|Captain)/}=othertitle !< {} !>/nmod|compound|flat/ {} !>det {} !< conj ({} >det {})) >/nmod|compound|flat/=dead {}=newhead . {}=newhead removeNamedEdge -edge dead addEdge -gov newhead -dep othertitle -reln nmod:desc setRoots newhead {word:/(?i:actor|admiral|adviser|economist|father|general|judge|justice|lieutenant|lord|miss|mother|professor|representative|scholar|scientist|sister|writer|Queen|poet|Mr|Mrs|Madam|Commissioner|Messrs|Minister|President|Governor|Chancellor|economist|fellow|Director|philosopher|critic|King|novelist|playwright|Lady|author|Captain)/}=oldhead <=reln {} >/nmod|compound|flat/=dead {}=newhead . {}=newhead !>det {} !< conj ({} >det {}) reattachNamedEdge -edge reln -dep newhead removeNamedEdge -edge dead addEdge -gov newhead -dep oldhead -reln nmod:desc {word:/(?i:actor|admiral|adviser|economist|father|general|judge|justice|lieutenant|lord|miss|mother|professor|representative|scholar|scientist|sister|writer|Queen|poet|Mr|Mrs|Madam|Commissioner|Messrs|Minister|President|Governor|Chancellor|economist|fellow|Director|philosopher|critic|King|novelist|playwright|Lady|author|Captain)/}=oldhead !< {} >/nmod|compound|flat/=dead {}=newhead . {}=newhead !>det {} !< conj ({} >det {}) removeNamedEdge -edge dead addEdge -gov newhead -dep oldhead -reln nmod:desc setRoots newhead
# make the child come after newhead so that skipped double titles aren't captured # eg, "poet and critic John Dryden" {}=oldhead </nmod:desc/=nmod ({}=newhead .. {}=child) >=reln {}=child reattachNamedEdge -edge reln -gov newhead
…, depending on what the relation is. Not amod|nmod|acl # not nmod|amod|acl because, when checked by hand, those indicated words that modified the title, not the person # acl for an instance of "**Managing** Director Dominique Strauss", where "Managing" is modifying "Director" # it's not clear that would generalize for other instances of acl {}=oldhead </nmod:desc/=nmod ({}=newhead) >/^(?!.*nmod|amod|acl).*$/=reln {}=child -- {}=child reattachNamedEdge -edge reln -gov newhead
add nmod:desc to various titles which are used as part of a person's name
#9