-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Part of Speech 's' #9
Comments
So lemmas with It might also be good for https://github.com/globalwordnet/schemas/ to change the |
Good point,
@jmccrae what do you think?
In the original OMW, we actually changed them all to 'a', even at the
synset level (and lost a little information).
…On Tue, Jul 20, 2021 at 10:14 PM Michael Wayne Goodman < ***@***.***> wrote:
So lemmas with partOfSpeech="s" should probably be partOfSpeech="a"?
It might also be good for https://github.com/globalwordnet/schemas/ to
change the partOfSpeech attribute on <Synset> in WN-LMF to sstype or
something, but being a backwards-incompatible change it might be harder to
get that through.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#9 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIPZRQML3DQ4Z3MDH2CMOLTYV67HANCNFSM5AV2XZRQ>
.
--
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University
|
Satellite is a fundamentally different part-of-speech in the structure of Princeton WordNet and certain parts of the structure, as well as related technical implementations (sense key calculation), rely on this. Linguistically, IMHO, it is not a sensible distinction and it leads to all kinds of issues (see OEWN's Issue globalwordnet/english-wordnet#35 for the start of the rabbit hole). My opinion is that PWN uses 'satellite' as a part-of-speech value on the same level as 'noun' and this should be respected in any export of PWN. OEWN may at some point, I hope, get round to removing this distinction. My opinion on adding |
@jmccrae I think you're addressing a different issue than what @FredsoNerd raised. Part-of-speech is, linguistically, a syntactic property and not a semantic property, and therefore in WNDB lexical entries (in the index files) have a @FredsoNerd was pointing out that $ grep well-connected *.adj
data.adj:00567414 00 s 01 well-connected 0 001 & 00566099 a 0000 | connected by blood or close acquaintance with people of wealth or social position; "a well-connected Edinburgh family"
index.adj:well-connected a 1 1 & 1 0 00567414 Note that the $ grep '"well-connected"' wn30.xml
<Lemma writtenForm="well-connected" partOfSpeech="s" /> Unless I've misunderstood the WNDB format, it appears this is an error, and the |
I guess that is an interpretation... the schema description of the format is kind of clear that that is not the interpretation we have made so far: |
But PWN never claimed that |
Oh, this was the interpretation at that time or a typo? |
Also, regardless of the interpretation and what we want to do moving forward, we should be careful that we're not changing or losing information in PWN, which is fixed (see #5 (comment)). Currently there's no info loss (in the entropy sense) as we can replace |
I think |
In the PWN:3.0 and PWN:3.1 data, one may find occurrences of PartsOfSpeech
s
, such as inIn https://wordnet.princeton.edu/documentation/lexnames5wn the POS described are: NOUN; VERB; ADJECTIVE; and ADVERB.
That might have been a misleading with the
ss_type
s from https://wordnet.princeton.edu/documentation/wndb5wn: NOUN (n); VERB (v); ADJECTIVE (a) ; ADJECTIVE SATELLITE (s); and ADVERB (r).The text was updated successfully, but these errors were encountered: