Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working Steps #1

Open
cometbridge1998 opened this issue Feb 26, 2025 · 0 comments
Open

Working Steps #1

cometbridge1998 opened this issue Feb 26, 2025 · 0 comments
Labels
documentation Improvements or additions to documentation

Comments

@cometbridge1998
Copy link
Owner

Generally this project should consist of two parts:
In the first part, a reliable collection of corpus date containing German support verb constructions (SVC) will be gathered.
The second part will research the context embedding of these constructions.

In temporary plan the project will follow these steps:

Part 1: Data collection

  1. Find a reliable list of German SVCs and process the text to a format suitable for CQP query.
  2. Install the cwb-ccc to carry out systematical queries, since CQP Web is less suitable for automatic queries.
  3. Conduct the "transformation test" to find out its validity for finding out German SVCs. Since most SVCs contain a preposition, a verbal noun and a function verb, I will design queries in this pattern and compare the result with the available list of SVCs. To identify the verbal nouns, I will write a small program basing on the morphological regulations, which compares the candidates with infinitive verbs.
  4. Gather texts with German SVCs and texts for the comparing group, in which the same verbs are used as full verbs.

Part 2: Research the embedding

  1. Fit different context embedding models to the texts and save the embedding layers.
  2. Conduct statistical analysis to the embedding of the functions verbs.
@cometbridge1998 cometbridge1998 added the documentation Improvements or additions to documentation label Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant