Skip to content

Clarifications on annotation  #211

@MikeMpapa

Description

@MikeMpapa

Hi there,
I am working on a building a new dataset in Spanish (polysyllabic language). I have gone though MakeDiffSinger but I still have some gaps. I would be grateful if you could sanity check me on my understanding and share any thoughts you might have

Questions for clarifications:

  1. ph_seq: These are sequences of phonemes or syllables?
    Currently I using phonemes and their timestamps as provided by MFA. I am using a pre-trained Spanish model available by MFA. Would you recommend training a new one on my specific data?

  2. note_dur: The midi notes should be estimated over phonemes, syllables, or words?
    Now I estimated one note for each phoneme and assumed ph_dur==note_dure

  3. ph_num: The number of phonemes in each word or in each syllable?
    Now I assumed the number of phonemes in each word

  4. note_seq: Do you think SOME would suffice to get a first shot at this ? I would speculate yes?

  5. is_slur: how would you define slur in this context? I have not found plenty of resources on this topic
    Now I assumed no slurs at all

  6. SPs and APs: Would you recommend doing that manually or using the enhance script might be OK for a first shot?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions