-
Notifications
You must be signed in to change notification settings - Fork 13
Annotation Layer and Metadata Field Names
ctschroeder edited this page Oct 9, 2014
·
17 revisions
<title>Annotation Layer and Metadata Field Names for Coptic SCRIPTORIUM Documents
</title>
Annotation Layer Names | |
tok | tokens, smallest possible unit to be annotated; MAY BE SMALLER THAN THE MORPHEMES IN ORIG |
orig | see transcription guidelines for smallest unit of LANGUAGE (morpheme or word level; smaller than the bound group level); orthography is from the original text (diplomatic, edition, whatever); includes supralinear strokes and other markings from the manuscript |
orig_group | bound groups using the original orthography, including supralinear strokes and other markings |
norm_group | bound groups (same structure as orig_word but with normalized spelling, etc., so content is based on norm) |
norm | normalized version of orig |
pos | part of speech tags |
lang | language of origin tags (Hebrew, Greek, Latin, Aramaic, etc.) |
morph | see transcription guidelines sections 4.3 and 4.4 for morphs that are below the word level -- this is where words containing mnt, at, ref are annotated a second time |
note | notes that normally would go in a TEI XML <note note="xxx"> tag |
hi@rend | see transcription guidelines sections 4.2 & 5; text renderings |
lb@n | line breaks -- numbered according to the original manuscript |
cb@n | column breaks -- numbered according to the original manuscript |
pb_xml@id | page numbers of original manuscript (not the current repositiory numbering) (TEI XML <pb xml:id="xxx"> |
ignore:note | notes that will NOT be imported into ANNIS or exported as TEI or PAULA XML; private notations from annotators/encoders/editors |
translation | English translation |
p | paragraph breaks for translation |
verse | verse of text written as number (always use in Bible of any kind, including Sahidica) |
vid | formerly verse@id (Sahidica) |
chapter | chapter of text as number (not necessary -- in metadata) |
chapter@cname | chapter of text written as text and number (not necessary -- in other data) |
chapter@cid | chapter id (Sahidica-- not necessary) |
verse@vname | verse of text written as text and number (e.g. 1 Corinthians 1:10) (not necessary -- in other data) |
add_place | |
Preferred order of layers | tok, orig, orig_group, norm_group, norm, pos, morph, lang translation, lb@n, cb@n, pb@xml_id, p |
METADATA in meta sheet | |
Coptic_edition | |
Greek_source | |
corpus | |
title | |
author | |
language | |
annotation | |
project | |
translation | |
msName | |
pages_from | |
pages_to | |
msContents_title@type | |
msContents_title@n | |
repository | |
collection | |
idno | |
version@n | |
version@date | |
source_info | |
license | use for copyright in Sahidica, CC-BY for everything else |
respStatement? | |