From e69b687cbc938520faab65f1cd92b9da4b74827f Mon Sep 17 00:00:00 2001 From: hoytpr Date: Thu, 28 Mar 2019 10:17:09 -0500 Subject: [PATCH] Adding glossary to replace FIXME --- reference.md | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 66 insertions(+), 2 deletions(-) diff --git a/reference.md b/reference.md index 6260be63..72396270 100644 --- a/reference.md +++ b/reference.md @@ -1,8 +1,72 @@ --- layout: reference -title: Reference --- ## Glossary -FIXME +{:auto_ids} +accession +: a unique identifier assigned to each sequence or set of sequences + +BLAST +: The Basic Local Alignment Search Tool at NCBI that searches for similarities between known and unknown biomolecules like DNA + +categorical variable +: Variables can be classified as categorical (aka, qualitative) or quantitative (aka, numerical). Categorical variables take on a fixed number of values that are names or labels. + +cleaned data +: data that has been manipulated post-collection to remove errors or inaccuracies, introduce desired formatting changes, or otherwise prepare the data for analysis + +conditional formatting +: formatting that is applied to a specific cell or range of cells depending on a set of criteria + +CSV (comma separated values) format +: a plain text file format in which values are separated by commas + +factor +: a variable that takes on a limited number of possible values (i.e. categorical data) + +Gb +: gigabyte of file storage or file size + +Gbase +: a gigabase represents one billion nucleic acid bases (Gbp may indicate one billion base pairs of nucleic acid) + +headers +: names at tops of columns that are descriptive about the column contents (sometimes optional) + +metadata +: data which describes other data + +NGS +: common acronym for "Next Generation Sequencing" currently being replaced by "High Throughput Sequencing" + +null value +: a value used to record observations missing from a dataset + +observation +: a single measurement or record of the object being recorded (e.g. the weight of a particular mouse) + +plain text +: unformatted text + +quality assurance +: any process which checks data for validity during entry + +quality control +: any process which removes problematic data from a dataset + +raw data +: data that has not been manipulated and represents actual recorded values + +rich text +: formatted text (e.g. text that appears bolded, colored or italicized) + +string +: a collection of characters (e.g. "thisisastring") + +TSV (tab separated values) format +: a plain text file format in which values are separated by tabs + +variable +: a category of data being collected on the object being recorded (e.g. a mouse's weight) \ No newline at end of file