-
-
Notifications
You must be signed in to change notification settings - Fork 76
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
66 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,72 @@ | ||
--- | ||
layout: reference | ||
title: Reference | ||
--- | ||
|
||
## Glossary | ||
|
||
FIXME | ||
{:auto_ids} | ||
accession | ||
: a unique identifier assigned to each sequence or set of sequences | ||
|
||
BLAST | ||
: The Basic Local Alignment Search Tool at NCBI that searches for similarities between known and unknown biomolecules like DNA | ||
|
||
categorical variable | ||
: Variables can be classified as categorical (aka, qualitative) or quantitative (aka, numerical). Categorical variables take on a fixed number of values that are names or labels. | ||
|
||
cleaned data | ||
: data that has been manipulated post-collection to remove errors or inaccuracies, introduce desired formatting changes, or otherwise prepare the data for analysis | ||
|
||
conditional formatting | ||
: formatting that is applied to a specific cell or range of cells depending on a set of criteria | ||
|
||
CSV (comma separated values) format | ||
: a plain text file format in which values are separated by commas | ||
|
||
factor | ||
: a variable that takes on a limited number of possible values (i.e. categorical data) | ||
|
||
Gb | ||
: gigabyte of file storage or file size | ||
|
||
Gbase | ||
: a gigabase represents one billion nucleic acid bases (Gbp may indicate one billion base pairs of nucleic acid) | ||
|
||
headers | ||
: names at tops of columns that are descriptive about the column contents (sometimes optional) | ||
|
||
metadata | ||
: data which describes other data | ||
|
||
NGS | ||
: common acronym for "Next Generation Sequencing" currently being replaced by "High Throughput Sequencing" | ||
|
||
null value | ||
: a value used to record observations missing from a dataset | ||
|
||
observation | ||
: a single measurement or record of the object being recorded (e.g. the weight of a particular mouse) | ||
|
||
plain text | ||
: unformatted text | ||
|
||
quality assurance | ||
: any process which checks data for validity during entry | ||
|
||
quality control | ||
: any process which removes problematic data from a dataset | ||
|
||
raw data | ||
: data that has not been manipulated and represents actual recorded values | ||
|
||
rich text | ||
: formatted text (e.g. text that appears bolded, colored or italicized) | ||
|
||
string | ||
: a collection of characters (e.g. "thisisastring") | ||
|
||
TSV (tab separated values) format | ||
: a plain text file format in which values are separated by tabs | ||
|
||
variable | ||
: a category of data being collected on the object being recorded (e.g. a mouse's weight) |