Skip to content

Commit

Permalink
Adding glossary to replace FIXME
Browse files Browse the repository at this point in the history
  • Loading branch information
hoytpr committed Mar 28, 2019
1 parent 5324ef0 commit e69b687
Showing 1 changed file with 66 additions and 2 deletions.
68 changes: 66 additions & 2 deletions reference.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,72 @@
---
layout: reference
title: Reference
---

## Glossary

FIXME
{:auto_ids}
accession
: a unique identifier assigned to each sequence or set of sequences

BLAST
: The Basic Local Alignment Search Tool at NCBI that searches for similarities between known and unknown biomolecules like DNA

categorical variable
: Variables can be classified as categorical (aka, qualitative) or quantitative (aka, numerical). Categorical variables take on a fixed number of values that are names or labels.

cleaned data
: data that has been manipulated post-collection to remove errors or inaccuracies, introduce desired formatting changes, or otherwise prepare the data for analysis

conditional formatting
: formatting that is applied to a specific cell or range of cells depending on a set of criteria

CSV (comma separated values) format
: a plain text file format in which values are separated by commas

factor
: a variable that takes on a limited number of possible values (i.e. categorical data)

Gb
: gigabyte of file storage or file size

Gbase
: a gigabase represents one billion nucleic acid bases (Gbp may indicate one billion base pairs of nucleic acid)

headers
: names at tops of columns that are descriptive about the column contents (sometimes optional)

metadata
: data which describes other data

NGS
: common acronym for "Next Generation Sequencing" currently being replaced by "High Throughput Sequencing"

null value
: a value used to record observations missing from a dataset

observation
: a single measurement or record of the object being recorded (e.g. the weight of a particular mouse)

plain text
: unformatted text

quality assurance
: any process which checks data for validity during entry

quality control
: any process which removes problematic data from a dataset

raw data
: data that has not been manipulated and represents actual recorded values

rich text
: formatted text (e.g. text that appears bolded, colored or italicized)

string
: a collection of characters (e.g. "thisisastring")

TSV (tab separated values) format
: a plain text file format in which values are separated by tabs

variable
: a category of data being collected on the object being recorded (e.g. a mouse's weight)

0 comments on commit e69b687

Please sign in to comment.