Skip to content

Commit

Permalink
Merge pull request #63 from bebatut/fixes
Browse files Browse the repository at this point in the history
Small fixes for the last 2 episodes
  • Loading branch information
raynamharris authored Jun 13, 2018
2 parents b08467c + dcc1011 commit 0921905
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 10 deletions.
8 changes: 1 addition & 7 deletions _episodes/02-project-planning.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ If you have a local high performance computing center or data storage facility o

If you don't have access to these resources, you can back up on hard drives. Have two backups, and keep the hard drives in different physical locations.

You can also use resources like [Amazon S3](https://aws.amazon.com/s3/), [Microsoft Azure](https://azure.microsoft.com/en-us/pricing/details/storage/blobs/), [Google Cloud](https://cloud.google.com/storage/) or others for cloud storage. The (open science framework)[https://osf.io] is a free option for storing files up to 5 GB. See more in the [cloud lesson](http://www.datacarpentry.org/cloud-genomics/05-which-cloud/).
You can also use resources like [Amazon S3](https://aws.amazon.com/s3/), [Microsoft Azure](https://azure.microsoft.com/en-us/pricing/details/storage/blobs/), [Google Cloud](https://cloud.google.com/storage/) or others for cloud storage. The [open science framework](https://osf.io) is a free option for storing files up to 5 GB. See more in the [cloud lesson](http://www.datacarpentry.org/cloud-genomics/05-which-cloud/).



Expand All @@ -100,12 +100,6 @@ be able to get to a solution on instinct alone - taking your time, using Google,
valid ways of solving your problems. As you complete the lessons you'll be able to use all of those methods more
efficiently.

#





> ## Where to go from here?
>
> More reading about core competencies
Expand Down
7 changes: 4 additions & 3 deletions _episodes/03-ncbi-sra.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ In our experiments we're usually generating our own genomic data, but many types
There are many repositories for public data. Some model organisms or fields have specific databases, and there are ones for particular types of data. Two of the most comprehensive are the [National Center for Biotechnology Information (NCBI)](https://www.ncbi.nlm.nih.gov) and [European Nucleotide Archive (EMBL-EBI)](https://www.ebi.ac.uk/ena). In this lesson we're working with the NCBI database, but the general process is the same for any database.

# Accessing the original archived data
The [sequencing dataset adapted for this lesson](http://www.datacarpentry.org/organization-genomics/data/) was obtained from the [NCBI Sequence Read Archive](http://www.ncbi.nlm.nih.gov/sra) which is a large (>3 quadrillion basepairs as of 2014) repository for next-generation sequence data. Like many NCBI databases, it is complex and mastering its use is greater than the scope of this lesson. Very often, as in the Lenski paper, there will be a direct link (perhaps in the supplemental information) to where on the SRA the dataset can be found. The link from the Lenski paper is: [http://www.ncbi.nlm.nih.gov/sra?term=SRA026813](http://www.ncbi.nlm.nih.gov/sra?term=SRA026813)

The [sequencing dataset (from Lenski paper) adapted for this lesson](http://www.datacarpentry.org/organization-genomics/data/) was obtained from the [NCBI Sequence Read Archive](http://www.ncbi.nlm.nih.gov/sra) which is a large (>3 quadrillion basepairs as of 2014) repository for next-generation sequence data. Like many NCBI databases, it is complex and mastering its use is greater than the scope of this lesson. Very often, as in the Lenski paper, there will be a direct link (perhaps in the supplemental information) to where on the SRA the dataset can be found. E.g. the link from the Lenski paper is [http://www.ncbi.nlm.nih.gov/sra?term=SRA026813](http://www.ncbi.nlm.nih.gov/sra?term=SRA026813)

## Locate the Run Accessory Table for the Lenski Dataset on the SRA

Expand All @@ -27,7 +28,7 @@ You will be presented with a page for the overall SRA accession SRA026813 - this

3. Click on the ['All runs'](http://www.ncbi.nlm.nih.gov/Traces/study/?acc=SRP004752) link under where it says **Study**. This is a description of all of the NGS datasets related to the experiment.

4. Go to the top of the page and in the **Total** row you will see there are 37 runs, 10.15Gb data, and 16.45 Gbases of data. Click the 'RunInfo Table' button.
4. Go to the top of the page and in the **Total** row you will see there are 37 runs, 10.15Gb data, and 16.45 Gbases of data. Click the 'RunInfo Table' button and save the file locally.

We are not downloading any actual sequence data here! This is only a text file that fully describes the entire
dataset.
Expand All @@ -37,7 +38,7 @@ You should now have a file called `SraRunTable.txt`
## Review the SraRunTable in a spreadsheet program


Using your choice of spreadsheet program open the **SraRunTable.txt** file. If prompted this is a tab-delimited file (`.tsv`).
Using your choice of spreadsheet program open the `SraRunTable.txt` file. If prompted this is a tab-delimited file (`.tsv`).

> ## Discussion
> Discuss with the person next to you:
Expand Down

0 comments on commit 0921905

Please sign in to comment.