From ea36fb95a097e06f48a6a237739cc04fbb58e248 Mon Sep 17 00:00:00 2001 From: Edward Wallace Date: Thu, 25 Jun 2020 12:04:01 +0100 Subject: [PATCH] Data sharing in 03-ncbi-sra.md * added that learners are likely to upload your data to a public repository, including as a take-home. * clarified that almost all analyses use reference data, which previously said "many" which I consider an understatement. --- _episodes/03-ncbi-sra.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/_episodes/03-ncbi-sra.md b/_episodes/03-ncbi-sra.md index 32c493d2..2dc5f05f 100644 --- a/_episodes/03-ncbi-sra.md +++ b/_episodes/03-ncbi-sra.md @@ -9,9 +9,12 @@ objectives: - "Understand how to access and download this data." keypoints: - "Public data repositories are a great source of genomic data." +- "You are likely to put your own data on a public repository." --- -In our experiments we're usually generating our own genomic data, but many types of analyses use reference data or you may want to use it to compare your results or annotate your data with publicly available data. You may also want to do a full project or set of analyses using publicly available data. This data is a great, and essential, resource for genomic data analysis. +In our experiments we usually think about generating our own sequencing data. However, almost all analyses use reference data, and you may want to use it to compare your results or annotate your data with publicly available data. You may also want to do a full project or set of analyses using publicly available data. This data is a great, and essential, resource for genomic data analysis. + +When you come to publish a paper including your sequencing data, most journals and funders require that you place your data on a public repository. Sharing your data makes it more likely that your work will be re-used and cited. It helps to prepare for this early! There are many repositories for public data. Some model organisms or fields have specific databases, and there are ones for particular types of data. Two of the most comprehensive public repositories are provided by the [National Center for Biotechnology Information (NCBI)](https://www.ncbi.nlm.nih.gov) and the [European Nucleotide Archive (EMBL-EBI)](https://www.ebi.ac.uk/). The NCBI's [Sequence Read Archive (SRA)](https://trace.ncbi.nlm.nih.gov/Traces/sra/) is the database we will be using for this lesson, but the EMBL-EBI's Nucleic Acid Archive (ENA) is also useful. The general processes are similar for any database.