Submitting data to the NCBI Sequence Read Archive (SRA)
Pre-processing steps
- Collect fastq.gz files for each sample
- Rename fastq.gz files from long name to short name per sample
- Get biosample data early (especially if it's from collaborators)
This is a place to highlight tips and software demos with the goal of preserving expertise and knowledge from within the CQLS and the departments and labs that we serve.
Looking for R tips? Python? Command line questions? Help with the CQLS infrastructure? You’ve come to the right place!
Note
This content was previously hosted at tips.cqls.oregonstate.edu and software.cqls.oregonstate.edu/tips
Do you think you have an idea that you'd like to highlight on the page? Do you want to convert your BUG talk into a post that can be a guide for other folks to follow for their own research? Let us know! We'd be happy to have guest authors for tips posts, especially as you become experts at specific software tools and pipelines.
Users that are looking to work with MathWorks MATLAB software are able to access
it at the command line. First off, MATLAB will not run on the front end machines
shell-hpc.cqls.oregonstate.edu
and files-hpc.cqls.oregonstate.edu
. Users are able to
run the software stack on the processing machines using the -nodisplay
option
when accessing MATLAB via the command line.
The purpose of this guide is to describe the primary differences in analyzing sequences generated using the Earth Microbiome Project (EMP) or standard Illumina protocols. In other words, this post will describe the practical aspects and applications with respect to analyzing the data, rather than the theory behind the process, study design, etc.