Application Guide: SRA Toolkit¶
This page contains information on how to run a SRA Toolkit batch jobs on BlueBEAR.
See the SRA-Toolkit page on the BEAR Apps website for information on available versions.
SRA (Sequence Read Archive) is an NCBI-defined format for NGS data. Every data submitted to NCBI needs to be in SRA format. The SRA Toolkit provides tools for downloading data, converting different formats of data into SRA format, and vice versa, extracting SRA data in other different formats.
Example sbatch
scripts¶
See example non-parallelised batch scripts for SRA Toolkit commands. For further general information on running BlueBEAR jobs, please see Jobs on BlueBEAR.
The following example downloads and converts the compressed SRA format to fastq files
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --time=5
#SBATCH --qos=bbshort
#SBATCH --mail-type=ALL
#SBATCH --account=_project name_
set -e
module purge; module load bluebear
module load SRA-Toolkit/2.10.9-gompi-2020b
OUTPUT_DIR=/path/to/output/directory
fasterq-dump --outdir "${OUPUT_DIR}" SRR014336
There is further documentation on the SRA Toolkit here https://hpc.nih.gov/apps/sratoolkit.htm