Skip to content

Application Guide: SRA Toolkit

This page contains information on how to run a SRA Toolkit batch jobs on BlueBEAR.

See the SRA-Toolkit page on the BEAR Apps website for information on available versions.

SRA (Sequence Read Archive) is an NCBI-defined format for NGS data. Every data submitted to NCBI needs to be in SRA format. The SRA Toolkit provides tools for downloading data, converting different formats of data into SRA format, and vice versa, extracting SRA data in other different formats.

Example sbatch scripts

See example non-parallelised batch scripts for SRA Toolkit commands. For further general information on running BlueBEAR jobs, please see Jobs on BlueBEAR.

The following example downloads and converts the compressed SRA format to fastq files

#!/bin/bash

#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --time=5
#SBATCH --qos=bbshort
#SBATCH --mail-type=ALL
#SBATCH --account=_project name_

set -e

module purge; module load bluebear
module load SRA-Toolkit/2.10.9-gompi-2020b

OUTPUT_DIR=/path/to/output/directory

fasterq-dump --outdir "${OUPUT_DIR}" SRR014336

There is further documentation on the SRA Toolkit here https://hpc.nih.gov/apps/sratoolkit.htm