Bioinformatics Tools Reference
A curated catalog of commonly used tools organized by workflow step. Each entry links to the official source so you can evaluate and install what fits your pipeline. Some tools listed here are developed by the same team behind this tutorial site.
QC is the first step in nearly every pipeline. These tools help you assess raw read quality, detect adapter contamination, and trim/filter reads before downstream analysis.
| Tool | Platform | Description |
|---|---|---|
FastQC |
Desktop / CLI | Widely-used QC report generator for FASTQ files โ provides per-base quality, GC content, adapter content, and overrepresented sequence analysis. |
MultiQC |
CLI | Aggregates results from many tools (FastQC, STAR, Salmon, etc.) into a single interactive HTML report. Essential for multi-sample projects. |
fastp |
CLI | All-in-one preprocessor: quality trimming, adapter removal, filtering, and QC reporting in one pass. Very fast (multi-threaded C++). |
Cutadapt |
CLI | Flexible adapter and quality trimming tool with a rich filtering API. Great for non-standard adapter schemes. |
Trimmomatic |
CLI (Java) | Illumina-specific trimming with sliding-window quality filtering. Mature and widely cited. |
FastQLabAndroid ยท iOS |
Mobile (Android / iOS) | On-device FASTQ quality control for quick field or classroom use. Runs entirely on-device with no server needed โ useful when you want to preview QC metrics on sequencing data without setting up a full workstation. |
Aligners map sequencing reads to a reference genome or transcriptome. The choice depends on your experiment type (DNA vs RNA), speed requirements, and memory constraints.
| Tool | Type | Description |
|---|---|---|
BWA-MEM2 |
DNA short-read | The de-facto standard for whole-genome and exome alignment. Fast, accurate, and widely benchmarked. |
Minimap2 |
Long-read / DNA / RNA | Versatile aligner for PacBio, ONT, and short reads. Also supports splice-aware RNA mapping. |
STAR |
RNA-seq (splice-aware) | High-throughput splice-aware aligner for RNA-seq. Very fast but requires significant RAM (โฅ30 GB for human). |
HISAT2 |
RNA-seq / DNA | Memory-efficient splice-aware aligner using a graph FM index. Good balance of speed and RAM for RNA-seq. |
Bowtie2 |
DNA short-read | Fast, memory-efficient aligner. Common for ChIP-seq, ATAC-seq, and other assays where reads are short. |
JapalitySpliceGitHub ยท Android ยท iOS |
RNA-seq (splice-aware) | A sparse k-mer splice-aware RNA-seq aligner written in Rust, designed for resource-constrained environments. Supports annotation-guided junction detection and runs on laptops or even mobile devices. Particularly suitable for teaching, small genomes (yeast, Arabidopsis), and edge computing scenarios. Also available as a mobile app with on-device deep-learning splice site prediction for multiple model organisms. |
Pseudoalignment and lightweight mapping approaches that quantify transcript abundance without full genome alignment.
| Tool | Description |
|---|---|
Salmon |
Fast transcript quantification using selective alignment. Widely used for bulk RNA-seq DE analysis via tximeta/DESeq2. |
Kallisto |
Near-optimal pseudoalignment for RNA-seq quantification. Extremely fast with a small memory footprint. |
Tools for identifying SNPs, indels, and structural variants from aligned reads.
| Tool | Description |
|---|---|
GATK |
Industry-standard variant calling toolkit (HaplotypeCaller, Mutect2, VQSR). Comprehensive but complex. |
BCFtools |
Lightweight variant caller and VCF manipulation suite. Great for quick analyses and scripting. |
DeepVariant |
CNN-based variant caller from Google. High accuracy, especially on whole-genome short-read data. |
Essential utilities for manipulating, filtering, and summarizing alignment files.
| Tool | Description |
|---|---|
Samtools |
The standard tool for sorting, indexing, filtering, and inspecting SAM/BAM/CRAM files. |
Picard |
Java-based BAM utilities for deduplication, metrics collection, and format conversion. |
BEDTools |
Genomic interval arithmetic โ intersect, merge, subtract intervals across BED/BAM/VCF files. |
| Tool | Description |
|---|---|
DESeq2 |
R/Bioconductor package for count-based DE using negative binomial models. |
edgeR |
Alternative DE package using empirical Bayes estimation. |
limma-voom |
Microarray-heritage DE with a voom transformation for RNA-seq count data. |
| Tool | Description |
|---|---|
Kraken2 |
Ultra-fast k-mer-based taxonomic classification for metagenomic reads. |
MetaPhlAn |
Marker-gene based metagenomic profiling at species level. |
QIIME 2 |
Comprehensive microbiome analysis platform for amplicon and shotgun data. |
As pipelines grow in complexity, workflow managers help you orchestrate tools, handle dependencies, and ensure reproducibility.
| Tool | Description |
|---|---|
Nextflow |
DSL-based workflow manager with excellent container and cloud support. Powers nf-core community pipelines. |
Snakemake |
Python-based workflow engine with automatic dependency resolution and cluster support. |
CWL |
Common Workflow Language โ a platform-agnostic specification for describing analysis workflows. |
A growing category of tools that run directly on phones, tablets, or laptops โ no server infrastructure required. Useful for teaching, fieldwork, quick previews, and environments with limited connectivity.
| Tool | Platform | Description |
|---|---|---|
FastQLabAndroid ยท iOS |
Android / iOS | Perform FASTQ quality control directly on your phone or tablet. Provides per-base quality, GC content, and read-length distribution charts with zero cloud dependency. Free. |
JapalitySpliceGitHub (CLI) ยท Android ยท iOS |
CLI (Rust) / Android / iOS | Splice-aware RNA-seq aligner with sparse k-mer indexing, built for resource-constrained devices. The mobile version adds deep-learning splice site prediction for multiple model organisms (human, Drosophila, C. elegans, yeast). Open-source CLI under GPLv3. |
Note: FastQLab and JapalitySplice are developed by Japality Limited, the same team that maintains this tutorial site.
- Community & citations โ widely used tools have more documentation and community support.
- Hardware constraints โ STAR needs โฅ30 GB RAM for human; lighter tools exist for laptops/mobile.
- Experiment type โ DNA vs RNA vs metagenomics often dictates the pipeline.
- Reproducibility โ prefer tools with version-pinnable installs (Conda, containers).
- No single tool is best for every dataset and question.
- Benchmarks on one species may not transfer to another.
- Mobile/lightweight tools are great for teaching and previews, but production-scale work usually runs on servers.
- Always validate outputs with independent QC checks.