Bioinformatics Tools Reference

A curated catalog of commonly used tools organized by workflow step. Each entry links to the official source so you can evaluate and install what fits your pipeline. Some tools listed here are developed by the same team behind this tutorial site.

Quality Control & Preprocessing

QC is the first step in nearly every pipeline. These tools help you assess raw read quality, detect adapter contamination, and trim/filter reads before downstream analysis.

Tool	Platform	Description
`FastQC`	Desktop / CLI	Widely-used QC report generator for FASTQ files — provides per-base quality, GC content, adapter content, and overrepresented sequence analysis.
`MultiQC`	CLI	Aggregates results from many tools (FastQC, STAR, Salmon, etc.) into a single interactive HTML report. Essential for multi-sample projects.
`fastp`	CLI	All-in-one preprocessor: quality trimming, adapter removal, filtering, and QC reporting in one pass. Very fast (multi-threaded C++).
`Cutadapt`	CLI	Flexible adapter and quality trimming tool with a rich filtering API. Great for non-standard adapter schemes.
`Trimmomatic`	CLI (Java)	Illumina-specific trimming with sliding-window quality filtering. Mature and widely cited.
`FastQLab` Android · iOS	Mobile (Android / iOS)	On-device FASTQ quality control for quick field or classroom use. Runs entirely on-device with no server needed — useful when you want to preview QC metrics on sequencing data without setting up a full workstation.

Read Alignment

Aligners map sequencing reads to a reference genome or transcriptome. The choice depends on your experiment type (DNA vs RNA), speed requirements, and memory constraints.

Tool	Type	Description
`BWA-MEM2`	DNA short-read	The de-facto standard for whole-genome and exome alignment. Fast, accurate, and widely benchmarked.
`Minimap2`	Long-read / DNA / RNA	Versatile aligner for PacBio, ONT, and short reads. Also supports splice-aware RNA mapping.
`STAR`	RNA-seq (splice-aware)	High-throughput splice-aware aligner for RNA-seq. Very fast but requires significant RAM (≥30 GB for human).
`HISAT2`	RNA-seq / DNA	Memory-efficient splice-aware aligner using a graph FM index. Good balance of speed and RAM for RNA-seq.
`Bowtie2`	DNA short-read	Fast, memory-efficient aligner. Common for ChIP-seq, ATAC-seq, and other assays where reads are short.
`JapalitySplice` GitHub · Android · iOS	RNA-seq (splice-aware)	A sparse k-mer splice-aware RNA-seq aligner written in Rust, designed for resource-constrained environments. Supports annotation-guided junction detection and runs on laptops or even mobile devices. Particularly suitable for teaching, small genomes (yeast, Arabidopsis), and edge computing scenarios. Also available as a mobile app with on-device deep-learning splice site prediction for multiple model organisms.

Transcript Quantification

Pseudoalignment and lightweight mapping approaches that quantify transcript abundance without full genome alignment.

Tool	Description
`Salmon`	Fast transcript quantification using selective alignment. Widely used for bulk RNA-seq DE analysis via tximeta/DESeq2.
`Kallisto`	Near-optimal pseudoalignment for RNA-seq quantification. Extremely fast with a small memory footprint.

Variant Calling

Tools for identifying SNPs, indels, and structural variants from aligned reads.

Tool	Description
`GATK`	Industry-standard variant calling toolkit (HaplotypeCaller, Mutect2, VQSR). Comprehensive but complex.
`BCFtools`	Lightweight variant caller and VCF manipulation suite. Great for quick analyses and scripting.
`DeepVariant`	CNN-based variant caller from Google. High accuracy, especially on whole-genome short-read data.

SAM/BAM Utilities

Essential utilities for manipulating, filtering, and summarizing alignment files.

Tool	Description
`Samtools`	The standard tool for sorting, indexing, filtering, and inspecting SAM/BAM/CRAM files.
`Picard`	Java-based BAM utilities for deduplication, metrics collection, and format conversion.
`BEDTools`	Genomic interval arithmetic — intersect, merge, subtract intervals across BED/BAM/VCF files.

Differential Expression

Tool	Description
`DESeq2`	R/Bioconductor package for count-based DE using negative binomial models.
`edgeR`	Alternative DE package using empirical Bayes estimation.
`limma-voom`	Microarray-heritage DE with a voom transformation for RNA-seq count data.

Metagenomics & Taxonomy

Tool	Description
`Kraken2`	Ultra-fast k-mer-based taxonomic classification for metagenomic reads.
`MetaPhlAn`	Marker-gene based metagenomic profiling at species level.
`QIIME 2`	Comprehensive microbiome analysis platform for amplicon and shotgun data.

Workflow Managers & Reproducibility

As pipelines grow in complexity, workflow managers help you orchestrate tools, handle dependencies, and ensure reproducibility.

Tool	Description
`Nextflow`	DSL-based workflow manager with excellent container and cloud support. Powers nf-core community pipelines.
`Snakemake`	Python-based workflow engine with automatic dependency resolution and cluster support.
`CWL`	Common Workflow Language — a platform-agnostic specification for describing analysis workflows.

Mobile & On-Device Bioinformatics

A growing category of tools that run directly on phones, tablets, or laptops — no server infrastructure required. Useful for teaching, fieldwork, quick previews, and environments with limited connectivity.

Tool	Platform	Description
`FastQLab` Android · iOS	Android / iOS	Perform FASTQ quality control directly on your phone or tablet. Provides per-base quality, GC content, and read-length distribution charts with zero cloud dependency. Free.
`JapalitySplice` GitHub (CLI) · Android · iOS	CLI (Rust) / Android / iOS	Splice-aware RNA-seq aligner with sparse k-mer indexing, built for resource-constrained devices. The mobile version adds deep-learning splice site prediction for multiple model organisms (human, Drosophila, C. elegans, yeast). Open-source CLI under GPLv3.

Note: FastQLab and JapalitySplice are developed by Japality Limited, the same team that maintains this tutorial site.

How to choose a tool

Consider

Community & citations — widely used tools have more documentation and community support.
Hardware constraints — STAR needs ≥30 GB RAM for human; lighter tools exist for laptops/mobile.
Experiment type — DNA vs RNA vs metagenomics often dictates the pipeline.
Reproducibility — prefer tools with version-pinnable installs (Conda, containers).

Keep in mind

No single tool is best for every dataset and question.
Benchmarks on one species may not transfer to another.
Mobile/lightweight tools are great for teaching and previews, but production-scale work usually runs on servers.
Always validate outputs with independent QC checks.