BactiSeq: Databases
Database Configuration
Table of contents
Introduction
Databases are fundamental for genomic analysis, and Bactiseq’s customizable database parameters provide critical flexibility that enhances the pipeline’s applicability across diverse research contexts. This functionality enables researchers to:
- Utilize specific databases to maintain consistency with previously executed analyses
- Optimize database size and scope when computational resources are limited
- Customize analyses for specific organisms or research questions
The pipeline automatically downloads default databases and stores them in a the path specified –db_path or default, saves to the current directory under Databases/
Default Database Versions
When no custom database paths are provided, Bactiseq downloads the following default versions:
- Bakta database - v1.10.4
- AMRFinderPlus database - v3.12.8
- CheckM2 database - v14897628
- RGI database - v6.0.3
- Busco database - v5.8.3 (
bacteria_odb10) - Kraken2 database - 8GB standard database
- Gambit database - v1.0
Database Parameters
| Parameter | Description | Default Value | Resources |
|---|---|---|---|
| db_path | Database storage directory for databases downloaded through pipeline | "./Databases" | custom local path |
| bakta_db | Path to Bakta annotation database | null | https://zenodo.org/records/14916843 |
| amr_db | Path to AMRFinderPlus database | null | https://github.com/ncbi/amr/wiki/AMRFinderPlus-database |
| card_db | Path to RGI database for AMR genes | null | https://github.com/arpcard/rgi/blob/master/docs/rgi_load.rst |
| checkm2_db | Path to CheckM2 database | null | https://zenodo.org/records/14897628 |
| checkm2_ver | Version of database that will be downloaded if no path is given | 14897628 | https://zenodo.org/records/14897628 |
| busco_db_type | Type of database that the Busco database contains | "bacteria_odb10" | https://busco.ezlab.org/ |
| busco_db | Path to Busco database | null | https://busco.ezlab.org/ |
| kraken2_db | Path to Kraken2 database | null | https://benlangmead.github.io/aws-indexes/k2 |
| gambit_db | Path to gambit database | null | https://gambit-genomics.readthedocs.io/en/latest/databases.html#database-releases |
If a different version fo the database is wanted, the path to the manually downaded database can be set in nextflow.config file of the pipeline. This allows researchers to utilize specific database to maintain consistency with potentially previously executed analyses or optimize database size and scope if computational resources are limited.
⚠️ WARNING: If you change the database type for Bakta to a light database, you must change the line in
subworkflows/local/databasedownload/main.nffor bakta to detectdb-lightinstead of justdb.