Matching embedded kmer tables to data
To make the alignment process as simple as possible, the user needs to provide POD5 file(s) and corresponding basecalls in a BAM file. Initially, also a kmer table was required, but since these are static a selection is now embedded in the executables.
The following tables are embedded:
- DNA R10 400bps
- DNA R10 260bps
- DNA R9 (legacy)
- RNA004
- RNA002 (legacy)
Fishnet uses the basecall model name that is stored in the header of a BAM file generated by Dorado, more specifically in the basecall_model value of the DS tag (for more information see the Dorado docs).
The matching is performed in alignment::core::refinement::load_kmer_table. If no embedded kmer table can be matched to given data, the user is required to provide one manually using the --kmer-table flag.