Skip to content

Command line arguments

The table below shows all arguments that are available in the command line interface.

Argument Type Default Description
-h, --help Flag - Print help message explaining all flags. -h shows a more compact version.
-v, --version Flag - Print the program version.

Required Input/Output

Argument Type Default Description
-b, --bam <bam> Path (file) - Path to a BAM input file. Only a single file must be provided.
-p, --pod5 <pod5> Path(s) (file or directory) - Path(s) to the POD5 input. Multiple paths can be provided space-separated. Each path can point to a file or directory. If a directory is provided, all .pod5 files within it are processed. File and directory paths can be combined.
-o, --out <output> Path (file) - Path to the output file. The file extension determines the format: .parquet for Parquet output, .jsonl for JSONL output.

Optional Arguments

There are various parameters available to set more general settings or to fine-tune the algorithm. For clarity, optional arguments are grouped by function.

General settings

Argument Type Default Description
-r, --rna Flag - Whether direct RNA data is provided. If set, reverses the raw signal (3'->5') to match the base-called/mapped 5'->3' orientation.
-k, --kmer-table <kmer_table> Path (file) - Path to a kmer level table. This is only required if no embedded kmer table can be matched to given data (more information)
-a, --alignment-type <type> Enum (query, reference, both) query Determines the type of alignment to perform. query: aligns signal to base-called query sequence. reference: aligns signal to the reference sequence (if mapped).

Output settings

Argument Type Default Description
-l, --output-level <level> Enum (1, 2, 3) 2 Controls the output detail level: 1: read ID + alignment(s); 2: + sequences; 3: + signal. Note: including signal greatly increases file size.
-f, --force-overwrite Flag - Overwrite existing output files if set. Otherwise, raises an error if the output file already exists.
--output-batch-size <size> Integer 4000 Number of alignments batched before writing to file. Higher values may improve speed but use more memory.

Threading settings

Argument Type / Possible Values Default Description
-t, --threads <n> Integer 8 Number of parallel threads used. Set to 1 to disable multithreading. If set to 2 or 3, processing falls back to single-threaded mode.
--queue-size <size> Integer 10000 Size of the queue for transferring data between worker threads. Only used if more than 3 threads are active. Reducing the queue size lowers memory usage.

Logging settings

Argument Type Default Description
--log-level <level> Enum (off, error, warn, info, debug, trace) off Sets logging verbosity. The amount of intermediate information increases from error to trace. Use error for a summary of failed alignments.
--log-path <path> Path (file) log.txt Path to the log file. Only used if log-level is not off. Existing logs are appended to.

Refinement – Dynamic Programming

Argument Type / Possible Values Default Description
-i, --refine-iters <n> Integer 2 Number of refinement iterations. In each iteration, alignment boundaries are adjusted to minimize signal differences, and rescaling parameters are recalculated. Set to 0 to skip refinement.
--refine-algo <algo> Enum (viterbi, dwell-penalty) dwell-penalty Refinement algorithm. dwell-penalty internally uses Viterbi but penalizes short dwell times.
--dwell-penalty-target <value> Float 4.0 Target dwell time used by the dwell-penalty algorithm.
--dwell-penalty-limit <value> Float 3.0 Maximum dwell time considered for dwell-time penalty.
--dwell-penalty-weight <value> Float 0.5 Penalty weight applied to short dwell times in dwell-penalty refinement.
--half-bandwidth <n> Integer 5 Half-width of the signal band (number of neighboring bases considered during alignment).
--min-band-size <n> Integer 2 Minimum enforced sequence band size when adjusting the sequence band.
--normalize-levels Flag - Normalize expected levels from the k-mer table (equivalent to do_fix_gauge in Remora).

Refinement – Rescaling

Argument Type / Possible Values Default Description
--rescale-algo <algo> Enum (theil-sen, least-squares) theil-sen Rescaling algorithm for normalizing signal (norm_signal = (signal - shift) / scale). Uses the full signal for estimation.
--rescale-dwell-filter-lower-quant <q> Float 0.1 Lower dwell-time quantile filter. Bases with dwell times below this are excluded.
--rescale-dwell-filter-upper-quant <q> Float 0.9 Upper dwell-time quantile filter. Bases above this dwell time are excluded.
--rescale-min-abs-level <value> Float 0.2 Minimum normalized signal intensity difference required for inclusion.
--rescale-num-bases-truncate <n> Integer 10 Number of bases trimmed from each end before rescaling.
--rescale-min-num-filtered-levels <n> Integer 10 Minimum number of bases required after filtering for valid rescaling.
--rescale-max-len <n> Integer 1000 Maximum number of bases used for rescaling. If exceeded, a random subset is used (only for theil-sen). Set to 0 to use all.

Refinement – Rough Rescaling

Argument Type / Possible Values Default Description
--rough-rescale-algo <algo> Enum (none, least-squares, theil-sen) theil-sen Algorithm for rough pre-refinement rescaling using signal percentiles instead of full signal data.
--rough-rescale-quants-min <q> Float 0.05 Lowest signal percentile used in rough rescaling.
--rough-rescale-quants-max <q> Float 0.95 Highest signal percentile used in rough rescaling.
--rough-rescale-quants-steps <n> Integer 19 Number of percentile steps between min and max. Default percentiles: 0.05, 0.10, ..., 0.95.
--rough-rescale-clip-bases <n> Integer 10 Number of bases trimmed from each end before rough rescaling.
--rough-rescale-use-all-signal Flag - Whether to use the entire signal for percentile calculation. If unset, one measurement per base (center) is used to reduce computational load.