Command line arguments
The table below shows all arguments that are available in the command line interface.
| Argument |
Type |
Default |
Description |
-h, --help |
Flag |
- |
Print help message explaining all flags. -h shows a more compact version. |
-v, --version |
Flag |
- |
Print the program version. |
| Argument |
Type |
Default |
Description |
-b, --bam <bam> |
Path (file) |
- |
Path to a BAM input file. Only a single file must be provided. |
-p, --pod5 <pod5> |
Path(s) (file or directory) |
- |
Path(s) to the POD5 input. Multiple paths can be provided space-separated. Each path can point to a file or directory. If a directory is provided, all .pod5 files within it are processed. File and directory paths can be combined. |
-o, --out <output> |
Path (file) |
- |
Path to the output file. The file extension determines the format: .parquet for Parquet output, .jsonl for JSONL output. |
Optional Arguments
There are various parameters available to set more general settings or to fine-tune the algorithm. For clarity, optional arguments are grouped by function.
General settings
| Argument |
Type |
Default |
Description |
-r, --rna |
Flag |
- |
Whether direct RNA data is provided. If set, reverses the raw signal (3'->5') to match the base-called/mapped 5'->3' orientation. |
-k, --kmer-table <kmer_table> |
Path (file) |
- |
Path to a kmer level table. This is only required if no embedded kmer table can be matched to given data (more information) |
-a, --alignment-type <type> |
Enum (query, reference, both) |
query |
Determines the type of alignment to perform. query: aligns signal to base-called query sequence. reference: aligns signal to the reference sequence (if mapped). |
Output settings
| Argument |
Type |
Default |
Description |
-l, --output-level <level> |
Enum (1, 2, 3) |
2 |
Controls the output detail level: 1: read ID + alignment(s); 2: + sequences; 3: + signal. Note: including signal greatly increases file size. |
-f, --force-overwrite |
Flag |
- |
Overwrite existing output files if set. Otherwise, raises an error if the output file already exists. |
--output-batch-size <size> |
Integer |
4000 |
Number of alignments batched before writing to file. Higher values may improve speed but use more memory. |
Threading settings
| Argument |
Type / Possible Values |
Default |
Description |
-t, --threads <n> |
Integer |
8 |
Number of parallel threads used. Set to 1 to disable multithreading. If set to 2 or 3, processing falls back to single-threaded mode. |
--queue-size <size> |
Integer |
10000 |
Size of the queue for transferring data between worker threads. Only used if more than 3 threads are active. Reducing the queue size lowers memory usage. |
Logging settings
| Argument |
Type |
Default |
Description |
--log-level <level> |
Enum (off, error, warn, info, debug, trace) |
off |
Sets logging verbosity. The amount of intermediate information increases from error to trace. Use error for a summary of failed alignments. |
--log-path <path> |
Path (file) |
log.txt |
Path to the log file. Only used if log-level is not off. Existing logs are appended to. |
Refinement – Dynamic Programming
| Argument |
Type / Possible Values |
Default |
Description |
-i, --refine-iters <n> |
Integer |
2 |
Number of refinement iterations. In each iteration, alignment boundaries are adjusted to minimize signal differences, and rescaling parameters are recalculated. Set to 0 to skip refinement. |
--refine-algo <algo> |
Enum (viterbi, dwell-penalty) |
dwell-penalty |
Refinement algorithm. dwell-penalty internally uses Viterbi but penalizes short dwell times. |
--dwell-penalty-target <value> |
Float |
4.0 |
Target dwell time used by the dwell-penalty algorithm. |
--dwell-penalty-limit <value> |
Float |
3.0 |
Maximum dwell time considered for dwell-time penalty. |
--dwell-penalty-weight <value> |
Float |
0.5 |
Penalty weight applied to short dwell times in dwell-penalty refinement. |
--half-bandwidth <n> |
Integer |
5 |
Half-width of the signal band (number of neighboring bases considered during alignment). |
--min-band-size <n> |
Integer |
2 |
Minimum enforced sequence band size when adjusting the sequence band. |
--normalize-levels |
Flag |
- |
Normalize expected levels from the k-mer table (equivalent to do_fix_gauge in Remora). |
Refinement – Rescaling
| Argument |
Type / Possible Values |
Default |
Description |
--rescale-algo <algo> |
Enum (theil-sen, least-squares) |
theil-sen |
Rescaling algorithm for normalizing signal (norm_signal = (signal - shift) / scale). Uses the full signal for estimation. |
--rescale-dwell-filter-lower-quant <q> |
Float |
0.1 |
Lower dwell-time quantile filter. Bases with dwell times below this are excluded. |
--rescale-dwell-filter-upper-quant <q> |
Float |
0.9 |
Upper dwell-time quantile filter. Bases above this dwell time are excluded. |
--rescale-min-abs-level <value> |
Float |
0.2 |
Minimum normalized signal intensity difference required for inclusion. |
--rescale-num-bases-truncate <n> |
Integer |
10 |
Number of bases trimmed from each end before rescaling. |
--rescale-min-num-filtered-levels <n> |
Integer |
10 |
Minimum number of bases required after filtering for valid rescaling. |
--rescale-max-len <n> |
Integer |
1000 |
Maximum number of bases used for rescaling. If exceeded, a random subset is used (only for theil-sen). Set to 0 to use all. |
Refinement – Rough Rescaling
| Argument |
Type / Possible Values |
Default |
Description |
--rough-rescale-algo <algo> |
Enum (none, least-squares, theil-sen) |
theil-sen |
Algorithm for rough pre-refinement rescaling using signal percentiles instead of full signal data. |
--rough-rescale-quants-min <q> |
Float |
0.05 |
Lowest signal percentile used in rough rescaling. |
--rough-rescale-quants-max <q> |
Float |
0.95 |
Highest signal percentile used in rough rescaling. |
--rough-rescale-quants-steps <n> |
Integer |
19 |
Number of percentile steps between min and max. Default percentiles: 0.05, 0.10, ..., 0.95. |
--rough-rescale-clip-bases <n> |
Integer |
10 |
Number of bases trimmed from each end before rough rescaling. |
--rough-rescale-use-all-signal |
Flag |
- |
Whether to use the entire signal for percentile calculation. If unset, one measurement per base (center) is used to reduce computational load. |