CLI Reference¶
For an up-to-date list of subcommands and CLI options refer to hictk --help.
Subcommands¶
Blazing fast tools to work with .hic and .cool files.
hictk [OPTIONS] [SUBCOMMANDS]
OPTIONS:
-h, --help Print this help message and exit
-V, --version Display program version information and exit
[Option Group: help]
[At most 1 of the following options are allowed]
OPTIONS:
--help-cite Print hictk's citation in Bibtex format and exit.
--help-build-meta Print information regarding hictk's build options and third-party
dependencies, and exit.
--help-docs Print the URL to hictk's documentation and exit.
--help-license Print the hictk license and exit.
--help-telemetry Print information regarding telemetry collection and exit.
SUBCOMMANDS:
balance Balance Hi-C files using ICE, SCALE, or VC.
convert Convert Hi-C files between different formats.
dump Read interactions and other kinds of data from .hic and Cooler
files and write them to stdout.
fix-mcool Fix corrupted .mcool files.
load Build .cool and .hic files from interactions in various text
formats.
merge Merge multiple Cooler or .hic files into a single file.
metadata Print file metadata to stdout.
rename-chromosomes, rename-chromsRename chromosomes found in Cooler files.
validate Validate .hic and Cooler files.
zoomify Convert single-resolution Cooler and .hic files to
multi-resolution by coarsening.
hictk balance¶
Balance Hi-C files using ICE, SCALE, or VC.
hictk balance [OPTIONS] SUBCOMMAND
OPTIONS:
-h, --help Print this help message and exit
SUBCOMMANDS:
ice Balance Hi-C files using ICE.
scale Balance Hi-C files using SCALE.
vc Balance Hi-C matrices using VC.
hictk balance ice¶
Balance Hi-C files using ICE.
hictk balance ice [OPTIONS] input
POSITIONALS:
input TEXT:((.[ms]cool) OR (.hic)) AND (NOT .scool) REQUIRED
Path to the .hic, .cool or .mcool file to be balanced.
OPTIONS:
-h, --help Print this help message and exit
--mode TEXT:{gw,trans,cis} [gw]
Balance matrix using:
- genome-wide interactions (gw)
- trans-only interactions (trans)
- cis-only interactions (cis)
--tmpdir TEXT:DIR Path to a folder where to store temporary data.
--ignore-diags UINT [2]
Number of diagonals (including the main diagonal) to mask before
balancing.
--mad-max FLOAT:NONNEGATIVE [5]
Mask bins using the MAD-max filter.
Bins whose log marginal sum is less than --mad-max median
absolute deviations below the median log marginal sum of all the
bins in the same chromosome.
--min-nnz UINT [10]
Mask rows with fewer than --min-nnz non-zero entries.
--min-count UINT [0]
Mask rows with fewer than --min-count interactions.
--tolerance FLOAT:NONNEGATIVE [1e-05]
Threshold of the variance of marginals used to determine whether
the algorithm has converged.
--max-iters UINT:POSITIVE [500]
Maximum number of iterations.
--rescale-weights, --no-rescale-weights{false}
Rescale weights such that rows sum approximately to 2.
--name TEXT Name to use when writing weights to file.
Defaults to ICE, INTER_ICE and GW_ICE when --mode is cis, trans
and gw, respectively.
--create-weight-link, --no-create-weight-link{false}
Create a symbolic link to the balancing weights at
clr::/bins/weight.
Ignored when balancing .hic files
--in-memory Store all interactions in memory (greatly improves performance).
--stdout Write balancing weights to stdout instead of writing them to the
input file.
--chunk-size UINT:POSITIVE [10000000]
Number of interactions to process at once. Ignored when using
--in-memory.
-v, --verbosity INT:INT in [1 - 4] [3]
Set verbosity of output to the console.
-t, --threads UINT:UINT in [1 - 32] [1]
Maximum number of parallel threads to spawn.
-l, --compression-lvl INT:INT in [0 - 19] [3]
Compression level used to compress temporary files using ZSTD.
-f, --force Overwrite existing files and datasets (if any).
hictk balance scale¶
Balance Hi-C files using SCALE.
hictk balance scale [OPTIONS] input
POSITIONALS:
input TEXT:((.[ms]cool) OR (.hic)) AND (NOT .scool) REQUIRED
Path to the .hic, .cool or .mcool file to be balanced.
OPTIONS:
-h, --help Print this help message and exit
--mode TEXT:{gw,trans,cis} [gw]
Balance matrix using:
- genome-wide interactions (gw)
- trans-only interactions (trans)
- cis-only interactions (cis)
--tmpdir TEXT Path to a folder where to store temporary data.
--max-percentile FLOAT [10]
Percentile used to compute the maximum number of nnz values that
cause a row to be masked.
--max-row-sum-err FLOAT:NONNEGATIVE [0.05]
Row sum threshold used to determine whether convergence has been
achieved.
--tolerance FLOAT:NONNEGATIVE [0.0001]
Threshold of the variance of marginals used to determine whether
the algorithm has converged.
--max-iters UINT:POSITIVE [500]
Maximum number of iterations.
--rescale-weights, --no-rescale-weights{false}
Rescale weights such that the sum of the balanced matrix is
similar to that of the input matrix.
--name TEXT Name to use when writing weights to file.
Defaults to SCALE, INTER_SCALE and GW_SCALE when --mode is cis,
trans and gw, respectively.
--create-weight-link, --no-create-weight-link{false}
Create a symbolic link to the balancing weights at
clr::/bins/weight.
Ignored when balancing .hic files
--in-memory Store all interactions in memory (greatly improves performance).
--stdout Write balancing weights to stdout instead of writing them to the
input file.
--chunk-size UINT:POSITIVE [10000000]
Number of interactions to process at once. Ignored when using
--in-memory.
-v, --verbosity INT:INT in [1 - 4] [3]
Set verbosity of output to the console.
-t, --threads UINT:UINT in [1 - 32] [1]
Maximum number of parallel threads to spawn.
-l, --compression-lvl INT:INT in [0 - 19] [3]
Compression level used to compress temporary files using ZSTD.
-f, --force Overwrite existing files and datasets (if any).
hictk balance vc¶
Balance Hi-C matrices using VC.
hictk balance vc [OPTIONS] input
POSITIONALS:
input TEXT:((.[ms]cool) OR (.hic)) AND (NOT .scool) REQUIRED
Path to the .hic, .cool or .mcool file to be balanced.
OPTIONS:
-h, --help Print this help message and exit
--mode TEXT:{gw,trans,cis} [gw]
Balance matrix using:
- genome-wide interactions (gw)
- trans-only interactions (trans)
- cis-only interactions (cis)
--rescale-weights, --no-rescale-weights{false}
Rescale weights such that the sum of the balanced matrix is
similar to that of the input matrix.
--name TEXT Name to use when writing weights to file.
Defaults to VC, INTER_VC and GW_VC when --mode is cis, trans and
gw, respectively.
--create-weight-link, --no-create-weight-link{false}
Create a symbolic link to the balancing weights at
clr::/bins/weight.
Ignored when balancing .hic files
--stdout Write balancing weights to stdout instead of writing them to the
input file.
-v, --verbosity INT:INT in [1 - 4] [3]
Set verbosity of output to the console.
-f, --force Overwrite existing files and datasets (if any).
hictk convert¶
Convert Hi-C files between different formats.
hictk convert [OPTIONS] input output
POSITIONALS:
input TEXT:((.[ms]cool) OR (.hic)) AND (NOT .scool) REQUIRED
Path to the .hic, .cool or .mcool file to be converted.
output TEXT REQUIRED Output path. File extension is used to infer output format.
OPTIONS:
-h, --help Print this help message and exit
--output-fmt TEXT:{cool,mcool,hic} [auto]
Output format (by default this is inferred from the output file
extension).
Should be one of:
- cool
- mcool
- hic
-r, --resolutions UINT:POSITIVE ...
One or more resolutions to be converted. By default all
resolutions are converted.
--normalization-methods TEXT [ALL] ...
Name of one or more normalization methods to be copied.
By default, vectors for all known normalization methods are
copied.
Pass NONE to avoid copying the normalization vectors.
--fail-if-norm-not-found
Fail if any of the requested normalization vectors are missing.
-g, --genome TEXT Genome assembly name. By default this is copied from the .hic
file metadata.
--tmpdir TEXT:DIR Path where to store temporary files.
--chunk-size UINT:POSITIVE [10000000]
Batch size to use when converting .[m]cool to .hic.
-v, --verbosity INT:INT in [1 - 4] [3]
Set verbosity of output to the console.
-t, --threads UINT:UINT in [2 - 32] [2]
Maximum number of parallel threads to spawn.
When converting from hic to cool, only two threads will be used.
-l, --compression-lvl UINT:INT in [1 - 12] [6]
Compression level used to compress interactions.
Defaults to 6 and 10 for .cool and .hic files, respectively.
--skip-all-vs-all, --no-skip-all-vs-all{false}
Do not generate All vs All matrix.
Has no effect when creating .[m]cool files.
--count-type TEXT:{auto,int,float} [auto]
Specify the strategy used to infer count types when converting
.hic files to .[m]cool format.
Can be one of: int, float, or auto.
-f, --force Overwrite existing files (if any).
hictk dump¶
Read interactions and other kinds of data from .hic and Cooler files and write
them to stdout.
hictk dump [OPTIONS] uri
POSITIONALS:
uri TEXT:(.[ms]cool) OR (.hic) REQUIRED
Path to a .hic, .cool or .mcool file (Cooler URI syntax
supported).
OPTIONS:
-h, --help Print this help message and exit
--resolution UINT:NONNEGATIVE
HiC matrix resolution (ignored when file is in .cool format).
--matrix-type ENUM:{observed,oe,expected} [observed]
Matrix type (ignored when file is not in .hic format).
--matrix-unit ENUM:{BP,FRAG} [BP]
Matrix unit (ignored when file is not in .hic format).
-t, --table TEXT:{chroms,bins,pixels,normalizations,resolutions,cells,weights} [pixels]
Name of the table to dump.
-r, --range TEXT [all] Excludes: --query-file --cis-only --trans-only
Coordinates of the genomic regions to be dumped following UCSC
style notation (chr1:0-1000).
--range2 TEXT [all] Needs: --range Excludes: --query-file --cis-only --trans-only
Coordinates of the genomic regions to be dumped following UCSC
style notation (chr1:0-1000).
--query-file TEXT:(FILE) OR ({-}) Excludes: --range --range2 --cis-only --trans-only
Path to a BEDPE file with the list of coordinates to be fetched
(pass - to read queries from stdin).
--cis-only Excludes: --range --range2 --query-file --trans-only
Dump intra-chromosomal interactions only.
--trans-only Excludes: --range --range2 --query-file --cis-only
Dump inter-chromosomal interactions only.
-b, --balance TEXT [NONE]
Balance interactions using the given method.
--sorted, --unsorted{false}
Return interactions in ascending order.
--join, --no-join{false}
Output pixels in BG2 format.
hictk fix-mcool¶
Fix corrupted .mcool files.
hictk fix-mcool [OPTIONS] input output
POSITIONALS:
input TEXT:.mcool REQUIRED Path to a corrupted .mcool file.
output TEXT REQUIRED Path where to store the restored .mcool.
OPTIONS:
-h, --help Print this help message and exit
--tmpdir TEXT:DIR Path to a folder where to store temporary data.
--skip-balancing Do not recompute or copy balancing weights.
--check-base-resolution
Check whether the base resolution is corrupted.
--in-memory Store all interactions in memory while balancing (greatly
improves performance).
--chunk-size UINT:POSITIVE [10000000]
Number of interactions to process at once during balancing.
Ignored when using --in-memory.
-v, --verbosity INT:INT in [1 - 4] [3]
Set verbosity of output to the console.
-t, --threads UINT:UINT in [1 - 32] [1]
Maximum number of parallel threads to spawn (only applies to the
balancing stage).
-l, --compression-lvl INT:INT in [0 - 19] [3]
Compression level used to compress temporary files using ZSTD
(only applies to the balancing stage).
-f, --force Overwrite existing files (if any).
hictk load¶
Build .cool and .hic files from interactions in various text formats.
hictk load [OPTIONS] interactions output-path
POSITIONALS:
interactions TEXT:(FILE) OR ({-}) REQUIRED
Path to a file with the interactions to be loaded.
Common compression formats are supported (namely, bzip2, gzip,
lz4, lzo, xz, and zstd).
Pass "-" to indicate that interactions should be read from stdin.
output-path TEXT REQUIRED Path to output file.
File extension will be used to infer the output format.
This behavior can be overridden by explicitly specifying an
output format through option --output-fmt.
OPTIONS:
-h, --help Print this help message and exit
-c, --chrom-sizes TEXT:FILE Excludes: --bin-table
Path to .chrom.sizes file.
Required when interactions are not in 4DN pairs format.
-b, --bin-size UINT:POSITIVE Excludes: --bin-table
Bin size (bp).
Required when --bin-table is not used.
--bin-table TEXT:FILE Excludes: --chrom-sizes --bin-size
Path to a BED3+ file with the bin table.
-f, --format TEXT:{4dn,validpairs,bg2,coo} REQUIRED
Input format.
--output-fmt TEXT:{auto,cool,hic} [auto]
Output format (by default this is inferred from the output file
extension).
Should be one of:
- auto
- cool
- hic
--force Force overwrite existing output file(s).
--assembly TEXT [unknown]
Assembly name.
--drop-unknown-chroms
Ignore records referencing unknown chromosomes.
--one-based, --zero-based{false}
Interpret genomic coordinates or bins as one/zero based.
By default coordinates are assumed to be one-based for
interactions in 4dn and validpairs formats and zero-based
otherwise.
--count-as-float Interactions are floats.
--skip-all-vs-all, --no-skip-all-vs-all{false}
Do not generate All vs All matrix.
Has no effect when creating .cool files.
--assume-sorted, --assume-unsorted{false}
Assume input files are already sorted.
--validate-pixels, --no-validate-pixels{false}
Toggle pixel validation on or off.
When --no-validate-pixels is used and invalid pixels are
encountered, hictk will either crash or produce invalid files.
--transpose-lower-triangular-pixels, --no-transpose-lower-triangular-pixels{false}
Transpose pixels overlapping the lower-triangular matrix.
When --no-transpose-lower-triangular-pixels is used and one or
more pixels overlapping with the lower triangular matrix are
encountered an exception will be raised.
--chunk-size UINT [10000000]
Number of pixels to buffer in memory.
-l, --compression-lvl UINT:INT bounded to [1 - 12]
Compression level used to compress interactions.
Defaults to 6 and 10 for .cool and .hic files, respectively.
-t, --threads UINT:UINT in [2 - 32] [2]
Maximum number of parallel threads to spawn.
When loading interactions in a .cool file, only up to two threads
will be used.
--tmpdir TEXT:DIR Path to a folder where to store temporary data.
-v, --verbosity INT:INT in [1 - 4] [3]
Set verbosity of output to the console.
hictk merge¶
Merge multiple Cooler or .hic files into a single file.
hictk merge [OPTIONS] input-files...
POSITIONALS:
input-files TEXT:((.[ms]cool) OR (.hic)) AND (NOT .scool) x 2 REQUIRED
Path to two or more Cooler or .hic files to be merged (Cooler URI
syntax supported).
OPTIONS:
-h, --help Print this help message and exit
-o, --output-file TEXT REQUIRED
Output Cooler or .hic file (Cooler URI syntax supported).
--output-fmt TEXT:{cool,hic} [auto]
Output format (by default this is inferred from the output file
extension).
Should be one of:
- cool
- hic
--resolution UINT:NONNEGATIVE
Hi-C matrix resolution (ignored when input files are in .cool
format).
-f, --force Force overwrite output file.
--chunk-size UINT [10000000]
Number of pixels to store in memory before writing to disk.
-l, --compression-lvl UINT:INT bounded to [1 - 12]
Compression level used to compress interactions.
Defaults to 6 and 10 for .cool and .hic files, respectively.
-t, --threads UINT:UINT in [1 - 32] [1]
Maximum number of parallel threads to spawn.
When merging interactions in Cooler format, only a single thread
will be used.
--tmpdir TEXT:DIR Path to a folder where to store temporary data.
--skip-all-vs-all, --no-skip-all-vs-all{false}
Do not generate All vs All matrix.
Has no effect when merging .cool files.
--count-type TEXT:{int,float} [int]
Specify the count type to be used when merging files.
Ignored when the output file is in .hic format.
-v, --verbosity INT:INT in [1 - 4] [3]
Set verbosity of output to the console.
hictk metadata¶
Print file metadata to stdout.
hictk metadata [OPTIONS] uri
POSITIONALS:
uri TEXT:(.[ms]cool) OR (.hic) REQUIRED
Path to a .hic or .[ms]cool file (Cooler URI syntax supported).
OPTIONS:
-h, --help Print this help message and exit
-f, --output-format TEXT:{json,toml,yaml} [json]
Format used to return file metadata.
Should be one of: json, toml, or yaml.
--include-file-path, --exclude-file-path{false}
Output the given input path using attribute "uri".
--recursive Print metadata for each resolution or cell contained in a
multi-resolution or single-cell file.
hictk rename-chromosomes¶
Rename chromosomes found in Cooler files.
hictk rename-chromosomes [OPTIONS] uri
POSITIONALS:
uri TEXT:.[ms]cool REQUIRED Path to a .[ms]cool file (Cooler URI syntax supported).
OPTIONS:
-h, --help Print this help message and exit
--name-mappings TEXT Excludes: --add-chr-prefix --remove-chr-prefix
Path to a two column TSV with pairs of chromosomes to be renamed.
The first column should contain the original chromosome name,
while the second column should contain the destination name to
use when renaming.
--add-chr-prefix Excludes: --name-mappings --remove-chr-prefix
Prefix chromosome names with "chr".
--remove-chr-prefix Excludes: --name-mappings --add-chr-prefix
Remove prefix "chr" from chromosome names.
-v, --verbosity INT:INT in [1 - 4] [3]
Set verbosity of output to the console.
hictk validate¶
Validate .hic and Cooler files.
hictk validate [OPTIONS] uri
POSITIONALS:
uri TEXT REQUIRED Path to a .hic or .[ms]cool file (Cooler URI syntax supported).
OPTIONS:
-h, --help Print this help message and exit
--validate-index Validate Cooler index (may take a long time).
--validate-pixels Validate pixels found in Cooler files (may take a long time).
-f, --output-format TEXT:{json,toml,yaml} [json]
Format used to report the outcome of file validation.
Should be one of: json, toml, or yaml.
--include-file-path, --exclude-file-path{false}
Output the given input path using attribute "uri".
--exhaustive, --fail-fast{false}
When processing multi-resolution or single-cell files, do not
fail as soon as the first error is detected.
--quiet Don't print anything to stdout. Success/failure is reported
through exit codes.
hictk zoomify¶
Convert single-resolution Cooler and .hic files to multi-resolution by
coarsening.
hictk zoomify [OPTIONS] cooler/hic [m]cool/hic
POSITIONALS:
cooler/hic TEXT:((.[ms]cool) OR (.hic)) AND (NOT .scool) REQUIRED
Path to a .cool or .hic file (Cooler URI syntax supported).
[m]cool/hic TEXT REQUIRED Output path.
When zoomifying Cooler files, providing a single resolution
through --resolutions and specifying --no-copy-base-resolution,
the output file will be in .cool format.
OPTIONS:
-h, --help Print this help message and exit
--force Force overwrite existing output file(s).
--resolutions UINT:POSITIVE ...
One or more resolutions to be used for coarsening.
--copy-base-resolution, --no-copy-base-resolution{false}
Copy the base resolution to the output file.
--nice-steps, --pow2-steps{false} [--nice-steps]
Use nice or power of two steps to automatically generate the list
of resolutions.
Example:
Base resolution: 1000
Pow2: 1000, 2000, 4000, 8000...
Nice: 1000, 2000, 5000, 10000...
-l, --compression-lvl UINT:INT bounded to [1 - 12] [6]
Compression level used to compress interactions.
Defaults to 6 and 10 for .mcool and .hic files, respectively.
-t, --threads UINT:UINT in [1 - 32] [1]
Maximum number of parallel threads to spawn.
When zoomifying interactions from a .cool file, only a single
thread will be used.
--chunk-size UINT [10000000]
Number of pixels to buffer in memory.
Only used when zoomifying .hic files.
--skip-all-vs-all, --no-skip-all-vs-all{false}
Do not generate All vs All matrix.
Has no effect when zoomifying .cool files.
--tmpdir TEXT:DIR Path to a folder where to store temporary data.
-v, --verbosity INT:INT in [1 - 4] [3]
Set verbosity of output to the console.