CLI Reference#

For an up-to-date list of subcommands and CLI options refer to hictk –help.

Subcommands#

Blazing fast tools to work with .hic and .cool files.
Usage: hictk [OPTIONS] SUBCOMMAND
Options:
  -h,--help                   Print this help message and exit
  -V,--version                Display program version information and exit
Subcommands:
  balance                     Balance Hi-C matrices using ICE, SCALE, or VC.
  convert                     Convert HiC matrices to a different format.
  dump                        Dump data from .hic and Cooler files to stdout.
  fix-mcool                   Fix corrupted .mcool files.
  load                        Build .cool and .hic files from interactions in various text formats.
  merge                       Merge multiple Cooler or .hic files into a single file.
  rename-chromosomes, rename-chroms
                              Rename chromosomes found in a Cooler file.
  validate                    Validate .hic and Cooler files.
  zoomify                     Convert single-resolution Cooler and .hic files to multi-resolution by coarsening.

hictk balance#

Balance Hi-C matrices using ICE, SCALE, or VC.
Usage: hictk balance [OPTIONS] [SUBCOMMAND]
Options:
  -h,--help                   Print this help message and exit
Subcommands:
  ice                         Balance Hi-C matrices using ICE.
  scale                       Balance Hi-C matrices using SCALE.
  vc                          Balance Hi-C matrices using VC.

hictk balance ice#

Balance Hi-C matrices using ICE.
Usage: hictk balance ice [OPTIONS] input
Positionals:
  input TEXT:((HiC) OR (Cooler)) OR (Multires-cooler) REQUIRED
                              Path to the .hic, .cool or .mcool file to be balanced.
Options:
  -h,--help                   Print this help message and exit
  --mode TEXT:{gw,trans,cis} [gw]
                              Balance matrix using:
                               - genome-wide interactions (gw)
                               - trans-only interactions (trans)
                               - cis-only interactions (cis)
  --tmpdir TEXT [/tmp]        Path to a folder where to store temporary data.
  --ignore-diags UINT [2]     Number of diagonals (including the main diagonal) to mask before balancing.
  --mad-max FLOAT:NONNEGATIVE [5]
                              Mask bins using the MAD-max filter.
                              bins whose log marginal sum is less than --mad-max median
                              absolute deviations below the median log marginal sum of
                              all the bins in the same chromosome.
  --min-nnz UINT [10]         Mask rows with fewer than --min-nnz non-zero entries.
  --min-count UINT [0]        Mask rows with fewer than --min-count interactions.
  --tolerance FLOAT:NONNEGATIVE [1e-05]
                              Threshold of the variance of marginals used to determine whether
                              the algorithm has converged.
  --max-iters UINT:POSITIVE [500]
                              Maximum number of iterations.
  --rescale-weights,--no-rescale-weights{false}
                              Rescale weights such that rows sum approximately to 2.
  --name TEXT                 Name to use when writing weights to file.
                              Defaults to ICE, INTER_ICE and GW_ICE when --mode is cis, trans and gw, respectively.
  --create-weight-link        Create a symbolic link to the balancing weights at clr::/bins/weight.
                              Ignored when balancing .hic files
  --in-memory                 Store all interactions in memory (greatly improves performance).
  --stdout                    Write balancing weights to stdout instead of writing them to the input file.
  --chunk-size UINT:POSITIVE [10000000]
                              Number of interactions to process at once. Ignored when using --in-memory.
  -v,--verbosity UINT:INT in [1 - 4] []
                              Set verbosity of output to the console.
  -t,--threads UINT:UINT in [1 - 16] [1]
                              Maximum number of parallel threads to spawn.
  -l,--compression-lvl UINT:INT in [0 - 19] []
                              Compression level used to compress temporary files using ZSTD.
  -f,--force                  Overwrite existing files and datasets (if any).

hictk balance scale#

Balance Hi-C matrices using SCALE.
Usage: hictk balance scale [OPTIONS] input
Positionals:
  input TEXT:((HiC) OR (Cooler)) OR (Multires-cooler) REQUIRED
                              Path to the .hic, .cool or .mcool file to be balanced.
Options:
  -h,--help                   Print this help message and exit
  --mode TEXT:{gw,trans,cis} [gw]
                              Balance matrix using:
                               - genome-wide interactions (gw)
                               - trans-only interactions (trans)
                               - cis-only interactions (cis)
  --tmpdir TEXT [/tmp]        Path to a folder where to store temporary data.
  --max-percentile FLOAT [10]
                              Percentile used to compute the maximum number of nnz values that cause a row to be masked.
  --max-row-sum-err FLOAT:NONNEGATIVE [0.05]
                              Row sum threshold used to determine whether convergence has been achieved.
  --tolerance FLOAT:NONNEGATIVE [1e-05]
                              Threshold of the variance of marginals used to determine whether
                              the algorithm has converged.
  --max-iters UINT:POSITIVE [500]
                              Maximum number of iterations.
  --rescale-weights,--no-rescale-weights{false}
                              Rescale weights such that the sum of the balanced matrix is similar
                              to that of the input matrix.
  --name TEXT                 Name to use when writing weights to file.
                              Defaults to SCALE, INTER_SCALE and GW_SCALE when --mode is cis, trans and gw, respectively.
  --create-weight-link        Create a symbolic link to the balancing weights at clr::/bins/weight.
                              Ignored when balancing .hic files
  --in-memory                 Store all interactions in memory (greatly improves performance).
  --stdout                    Write balancing weights to stdout instead of writing them to the input file.
  --chunk-size UINT:POSITIVE [10000000]
                              Number of interactions to process at once. Ignored when using --in-memory.
  -v,--verbosity UINT:INT in [1 - 4] []
                              Set verbosity of output to the console.
  -t,--threads UINT:UINT in [1 - 16] [1]
                              Maximum number of parallel threads to spawn.
  -l,--compression-lvl UINT:INT in [0 - 19] []
                              Compression level used to compress temporary files using ZSTD.
  -f,--force                  Overwrite existing files and datasets (if any).

hictk balance vc#

Balance Hi-C matrices using VC.
Usage: hictk balance vc [OPTIONS] input
Positionals:
  input TEXT:((HiC) OR (Cooler)) OR (Multires-cooler) REQUIRED
                              Path to the .hic, .cool or .mcool file to be balanced.
Options:
  -h,--help                   Print this help message and exit
  --mode TEXT:{gw,trans,cis} [gw]
                              Balance matrix using:
                               - genome-wide interactions (gw)
                               - trans-only interactions (trans)
                               - cis-only interactions (cis)
  --rescale-weights,--no-rescale-weights{false}
                              Rescale weights such that the sum of the balanced matrix is similar
                              to that of the input matrix.
  --name TEXT                 Name to use when writing weights to file.
                              Defaults to VC, INTER_VC and GW_VC when --mode is cis, trans and gw, respectively.
  --create-weight-link        Create a symbolic link to the balancing weights at clr::/bins/weight.
                              Ignored when balancing .hic files
  --stdout                    Write balancing weights to stdout instead of writing them to the input file.
  -v,--verbosity UINT:INT in [1 - 4] []
                              Set verbosity of output to the console.
  -f,--force                  Overwrite existing files and datasets (if any).

hictk convert#

Convert HiC matrices to a different format.
Usage: hictk convert [OPTIONS] input output
Positionals:
  input TEXT:((HiC) OR (Cooler)) OR (Multires-cooler) REQUIRED
                              Path to the .hic, .cool or .mcool file to be converted.
  output TEXT REQUIRED        Output path. File extension is used to infer output format.
Options:
  -h,--help                   Print this help message and exit
  --output-fmt TEXT:{cool,mcool,hic} [auto]
                              Output format (by default this is inferred from the output file extension).
                              Should be one of:
                              - cool
                              - mcool
                              - hic
  -r,--resolutions UINT:POSITIVE ...
                              One or more resolutions to be converted. By default all resolutions are converted.
  --normalization-methods TEXT [ALL]  ...
                              Name of one or more normalization methods to be copied.
                              By default, vectors for all known normalization methods are copied.
                              Pass NONE to avoid copying normalization vectors.
  --fail-if-norm-not-found    Fail if any of the requested normalization vectors are missing.
  -g,--genome TEXT            Genome assembly name. By default this is copied from the .hic file metadata.
  --tmpdir TEXT               Path where to store temporary files.
  --chunk-size UINT:POSITIVE [10000000]
                              Batch size to use when converting .[m]cool to .hic.
  -v,--verbosity UINT:INT in [1 - 4] []
                              Set verbosity of output to the console.
  -t,--threads UINT:UINT in [2 - 16] [2]
                              Maximum number of parallel threads to spawn.
                              When converting from hic to cool, only two threads will be used.
  -l,--compression-lvl UINT:INT in [1 - 12] [6]
                              Compression level used to compress interactions.
                              Defaults to 6 and 10 for .cool and .hic files, respectively.
  --skip-all-vs-all,--no-skip-all-vs-all{false}
                              Do not generate All vs All matrix.
                              Has no effect when creating .[m]cool files.
  -f,--force                  Overwrite existing files (if any).

hictk dump#

Dump data from .hic and Cooler files to stdout.
Usage: hictk dump [OPTIONS] uri
Positionals:
  uri TEXT:(((HiC) OR (Cooler)) OR (Multires-cooler)) OR (Single-cell-cooler) REQUIRED
                              Path to a .hic, .cool or .mcool file (Cooler URI syntax supported).
Options:
  -h,--help                   Print this help message and exit
  --resolution UINT:NONNEGATIVE
                              HiC matrix resolution (ignored when file is in .cool format).
  --matrix-type ENUM:value in {expected->2,observed->0,oe->1} OR {2,0,1} [observed]
                              Matrix type (ignored when file is not in .hic format).
  --matrix-unit ENUM:value in {BP->0,FRAG->1} OR {0,1} [BP]
                              Matrix unit (ignored when file is not in .hic format).
  -t,--table TEXT:{chroms,bins,pixels,normalizations,resolutions,cells,weights} [pixels]
                              Name of the table to dump.
  -r,--range TEXT [all]  Excludes: --query-file --cis-only --trans-only
                              Coordinates of the genomic regions to be dumped following UCSC-style notation (chr1:0-1000).
  --range2 TEXT [all]  Needs: --range Excludes: --query-file --cis-only --trans-only
                              Coordinates of the genomic regions to be dumped following UCSC-style notation (chr1:0-1000).
  --query-file TEXT:(FILE) OR ({-}) Excludes: --range --range2 --cis-only --trans-only
                              Path to a BEDPE file with the list of coordinates to be fetched (pass - to read queries from stdin).
  --cis-only Excludes: --range --range2 --query-file --trans-only
                              Dump intra-chromosomal interactions only.
  --trans-only Excludes: --range --range2 --query-file --cis-only
                              Dump inter-chromosomal interactions only.
  -b,--balance TEXT [NONE]    Balance interactions using the given method.
  --sorted,--unsorted{false}  Return interactions in ascending order.
  --join,--no-join{false}     Output pixels in BG2 format.

hictk fix-mcool#

Fix corrupted .mcool files.
Usage: hictk fix-mcool [OPTIONS] input output
Positionals:
  input TEXT:Multires-cooler REQUIRED
                              Path to a corrupted .mcool file.
  output TEXT REQUIRED        Path where to store the restored .mcool.
Options:
  -h,--help                   Print this help message and exit
  --tmpdir TEXT [/tmp]        Path to a folder where to store temporary data.
  --skip-balancing            Do not recompute or copy balancing weights.
  --check-base-resolution     Check whether the base resolution is corrupted.
  --in-memory                 Store all interactions in memory while balancing (greatly improves performance).
  --chunk-size UINT:POSITIVE [10000000]
                              Number of interactions to process at once during balancing.
                              Ignored when using --in-memory.
  -v,--verbosity UINT:INT in [1 - 4] []
                              Set verbosity of output to the console.
  -t,--threads UINT:UINT in [1 - 16] [1]
                              Maximum number of parallel threads to spawn (only applies to the balancing stage).
  -l,--compression-lvl UINT:INT in [0 - 19] []
                              Compression level used to compress temporary files using ZSTD (only applies to the balancing stage).
  -f,--force                  Overwrite existing files (if any).

hictk load#

Build .cool and .hic files from interactions in various text formats.
Usage: hictk load [OPTIONS] chrom-sizes output-path
Positionals:
  chrom-sizes TEXT:FILE REQUIRED
                              Path to .chrom.sizes file.
  output-path TEXT REQUIRED   Path to output file.
Options:
  -h,--help                   Print this help message and exit
  -b,--bin-size UINT:POSITIVE Excludes: --bin-table
                              Bin size (bp).
                              Required when --bin-table is not used.
  --bin-table TEXT:FILE Excludes: --bin-size
                              Path to a BED3+ file with the bin table.
  -f,--format TEXT:{4dn,validpairs,bg2,coo} REQUIRED
                              Input format.
  --force                     Force overwrite existing output file(s).
  --assembly TEXT [unknown]   Assembly name.
  --one-based,--zero-based{false}
                              Interpret genomic coordinates or bins as one/zero based.
                              By default coordinates are assumed to be one-based for interactions in
                              4dn and validapairs formats and zero-based otherwise.
  --count-as-float            Interactions are floats.
  --skip-all-vs-all,--no-skip-all-vs-all{false}
                              Do not generate All vs All matrix.
                              Has no effect when creating .cool files.
  --assume-sorted,--assume-unsorted{false}
                              Assume input files are already sorted.
  --chunk-size UINT [10000000]
                              Number of pixels to buffer in memory.
  -l,--compression-lvl UINT:INT bounded to [1 - 12]
                              Compression level used to compress interactions.
                              Defaults to 6 and 10 for .cool and .hic files, respectively.
  -t,--threads UINT:UINT in [1 - 16] [1]
                              Maximum number of parallel threads to spawn.
                              When loading interactions in a .cool file, only a single thread will be used.
  --tmpdir TEXT [/tmp]        Path to a folder where to store temporary data.
  -v,--verbosity UINT:INT in [1 - 4] []
                              Set verbosity of output to the console.

hictk merge#

Merge multiple Cooler or .hic files into a single file.
Usage: hictk merge [OPTIONS] input-files...
Positionals:
  input-files TEXT:(Cooler) OR (HiC) x 2 REQUIRED
                              Path to two or more Cooler or .hic files to be merged (Cooler URI syntax supported).
Options:
  -h,--help                   Print this help message and exit
  -o,--output-file TEXT REQUIRED
                              Output Cooler or .hic file (Cooler URI syntax supported).
  --resolution UINT:NONNEGATIVE
                              HiC matrix resolution (ignored when input files are in .cool format).
  -f,--force                  Force overwrite output file.
  --chunk-size UINT [10000000]
                              Number of pixels to store in memory before writing to disk.
  -l,--compression-lvl UINT:INT bounded to [1 - 12]
                              Compression level used to compress interactions.
                              Defaults to 6 and 10 for .cool and .hic files, respectively.
  -t,--threads UINT:UINT in [1 - 16] [1]
                              Maximum number of parallel threads to spawn.
                              When merging interactions in Cooler format, only a single thread will be used.
  --tmpdir TEXT [/tmp]        Path to a folder where to store temporary data.
  --skip-all-vs-all,--no-skip-all-vs-all{false}
                              Do not generate All vs All matrix.
                              Has no effect when merging .cool files.
  -v,--verbosity UINT:INT in [1 - 4] []
                              Set verbosity of output to the console.

hictk rename-chromosomes#

Rename chromosomes found in a Cooler file.
Usage: hictk rename-chromosomes [OPTIONS] uri
Positionals:
  uri TEXT REQUIRED           Path to a or .[ms]cool file (Cooler URI syntax supported).
Options:
  -h,--help                   Print this help message and exit
  --name-mappings TEXT Excludes: --add-chr-prefix --remove-chr-prefix
                              Path to a two column TSV with pairs of chromosomes to be renamed.
                              The first column should contain the original chromosome name,
                              while the second column should contain the destination name to use when renaming.
  --add-chr-prefix Excludes: --name-mappings --remove-chr-prefix
                              Prefix chromosome names with "chr".
  --remove-chr-prefix Excludes: --name-mappings --add-chr-prefix
                              Remove prefix "chr" from chromosome names.
  -v,--verbosity UINT:INT in [1 - 4] []
                              Set verbosity of output to the console.

hictk validate#

Validate .hic and Cooler files.
Usage: hictk validate [OPTIONS] uri
Positionals:
  uri TEXT REQUIRED           Path to a .hic or .[ms]cool file (Cooler URI syntax supported).
Options:
  -h,--help                   Print this help message and exit
  --validate-index            Validate Cooler index (may take a long time).
  --quiet                     Don't print anything to stdout. Success/failure is reported through exit codes

hictk zoomify#

Convert single-resolution Cooler and .hic files to multi-resolution by coarsening.
Usage: hictk zoomify [OPTIONS] cooler/hic mcool/hic
Positionals:
  cooler/hic TEXT:(Cooler) OR (HiC) REQUIRED
                              Path to a .cool or .hic file (Cooler URI syntax supported).
  mcool/hic TEXT REQUIRED     Output path.
Options:
  -h,--help                   Print this help message and exit
  --force                     Force overwrite existing output file(s).
  --resolutions UINT ...      One or more resolutions to be used for coarsening.
  --copy-base-resolution,--no-copy-base-resolution{false}
                              Copy the base resolution to the output file.
  --nice-steps,--pow2-steps{false} [--nice-steps]
                              Use nice or power of two steps to automatically generate the list of resolutions.
                              Example:
                              Base resolution: 1000
                              Pow2: 1000, 2000, 4000, 8000...
                              Nice: 1000, 2000, 5000, 10000...
  -l,--compression-lvl UINT:INT bounded to [1 - 12] [6]
                              Compression level used to compress interactions.
                              Defaults to 6 and 10 for .mcool and .hic files, respectively.
  -t,--threads UINT:UINT in [1 - 16] [1]
                              Maximum number of parallel threads to spawn.
                              When zoomifying interactions from a .cool file, only a single thread will be used.
  --chunk-size UINT [10000000]
                              Number of pixels to buffer in memory.
                              Only used when zoomifying .hic files.
  --skip-all-vs-all,--no-skip-all-vs-all{false}
                              Do not generate All vs All matrix.
                              Has no effect when zoomifying .cool files.
  --tmpdir TEXT [/tmp]        Path to a folder where to store temporary data.
  -v,--verbosity UINT:INT in [1 - 4] []
                              Set verbosity of output to the console.