Merge pull request #183 from paulsengroup/docs/update

Update docs
paulsengroup · Jun 14, 2024 · 0d3b21d · 0d3b21d
2 parents 88e78df + 4e6307a
commit 0d3b21d
Show file tree

Hide file tree

Showing 8 changed files with 173 additions and 11 deletions.
diff --git a/docs/balancing_matrices.rst b/docs/balancing_matrices.rst
@@ -8,6 +8,7 @@ Balancing Hi-C matrices
 ``hictk`` supports balancing .hic, .cool and .mcool files using ICE (iterative correction and eigenvector decomposition), SCALE and VC:
 
 .. code-block:: console
+
   user@dev:/tmp$ hictk balance --help
   Balance Hi-C matrices using ICE, SCALE, or VC.
   Usage: hictk balance [OPTIONS] [SUBCOMMAND]

diff --git a/docs/cli_reference.rst b/docs/cli_reference.rst
@@ -114,7 +114,7 @@ hictk balance scale
                                 Percentile used to compute the maximum number of nnz values that cause a row to be masked.
     --max-row-sum-err FLOAT:NONNEGATIVE [0.05]
                                 Row sum threshold used to determine whether convergence has been achieved.
-    --tolerance FLOAT:NONNEGATIVE [1e-05]
+    --tolerance FLOAT:NONNEGATIVE [0.0001]
                                 Threshold of the variance of marginals used to determine whether
                                 the algorithm has converged.
     --max-iters UINT:POSITIVE [500]

diff --git a/docs/cpp_api/cooler.rst b/docs/cpp_api/cooler.rst
@@ -242,8 +242,6 @@ Pixel selector
   **Fetch at once**
 
   .. cpp:function:: template <typename N> [[nodiscard]] std::vector<Pixel<N>> read_all() const;
-  .. cpp:function:: template <typename N> [[nodiscard]] Eigen::SparseMatrix<N> read_sparse() const;
-  .. cpp:function:: template <typename N> [[nodiscard]] Eigen::Matrix<N, Eigen::Dynamic, Eigen::Dynamic> read_dense() const;
 
   **Accessors**
 

diff --git a/docs/cpp_api/generic.rst b/docs/cpp_api/generic.rst
@@ -199,8 +199,6 @@ Pixel selector
   **Fetch at once**
 
   .. cpp:function:: template <typename N> [[nodiscard]] std::vector<Pixel<N>> read_all() const;
-  .. cpp:function:: template <typename N> [[nodiscard]] Eigen::SparseMatrix<N> read_sparse() const;
-  .. cpp:function:: template <typename N> [[nodiscard]] Eigen::Matrix<N, Eigen::Dynamic, Eigen::Dynamic> read_dense() const;
 
   Read and return all :cpp:class:`Pixel`\s at once using a :cpp:class:`std::vector`.
 

diff --git a/docs/cpp_api/hic.rst b/docs/cpp_api/hic.rst
@@ -126,9 +126,6 @@ Pixel selector
 
   .. cpp:function:: template <typename N> [[nodiscard]] std::vector<Pixel<N>> read_all() const;
 
-  .. cpp:function:: template <typename N> [[nodiscard]] Eigen::SparseMatrix<N> read_sparse() const;
-  .. cpp:function:: template <typename N> [[nodiscard]] Eigen::Matrix<N, Eigen::Dynamic, Eigen::Dynamic> read_dense() const;
-
   **Accessors**
 
   .. cpp:function:: [[nodiscard]] const PixelCoordinates &coord1() const noexcept;

diff --git a/docs/cpp_api/index.rst b/docs/cpp_api/index.rst
@@ -14,3 +14,4 @@ hictk C++ API is structured as follows:
    cooler
    hic
    shared
+   transformers
diff --git a/docs/cpp_api/transformers.rst b/docs/cpp_api/transformers.rst
@@ -0,0 +1,167 @@
+..
+   Copyright (C) 2024 Roberto Rossini <roberros@uio.no>
+   SPDX-License-Identifier: MIT
+
+.. cpp:namespace:: hictk
+
+Pixel transformers
+==================
+
+The transformer library provides a set of common algorithms used to manipulate streams of pixels.
+Classes in defined in this library take a pair of pixel iterators or :cpp:class:`PixelSelector`\s directly and transform and/or aggregate them in different ways.
+
+.. cpp:namespace:: hictk::transformers
+
+Coarsening pixels
+-----------------
+
+.. cpp:class:: template <typename PixelIt> CoarsenPixels
+
+  Class used to coarsen pixels read from a pair of pixel iterators.
+  Coarsening is performed in a streaming fashion, minimizing the number of pixels that are kept into memory at any given time.
+
+  .. cpp:function:: CoarsenPixels(PixelIt first_pixel, PixelIt last_pixel,  std::shared_ptr<const BinTable> source_bins, std::size_t factor);
+
+   Constructor for :cpp:class:`CoarsenPixels` class.
+   ``first_pixel`` and ``last_pixels`` should be a pair of iterators pointing to the stream of pixels to be coarsened.
+   ``source_bins`` is a shared pointer to the bin table to which ``first_pixel`` and ``last_pixel`` refer to.
+   ``factor`` should be an integer value greater than 1, and is used to determine the properties of the ``target_bins`` :cpp:class:`BinTable` used for coarsening.
+
+  **Accessors**
+
+  .. cpp:function:: [[nodiscard]] const BinTable &src_bins() const noexcept;
+  .. cpp:function:: [[nodiscard]] const BinTable &dest_bins() const noexcept;
+  .. cpp:function:: [[nodiscard]] std::shared_ptr<const BinTable> src_bins_ptr() const noexcept;
+  .. cpp:function:: [[nodiscard]] std::shared_ptr<const BinTable> dest_bins_ptr() const noexcept;
+
+  :cpp:class:`BinTable` accesors.
+
+  **Iteration**
+
+  .. cpp:function:: begin() const -> iterator;
+  .. cpp:function:: end() const -> iterator;
+  .. cpp:function:: cbegin() const -> iterator;
+  .. cpp:function:: cend() const -> iterator;
+
+  Return an `InputIterator <https://en.cppreference.com/w/cpp/named_req/InputIterator>`_ to traverse the coarsened pixels.
+
+  **Others**
+
+  .. cpp:function:: [[nodiscard]] auto read_all() const -> std::vector<ThinPixel<N>>;
+
+Transforming COO pixels to BG2 pixels
+-------------------------------------
+
+.. cpp:class:: template <typename PixelIt> JoinGenomicCoords
+
+  Class used to join genomic coordinates onto COO pixels, effectively transforming :cpp:class:`ThinPixel`\s into :cpp:class:`Pixel`\s.
+
+  .. cpp:function:: JoinGenomicCoords(PixelIt first_pixel, PixelIt last_pixel,  std::shared_ptr<const BinTable> bins);
+
+   Constructor for :cpp:class:`JoinGenomicCoords` class.
+   ``first_pixel`` and ``last_pixels`` should be a pair of iterators pointing to the stream of pixels to be processed.
+   ``bins`` is a shared pointer to the bin table to which ``first_pixel`` and ``last_pixel`` refer to.
+
+  **Iteration**
+
+  .. cpp:function:: begin() const -> iterator;
+  .. cpp:function:: end() const -> iterator;
+  .. cpp:function:: cbegin() const -> iterator;
+  .. cpp:function:: cend() const -> iterator;
+
+  Return an `InputIterator <https://en.cppreference.com/w/cpp/named_req/InputIterator>`_ to traverse the :cpp:class:`Pixel`\s.
+
+  **Others***
+
+  .. cpp:function:: [[nodiscard]] auto read_all() const -> std::vector<Pixel<N>>;
+
+
+Merging streams of pre-sorted pixels
+------------------------------------
+
+.. cpp:class:: template <typename PixelIt> PixelMerger
+
+  Class used to merge streams of pre-sorted pixels, yielding a sequence of unique pixels sorted by their genomic coordinates.
+  Merging is performed in a streaming fashion, minimizing the number of pixels that are kept into memory at any given time.
+
+  Duplicate pixels are aggregated by summing their corresponding interactions.
+  Pixel merging also affects duplicate pixels coming from the same stream.
+
+  .. cpp:function:: PixelMerger(std::vector<PixelIt> head, std::vector<PixelIt> tail);
+  .. cpp:function:: template <typename ItOfPixelIt> PixelMerger(ItOfPixelIt first_head, ItOfPixelIt last_head, ItOfPixelIt first_tail);
+
+  Constructors taking either two vectors of `InputIterators <https://en.cppreference.com/w/cpp/named_req/InputIterator>`_ or pairs of iterators to `InputIterators <https://en.cppreference.com/w/cpp/named_req/InputIterator>`_.
+
+  The ``head`` and ``tail`` vectors should contain the iterators pointing to the beginning and end of :cpp:class:`ThinPixel` streams, respectively.
+
+  **Iteration**
+
+  .. cpp:function:: auto begin() const -> iterator;
+  .. cpp:function:: auto end() const noexcept -> iterator;
+
+  Return an `InputIterator <https://en.cppreference.com/w/cpp/named_req/InputIterator>`_ to traverse the stream :cpp:class:`ThinPixel`\s after merging.
+
+  **Others**
+
+  .. cpp:function:: [[nodiscard]] auto read_all() const -> std::vector<PixelT>;
+
+
+Computing common statistics
+---------------------------
+
+.. cpp:function:: template <typename PixelIt> [[nodiscard]] double avg(PixelIt first, PixelIt last);
+.. cpp:function:: template <typename PixelIt, typename N> [[nodiscard]] N max(PixelIt first, PixelIt last);
+.. cpp:function:: template <typename PixelIt> [[nodiscard]] std::size_t nnz(PixelIt first, PixelIt last);
+.. cpp:function:: template <typename PixelIt, typename N> [[nodiscard]] N sum(PixelIt first, PixelIt last);
+
+
+Converting streams of pixels to Arrow Tables
+--------------------------------------------
+
+.. cpp:enum-class:: DataFrameFormat
+
+  .. cpp:enumerator:: COO
+  .. cpp:enumerator:: BG2
+
+.. cpp:class:: template <typename PixelIt> ToDataFrame
+
+  .. cpp:function:: ToDataFrame(PixelIt first_pixel, PixelIt last_pixel, DataFrameFormat format = DataFrameFormat::COO, std::shared_ptr<const BinTable> bins = nullptr, bool transpose = false, std::size_t chunk_size = 256'000);
+
+  Construct an instance of a :cpp:class:`ToDataFrame` converter given a stream of pixels delimited by ``first_pixel`` and ``last_pixel``, a DataFrame ``format`` and a :cpp:class:`BinTable`.
+
+  When ``transpose`` is set to true, the converter will produce a table consisting of pixels overlapping the lower-triangle of the matrix.
+
+  .. cpp:function:: [[nodiscard]] std::shared_ptr<arrow::Table> operator()();
+
+  Convert the stream of pixels into an :cpp:class:`arrow::Table`.
+
+
+Converting streams of pixels to Eigen Matrices
+----------------------------------------------
+
+.. cpp:class:: template <typename PixelSelector> ToDenseMatrix
+
+  .. cpp:function:: ToDenseMatrix(PixelSelector&& selector, N n, bool mirror = true);
+
+  Construct an instance of a :cpp:class:`ToDenseMatrix` converter given a :cpp:class:`PixelSelector` object and a count type ``n``.
+
+  When ``mirror`` is set to true, the converter will take care of mirroring the upper-triangle matrix when appropriate.
+
+  .. cpp:function:: [[nodiscard]] auto operator()() -> Eigen::Matrix<N, Eigen::Dynamic, Eigen::Dynamic>;
+
+  Convert the stream of pixels into an :cpp:class:`Eigen::Matrix`.
+
+Converting streams of pixels to Eigen Sparse Matrices
+-----------------------------------------------------
+
+.. cpp:class:: template <typename PixelSelector> ToSparseMatrix
+
+  .. cpp:function:: ToSparseMatrix(PixelSelector&& selector, N n, bool transpose = false);
+
+  Construct an instance of a :cpp:class:`ToSparseMatrix` converter given a :cpp:class:`PixelSelector` object and a count type ``n``.
+
+  When ``transpose`` is set to true, the converter will produce a matrix consisting of pixels overlapping the lower-triangle of the matrix.
+
+  .. cpp:function:: [[nodiscard]] auto operator()() -> Eigen::SparseMatrix<N>;
+
+  Convert the stream of pixels into an :cpp:class:`Eigen::SparseMatrix`.
diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -1,5 +1,5 @@
-furo==2024.1.29
-sphinx==7.2.6
+furo==2024.5.6
+sphinx==7.3.7
 sphinx-copybutton==0.5.2
-sphinxcontrib-moderncmakedomain==3.27.0
+sphinxcontrib-moderncmakedomain==3.29.0
 sphinxcontrib-svg2pdfconverter==1.2.2
-Original file line number
+Diff line change
@@ Expand Up / @@ -14,3 +14,4 @@ hictk C++ API is structured as follows: @@
        cooler
        hic
        shared
+       transformers