Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement stitching algorithm #1032

Merged
merged 205 commits into from
Mar 11, 2024
Merged

Reimplement stitching algorithm #1032

merged 205 commits into from
Mar 11, 2024

Conversation

Donaim
Copy link
Member

@Donaim Donaim commented Nov 6, 2023

Closes #1030

@Donaim Donaim force-pushed the newfangled-stitching branch 2 times, most recently from 6dc389b to 5bd591e Compare November 6, 2023 16:41
Create a module for handling CIGAR strings and their related alignment
formats. This includes functions for managing coordinate mapping
between the query sequence and the reference sequence, as well as
handling CIGAR strings.

The added classes incorporate various methods to extend coordinates,
convert them and translate them. It also includes a class for managing
CIGAR hits, which includes functions to slice CIGAR operations,
check for overlap, and converting operations to a multiple sequence
alignment (MSA).

This update helps to provide a more comprehensive set of tools for
handling and interpreting CIGAR strings and alignments.
@Donaim Donaim force-pushed the newfangled-stitching branch from 5bd591e to 2a8aec8 Compare November 6, 2023 16:49
Copy link

codecov bot commented Nov 6, 2023

Codecov Report

Attention: Patch coverage is 97.56326% with 26 lines in your changes are missing coverage. Please review.

Project coverage is 88.06%. Comparing base (d62ac24) to head (acc110b).
Report is 5 commits behind head on master.

Files Patch % Lines
micall/core/plot_contigs.py 98.02% 13 Missing ⚠️
micall/core/contig_stitcher.py 97.97% 7 Missing ⚠️
micall/core/denovo.py 90.47% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1032      +/-   ##
==========================================
+ Coverage   86.42%   88.06%   +1.64%     
==========================================
  Files          28       29       +1     
  Lines        6109     7150    +1041     
==========================================
+ Hits         5280     6297    +1017     
- Misses        829      853      +24     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Donaim Donaim force-pushed the newfangled-stitching branch 3 times, most recently from 646736e to cde06c8 Compare November 7, 2023 16:40
Donaim added 15 commits November 7, 2023 11:15
The updated function now creates a list with the same length as the
input strings, initialized with zeros. The function then performs a
moving window average comparison on the strings in both forward and
reverse directions. This enhancement is designed to provide a more
thorough and robust analysis of the sequence comparisons.

Also add a docstring to it.
Instead of appending the newly stitched part to the end,
prepend it at the start.

This way we make sure that it will be processed on
the next loop cycle.
@Donaim Donaim force-pushed the newfangled-stitching branch 2 times, most recently from d3c4aa7 to 0164ba6 Compare November 7, 2023 19:18
If we assert that addition of cigar strings is commutative,
then it is not associative. But for predictability of addition it is
more important to have associativity.
@Donaim Donaim force-pushed the newfangled-stitching branch 2 times, most recently from 8cd66d8 to d3dd2ca Compare February 20, 2024 23:32
@Donaim Donaim force-pushed the newfangled-stitching branch from d3dd2ca to 0f21847 Compare February 20, 2024 23:43
Copy link
Member

@donkirkby donkirkby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big pull request, so I started with one module: cigar_tools. Overall, it looks pretty good, but I've started a few threads for discussion.

micall/tests/test_cigar_tools.py Outdated Show resolved Hide resolved
micall/tests/test_cigar_tools.py Outdated Show resolved Hide resolved
micall/tests/test_cigar_tools.py Outdated Show resolved Hide resolved
micall/tests/test_cigar_tools.py Outdated Show resolved Hide resolved
micall/utils/cigar_tools.py Outdated Show resolved Hide resolved
micall/tests/test_cigar_tools.py Outdated Show resolved Hide resolved
@Donaim Donaim force-pushed the newfangled-stitching branch from 1add042 to 6ea286e Compare February 26, 2024 21:37
Also add a check for the text of the error messages.

Co-authored-by: Don Kirkby <donkirkby@users.noreply.github.com>
@Donaim Donaim force-pushed the newfangled-stitching branch from 6ea286e to 8000704 Compare February 26, 2024 21:51
@Donaim Donaim force-pushed the newfangled-stitching branch from b9ff3d3 to f44e405 Compare February 27, 2024 17:29
There seem to be quite a lot of unique checks that only PyCharm
performs, and neither flake8 nor ruff have them.

For example: grammar checks!

This commit fixes all the errors that I've seen in the stitcher code,
while browsing it in PyCharm.
@Donaim Donaim force-pushed the newfangled-stitching branch from 4cad8cf to a770f1c Compare February 29, 2024 22:25
And similarly, rename contigs_stitched.csv to contigs.csv
@Donaim Donaim force-pushed the newfangled-stitching branch 2 times, most recently from b03415e to aba3d58 Compare March 5, 2024 00:30
@Donaim Donaim force-pushed the newfangled-stitching branch from aba3d58 to cb28178 Compare March 5, 2024 00:31
@Donaim Donaim force-pushed the newfangled-stitching branch from 0da7699 to 940b186 Compare March 5, 2024 01:13
Use the unstitched versions of the files.
Copy link
Member

@donkirkby donkirkby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all looking pretty good. I posted a few questions for you, along with some minor suggestions for you to consider.

micall/tests/test_contig_stitcher.py Show resolved Hide resolved
micall/tests/test_contig_stitcher.py Show resolved Hide resolved
micall/tests/test_contig_stitcher.py Show resolved Hide resolved
micall/tests/test_contig_stitcher.py Show resolved Hide resolved
micall/core/contig_stitcher.py Outdated Show resolved Hide resolved
micall/core/contig_stitcher.py Outdated Show resolved Hide resolved
micall/core/contig_stitcher.py Outdated Show resolved Hide resolved
micall/core/contig_stitcher.py Show resolved Hide resolved
Co-authored-by: Don Kirkby <donkirkby@users.noreply.github.com>
@Donaim Donaim force-pushed the newfangled-stitching branch from 2358d28 to acc110b Compare March 7, 2024 05:47
@Donaim Donaim marked this pull request as ready for review March 11, 2024 18:31
@Donaim Donaim merged commit 3200a80 into master Mar 11, 2024
3 checks passed
@Donaim Donaim deleted the newfangled-stitching branch March 11, 2024 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reimplement the stitcher
2 participants