Question about generating "depth.txt" for metabat2 #20

hongzhonglu · 2021-03-21T07:50:23Z

Hi Francisco,
Could I ask a question about usage of metabat2 in bin analysis? I found it need this file-"depth.txt"? However I can not get file from the last step - "megahit". Do you know how to prepare "depth.txt" as the input for metabat2?

Thanks a lot!

Best regards,
Hongzhong

franciscozorrilla · 2021-03-22T12:19:57Z

Hi Hongzhong,

There are a few ways to do this, and the most optimal way depends on what you are trying to do.
Two important questions to ask yourself are:

Do you just want to use metabat2 for binning or are you using all 3 binners + metaWRAP? Yes = best performance
Are you cross mapping each set of short reads to each assembly? Yes = best performance

You can look at the tutorial/demo to get an idea of how to use the metaGEM.sh parser to interface with the Snakefile for job submissions, and specifically in this section you can see the cross-mapping. I just uncommented out 3 lines of the Snakefile in this last commit, so now you should be able to submit crossMap jobs to generate depth files just as described in the tutorial. For example, to submit 2 jobs with 24 cores + 120 GB RAM and 24 hour max runtime:

bash metaGEM.sh -t crossMap -j 2 -c 24 -m 120 -h 24

Note that by default this will run the Snakefile rule crossMap, which will submit one job per each of your samples. Within each of these jobs, there will be a for loop mapping each set of paired end reads in your dataset to the focal sample's assembly. These mapping files will are used to generate your coverage inputs for CONCOCT, MetaBAT2, and MaxBin2.

I should mention as a note of caution that this approach works well for small-to-medium-sized datasets (~ <= 150 medium sized samples), but may become impractical for large datasets, both in terms of runtime and computational load. This is because the job needs to generate N sorted bam files to create the concoct coverage table, where N = number of samples. You can imagine if you had a dataset of 300 samples, and each bam file is ~10GB, you would need around 3TB of temporary storage per job, and up to ~900TB if you run all jobs in parallel.

In the metaGEM manuscript we processed the TARA oceans dataset which was quite large (~246 samples). For these larger datasets we recommend to run a slightly modified workflow where each individual mapping operation is submitted as an individual job and mapped using kallisto. I am now working on adding support for this alternative branch of the workflow to the metaGEM.sh parser (issue #22).

Please let me know if you have further questions.
Best wishes,
Francisco

hongzhonglu · 2021-03-23T16:36:22Z

Hi Francisco,
Thanks a lot for your kind help! Now I just want to use metabat2 for binning as it is the first time for me to run the meta-genome analysis. So I start from simple things. I could find the steps to generate depth.txt file from your nice pipeline. I will study how to run it.

Best regards,
Hongzhong

franciscozorrilla · 2021-03-23T17:07:06Z

Hi Hongzhong,

In that case I recommend looking at the metabat rule in line 512 of the Snakefile.
Note that the output is commented out currently, since this is a "backup"/alternative version of running metabat2.
You will need to uncomment out the output to the metabat rule in line 518 to look like this:

directory(f'{config["path"]["root"]}/{config["folder"]["metabat"]}/{{IDs}}/{{IDs}}.metabat-bins')

and then comment out the output to the main metabat rule metabatCross on line 581 to look like this:

#directory(f'{config["path"]["root"]}/{config["folder"]["metabat"]}/{{IDs}}/{{IDs}}.metabat-bins')

You need to do this so that Snakemake knows exactly which rule to execute to generate your desired files.
After making sure that only your desired metabat2 rule has an uncommented output then you can submit metabat2 jobs to the cluster using:

bash metaGEM.sh -t metabat -j N_JOBS -c N_CORES -m MEMORY -h RUN_TIME

I have also recently expanded the metaGEM wiki, so please check it out if you want to learn more about usage and implementation of metaGEM.

Also, just so you know, from personal experience I found that CONCOCT tends to outperform maxbin2 and metabat2 in most cases. As a reference you can look at Supplementary Figure 2 of the metaGEM paper:

Hope this helps and let me know if you have any other questions.
Best wishes,
Francisco

hongzhonglu · 2021-03-23T18:28:19Z

Hi Francisco,
Thanks so much! Very good reference for me to study.

Best regards,
Hongzhong

franciscozorrilla · 2021-03-29T11:43:39Z

Closing this due to inactivity but please reopen/comment if you have further questions.

franciscozorrilla closed this as completed Mar 29, 2021

franciscozorrilla mentioned this issue Apr 15, 2021

Is it common that only about 30 MAGs of high quality were obtained from one metagenome sample? #24

Closed

Repository owner locked and limited conversation to collaborators May 10, 2021

franciscozorrilla added the question Further information is requested label May 22, 2021

franciscozorrilla self-assigned this May 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Question about generating "depth.txt" for metabat2 #20

Question about generating "depth.txt" for metabat2 #20

hongzhonglu commented Mar 21, 2021

franciscozorrilla commented Mar 22, 2021

hongzhonglu commented Mar 23, 2021 •

edited

Loading

franciscozorrilla commented Mar 23, 2021

hongzhonglu commented Mar 23, 2021

franciscozorrilla commented Mar 29, 2021

This issue was moved to a discussion.

This issue was moved to a discussion.

Question about generating "depth.txt" for metabat2 #20

Question about generating "depth.txt" for metabat2 #20

Comments

hongzhonglu commented Mar 21, 2021

franciscozorrilla commented Mar 22, 2021

hongzhonglu commented Mar 23, 2021 • edited Loading

franciscozorrilla commented Mar 23, 2021

hongzhonglu commented Mar 23, 2021

franciscozorrilla commented Mar 29, 2021

This issue was moved to a discussion.

hongzhonglu commented Mar 23, 2021 •

edited

Loading