Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues #80 and #81: reduce 'mclapply' memory footprint; add "num_cores" option #94

Closed
wants to merge 6 commits into from

Conversation

warrenmcg
Copy link
Collaborator

Hi,

NOTE: This is the same pull request as #93, but I made the mistake of using my master branch instead of a dedicated branch, and unrelated changes got included. I corrected that mistake here.

Below is the same message I included with #93:

This is to address issues #80 and #81 that I opened.

For issue #80, gene_summary uses a nested lapply, with the first one using mclapply. This meant that every full bootstrap object was copied to each core. With more than a few samples, this leads to a very large memory footprint (20 samples failed on a machine with 120+ GB of RAM). Using the blogpost I mentioned in #80 as a starting point (link), I switched the use of mclapply to the second nested lapply, so that only individual bootstraps will be sent out to each core. This significantly reduces the memory footprint.

For issue #81, I wanted to take full advantage of a machine that had more than 2 cores, so I simply added the option to gene_summary and then to sleuth_prep, updated the documentation for sleuth_prep, and added test conditions to make sure the choice of num_cores was reasonable, throwing an informative error if it is not.

Finally, I followed your contribution guidelines to make the code lintr clean. In the process of making sure my updated code passed all of your tests, I found that one of your tests ("give a design matrix") had a bug, so I fixed it to make sure it worked.

Hope this helps!
Warren

…dded sleuth_prep option to select number of cores for mclapply
…ons for num_cores to throw informative error
…trix did not have dim names that matched the formula or the sample ids
@roryk
Copy link
Contributor

roryk commented Mar 14, 2017

This p/r is awesome.

@warrenmcg
Copy link
Collaborator Author

Hi Rory! Thank you for your kind words! It means a lot to me as a longtime user but occasional and novice contributor.

As an addendum, I'll also note to the sleuth authors that I realized that the 'give a design matrix' test issue on the main branch was already fixed in the devel branch, so I just fixed mine to match the changes in the devel branch.

@pimentel
Copy link
Collaborator

@warrenmcg thank you very much for this! nice work!

the gene aggregation stuff is currently undergoing some pretty massive changes in terms of memory usage which will be done probably by tomorrow. I will review it carefully by the end of the week.

same goes for your other PR.

thanks again,

harold

@warrenmcg
Copy link
Collaborator Author

warrenmcg commented Mar 16, 2017

Thank you @pimentel!

I saw on some of the other threads that you were out for medical leave. I hope you're feeling better! Excellent work on the tool overall, and for the improvements made to it since its release!

I began to make my own changes to the code to try to accommodate the gene_aggregation and the devel branch changes for my own work, so I'm excited to hear that you're almost finished with the merge!

Best,
Warren

@warrenmcg
Copy link
Collaborator Author

Hi @pimentel and @roryk,

Because of my updated code, reconciling the gene aggregation and memory usage enhancements, I've decide to close this pull request in favor of that one. Harold, let me know if you wish to focus on this one, in case you have a different set of changes to reconcile the two aspects of sleuth.

@warrenmcg warrenmcg closed this Apr 22, 2017
@roryk
Copy link
Contributor

roryk commented Apr 23, 2017

Thanks for the heads up Warren.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants