Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sleuth "Advanced options" #168

Merged
merged 12 commits into from
Jun 3, 2018
Merged

sleuth "Advanced options" #168

merged 12 commits into from
Jun 3, 2018

Conversation

warrenmcg
Copy link
Collaborator

Hi @pimentel,

Here are my suggested changes to the sleuth API and Documentation:

New arguments

  • Add filter_target_id option to sleuth_prep, to allow users to specify a list of target_ids to filter using some independent method. This is recommended if the preferred filtering method requires a matrix-wide transformation (e.g. edgeR's CPM filter) or otherwise requires assessing multiple features simultaneously, since sleuth_prep filtering step is built to only assess features one at a time.
  • Add normalize option to sleuth_prep that allows the user to skip the normalization steps (and all subsequent steps) if set to FALSE. This severely reduces the functionality of the sleuth object for most downstream applications, but could be useful in certain situations (e.g. quickly checking a custom filter; quickly checking the raw data or summary of the kallisto objects; etc).
  • Add weight_func option to sleuth_results, to specify a custom weighting function that acts on the mean observations for each transcript when doing p-value aggregation.

Changed API

  • Changed API of transformation_function and transformation_function_tpm to transform_fun_counts and transform_fun_tpm, respectively, for clarity and brevity.
  • Greatly simplify the exposed API for sleuth_prep. Move several features to "advanced options" using ... and the Details section of the documentation for details. The following features were moved: filter_target_id, filter_fun, norm_fun_counts, norm_fun_tpm, extra_bootstrap_summary, read_bootstrap_tpm, max_bootstrap, transform_fun_counts, and transform_fun_tpm.
  • Simplified the exposed API for sleuth_fit. Moved which_var and the sliding_window_grouping extra options to ... with details in the Details section of the documentation.
  • Simplified the exposed API for sleuth_results, with the weight_func hidden in ..., but described in detail in the Details section of the documentation.

Other small changes:

  • Deprecate bs_sigma_summary, since it assumes that the bootstraps are summarized using the method in versions <= 0.28.1.
  • Add several sanity checks.

+ this allows the use different weights with the observed means of transcripts for the lancaster method
+ this prevents errors when a sleuth-ALR transformation is used
+ Changed API for 'transformation_fun' and 'transformation_fun_tpm' to
  'transform_fun_counts' and 'transform_fun_tpm' respectively
+ Added public API for the 'norm_fun_counts' and 'norm_fun_tpm' so that
  users can see how the data was normalized when viewing a sleuth object.
+ Added error handling if the user attempts to change 'norm_fun_counts' or 'norm_fun_tpm'
  manually.
+ Added new 'normalize' boolean to skip the normalization steps, which
  also skips the rest of the downstream processing (bootstrap summarization,
  transformation, etc.)
+ Moved several of the sleuth_prep options to a new section of 'advanced options'.
  These are now handled by the '...' argument. This includes options for
  summarizing the bootstraps ('read_bootstrap_tpm', 'extra_bootstrap_summary', 'max_bootstrap'),
  normalizing the data ('normalize' boolean, 'norm_fun_counts', 'norm_fun_tpm'),
  transforming the data ('transform_fun_counts', 'transform_fun_tpm'),
  and the old 'gene_mode' for counts aggregation. Followed the example for advanced options
  used by the 'polyester' package for its 'simulate_experiment' function.
+ Added sanity checks for the mutually exclusive 'gene_mode' & 'pval_aggregate' modes
  for gene-level aggregation. 'pval_aggregate' is the default mode if 'aggregation_column' is
  set. If the user tries to change either gene_mode or pval_aggregate manually, they receive
  warning if these two modes conflict and if 'gene_column' has not been set.
+ Changed how the sleuth object handles when 'transform_fun_counts' or 'transform_fun_tpm'
  are changed manually. Now it throws an error if nothing has been fit, preventing the user
  from changing the listed transformation function. This is so users can always see how the
  data was transformed when viewing fits within a sleuth object.
+ Specifies documentation for the 'which_var' argument.
+ Adds explicit documentation for the additional options to
'sliding_window_grouping': 'n_bins', 'lwr', and 'upr'.
+ Discuss the interpretation of the 'b' value from Wald test results.
+ Discuss the warning if gene aggregation is done with transcript-level
target_mappings.
+ Discuss the two aggregation modes.
+ Discuss the advanced option 'weight_func' for weighting the lancaster
method.
+ Add the expected columns to the specification of the results if a user
does 'pval_aggregate = TRUE'
@warrenmcg warrenmcg requested a review from pimentel April 18, 2018 03:29
+ now those packages can move to the 'imports' section of the DESCRIPTION
+ this addresses issue pachterlab#56
…cter columns

+ this prevents warnings introduced when the supplied target_mapping had factors instead
+ this handles bugs seen in issues pachterlab#76 and pachterlab#169
@lynnyi lynnyi closed this Apr 30, 2018
@lynnyi lynnyi reopened this May 8, 2018
@pimentel
Copy link
Collaborator

pimentel commented Jun 3, 2018

JFC, @warrenmcg. This is a massive PR. I didn't think you were going to go through all of this so carefully -- thanks for that. It looks great!

@pimentel pimentel merged commit 29c0a01 into pachterlab:devel Jun 3, 2018
@pimentel
Copy link
Collaborator

pimentel commented Jun 3, 2018

PS: thanks for dealing with the import stuff -- this had been on my mental TODO list forever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants