Skip to content

Commit

Permalink
Merge pull request #17 from uclahs-cds/jyhan
Browse files Browse the repository at this point in the history
Jyhan
  • Loading branch information
jeeyunhan authored Nov 8, 2024
2 parents d22dfaf + 127c0db commit 1ae5524
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 110 deletions.
43 changes: 15 additions & 28 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,17 @@
# OutSeekR 1.0.0 - 2024-11-07

## Features

### Core Functionality
- **Outlier Detection Algorithm**: Implements a statistical approach for detecting transcript-level outliers in RNA-seq or related data types, leveraging normalized data (e.g., FPKM) and several statistical metrics.
- **Multiple Statistics for Robust Analysis**: Utilizes five distinct statistics for each transcript to robustly assess outliers:
- Z-scores using mean and standard deviation.
- Z-scores using median and median absolute deviation.
- Z-scores with 5%-trimmed mean and standard deviation.
- Fraction of observations in the smaller cluster from K-means (K=2).
- Cosine similarity between extreme observed values and theoretical distribution quantiles.

### Null Data Simulation
- **Comprehensive Null Simulation**: Generates null datasets mimicking the observed data distribution, without outliers, through generalized additive modeling of four potential distributions.
# Unreleased

### Iterative Outlier Testing
- **P-value and Rank Product Calculation**: Calculates outlier p-values by comparing rank products from observed and null data across multiple rounds, refining the detection by iteratively removing the most extreme outliers.
- **FDR Adjustment**: Applies False Discovery Rate (FDR) correction to control for multiple testing.

### Performance Optimization
- **Parallel Processing Support**: Optimized for high-performance analysis using `future.apply` to enable parallelization, compatible with various computing environments.

### Output
- **Results**: Provides p-values, FDR values, the count of detected outliers, and the preferred statistical threshold for initial selection (`p-value` or `FDR`).

### Example Data and Parameters
- **Sample Data and Example Workflow**: Includes `outliers` sample data and demonstrates usage with the `detect.outliers()` function.
- **Adjustable Parameters**: Allows customization of p-value and FDR thresholds, the number of null data iterations, and the initial screen method for flexible analysis.
# OutSeekR 1.0.0 - 2024-11-07

## Added
* Implementation of core *Outlier Detection Algorithm*, a statistical approach for detecting transcript-level outliers in RNA-seq or related data types, leveraging normalized data (e.g., FPKM) and several statistical metrics.
* Five distinct statistics for robustly assessing outliers:
* Z-scores using mean and standard deviation.
* Z-scores using median and median absolute deviation.
* Z-scores with 5%-trimmed mean and standard deviation.
* Fraction of observations in the smaller cluster from K-means (K=2).
* Cosine similarity between extreme observed values and theoretical distribution quantiles.
* Comprehensive null simulation functionality. Generates null datasets mimicking the observed data distribution (without outliers) through generalized additive modeling of four potential distributions.
* Outlier p-value calculation by comparing rank products from observed and null data across multiple rounds, refining the detection by iteratively removing the most extreme outliers.
* Support for *false discovery rate* (FDR) correction to control for multiple testing.
* Optimization for high-performance analysis using `future.apply` to enable parallelization, compatible with various computing environments.
* Sample `outliers` data and usage demonstration.
82 changes: 0 additions & 82 deletions data-raw/make.outliers.R

This file was deleted.

0 comments on commit 1ae5524

Please sign in to comment.