Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Multivariate T likelihood cost #58

Merged
merged 34 commits into from
Jan 27, 2025

Conversation

johannvk
Copy link
Contributor

This PR adds a new cost function to skchange, the robust cost MultivariateTCost.

As the maximum likelihood estimator for the scale matrix of a multivariate student's T distribution is now known analytically we solve a fixed point iteration equation to find the MLE scale matrix on each interval, before evaluating the log likelihood of the sample at the computed MLE scale matrix estimate. However, using the capbilities of Numba, the performance is greatly improved as compared to a purely numpy and scipy based implementation.
Currently the column medians are used as a quick and robust estimator of the distribution mean.

The fatness of the tails in the T distribution is controlled by the degrees of freedom (dof), and when using this cost one can either specify a fixed degrees of freedom for the cost object, or estimate the degrees of freedom when fitting the cost to samples.

References:

  • Aeschliman, Chad, Johnny Park, and Avinash C. Kak. “A Novel Parameter Estimation Algorithm for the Multivariate T-Distribution and Its Application to Computer Vision.” In Computer Vision – ECCV 2010, edited by Kostas Daniilidis, Petros Maragos, and Nikos Paragios, 594–607. Berlin, Heidelberg: Springer, 2010. link
  • Ollila, Esa, Daniel P. Palomar, and Frédéric Pascal. “Shrinking the Eigenvalues of M-Estimators of Covariance Matrix.” arXiv, October 28, 2020. link
  • Pascal, Frédéric, Esa Ollila, and Daniel P. Palomar. “Improved Estimation of the Degree of Freedom Parameter of Multivariate T-Distribution.” In 2021 29th European Signal Processing Conference (EUSIPCO), 860–64, 2021. link

…ery close in speed... But Newton way cooler.
…w simple tests. Can use the quick (and theoretically interesting/meaningfull) exponential map on SPD matrices instead.
…ivariate t distribution. Like the idea of using the geometric mean of two different simple estimators.
…int iteration as when using Newton iterations.
…mpy style, and got a 10-20x performance increase.
…tching how the leave-one-out sample was handled by the scale matrix MLE algorithm.
…w which we refine the mv_t dof estimate using loo-iterative algorithm.
and added a relative tolerance criteria when estimating
the MLE scale matrix.
…ed unused utility functions in the test file for MvTCost.
@johannvk johannvk requested a review from Tveten January 20, 2025 16:23
Copy link

codecov bot commented Jan 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.74%. Comparing base (74edbf9) to head (80e5c64).
Report is 35 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #58      +/-   ##
==========================================
+ Coverage   98.53%   98.74%   +0.21%     
==========================================
  Files          49       50       +1     
  Lines        1838     2159     +321     
==========================================
+ Hits         1811     2132     +321     
  Misses         27       27              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…pecial functions used by the multivariate T cost. Clarify that they're not valid approximations at too low argument values (close to zero).
@johannvk johannvk requested a review from Tveten January 26, 2025 16:08
…put parameter to MultivariateTCost after call to .fit()
@johannvk
Copy link
Contributor Author

I experienced a problem when testing this PR on another project using skchange now, and the issue was fixed when I upgraded numba to version 0.61 and its dependency llvmlite was upgraded from 0.43.0 to 0.44.0. Is it possible that we have a dependency on these newest versions of numba and llvmlite?

…required for an unknown reason. And move tbb threading layer priority down, to avoid constant warning messages. Omp now checked for first.
@johannvk johannvk merged commit 362709f into main Jan 27, 2025
10 checks passed
@johannvk johannvk deleted the experimental/multivariate-t-mle-covariance-fitting branch January 31, 2025 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants