Skip to content

Commit

Permalink
DOC fix: The algorithm explained - and implemented - in K-Medoids… (#44)
Browse files Browse the repository at this point in the history
Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com>
  • Loading branch information
kno10 and rth authored Mar 29, 2020
1 parent ffcf96f commit 0008438
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 12 deletions.
17 changes: 8 additions & 9 deletions doc/user_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,11 @@ clusters. This makes it more suitable for smaller datasets in comparison to

**Algorithm description:**
There are several algorithms to compute K-Medoids, though :class:`KMedoids`
currently only supports Partitioning Around Medoids (PAM). The PAM algorithm
uses a greedy search, which may fail to find the global optimum. It consists of
two alternating steps commonly called the
Assignment and Update steps (BUILD and SWAP in Kaufmann and Rousseeuw, 1987).
currently only supports K-Medoids solver analogous to K-Means. Other frequently
used approach is partitioning around medoids (PAM) which is currently not
implemented.

PAM works as follows:
This version works as follows:

* Initialize: Select ``n_clusters`` from the dataset as the medoids using
a heuristic, random, or k-medoids++ approach (configurable using the ``init`` parameter).
Expand All @@ -65,7 +64,7 @@ PAM works as follows:

.. topic:: References:

* "Clustering by Means of Medoids'"
Kaufman, L. and Rousseeuw, P.J.,
Statistical Data Analysis Based on the L1Norm and Related Methods, edited
by Y. Dodge, North-Holland, 405416. 1987
* Maranzana, F.E., 1963. On the location of supply points to minimize
transportation costs. IBM Systems Journal, 2(2), pp.129-135.
* Park, H.S. and Jun, C.H., 2009. A simple and fast algorithm for K-medoids
clustering. Expert systems with applications, 36(2), pp.3336-3341.
7 changes: 4 additions & 3 deletions sklearn_extra/cluster/_k_medoids.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,10 @@ class KMedoids(BaseEstimator, ClusterMixin, TransformerMixin):
References
----------
Kaufman, L. and Rousseeuw, P.J., Statistical Data Analysis Based on
the L1–Norm and Related Methods, edited by Y. Dodge, North-Holland,
405–416. 1987
Maranzana, F.E., 1963. On the location of supply points to minimize
transportation costs. IBM Systems Journal, 2(2), pp.129-135.
Park, H.S.and Jun, C.H., 2009. A simple and fast algorithm for K-medoids
clustering. Expert systems with applications, 36(2), pp.3336-3341.
See also
--------
Expand Down

0 comments on commit 0008438

Please sign in to comment.