-
Notifications
You must be signed in to change notification settings - Fork 950
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 4 files at r1.
Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @dsmilkov, and @tafsiri)
src/math.ts, line 19 at r1 (raw file):
/** * Exports under the tf.math.*
nit: append "namespace" at the end to make it a complete sentence.
src/math.ts, line 24 at r1 (raw file):
import {confusionMatrix} from './ops/confusion_matrix'; export const math = {confusionMatrix};
do export {confusionMatrix};
instead. Then in index.ts do:
import * as math from './math.ts'
export {....,math};
Like we do for other namespaces: "io", "webgl", "util" etc.
src/ops/confusion_matrix.ts, line 24 at r1 (raw file):
/** * Calcualte the confusion matrix.
How about: Computes the confusion matrix from predictions and labels.
src/ops/confusion_matrix.ts, line 38 at r1 (raw file):
* ``` * * @param {tf.Tensor1D} labels The target labels, assumed to be 0-based integers
jsdocs in typescript shouldn't have type annotations. We auto-generate them via our doc generator using the type information. Remove {tf.*} here and below.
src/ops/confusion_matrix.ts, line 39 at r1 (raw file):
* * @param {tf.Tensor1D} labels The target labels, assumed to be 0-based integers * for the categories. The shape is `[numExamples]`, where
s/categories/class so we have less terms.
src/ops/confusion_matrix.ts, line 41 at r1 (raw file):
* for the categories. The shape is `[numExamples]`, where * `numExamples` is the number of examples included. * @param {tf.Tensor1D} predictions The predicted probabilities, assumed to be
"predicted classes" since the predictions are not probabilities.
src/ops/confusion_matrix.ts, line 50 at r1 (raw file):
* `r` were predicted as class `c`. */ /** @doc {heading: 'Environment'} */
place it under heading: Operations, subheading: Evaluation , like we do for topk: https://js.tensorflow.org/api/latest/#Operations-Evaluation
src/ops/confusion_matrix.ts, line 51 at r1 (raw file):
*/ /** @doc {heading: 'Environment'} */ export function confusionMatrix(
rename to confusionMatrix_
and at the bottom of the file:
export const confusionMatrix = op({confusionMatrix_});.
And no need for any tidy below.
src/ops/confusion_matrix.ts, line 66 at r1 (raw file):
util.assert( labels.shape[0] === predictions.shape[0], `Mismatch in the number of examples: ` +
Let's make it more explicit: Labels and predictions should have the same number of elements
src/ops/confusion_matrix.ts, line 68 at r1 (raw file):
`Mismatch in the number of examples: ` + `${labels.shape[0]} vs. ${predictions.shape[0]}`); util.assert(
do the $labels = convertToTensor(labels). Same for predictions. See other ops for this. Also add a unit test to check that the op accepts native arrays.
src/ops/confusion_matrix.ts, line 77 at r1 (raw file):
const oneHotLabels = oneHot(labels, numClasses); const oneHotPredictions = oneHot(predictions, numClasses); return oneHotLabels.transpose().matMul(oneHotPredictions);
This is O(C^2*N) and CPU impl would give O(N). This means as soon as C=10, this GPU acceleration will erase all benefits since it's 100x more work than CPU. I think we should implement this in CPU and make it async with await .data(). Also, Shanqing can you check what TF Python does? Do they have a GPU/CUDA implementation at all? cc @nsthorat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 approvals obtained (waiting on @caisq and @tafsiri)
src/math.ts, line 19 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
nit: append "namespace" at the end to make it a complete sentence.
Done.
src/math.ts, line 24 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
do
export {confusionMatrix};
instead. Then in index.ts do:import * as math from './math.ts' export {....,math};
Like we do for other namespaces: "io", "webgl", "util" etc.
Done.
src/ops/confusion_matrix.ts, line 24 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
How about: Computes the confusion matrix from predictions and labels.
Adopted with minor modifications.
src/ops/confusion_matrix.ts, line 38 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
jsdocs in typescript shouldn't have type annotations. We auto-generate them via our doc generator using the type information. Remove {tf.*} here and below.
Done.
src/ops/confusion_matrix.ts, line 39 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
s/categories/class so we have less terms.
Done.
src/ops/confusion_matrix.ts, line 41 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
"predicted classes" since the predictions are not probabilities.
Done.
src/ops/confusion_matrix.ts, line 50 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
place it under heading: Operations, subheading: Evaluation , like we do for topk: https://js.tensorflow.org/api/latest/#Operations-Evaluation
Oops. Sorry for the copy-paste error. Done.
src/ops/confusion_matrix.ts, line 51 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
rename to confusionMatrix_
and at the bottom of the file:
export const confusionMatrix = op({confusionMatrix_});.And no need for any tidy below.
Done.
src/ops/confusion_matrix.ts, line 66 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
Let's make it more explicit: Labels and predictions should have the same number of elements
Done.
src/ops/confusion_matrix.ts, line 68 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
do the $labels = convertToTensor(labels). Same for predictions. See other ops for this. Also add a unit test to check that the op accepts native arrays.
Done. Thanks.
src/ops/confusion_matrix.ts, line 77 at r1 (raw file):
Previously, dsmilkov (Daniel Smilkov) wrote…
This is O(C^2*N) and CPU impl would give O(N). This means as soon as C=10, this GPU acceleration will erase all benefits since it's 100x more work than CPU. I think we should implement this in CPU and make it async with await .data(). Also, Shanqing can you check what TF Python does? Do they have a GPU/CUDA implementation at all? cc @nsthorat.
Good questions. TF Python uses their SparseTensor class, which is both efficient and runs on the GPU. Unfortunately, we don't have sparse tensors yet. I think the current approach is fine, because it avoids any downloads. Making the function async would be very surprising to the user.
In terms of complexity, the matMul happens on the GPU so it should be relatively fast. I just timed a confusionMatrix calll with 1000 classes and 1000 examples, it takes about 330 ms on WebGL 2, which is acceptable IMO.
+1 to usability over performance (making it sync is better than async & a minor performance win). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r2.
Reviewable status: 0 of 1 approvals obtained (waiting on @tafsiri)
src/ops/confusion_matrix.ts, line 77 at r1 (raw file):
Previously, caisq (Shanqing Cai) wrote…
Good questions. TF Python uses their SparseTensor class, which is both efficient and runs on the GPU. Unfortunately, we don't have sparse tensors yet. I think the current approach is fine, because it avoids any downloads. Making the function async would be very surprising to the user.
In terms of complexity, the matMul happens on the GPU so it should be relatively fast. I just timed a confusionMatrix calll with 1000 classes and 1000 examples, it takes about 330 ms on WebGL 2, which is acceptable IMO.
Got it. Let's go with this approach for now and we can revisit if users run into significant perf issues.
FEATURE
Fixes tensorflow/tfjs#771
This change is