Simplifications and some fun stuff for the MNIST Gluon tutorial #13094

kohr-h · 2018-11-02T19:30:02Z

Description

This PR removes some of the (IMO) unnecessary complexity from the tutorial (one of the first that new users click on) by always using 1 GPU or 1 CPU. This makes the MLP training code much more straightforward (no splitting, no inner loop).
I also think that there was too much staccato of obvious comments in that part, while the truly non-obvious parts were left uncommented (e.g., what is ag.record()?). I tried to straighten up that part a bit, although I think some comments are still a bit on the obvious side.

Furthermore, I added a small section where some of the mislabeled (apparently it's one "l", need to fix that) images are plotted together with the predicted and true labels. I think that's more fun than just looking at an accuracy score.

I'm not finished with the whole tutorial yet, so it's WIP, but I'd like to know what you think before I put more work into this PR.

Note: I've included the 2 new images in the same directory as the tutorial for convenience, they would obviously need to go somewhere else.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

Changes are complete (i.e. I finished coding on this PR)

ankkhedia · 2018-11-02T20:36:30Z

@kohr-h Thanks for your contribiution!

@mxnet-label-bot [pr-work-in-progress, Gluon]

ThomasDelteil

Thanks for the much needed update @kohr-h, can I bother you to update the data loading to be pure Gluon rather than using the old DataIter API ? That means no need to reset the iterator between epoch and the data comes out of the iterator as a tuple (data, label) rather than a batch.data and batch.label

train_dataset = mx.gluon.data.vision.MNIST(train=True).transform_first(mx.gluon.data.vision.transforms.ToTensor())
test_dataset = mx.gluon.data.vision.MNIST(train=False).transform_first(mx.gluon.data.vision.transforms.ToTensor())

train_data = mx.gluon.data.DataLoader(train_dataset, shuffle=True, last_batch='rollover', batch_size=batch_size, num_workers=2)
test_data = mx.gluon.data.DataLoader(train_dataset, shuffle=False, last_batch='rollover', batch_size=batch_size, num_workers=2)

docs/tutorials/gluon/mnist.md

kohr-h · 2018-11-06T21:56:07Z

Thanks for the much needed update @kohr-h, can I bother you to update the data loading to be pure Gluon rather than using the old DataIter API ? That means no need to reset the iterator between epoch and the data comes out of the iterator as a tuple (data, label) rather than a batch.data and batch.label

No problem. I've started doing that, more to come.

kalyc · 2018-11-13T21:18:04Z

Thanks for your contribution @kohr-h
Could you please update the PR as per discussions above?

kohr-h · 2018-11-14T19:46:38Z

Thanks for the reminder @kalyc. I got too many balls in the air. Will get this done asap.

kohr-h · 2018-11-28T22:08:10Z

I'm done with the rewording and restructuring of the example.

Note that I also made the CNN part more basic by moving from a custom nn.Block subclass to a nn.Sequential model. I don't think it added any value, and there are other tutorials that explain this in more detail.

Ready for review.

ThomasDelteil

A lot of nit-picking on the formatting but overall when these are addressed I think we should be good to merge. Great work on the wording and explanations. Thanks for the contribution 👍

docs/tutorials/gluon/mnist.md

- Apply hybrid blocks - Move outputs outside of code blocks and mark for notebooks to ignore - Remove images, use external link - Fix a few formulations

kohr-h · 2018-11-29T23:50:54Z

I think I addressed all the comments from the review.

vandanavk · 2018-12-03T03:39:39Z

@sandeep-krishnamurthy @stu1130 for review

sandeep-krishnamurthy

Looks great. Thank you.
1 question - Why activation function in last layer?

docs/tutorials/gluon/mnist.md

kohr-h · 2018-12-04T22:41:10Z

1 question - Why activation function in last layer?

No real reason. I think I didn't change the original logic, but I can check again. If a different last activation is more reasonable for this tutorial, I can put it in, no problem.

sandeep-krishnamurthy · 2018-12-05T06:23:22Z

Sigmoid activation is a better fit here to output the probability of 10 digits.

kohr-h · 2018-12-05T19:33:48Z

@sandeep-krishnamurthy I switched both nets to sigmoid activation. Note though that they perform worse with the same settings. I've tested a few learning rates, and the current ones improve things a bit. But maybe it's okay for the scope of the tutorial that the training isn't perfectly fine-tuned.

docs/tutorials/gluon/mnist.md

kohr-h · 2018-12-05T20:15:01Z

Okay, now we're in the right ballpark with the results. Thanks @ThomasDelteil for the catch.

…al (#13094)" This reverts commit 8bbac82.

…icense file" (#13558) * Revert "Chi_square_check for discrete distribution fix (#13543)" This reverts commit cf6e8cb. * Revert "Updated docs for randint operator (#13541)" This reverts commit e0ff3c3. * Revert "Simplifications and some fun stuff for the MNIST Gluon tutorial (#13094)" This reverts commit 8bbac82. * Revert "Fix #13521 (#13537)" This reverts commit f6b4665. * Revert "Add a retry to qemu_provision (#13551)" This reverts commit f6f8401. * Revert "[MXNET-769] Use MXNET_HOME in a tempdir in windows to prevent access denied due t… (#13531)" This reverts commit bd8e0f8. * Revert "[MXNET-1249] Fix Object Detector Performance with GPU (#13522)" This reverts commit 1c8972c. * Revert "Fixing a 404 in the ubuntu setup doc (#13542)" This reverts commit cb0db29. * Revert "Bumped minor version from 1.4.0 to 1.5.0 on master, updated License file (#13478)" This reverts commit 40db619.

…he#13094) * Simplify mnist Gluon tutorial and add mislabelled sample plotting * Add mnist Gluon tutorial images * Gluon MNIST tutorial: Use modern Gluon constructs, fix some wordings * [Gluon] Move to data loaders and improve wording in MNIST tutorial * Fix broken links * Fix spelling of mislabeled * Final rewordings and code simplifications * Fix things according to review - Apply hybrid blocks - Move outputs outside of code blocks and mark for notebooks to ignore - Remove images, use external link - Fix a few formulations * Change activations to sigmoid in MNIST tutorial * Remove superfluous last layer activations in MNIST tutorial

…icense file" (apache#13558) * Revert "Chi_square_check for discrete distribution fix (apache#13543)" This reverts commit cf6e8cb. * Revert "Updated docs for randint operator (apache#13541)" This reverts commit e0ff3c3. * Revert "Simplifications and some fun stuff for the MNIST Gluon tutorial (apache#13094)" This reverts commit 8bbac82. * Revert "Fix apache#13521 (apache#13537)" This reverts commit f6b4665. * Revert "Add a retry to qemu_provision (apache#13551)" This reverts commit f6f8401. * Revert "[MXNET-769] Use MXNET_HOME in a tempdir in windows to prevent access denied due t… (apache#13531)" This reverts commit bd8e0f8. * Revert "[MXNET-1249] Fix Object Detector Performance with GPU (apache#13522)" This reverts commit 1c8972c. * Revert "Fixing a 404 in the ubuntu setup doc (apache#13542)" This reverts commit cb0db29. * Revert "Bumped minor version from 1.4.0 to 1.5.0 on master, updated License file (apache#13478)" This reverts commit 40db619.

* upstream/master: (54 commits) Add notes about debug with libstdc++ symbols (apache#13533) add cpp example inception to nightly test (apache#13534) Fix exception handling api doc (apache#13519) fix link for gluon model zoo (apache#13583) ONNX import/export: Size (apache#13112) Update MXNetTutorialTemplate.ipynb (apache#13568) fix the situation where idx didn't align with rec (apache#13550) Fix use-before-assignment in convert_dot (apache#13511) License update (apache#13565) Update version to v1.5.0 including clojure package (apache#13566) Fix flaky test test_random:test_randint_generator (apache#13498) Add workspace cleaning after job finished (apache#13490) Adding test for softmaxoutput (apache#13116) apache#13441 [Clojure] Add Spec Validations for the Random namespace (apache#13523) Revert "Bumped minor version from 1.4.0 to 1.5.0 on master, updated License file" (apache#13558) Chi_square_check for discrete distribution fix (apache#13543) Updated docs for randint operator (apache#13541) Simplifications and some fun stuff for the MNIST Gluon tutorial (apache#13094) Fix apache#13521 (apache#13537) Add a retry to qemu_provision (apache#13551) ...

kohr-h requested a review from szha as a code owner November 2, 2018 19:30

marcoabreu added Gluon pr-work-in-progress PR is still work in progress labels Nov 2, 2018

ThomasDelteil suggested changes Nov 5, 2018

View reviewed changes

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved

kohr-h added 7 commits November 26, 2018 21:21

Simplify mnist Gluon tutorial and add mislabelled sample plotting

f1d390b

Add mnist Gluon tutorial images

5df5071

Gluon MNIST tutorial: Use modern Gluon constructs, fix some wordings

12914c5

[Gluon] Move to data loaders and improve wording in MNIST tutorial

286ed28

Fix broken links

9b4cda0

Fix spelling of mislabeled

54b09ae

Final rewordings and code simplifications

92e199f

kohr-h mentioned this pull request Nov 28, 2018

Add images for misclassified MNIST examples dmlc/web-data#142

Merged

ThomasDelteil suggested changes Nov 28, 2018

View reviewed changes

kohr-h changed the title ~~WIP: Simplifications and some fun stuff for the MNIST Gluon tutorial~~ Simplifications and some fun stuff for the MNIST Gluon tutorial Nov 29, 2018

Fix things according to review

be966bd

- Apply hybrid blocks - Move outputs outside of code blocks and mark for notebooks to ignore - Remove images, use external link - Fix a few formulations

ThomasDelteil approved these changes Nov 30, 2018

View reviewed changes

sandeep-krishnamurthy reviewed Dec 4, 2018

View reviewed changes

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved

Change activations to sigmoid in MNIST tutorial

6fcc201

ThomasDelteil reviewed Dec 5, 2018

View reviewed changes

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved

Remove superfluous last layer activations in MNIST tutorial

242e5e4

ThomasDelteil merged commit 8bbac82 into apache:master Dec 6, 2018

TaoLv added a commit that referenced this pull request Dec 6, 2018

Revert "Simplifications and some fun stuff for the MNIST Gluon tutori…

b9096f4

…al (#13094)" This reverts commit 8bbac82.

kohr-h deleted the gluon_mnist branch December 7, 2018 10:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplifications and some fun stuff for the MNIST Gluon tutorial #13094

Simplifications and some fun stuff for the MNIST Gluon tutorial #13094

kohr-h commented Nov 2, 2018 •

edited

Loading

ankkhedia commented Nov 2, 2018

ThomasDelteil left a comment •

edited

Loading

kohr-h commented Nov 6, 2018

kalyc commented Nov 13, 2018

kohr-h commented Nov 14, 2018

kohr-h commented Nov 28, 2018

ThomasDelteil left a comment

kohr-h commented Nov 29, 2018

vandanavk commented Dec 3, 2018

sandeep-krishnamurthy left a comment

kohr-h commented Dec 4, 2018

sandeep-krishnamurthy commented Dec 5, 2018

kohr-h commented Dec 5, 2018

kohr-h commented Dec 5, 2018

Simplifications and some fun stuff for the MNIST Gluon tutorial #13094

Simplifications and some fun stuff for the MNIST Gluon tutorial #13094

Conversation

kohr-h commented Nov 2, 2018 • edited Loading

Description

Checklist

Essentials

ankkhedia commented Nov 2, 2018

ThomasDelteil left a comment • edited Loading

Choose a reason for hiding this comment

kohr-h commented Nov 6, 2018

kalyc commented Nov 13, 2018

kohr-h commented Nov 14, 2018

kohr-h commented Nov 28, 2018

ThomasDelteil left a comment

Choose a reason for hiding this comment

kohr-h commented Nov 29, 2018

vandanavk commented Dec 3, 2018

sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

kohr-h commented Dec 4, 2018

sandeep-krishnamurthy commented Dec 5, 2018

kohr-h commented Dec 5, 2018

kohr-h commented Dec 5, 2018

kohr-h commented Nov 2, 2018 •

edited

Loading

ThomasDelteil left a comment •

edited

Loading