Update lip reading example #13647

seujung · 2018-12-14T06:11:27Z

Description

Add lip reading model using gluon

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

roywei · 2018-12-14T17:45:51Z

@mxnet-label-bot add[Example, Gluon, pr-awaiting-review]

aaronmarkham

I haven't been able to test this end-to-end yet. I've tried the data download process a couple of times, but have had to restart due to connectivity and space issues. I'll try again later, but I thought I'd at least give some initial feedback.
I'm looking forward to seeing this work. It seems like a very cool example. Thanks for sharing it.

example/gluon/lipnet/README.md

aaronmarkham · 2018-12-27T08:30:48Z

example/gluon/lipnet/lipnet_model.ipynb

+    "args = dict()\n",
+    "args['batch_size'] = 64\n",
+    "args['epochs'] = 100\n",
+    "args['image_path'] = '/home/ubuntu/works/2018/lips_model/data/datasets/'\n",


Might make it easier if you used paths relative to where this is in the examples folder and where the data gets downloaded.

example/gluon/lipnet/README.md

example/gluon/lipnet/utils/download_data.py

aaronmarkham · 2018-12-27T09:11:03Z

example/gluon/lipnet/utils/multi.py

+
+def split_seq(sam_num, n_tile):
+    """
+    Spli the number(sam_num) into numbers by n_tile


Suggested change

Spli the number(sam_num) into numbers by n_tile

Split the number(sam_num) into numbers by n_tile

example/gluon/lipnet/utils/multi.py

example/gluon/lipnet/utils/preprocess_data.py

Co-Authored-By: seujung <digit82@gmail.com>

aaronmarkham · 2018-12-28T06:54:11Z

Please add these to your prerequisites list:

scikit-image
scikit-video
dlib
tqdm

Also, I tried to run it without a GPU and couldn't get it to work with:

python3 main.py --use_gpu False 
# or 0

Raises this error:

mxnet.base.MXNetError: [06:52:06] src/ndarray/ndarray.cc:1233: GPU is not enabled

aaronmarkham · 2018-12-28T07:39:48Z

I built the project on a GPU instance this time and was able to run main.py. However, I immediately get a dump of a lot of these errors:

  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/data/dataloader.py", line 166, in _recursive_fork_recordio
    if depth >= max_depth:
RecursionError: maximum recursion depth exceeded in comparison

Looks like it ran that 200 times and failed each time.

* Split data into train and valid * Update Readme * Add infer.py * Remove ipynb * Apply to continual learning

vandanavk · 2019-02-05T19:36:58Z

is this PR good to go @thomelane @larroy ?

thomelane · 2019-02-07T00:19:22Z

@aaronmarkham any chance you could zip up all of the preprocessed files, to avoid aws s3 sync downloading loads of small files? would definitely save time and bandwidth.

And then we need to get those instructions into the README, so people don't try to download and preprocess the data themselves.

aaronmarkham · 2019-02-07T01:22:34Z

@thomelane Ok, I'm creating the tar files now. The reason I didn't do that is that I kept getting disconnected and wanted to be able to resume a sync. If you're pulling a 15gb file and have to start over, well, that's no fun.
At least people will have the option for either route once I've uploaded the files.
Rather than force another CI pass, if the code is alright here, let's merge it and then I can update the README in a follow-up PR.

aaronmarkham · 2019-02-07T22:27:06Z

I put the tar files in a separate bucket so you can pick how you want to download.

To get the tar files:

 aws s3 sync s3://mxnet-public/lipnet/data-archives .

Or to download them by link:
https://mxnet-public.s3.amazonaws.com/lipnet/data-archives/align.tgz
https://mxnet-public.s3.amazonaws.com/lipnet/data-archives/datasets.tgz

To get the folders (unzipped):

 aws s3 sync s3://mxnet-public/lipnet/data .

thomelane · 2019-02-08T23:52:21Z

@aaronmarkham thanks for uploading! sure, you can add the instructions to the readme in a different commit, wouldn't that need a CI run anyway, or has that been optimised now to ignore markdown changes?

@seujung the model seems to be training okay (i.e. loss going down), but still noticeable differences between target and prediction. How good are the predictions on a correctly trained model? Also noticed that things like learning rate aren't explicitly defined, are the defaults correct for this model?

soeque1 · 2019-02-10T13:22:31Z

@aaronmarkham I checked the file you uploaded. It was nice as intended.
@thomelane @seujung and I are working together for this example.
(1) After we did some experiments, we decided to use default learning rate.
(2) We checked the prediction, decoded using beam search.

It takes too long time to train this model. The main reason is the decode part (beam search) on validation data (def infer_batch). Actually, we do not need decode all the validation examples during training, so we skip this or check only one mini batch example. To speed up, we only check the decoded result using infer.py not main.py (train).
But it helps to understand how good the result is.

(3) Although the loss is still decreasing, I attach the pre-trained model.
Model
Sample

You can get the result.

python infer.py model_path='checkpoint/epoches_81_loss_15.7157'

Or
You can resume training.

python main.py model_path='checkpoint/epoches_81_loss_15.7157'

thomelane · 2019-02-11T20:02:30Z

Great, thanks for clarifying @soeque1!

thomelane · 2019-02-11T20:14:14Z

LGTM

szha · 2019-02-13T05:37:25Z

example/gluon/lipnet/README.md

@@ -0,0 +1,194 @@
+# LipNet: End-to-End Sentence-level Lipreading


Suggested change

# LipNet: End-to-End Sentence-level Lipreading



# LipNet: End-to-End Sentence-level Lipreading

License isn't required on readme files. @szha if you feel strongly about adding it, I'm going to modify the readme in another PR later today and I can add it then.

* update lipnet * update utils * Update example/gluon/lipnet/README.md Co-Authored-By: seujung <digit82@gmail.com> * Update example/gluon/lipnet/README.md Co-Authored-By: seujung <digit82@gmail.com> * Update example/gluon/lipnet/utils/multi.py Co-Authored-By: seujung <digit82@gmail.com> * Update example/gluon/lipnet/utils/preprocess_data.py Co-Authored-By: seujung <digit82@gmail.com> * Update example/gluon/lipnet/utils/multi.py Co-Authored-By: seujung <digit82@gmail.com> * Update example/gluon/lipnet/utils/download_data.py Co-Authored-By: seujung <digit82@gmail.com> * fix error for using gpu mode * Add requirements * Remove unnecessary requirements * Update .gitignore * Remove inappropriate license file * Changed relative path * Fix description * Fix description * Fix description * Fix description * Change doc strings and add url reference * Fix align_path * Remove zip files * Fix bugs: source_path, n_process * Fix target_path * Fix exception handler and resume the preprocess * Pass the output when it fails to detect the mouth * Add exception during collecting images * Add the disk space and fix default align_path * Change optimizer * Update readme for pip * Update README * Add checkpoint folder * Apply to train using multiprocess * update network.py * delete batchnorm comment *fix dropout * fix loading ndarray as F * add space * Update readme * Add the info of GRID Data * Add the info of word alignments * Add total download size * Add time for preprocessing * Add test code for beamsearch * add space * delete line and fix code * Add shebang in BeamSearch * Fix trainer * Add space line * Fix appeding losses * Fix trainer * Delete debug line in data_loader * Move transpose of input into data_loader * Delete trailing-whitespace * Hybridize lip model * Hybridize model * Refactor the len of input sequence * Fix the shape of model * Apply to split train and validation * Split data into train and valid * Update Readme * Add infer.py * Remove ipynb * Apply to continual learning * Add images * Update readme * Fix typo and pylint * Fix loss digits of save_file and typo * Add info of data split and batch size

Tech. Prototyping그룹 정승환 and others added 2 commits December 14, 2018 14:56

update lipnet

4ce3c9d

update utils

a2d237c

seujung requested a review from szha as a code owner December 14, 2018 06:11

marcoabreu added Example Gluon pr-awaiting-review PR is waiting for code review labels Dec 14, 2018

aaronmarkham suggested changes Dec 27, 2018

View reviewed changes

aaronmarkham and others added 6 commits December 27, 2018 22:57

Update example/gluon/lipnet/README.md

c6007ea

Co-Authored-By: seujung <digit82@gmail.com>

Update example/gluon/lipnet/README.md

6cd8667

Co-Authored-By: seujung <digit82@gmail.com>

Update example/gluon/lipnet/utils/multi.py

a0071d5

Co-Authored-By: seujung <digit82@gmail.com>

Update example/gluon/lipnet/utils/preprocess_data.py

5f78f05

Co-Authored-By: seujung <digit82@gmail.com>

Update example/gluon/lipnet/utils/multi.py

089455d

Co-Authored-By: seujung <digit82@gmail.com>

Update example/gluon/lipnet/utils/download_data.py

ab79109

Co-Authored-By: seujung <digit82@gmail.com>

seujung and others added 14 commits December 28, 2018 16:58

fix error for using gpu mode

9f10967

Add requirements

4aa4640

Remove unnecessary requirements

c5503d9

Update .gitignore

efe6295

Remove inappropriate license file

a958ad9

Changed relative path

3e8a709

Fix description

4e7ba27

Fix description

b8fbb26

Fix description

ac509a5

Fix description

ddeb117

Change doc strings and add url reference

271f3ac

Fix align_path

2ba0b90

Remove zip files

71d779d

Fix bugs: source_path, n_process

a9da0e0

soeque1 added 7 commits January 25, 2019 10:01

Fix the shape of model

a18a96b

Apply to split train and validation

66c1b94

* Split data into train and valid * Update Readme * Add infer.py * Remove ipynb * Apply to continual learning

Add images

05009c8

Update readme

b2f8d51

Fix typo and pylint

97dbcde

Fix loss digits of save_file and typo

ed3e4c1

Add info of data split and batch size

de1eb6b

soeque1 approved these changes Jan 27, 2019

View reviewed changes

soeque1 approved these changes Jan 28, 2019

View reviewed changes

szha reviewed Feb 13, 2019

View reviewed changes

szha approved these changes Feb 13, 2019

View reviewed changes

aaronmarkham merged commit 7ff6ad1 into apache:master Feb 13, 2019

szha mentioned this pull request Feb 13, 2019

fix website build #14148

Merged

4 tasks

soeque1 deleted the lipnet branch March 10, 2019 12:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update lip reading example #13647

Update lip reading example #13647

seujung commented Dec 14, 2018

roywei commented Dec 14, 2018

aaronmarkham left a comment

aaronmarkham Dec 27, 2018

soeque1 Dec 31, 2018 •

edited

Loading

aaronmarkham Dec 27, 2018

soeque1 Dec 31, 2018

aaronmarkham commented Dec 28, 2018 •

edited

Loading

aaronmarkham commented Dec 28, 2018

vandanavk commented Feb 5, 2019

thomelane commented Feb 7, 2019

aaronmarkham commented Feb 7, 2019

aaronmarkham commented Feb 7, 2019

thomelane commented Feb 8, 2019

soeque1 commented Feb 10, 2019 •

edited

Loading

thomelane commented Feb 11, 2019

thomelane commented Feb 11, 2019

szha Feb 13, 2019

aaronmarkham Feb 13, 2019

	Spli the number(sam_num) into numbers by n_tile
	Split the number(sam_num) into numbers by n_tile

		@@ -0,0 +1,194 @@
		# LipNet: End-to-End Sentence-level Lipreading

-# LipNet: End-to-End Sentence-level Lipreading
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+# LipNet: End-to-End Sentence-level Lipreading

Update lip reading example #13647

Update lip reading example #13647

Conversation

seujung commented Dec 14, 2018

Description

Checklist

Essentials

Changes

Comments

roywei commented Dec 14, 2018

aaronmarkham left a comment

Choose a reason for hiding this comment

aaronmarkham Dec 27, 2018

Choose a reason for hiding this comment

soeque1 Dec 31, 2018 • edited Loading

Choose a reason for hiding this comment

aaronmarkham Dec 27, 2018

Choose a reason for hiding this comment

soeque1 Dec 31, 2018

Choose a reason for hiding this comment

aaronmarkham commented Dec 28, 2018 • edited Loading

aaronmarkham commented Dec 28, 2018

vandanavk commented Feb 5, 2019

thomelane commented Feb 7, 2019

aaronmarkham commented Feb 7, 2019

aaronmarkham commented Feb 7, 2019

thomelane commented Feb 8, 2019

soeque1 commented Feb 10, 2019 • edited Loading

thomelane commented Feb 11, 2019

thomelane commented Feb 11, 2019

szha Feb 13, 2019

Choose a reason for hiding this comment

aaronmarkham Feb 13, 2019

Choose a reason for hiding this comment

soeque1 Dec 31, 2018 •

edited

Loading

aaronmarkham commented Dec 28, 2018 •

edited

Loading

soeque1 commented Feb 10, 2019 •

edited

Loading