Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

FIX: flaky test exponential generator #14287

Merged
merged 2 commits into from
Mar 4, 2019
Merged

Conversation

access2rohit
Copy link
Contributor

@access2rohit access2rohit commented Feb 28, 2019

Description

change success_rate to 0.20 for test_exponential_generator since 0.25 much stricter and fails random exponential generator tests.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Testing

Command:
ubuntu@ip-172-31-82-110 ~/Workspace/mxnet/tests/python/unittest (master) $ MXNET_TEST_COUNT=10000 nosetests --logging-level=DEBUG --verbose -s test_random.py:test_exponential_generator
Output:
[DEBUG] 10000 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=291062812 to reproduce.
ok


Ran 1 test in 46911.498s

OK

@access2rohit access2rohit requested a review from szha as a code owner February 28, 2019 22:24
@access2rohit
Copy link
Contributor Author

@anirudh2290 @eric-haibin-lin @apeforest Can you please review ?

@access2rohit access2rohit changed the title FIX: falky test exponential generator FIX: flaky test exponential generator Feb 28, 2019
@@ -1933,7 +1933,7 @@ def chi_square_check(generator, buckets, probs, nsamples=1000000):
_, p = ss.chisquare(f_obs=obs_freq, f_exp=expected_freq)
return p, obs_freq, expected_freq

def verify_generator(generator, buckets, probs, nsamples=1000000, nrepeat=5, success_rate=0.25, alpha=0.05):
def verify_generator(generator, buckets, probs, nsamples=1000000, nrepeat=5, success_rate=0.15, alpha=0.05):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did we change the value from 0.15 to 0.25?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's changed from 0.25 to 0.15, but I have the same question. Why this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question. The comment in this method says "25%" success rate. Why do we change it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuxihu @mseth10 @apeforest
It was changed in this PR: /~https://github.com/apache/incubator-mxnet/pull/13498/files
Even though It is mentioned in the comments that success rate should be 25%. There isn't any data to support why it should be kep at 25%. It could have been a typo too back then. SO, reverting it to original value seemed more logical.

Copy link
Contributor Author

@access2rohit access2rohit Feb 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is, due to increasing the strictness of this check for all generators it might cause them to become flaky. I can also explicitly relax the criteria for only this generator and we can observe the behavior of others and make changes accordingly. Setting it to a fixed value for all the generators doesn't seem like a good idea since all the random generators are different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuxihu : Actually No. He in fact made the check more strict. I am simply relaxing it. Therefore it shouldn't make another test flaky. But to make our tests better what I am suggesting is that i let it stay at 0.25 as default and explicitly pass 0.15 for exponential generator check. If others don't fail maybe only this one needed adjustment. Do you think this approach is better ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its fine to revert it to 0.25 to reduce flakiness. it wont make the other test flaky since this change is decreasing the failure chances.

Copy link
Member

@yuxihu yuxihu Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@access2rohit I like your idea of passing 0.15 for exponential generator check and let 0.25 be the default for others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuxihu @apeforest @mseth10 @anirudh2290 : I have updated successrate to 20% which is less stricter but not too loose a check. Currently running it 10k times and will update the PR once they succeed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests passed with success_rate=0.20

@anirudhacharya
Copy link
Member

@mxnet-label-bot add [pr-awaiting-review]

@marcoabreu marcoabreu added the pr-awaiting-review PR is waiting for code review label Mar 1, 2019
Copy link
Contributor

@ChaiBapchya ChaiBapchya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (after going through discussion and realizing that probably exponential generator needs relaxed checks)

Copy link
Contributor

@apeforest apeforest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@access2rohit Please make the change as you commented: make 0.25 as default and pass in 0.15 for exponential generator check.

Copy link
Member

@yuxihu yuxihu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@access2rohit
Copy link
Contributor Author

@anirudh2290 @apeforest can you review as well ?

Copy link
Member

@yuxihu yuxihu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more question: has this test been disabled? If so, do you also need to enable it in this PR?

Copy link
Contributor

@mseth10 mseth10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@access2rohit
Copy link
Contributor Author

@yuxihu : It still hasn't been merged. #14188

@anirudh2290 anirudh2290 merged commit 6152ffa into apache:master Mar 4, 2019
vdantu pushed a commit to vdantu/incubator-mxnet that referenced this pull request Mar 31, 2019
* changing success_percentage test correctness of random exponential generator to 20% to fix its flakiness

* Re-Trigger build
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
* changing success_percentage test correctness of random exponential generator to 20% to fix its flakiness

* Re-Trigger build
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants