Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix function generation reproducability #240

Merged
merged 3 commits into from
Apr 20, 2022
Merged

Conversation

jmid
Copy link
Collaborator

@jmid jmid commented Apr 19, 2022

This PR fixes #236

QCheck and QCheck2 generates functions by inhabiting (and side-effecting) an underlying Hashtbl, packaged as Poly_tbl.
A new function application then triggers a generator call and adds the new binding to the underlying table.
Overall, since the generator call side-effects the state of the underlying Random.State, the number of function applications may then unknowingly affect the output of another generator, e.g., if the function generator is part of a tuple as the example in #236:

(triple small_int (fun1 Observable.int small_int) small_int)

This side-effecting clearly breaks (what we tend to think of as) independence of the individual tuple entry generators.

The PR fixes the function generator by splitting the Random.State and lets the later function applications side-effect the split copy instead, thus restoring (some form of) independence. We do so for both QCheck and QCheck2.

The RS.split is part of the Random.State interface of the forth-coming OCaml 5.0 release - but we add a "poor man's" split for backwards compatibility on <5.0.

Finally the PR adds tests - with repeated reruns. Without the fix both of them gave rise to different prints when repeated. Both are fixed with the proposed patch.

@jmid
Copy link
Collaborator Author

jmid commented Apr 19, 2022

With a rewamped shrinker performance benchmark, below is the output of running it on this PR's fixed function generators.

When added up all the timing this PR represents

  • a speed-up for the QCheck2 shrinkers - test fold_left fold_right uncurried run with seed 8743 is reduced from 9.131s to 2.225s accounting for most of the QCheck2 improvement
  • a slow-down for the QCheck shrinkers - test fold_left test, fun first run with seed 8743 goes from 20.486s to 43.448s and from 125 successful shrink attempt out of 27750 to 191 out of 44563 attempts. That test run alone accounts for a 23sec slow-down (it is fixed by Shrinker improvements #235).
                                                         iteration seed 1234                   iteration seed 8743                   iteration seed 6789               total
Shrink test name                                  Q1/s  #succ/#att   Q2/s  #succ/#att   Q1/s  #succ/#att   Q2/s  #succ/#att   Q1/s  #succ/#att   Q2/s  #succ/#att    Q1/s   Q2/s
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
big bound issue59                                - skipped as generator is stateful, making it non-repeatable
long_shrink                                       0.023  149/351     0.917 3039/3099    0.005  148/349     0.594 3068/3127    0.008  146/345     0.673 3063/3124    0.036  2.184
ints arent 0 mod 3                                0.000   84/216     0.000    2/2       0.000   71/204     0.000    1/1       0.000   88/215     0.000   88/305     0.000  0.000
ints are 0                                        0.000   62/63      0.000   61/123     0.000   61/62      0.000   61/122     0.000   62/63      0.000   61/123     0.000  0.000
ints < 209609                                    - skipped as generator is stateful, making it non-repeatable
nat < 5001                                        0.000    6/56      0.000    7/77      0.000    6/47      0.000    7/69      0.000    3/24      0.000    8/85      0.000  0.000
char never produces 'abcdef'                      0.000    0/0       0.000    1/1       0.000    0/0       0.000    0/0       0.000    0/0       0.000    0/0       0.000  0.000
strings are empty                                 0.000  249/250     0.000    8/16      0.030 4466/4467    0.001   13/26      0.000    0/1       0.000    1/2       0.030  0.001
string never has a \000 char                      0.000   25/40      0.000   22/167     0.088 4466/4519    0.001   56/254     0.000   15/18      0.000   15/48      0.088  0.001
string never has a \255 char                      0.001  249/316     0.001   59/318     0.090 4466/4520    0.002   97/529     0.378 9260/9365    0.002   41/194     0.469  0.005
strings have unique chars                         0.003  248/269     0.000   18/30      0.922 4465/4536    0.002   24/52      0.000   14/34      0.000   15/20      0.925  0.002
pairs have different components                   0.000    0/4       0.000    0/6       0.000    0/4       0.000    0/6       0.000    0/6       0.000    0/10      0.000  0.000
pairs have same components                        0.000  125/126     0.000   63/125     0.000  124/125     0.000   62/123     0.000  119/120     0.000   63/125     0.000  0.000
pairs have a zero component                       0.000  124/188     0.000  122/306     0.000  123/186     0.000  122/306     0.000  118/182     0.000  123/308     0.000  0.000
pairs are (0,0)                                   0.000  125/126     0.000   63/125     0.000  124/125     0.000   62/123     0.000  119/120     0.000   63/125     0.000  0.000
pairs are ordered                                 0.000  827/17626   0.000   94/1217    0.000  690/12946   0.000   85/865     0.000  687/13963   0.000   94/1326    0.001  0.001
pairs are ordered reversely                       0.000  124/125     0.000   62/124     0.000  123/124     0.000   62/124     0.000  122/123     0.000   62/124     0.000  0.000
pairs sum to less than 128                        0.000  116/129     0.000   56/126     0.000  120/146     0.000   59/138     0.000  119/141     0.000   57/130     0.000  0.000
pairs lists rev concat                            0.014  140/332     0.009   83/168     0.008  137/335     0.002   75/152     0.000  130/318     0.000   67/136     0.022  0.011
pairs lists no overlap                            0.001   22/47      0.003   27/60      0.000   17/41      0.002   18/41      0.000    6/20      0.000   11/28      0.001  0.005
triples have pair-wise different components       0.000    7/31      0.000    3/15      0.000    6/6       0.000    3/3       0.000    2/6       0.000    3/3       0.000  0.000
triples have same components                      0.000  188/252     0.000   64/127     0.000  177/240     0.000   64/128     0.000  182/246     0.000   62/122     0.000  0.000
triples are ordered                               0.000  188/252     0.000    3/4       0.000  177/178     0.000    3/4       0.000  187/250     0.000   91/1021    0.000  0.000
triples are ordered reversely                     0.000  188/189     0.000   64/126     0.000  177/240     0.000  124/247     0.000  182/183     0.000   65/127     0.000  0.000
quadruples have pair-wise different components    0.000   23/41      0.000    4/4       0.000   11/11      0.000    4/4       0.000   14/38      0.000    4/11      0.000  0.000
quadruples have same components                   0.000  250/377     0.000  126/313     0.000  237/424     0.000  115/292     0.000  242/425     0.000  123/307     0.000  0.000
quadruples are ordered                            0.000  251/315     0.000    5/6       0.000  239/240     0.000    4/5       0.000  244/308     0.000    5/6       0.000  0.000
quadruples are ordered reversely                  0.000  251/252     0.000   66/128     0.000  239/302     0.000  126/250     0.000  244/245     0.000   66/128     0.000  0.000
forall (a, b) in nat: a < b                       0.000   13/23      0.000    6/16      0.000   10/15      0.000    6/15      0.000    5/6       0.000    4/7       0.000  0.000
forall (a, b, c) in nat: a < b < c                0.000   15/22      0.000    3/7       0.000   26/53      0.000    7/28      0.000    9/9       0.000    3/3       0.000  0.000
forall (a, b, c, d) in nat: a < b < c < d         0.000   23/29      0.000    4/4       0.000   30/56      0.000    4/4       0.000   13/13      0.000    4/4       0.000  0.000
forall (a, b, c, d, e) in nat: a < b < c < d < e  0.000   28/28      0.000    5/5       0.000   33/33      0.000    5/5       0.000   14/14      0.000    5/5       0.000  0.000
forall (a, b, c, d, e, f) in nat: a < b < c < d   0.000   30/30      0.000    6/6       0.000   38/38      0.000    6/6       0.000   16/16      0.000    6/6       0.000  0.000
forall (a, b, c, d, e, f, g) in nat: a < b < c <  0.000   31/31      0.000    7/7       0.000   41/41      0.000    7/7       0.000   22/22      0.000    7/7       0.000  0.000
forall (a, b, c, d, e, f, g, h) in nat: a < b <   0.000   35/35      0.000    8/8       0.000   48/48      0.000    8/8       0.000   22/22      0.000    7/7       0.000  0.000
forall (a, b, c, d, e, f, g, h, i) in nat: a < b  0.000   42/42      0.000    9/9       0.000   55/55      0.000    9/9       0.000   26/26      0.000    8/8       0.000  0.000
bind ordered pairs                                0.000  125/125     0.000    1/1       0.000  124/124     0.000    1/1       0.000  120/120     0.000    1/1       0.000  0.000
bind list_size constant                           0.000   50/358     0.000   12/26      0.000   48/338     0.000   12/25      0.000   48/342     0.000   11/21      0.000  0.000
lists are empty                                   0.000   11/16      0.000    8/16      0.001   19/27      0.004   13/26      0.000    4/9       0.000    1/2       0.002  0.004
lists shorter than 10                             0.000   50/1198    0.000   16/30      0.000   71/1637    0.002   21/42      0.000   36/868     0.000   15/29      0.001  0.002
lists shorter than 432                            6.390 1696/5118102  1.069  412/457     6.171 1612/4863421  1.015  405/450     6.165 1667/5037661  0.207  419/447    18.727  2.291
lists shorter than 4332                           2.240   13/190735  3.577 4022/4087    1.609   11/126052  4.343 4020/4067    1.467    7/126607  2.690 4013/4055    5.316 10.610
lists equal to duplication                        0.152   20/23      0.494    4/7       0.000    7/13      0.000    3/6       0.021   20/25      0.105   17/35      0.173  0.600
lists have unique elems                           0.000    7/17      0.000   11/22      0.002   12/44      0.004   17/30      0.000    6/16      0.000   10/20      0.003  0.004
tree contains only 42                             0.000   10/10      0.000    2/2       0.000   10/10      0.000    2/2       0.000   12/13      0.000    2/2       0.000  0.000
fail_pred_map_commute                             0.000  107/453     0.001  122/342     0.000  108/317     0.000   19/83      0.000  118/548     0.000   14/39      0.001  0.001
fail_pred_strings                                 0.000    1/3       0.000    2/5       0.000    1/2       0.000    1/4       0.000    1/2       0.000    1/4       0.000  0.000
fold_left fold_right                              0.000   24/74      0.001   56/189     0.000   34/149     0.000   21/58      0.000   21/53      0.000   22/64      0.001  0.001
fold_left fold_right uncurried                    2.541   97/80630   0.043  376/1550    0.145   38/390     2.225 2064/8057    0.000    5/20      0.000    4/17      2.686  2.269
fold_left fold_right uncurried fun last           0.001   21/88      0.001   56/189     0.000   35/220     0.002   72/170     0.000   22/73      0.000   28/73      0.001  0.003
fold_left test, fun first                         0.001   40/57      0.001   15/28     43.448  191/44563   3.519   47/9773   11.188  223/75912   0.003   36/64     54.637  3.523
                                                                                                                                                                   83.119 21.518

@jmid jmid mentioned this pull request Apr 19, 2022
@jmid jmid merged commit fef5416 into c-cube:master Apr 20, 2022
@jmid jmid deleted the fix-fun-gen-repro branch April 20, 2022 06:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generator reproducability with function generators
1 participant