Skip to content

Commit

Permalink
fix test
Browse files Browse the repository at this point in the history
  • Loading branch information
lhoestq committed Sep 25, 2020
1 parent 9cc507b commit 896e9dd
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions tests/test_arrow_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -577,6 +577,7 @@ def test_map_multiprocessing(self, in_memory):
self.assertEqual(len(dset_test._data_files), 0 if in_memory else 2)
self.assertListEqual(dset_test["id"], list(range(30)))
self.assertNotEqual(dset_test._fingerprint, fingerprint)
del dset, dset_test

def test_new_features(self, in_memory):
with tempfile.TemporaryDirectory() as tmp_dir:
Expand Down

1 comment on commit 896e9dd

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show benchmarks

PyArrow==0.17.1

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.019708 / 0.011353 (0.008355) 0.014208 / 0.011008 (0.003200) 0.048121 / 0.038508 (0.009613) 0.035440 / 0.023109 (0.012331) 0.203453 / 0.275898 (-0.072445) 0.234939 / 0.323480 (-0.088541) 0.006640 / 0.007986 (-0.001345) 0.005255 / 0.004328 (0.000926) 0.006843 / 0.004250 (0.002593) 0.049089 / 0.037052 (0.012037) 0.221432 / 0.258489 (-0.037057) 0.243720 / 0.293841 (-0.050121) 0.152565 / 0.128546 (0.024019) 0.127567 / 0.075646 (0.051920) 0.443658 / 0.419271 (0.024386) 0.512692 / 0.043533 (0.469159) 0.205215 / 0.255139 (-0.049924) 0.218803 / 0.283200 (-0.064396) 0.085386 / 0.141683 (-0.056297) 1.836378 / 1.452155 (0.384223) 1.891431 / 1.492716 (0.398715)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.038474 / 0.037411 (0.001063) 0.024009 / 0.014526 (0.009483) 0.151255 / 0.176557 (-0.025302) 1.155322 / 0.737135 (0.418187) 0.138366 / 0.296338 (-0.157972)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.219204 / 0.215209 (0.003995) 2.160234 / 2.077655 (0.082579) 1.275465 / 1.504120 (-0.228655) 1.156445 / 1.541195 (-0.384750) 1.173299 / 1.468490 (-0.295191) 6.876564 / 4.584777 (2.291787) 5.757948 / 3.745712 (2.012236) 8.112494 / 5.269862 (2.842633) 6.938740 / 4.565676 (2.373063) 0.680204 / 0.424275 (0.255929) 0.010719 / 0.007607 (0.003112) 0.242420 / 0.226044 (0.016375) 2.562125 / 2.268929 (0.293196) 1.731243 / 55.444624 (-53.713381) 1.582589 / 6.876477 (-5.293888) 1.607229 / 2.142072 (-0.534843) 6.812621 / 4.805227 (2.007394) 8.179307 / 6.500664 (1.678643) 8.692541 / 0.075469 (8.617072)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 11.131748 / 1.841788 (9.289960) 13.106738 / 8.074308 (5.032430) 15.187423 / 10.191392 (4.996031) 0.449245 / 0.680424 (-0.231179) 0.305177 / 0.534201 (-0.229023) 0.786755 / 0.579283 (0.207472) 0.604695 / 0.434364 (0.170331) 0.777705 / 0.540337 (0.237368) 1.632224 / 1.386936 (0.245288)
PyArrow==1.0
Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.018216 / 0.011353 (0.006863) 0.015350 / 0.011008 (0.004342) 0.046574 / 0.038508 (0.008066) 0.030666 / 0.023109 (0.007557) 0.334314 / 0.275898 (0.058416) 0.361350 / 0.323480 (0.037870) 0.009910 / 0.007986 (0.001924) 0.004399 / 0.004328 (0.000071) 0.006385 / 0.004250 (0.002134) 0.053591 / 0.037052 (0.016539) 0.350815 / 0.258489 (0.092326) 0.371698 / 0.293841 (0.077857) 0.154625 / 0.128546 (0.026079) 0.122534 / 0.075646 (0.046888) 0.432384 / 0.419271 (0.013113) 0.497918 / 0.043533 (0.454385) 0.337054 / 0.255139 (0.081915) 0.346398 / 0.283200 (0.063198) 0.091810 / 0.141683 (-0.049873) 1.831720 / 1.452155 (0.379565) 1.842007 / 1.492716 (0.349291)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.044902 / 0.037411 (0.007491) 0.022635 / 0.014526 (0.008110) 0.027190 / 0.176557 (-0.149367) 0.084468 / 0.737135 (-0.652667) 0.051517 / 0.296338 (-0.244822)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.275240 / 0.215209 (0.060031) 2.728382 / 2.077655 (0.650728) 1.959744 / 1.504120 (0.455624) 1.822793 / 1.541195 (0.281599) 1.812613 / 1.468490 (0.344123) 6.747467 / 4.584777 (2.162690) 5.656408 / 3.745712 (1.910696) 8.102175 / 5.269862 (2.832314) 6.962493 / 4.565676 (2.396816) 0.689737 / 0.424275 (0.265462) 0.011444 / 0.007607 (0.003837) 0.302139 / 0.226044 (0.076095) 3.194923 / 2.268929 (0.925994) 2.294409 / 55.444624 (-53.150215) 2.125958 / 6.876477 (-4.750519) 2.162049 / 2.142072 (0.019976) 6.780170 / 4.805227 (1.974942) 4.650308 / 6.500664 (-1.850356) 10.291870 / 0.075469 (10.216401)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 11.742575 / 1.841788 (9.900787) 13.971786 / 8.074308 (5.897478) 15.847264 / 10.191392 (5.655872) 0.774642 / 0.680424 (0.094219) 0.581683 / 0.534201 (0.047482) 0.786529 / 0.579283 (0.207246) 0.600406 / 0.434364 (0.166042) 0.769567 / 0.540337 (0.229230) 1.614636 / 1.386936 (0.227700)

Please sign in to comment.