Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows multiprocessing (and added tempfile bonus) #1117

Merged
merged 11 commits into from
Aug 17, 2017

Conversation

eric-s-s
Copy link
Contributor

the big changes are as follows:

  1. removed self.outputfile, self.writer and self.progress_bar .
    These needed to be removed so that Tournament could be pickled. The opening and closing of file objects and progress bars takes place at the level where they're needed (_run_serial and _process_done_queue).

  2. NamedTemoraryFile has been replaced by creating a file location with mkstemp.
    the method that calls the file creation (play) is responsible for cleaning up the file.
    Since file objects are open and closed in a separate place, I wasn't sure about having a separate open file object pointing to the same file. It seemed simpler to just create the file that could be accessed anywhere by the filename. this has two added bonuses:

    1. Windows can now use temp file
    2. I had the ability to test it and catch my mistake about a missing close. (by the time i figured out that removing the file at the end of play was being bypassed by _build_result_set i had a few thousand extra files in my temp folder :o)

    Now that i'm more clear on what's going on, I think I could change this back to using NamedTemporaryFile, but i'm not sure and wouldn't have the ability to thoroughly test it (being on windows)

  3. I changed all tqdm to have file=sys.stderr. This was for testing purposes.
    tqdm defaults to writing to sys.stderr, but if you leave it as the default, you can't capture the output in a context manager. It won't write to the contextual sys.stderr. it writes to the real one only. setting the file to sys.stderr solves this.

interactions : list
a list of dictionaries mapping tuples of player indices to
interactions : dict
a dictionary mapping tuples of player indices to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I correct that this is just a doc change? You haven't modified this behaviour (it was essentially incorrect).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. that is correct. mypy was complaining.

self.assertIsInstance(results, axelrod.ResultSet)

results = tournament.play(progress_bar=True)
# progress_bar = True
err = io.StringIO()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this setting things so that the progress bar is not output during the tests? (apologies)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is to grab the output and check that it's outputting.

@drvinceknight
Copy link
Member

Wow! This looks like a fantastic piece of work! Great job.

@marcharper and @meatballs I suggest we move slow with this one and make sure we're all happy (no rush - we're all busy) before merging.

Coverage looks like the one thing that's failing, looks trivial though:

axelrod/tests/unit/test_tournament.py                      518      2    99%   462, 1017

A quick scan: I think we can just remove the if block at l 1017. Not too sure what 462 is doing.

@drvinceknight
Copy link
Member

I have checked and a tournament with the Fortran strategies runs in parallel on this branch.

@eric-s-s
Copy link
Contributor Author

I have checked and a tournament with the Fortran strategies runs in parallel on this branch.

excellent! i had hoped so but had no way to check.

Wow! This looks like a fantastic piece of work! Great job.

thanks! :)

Copy link
Member

@drvinceknight drvinceknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking a while to get back to this (I've been away). This is looking good. A couple of small requests/questions.

Also could you take a look at fingerprint.py it makes use of a temp file (and I believe the test is skipped on Windows).

In fact could you perhaps remove the on_windows functionality altogether? If that opens up a pandoras box of more things than just the fingerprint (which I believe can be fixed as you've done it here) then don't worry about it (we'll open issues and keep track).

self.assertIsInstance(results, axelrod.ResultSet)
self.assertEqual(tournament.progress_bar.total, 15)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like these to stay in. This complemented by your file reading give a comprehensive test of the progress bar (without it, the total could be wrong for example).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would love for that to stay in. i prefer it to checking the sys.stderr. i'm not sure how to do it, though. the progress bar is now a local obj that is created and destroyed in the same method. do you know/can you think of a good way to access it?

an improvement (that i just tried) is asserting that sys.stderr has "Playing matches: 100%". (instead of separating that into two test: "Playing matches:" and "100%")This would show that the progress bar reached the end in each case.

the tests for _get_progress_bar (around line 220) show that the progress bar is of total =self.match_generator.size. For completeness, I'll add a test with different edges.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would love for that to stay in. i prefer it to checking the sys.stderr. i'm not sure how to do it, though. the progress bar is now a local obj that is created and destroyed in the same method. do you know/can you think of a good way to access it?

an improvement (that i just tried) is asserting that sys.stderr has "Playing matches: 100%". (instead of separating that into two test: "Playing matches:" and "100%")This would show that the progress bar reached the end in each case.

the tests for _get_progress_bar (around line 220) show that the progress bar is of total =self.match_generator.size. For completeness, I'll add a test with different edges.

I have missed that subtlety. I understand what you've done now and am happy: 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i found a solution with patch and a new class I made, RecordedTQDM(tqdm).

While testing it out, i discovered that if you play the same tournament repeatedly, num_interactions does not reset with each play. this affects the "Analysing: " progress_bar. For instance, if play creates 75 interactions, the second time you play there are 150 interactions. The "Analysing: " progress bar is now total=150 but finishes at n=75. Should num_interactions be resetting with each play?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bug. Good catch num_interactions should indeed be resetting.

"""
file, writer = self._get_file_objects()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a different filename for this; out_file perhaps?

"""
file, writer = self._get_file_objects()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this check (the one inside _get_file_objects impact performance? Can this all not just be done once at the start or is that falling over with pickling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be done at the start and was originally done that way. I doubt that performance would be affected more than a few nanoseconds either way (extra overhead for separate method call vs. return obj and passing to various methods.). the same check is still being performed exactly once, just in different places. I prefer the current way, because the original involved passing a writer variable through all the methods which had the variable for the sole purpose of passing it to the next method. It seemed cleaner to open and close the file closest to where it was needed.

original:

  • play
    • call setup_output and (possibly) create temp file
      • still needs to check filename to see if file is created
      • create writer and file obj
      • return writer and file obj
    • call _run_paralell(writer=writer)
      • call _start_workers
      • call _process_done_queue(writer=writer)
        • call _write_interactions(writer=writer)
    • close file object
    • delete temp file

newer:

  • play
    • call setup_output and (possibly) create temp file
    • call _run_paralell
      • call _start_workers
      • call _process_done_queue
        • check for file creation
        • create file obj and csv writer
        • call _write_interactions(writer=writer)
        • close file object
    • delete temp file

@eric-s-s
Copy link
Contributor Author

eric-s-s commented Aug 13, 2017

@drvinceknight i just gave fingerprint a try with mkstemp. it seems to be in working order. (haven't fiddled with on_windows yet, but i don't see a big problem there (knock on wood)). should i write up tests, and include it in this PR or do it as a separate PR after the current one is resolved?

@drvinceknight
Copy link
Member

drvinceknight commented Aug 13, 2017 via email

@drvinceknight
Copy link
Member

I prefer the current way, because the original involved passing a writer variable through all the methods which had the variable for the sole purpose of passing it to the next method.

Thanks for the detailed explanation. I agree with you. It's nicer this way. 👍

if self.use_progress_bar:
return tqdm.tqdm(total=self.match_generator.size,
desc="Playing matches")
else:
Copy link
Member

@marcharper marcharper Aug 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this else is unnecessary

Copy link
Member

@marcharper marcharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Copy link
Member

@drvinceknight drvinceknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great piece of work @eric-s-s. Thanks for hunting this down 👍 👏

@@ -154,20 +308,24 @@ def test_no_progress_bar_play(self):


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i noticed the same four tests being used about 5 times here. how about using this?

    def assert_play_pbar_correct_total_and_is_finished(self, pbar, total):
        self.assertEqual(pbar.desc, 'Playing matches: ')
        self.assertEqual(pbar.total, total)
        self.assertEqual(pbar.n, total)
        self.assertTrue(pbar.disable, True)

if good, i would appreciate a good name suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks fine to me and your name is fine (perhaps the slightly shorted version below).

When called can we be verbose:

assert_play_pbar_correct_total_and_finished(pbar=pbar, total=10)

progress_bar : bool
Whether or not to update the tournament progress bar
processes : int
How many processes to use.
"""
# At first sight, it might seem simpler to use the multiprocessing Pool
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i keep forgetting to ask. should this comment now be removed? when i first started working on this, i investigated Pool because of the original comment. It pointed to the possibility of using Pool if python changed. As the original code is simply more efficient, there may be no need to mention Pool at all. My updated comment may simply be useless noise. (who, after all, would not use a conventional tool if it worked better)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I think the current comment is good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(By current I mean the one you wrote on this branch.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is there because using the Pool class is simpler code and I wanted to avoid anyone in the future thinking they could do a nice cleanup here.

Copy link
Member

@meatballs meatballs Aug 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not quite accurate, though. It should really say that bound methods can't be pickled (and that is still the case, as far as I know).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curiosity got the better of me. i just checked, and pickle can now handle bound methods (in py3.6.1)! (after the last set of work, seeing improvements to pickle fills me with joy.) I don't know about earlier versions of 3. so perhaps leave the updated comment as the bound-method information is either out-of-date, or soon to be out-of-date? (i can't imagine anyone going in to change this again before py3.6 becomes an older revision)

here's the code i ran to check.

import pickle
import axelrod as axl

tournament = axl.Tournament(players=[axl.CyclerCCCDCD(), axl.Defector()])

bm_1 = tournament.play
print('original bound method: ', bm_1)
result_set_1 = bm_1(progress_bar=False)

bm_2 = pickle.loads(pickle.dumps(bm_1))
print('second bound method: ', bm_2)
result_set_2 = bm_2(progress_bar=False)

bm_3 = pickle.loads(pickle.dumps(bm_2))
print('third bound method: ', bm_3)
result_set_3 = bm_3(progress_bar=False)

print(result_set_1 == result_set_2 == result_set_3)

results:

E:\work\environments\axelrodtest\Scripts\python.exe C:/Users/Administrator/.PyCharm2017.1/config/scratches/scratch_6.py
original bound method:  <bound method Tournament.play of <axelrod.tournament.Tournament object at 0x0000000002765C50>>
second bound method:  <bound method Tournament.play of <axelrod.tournament.Tournament object at 0x0000000002DACE48>>
third bound method:  <bound method Tournament.play of <axelrod.tournament.Tournament object at 0x0000000006D08C88>>
True

Process finished with exit code 0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The library supports py3.5 so any comment needs to reflect that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting!

But, we support both 3.5 and 3.6, so I reckon that comment needs to stay there for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll make a virtualenv and check later today.

Copy link
Contributor Author

@eric-s-s eric-s-s Aug 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl;dr - pickling bound methods supported since py3.4 . my test on axelrod confirmed with py3.5.3. other bound method tests confirmed with 3.4 and 3.5.2 (i can't axelrod below 3.5.3 for ... reasons)

fun reading here

on a different test, i get interesting results:

import pickle

class Tourney(object):  # lazy renaming to recycle code.
    def __init__(self, a_list):
        self.lst = a_list

    def play(self):
        return self.lst.pop()

    @staticmethod
    def hi():
        return 'hi'

if __name__ == '__main__':
    tournament = Tourney([1, 2, 3])

    bm_1 = tournament.play
    print('original bound method: ', bm_1)
    result_set_1 = bm_1()

    bm_2 = pickle.loads(pickle.dumps(bm_1))
    print('second bound method: ', bm_2)
    result_set_2 = bm_2()

    bm_3 = pickle.loads(pickle.dumps(bm_2))
    print('third bound method: ', bm_3)
    result_set_3 = bm_3()

    print(result_set_1, result_set_2, result_set_3)

    um = Tourney.hi
    print(um)
    print(pickle.loads(pickle.dumps(um))())

on py3.5.2 this works as expected, but it fails on py3.4.4 because of pickling the staticmethod (but the bound methods were no problem). also fails on bound methods on 2.7.10. there was a brief window when bound methods were pickle-able but unbound methods were not.

D:\PROGRAMS\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\python.exe C:/Users/Administrator/.PyCharm2017.1/config/scratches/scratch_6.py
Traceback (most recent call last):
original bound method:  <bound method Tourney.play of <__main__.Tourney object at 0x0000000001F62208>>
  File "C:/Users/Administrator/.PyCharm2017.1/config/scratches/scratch_6.py", line 37, in <module>
    print(pickle.loads(pickle.dumps(um))())
second bound method:  <bound method Tourney.play of <__main__.Tourney object at 0x0000000001F622B0>>
_pickle.PicklingError: Can't pickle <function Tourney.hi at 0x0000000001FAC2F0>: attribute lookup hi on __main__ failed
third bound method:  <bound method Tourney.play of <__main__.Tourney object at 0x0000000001F626A0>>
3 2 1
False
<function Tourney.hi at 0x0000000001FAC2F0>

Process finished with exit code 1

@drvinceknight
Copy link
Member

I'm happy with this now. I believe the only thing left to get right is that inline comment. Will leave @meatballs to sign off 👍 :)

@meatballs meatballs merged commit e021ef7 into Axelrod-Python:master Aug 17, 2017
@eric-s-s eric-s-s deleted the multi_queue branch August 17, 2017 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants