-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
windows multiprocessing (and added tempfile bonus) #1117
Conversation
interactions : list | ||
a list of dictionaries mapping tuples of player indices to | ||
interactions : dict | ||
a dictionary mapping tuples of player indices to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct that this is just a doc change? You haven't modified this behaviour (it was essentially incorrect).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. that is correct. mypy was complaining.
self.assertIsInstance(results, axelrod.ResultSet) | ||
|
||
results = tournament.play(progress_bar=True) | ||
# progress_bar = True | ||
err = io.StringIO() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this setting things so that the progress bar is not output during the tests? (apologies)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is to grab the output and check that it's outputting.
Wow! This looks like a fantastic piece of work! Great job. @marcharper and @meatballs I suggest we move slow with this one and make sure we're all happy (no rush - we're all busy) before merging. Coverage looks like the one thing that's failing, looks trivial though:
A quick scan: I think we can just remove the |
I have checked and a tournament with the Fortran strategies runs in parallel on this branch. |
excellent! i had hoped so but had no way to check.
thanks! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking a while to get back to this (I've been away). This is looking good. A couple of small requests/questions.
Also could you take a look at fingerprint.py
it makes use of a temp file (and I believe the test is skipped on Windows).
In fact could you perhaps remove the on_windows
functionality altogether? If that opens up a pandoras box of more things than just the fingerprint
(which I believe can be fixed as you've done it here) then don't worry about it (we'll open issues and keep track).
self.assertIsInstance(results, axelrod.ResultSet) | ||
self.assertEqual(tournament.progress_bar.total, 15) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like these to stay in. This complemented by your file reading give a comprehensive test of the progress bar (without it, the total could be wrong for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would love for that to stay in. i prefer it to checking the sys.stderr
. i'm not sure how to do it, though. the progress bar is now a local obj that is created and destroyed in the same method. do you know/can you think of a good way to access it?
an improvement (that i just tried) is asserting that sys.stderr
has "Playing matches: 100%"
. (instead of separating that into two test: "Playing matches:" and "100%")This would show that the progress bar reached the end in each case.
the tests for _get_progress_bar
(around line 220) show that the progress bar is of total =self.match_generator.size
. For completeness, I'll add a test with different edges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would love for that to stay in. i prefer it to checking the sys.stderr. i'm not sure how to do it, though. the progress bar is now a local obj that is created and destroyed in the same method. do you know/can you think of a good way to access it?
an improvement (that i just tried) is asserting that sys.stderr has "Playing matches: 100%". (instead of separating that into two test: "Playing matches:" and "100%")This would show that the progress bar reached the end in each case.
the tests for _get_progress_bar (around line 220) show that the progress bar is of total =self.match_generator.size. For completeness, I'll add a test with different edges.
I have missed that subtlety. I understand what you've done now and am happy: 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i found a solution with patch
and a new class I made, RecordedTQDM(tqdm)
.
While testing it out, i discovered that if you play the same tournament repeatedly, num_interactions
does not reset with each play. this affects the "Analysing: "
progress_bar. For instance, if play
creates 75 interactions, the second time you play
there are 150 interactions. The "Analysing: " progress bar is now total=150
but finishes at n=75
. Should num_interactions
be resetting with each play?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a bug. Good catch num_interactions
should indeed be resetting.
axelrod/tournament.py
Outdated
""" | ||
file, writer = self._get_file_objects() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a different filename for this; out_file
perhaps?
axelrod/tournament.py
Outdated
""" | ||
file, writer = self._get_file_objects() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this check (the one inside _get_file_objects
impact performance? Can this all not just be done once at the start or is that falling over with pickling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it could be done at the start and was originally done that way. I doubt that performance would be affected more than a few nanoseconds either way (extra overhead for separate method call vs. return obj and passing to various methods.). the same check is still being performed exactly once, just in different places. I prefer the current way, because the original involved passing a writer
variable through all the methods which had the variable for the sole purpose of passing it to the next method. It seemed cleaner to open and close the file closest to where it was needed.
original:
play
- call
setup_output
and (possibly) create temp file- still needs to check
filename
to see if file is created - create writer and file obj
- return writer and file obj
- still needs to check
- call
_run_paralell(writer=writer)
- call
_start_workers
- call
_process_done_queue(writer=writer)
- call
_write_interactions(writer=writer)
- call
- call
- close file object
- delete temp file
- call
newer:
play
- call
setup_output
and (possibly) create temp file - call
_run_paralell
- call
_start_workers
- call
_process_done_queue
- check for file creation
- create file obj and csv writer
- call
_write_interactions(writer=writer)
- close file object
- call
- delete temp file
- call
@drvinceknight i just gave fingerprint a try with |
I'm happy for it to go in this PR.
…On Sun, 13 Aug 2017, 20:32 eric-s-s, ***@***.***> wrote:
@drvinceknight </~https://github.com/drvinceknight> i just gave fingerprint
a try with mkstemp. it seems to be in working order. should i write up
tests, and include it in this PR or do it as a separate PR after the
current one is resolved?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1117 (comment)>,
or mute the thread
</~https://github.com/notifications/unsubscribe-auth/ACCGWjf3jvtCD4ktF4wof_GRCxSpQ30Pks5sX09jgaJpZM4O0ipo>
.
|
Thanks for the detailed explanation. I agree with you. It's nicer this way. 👍 |
axelrod/tournament.py
Outdated
if self.use_progress_bar: | ||
return tqdm.tqdm(total=self.match_generator.size, | ||
desc="Playing matches") | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this else is unnecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great piece of work @eric-s-s. Thanks for hunting this down 👍 👏
@@ -154,20 +308,24 @@ def test_no_progress_bar_play(self): | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i noticed the same four tests being used about 5 times here. how about using this?
def assert_play_pbar_correct_total_and_is_finished(self, pbar, total):
self.assertEqual(pbar.desc, 'Playing matches: ')
self.assertEqual(pbar.total, total)
self.assertEqual(pbar.n, total)
self.assertTrue(pbar.disable, True)
if good, i would appreciate a good name suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks fine to me and your name is fine (perhaps the slightly shorted version below).
When called can we be verbose:
assert_play_pbar_correct_total_and_finished(pbar=pbar, total=10)
progress_bar : bool | ||
Whether or not to update the tournament progress bar | ||
processes : int | ||
How many processes to use. | ||
""" | ||
# At first sight, it might seem simpler to use the multiprocessing Pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i keep forgetting to ask. should this comment now be removed? when i first started working on this, i investigated Pool because of the original comment. It pointed to the possibility of using Pool if python changed. As the original code is simply more efficient, there may be no need to mention Pool at all. My updated comment may simply be useless noise. (who, after all, would not use a conventional tool if it worked better)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I think the current comment is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(By current I mean the one you wrote on this branch.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is there because using the Pool class is simpler code and I wanted to avoid anyone in the future thinking they could do a nice cleanup here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not quite accurate, though. It should really say that bound methods can't be pickled (and that is still the case, as far as I know).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curiosity got the better of me. i just checked, and pickle can now handle bound methods (in py3.6.1)! (after the last set of work, seeing improvements to pickle fills me with joy.) I don't know about earlier versions of 3. so perhaps leave the updated comment as the bound-method information is either out-of-date, or soon to be out-of-date? (i can't imagine anyone going in to change this again before py3.6 becomes an older revision)
here's the code i ran to check.
import pickle
import axelrod as axl
tournament = axl.Tournament(players=[axl.CyclerCCCDCD(), axl.Defector()])
bm_1 = tournament.play
print('original bound method: ', bm_1)
result_set_1 = bm_1(progress_bar=False)
bm_2 = pickle.loads(pickle.dumps(bm_1))
print('second bound method: ', bm_2)
result_set_2 = bm_2(progress_bar=False)
bm_3 = pickle.loads(pickle.dumps(bm_2))
print('third bound method: ', bm_3)
result_set_3 = bm_3(progress_bar=False)
print(result_set_1 == result_set_2 == result_set_3)
results:
E:\work\environments\axelrodtest\Scripts\python.exe C:/Users/Administrator/.PyCharm2017.1/config/scratches/scratch_6.py
original bound method: <bound method Tournament.play of <axelrod.tournament.Tournament object at 0x0000000002765C50>>
second bound method: <bound method Tournament.play of <axelrod.tournament.Tournament object at 0x0000000002DACE48>>
third bound method: <bound method Tournament.play of <axelrod.tournament.Tournament object at 0x0000000006D08C88>>
True
Process finished with exit code 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The library supports py3.5 so any comment needs to reflect that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting!
But, we support both 3.5 and 3.6, so I reckon that comment needs to stay there for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll make a virtualenv and check later today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tl;dr - pickling bound methods supported since py3.4 . my test on axelrod confirmed with py3.5.3. other bound method tests confirmed with 3.4 and 3.5.2 (i can't axelrod below 3.5.3 for ... reasons)
fun reading here
on a different test, i get interesting results:
import pickle
class Tourney(object): # lazy renaming to recycle code.
def __init__(self, a_list):
self.lst = a_list
def play(self):
return self.lst.pop()
@staticmethod
def hi():
return 'hi'
if __name__ == '__main__':
tournament = Tourney([1, 2, 3])
bm_1 = tournament.play
print('original bound method: ', bm_1)
result_set_1 = bm_1()
bm_2 = pickle.loads(pickle.dumps(bm_1))
print('second bound method: ', bm_2)
result_set_2 = bm_2()
bm_3 = pickle.loads(pickle.dumps(bm_2))
print('third bound method: ', bm_3)
result_set_3 = bm_3()
print(result_set_1, result_set_2, result_set_3)
um = Tourney.hi
print(um)
print(pickle.loads(pickle.dumps(um))())
on py3.5.2 this works as expected, but it fails on py3.4.4 because of pickling the staticmethod (but the bound methods were no problem). also fails on bound methods on 2.7.10. there was a brief window when bound methods were pickle-able but unbound methods were not.
D:\PROGRAMS\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\python.exe C:/Users/Administrator/.PyCharm2017.1/config/scratches/scratch_6.py
Traceback (most recent call last):
original bound method: <bound method Tourney.play of <__main__.Tourney object at 0x0000000001F62208>>
File "C:/Users/Administrator/.PyCharm2017.1/config/scratches/scratch_6.py", line 37, in <module>
print(pickle.loads(pickle.dumps(um))())
second bound method: <bound method Tourney.play of <__main__.Tourney object at 0x0000000001F622B0>>
_pickle.PicklingError: Can't pickle <function Tourney.hi at 0x0000000001FAC2F0>: attribute lookup hi on __main__ failed
third bound method: <bound method Tourney.play of <__main__.Tourney object at 0x0000000001F626A0>>
3 2 1
False
<function Tourney.hi at 0x0000000001FAC2F0>
Process finished with exit code 1
I'm happy with this now. I believe the only thing left to get right is that inline comment. Will leave @meatballs to sign off 👍 :) |
the big changes are as follows:
removed
self.outputfile
,self.writer
andself.progress_bar
.These needed to be removed so that
Tournament
could be pickled. The opening and closing of file objects and progress bars takes place at the level where they're needed (_run_serial
and_process_done_queue
).NamedTemoraryFile
has been replaced by creating a file location withmkstemp
.the method that calls the file creation (
play
) is responsible for cleaning up the file.Since file objects are open and closed in a separate place, I wasn't sure about having a separate open file object pointing to the same file. It seemed simpler to just create the file that could be accessed anywhere by the filename. this has two added bonuses:
play
was being bypassed by_build_result_set
i had a few thousand extra files in my temp folder :o)Now that i'm more clear on what's going on, I think I could change this back to using
NamedTemporaryFile
, but i'm not sure and wouldn't have the ability to thoroughly test it (being on windows)I changed all tqdm to have
file=sys.stderr
. This was for testing purposes.tqdm defaults to writing to
sys.stderr
, but if you leave it as the default, you can't capture the output in a context manager. It won't write to the contextualsys.stderr
. it writes to the real one only. setting thefile
tosys.stderr
solves this.