Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement transitive fingerprint #1125

Merged
merged 19 commits into from
Aug 25, 2017
Merged

Implement transitive fingerprint #1125

merged 19 commits into from
Aug 25, 2017

Conversation

Nikoleta-v3
Copy link
Member

Transitive fingerprints give us another fingerprint
for the library. Is showing the cooperation rate
of a strategy against a set of opponents over the turns.

The default opponents are a list of random player with
increasing cooperation probability.

This is an implementation of fingerprints
similar to the ones used in:

We have:

  • write tests for transitive fingerprint
  • Write docstrings.
  • Add documentation for transitive fingerprints
  • Remove the optional matplotlib check: This was already removed elsewhere with bc952f4

@drvinceknight and I worked on this together.

Here are some examples.

TF1 (has a CCD handshake)
tf1

TitForTat
titfortat

Alexei (Defects on last move)
alexei

EvolvedHMM5 (current winner of the tournament)
evolvedhmm5

DBS
dbs

Transitive fingerprints give us another fingerprint
for the library. Is showing the cooperation rate
of a strategy against a set of opponents over the turns.

The default opponents are a list of random player with
increasing cooperation probability.

This is an implementation of fingerprints
similar to the ones used in:

- https://arxiv.org/abs/1707.06920
- https://arxiv.org/abs/1707.06307

- write tests for transitive fingerprint
- Write docstrings.
- Add documentation for transitive fingerprints
- Remove the optional matplotlib check: This was already removed elsewhere with bc952f4
Copy link
Member

@meatballs meatballs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Just some minor typos from me

self.data : np.array
A numpy array containing the mean cooperation rate against each
opponent in each turn. The ith row corresponds to the ith opponent
and the jth columns the jth turn.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: column not columns

self.data : np.array
A numpy array containing the mean cooperation rate against each
opponent in each turn. The ith row corresponds to the ith opponent
and the jth columns the jth turn.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: column not columns

Transitive Fingerprints
-----------------------

Another implemented fingerprint is the transitive fingerprints. The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: 'is the transitive fingerprints' should be 'is the transitive fingerprint'

-----------------------

Another implemented fingerprint is the transitive fingerprints. The
transitive fingerprints represents the cooperation rate of a strategy against a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: fingerprint rather than fingerprints

@Nikoleta-v3
Copy link
Member Author

I believe I addressed everything.

@drvinceknight
Copy link
Member

drvinceknight commented Aug 23, 2017

@Nikoleta-v3 I've opened /~https://github.com/Nikoleta-v3/Axelrod/pull/6 (to your branch) which adds a more meaningful y label when using the default.

Copy link
Member

@marcharper marcharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments from me.

I think the default repetitions should be higher. We're estimating a proportion p and the standard error is sqrt(p(1-p) / n), so to get a visually consistent image from a stochastic strategy a fairly large n is necessary. Thoughts?

Also, is there a reason not to use all (perhaps short run time) strategies by default, as in the axelrod examples repository? Or are these fundamentally different in some way?

Parameters
----------
strategy : class or instance
A class that must be descended from axelrod.Player or an instance of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean "axelrod.Player or a descendent" -- I don't believe an instance of axelrod.Player would work here.

strategy : class or instance
A class that must be descended from axelrod.Player or an instance of
axelrod.Player.
opponents : list of class or instance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same: List of Player classes or something similar

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I see below that a class or instance can be given. I think we should stick to one or the other, ideally instance (since the parameters may be important).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree. This is just implemented to be analagous to the Ashlock fingerprint, which also allows both (this was a poor design decision mainly on my part).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could either leave them both in as is or remove them from Ashlock which adds a very minor backwards incompatibility.

A list that contains a list of opponents
Default: A spectrum of Random players
number_opponents: integer
An integer that defines the number of Random opponents
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of Random opponents


if opponents is None:
self.opponents = [axl.Random(p) for p in
np.linspace(0, 1, number_opponents)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spacing

noise: float = None, processes: int = None,
filename: str = None,
progress_bar: bool = True) -> np.array:
"""Build and play the spatial tournament.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the docstring, how about something like "Creates a spatial tournament to run the necessary matches to obtain fingerprint data"?

The number of processes to be used for parallel processing
filename: string, optional
The name of the file for self.spatial_tournament's interactions.
if None and in_memory=False, will auto-generate a filename.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ... a filename will be generated.


Parameters
----------
cmap : str, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above docstrings use integer and string -- can we chose either the full word or the python functions int and str (my preference is the latter) throughout?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nikoleta-v3 I expect these are also not quite right in the Ashlock fingerprint: can you change them there too? (to int and str)

@drvinceknight
Copy link
Member

drvinceknight commented Aug 24, 2017

I think the default repetitions should be higher. We're estimating a proportion p and the standard error is sqrt(p(1-p) / n), so to get a visually consistent image from a stochastic strategy a fairly large n is necessary. Thoughts?

Let's leave it as 10. This is the same default as the tournament and the Ashlock fingerprint and at least means that a default run isn't computationally costly. As shown by the images above this already gives interpretable fingerprints: for example we clearly see TF1s (admittedly not a stochastic strategy) handshake and how it takes advantage (periodically) of cooperative opponents.

When we run these on the fingerprints repo we'll bump up the repetitions (as any user can).

Also, is there a reason not to use all (perhaps short run time) strategies by default, as in the axelrod examples repository? Or are these fundamentally different in some way?

A few reasons:

  1. They're not well defined: when new strategies come in the fingerprints would change.
  2. They're not interpretable: as implemented here you can read behaviour as a function of turns and how cooperative the other player (row) is. Using the big list of strategies (whilst nice to look at) doesn't fundamentally give you very easy to access information from the plot.
  3. Using the Random spectrum and given a stochastic process representation of a player (for example memory 1) you could get these analytically.
  4. You can still use this fingerprint to fingerprint against a set of opponents so they can easily be used to create the fingerprints of the Examples repo (we used those as we were building these to verify). How to do this is in the docs too.



class TransitiveFingerprint(object):
def __init__(self, strategy, opponents=None, number_opponents=50):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change this to number_of_opponents?

drvinceknight and others added 10 commits August 24, 2017 11:07
This means you can more easily change font size etc.
Transitive fingerprints give us another fingerprint
for the library. Is showing the cooperation rate
of a strategy against a set of opponents over the turns.

The default opponents are a list of random player with
increasing cooperation probability.

This is an implementation of fingerprints
similar to the ones used in:

- https://arxiv.org/abs/1707.06920
- https://arxiv.org/abs/1707.06307

- write tests for transitive fingerprint
- Write docstrings.
- Add documentation for transitive fingerprints
- Remove the optional matplotlib check: This was already removed elsewhere with bc952f4
…a-v3/Axelrod into implement_transitive_fingerprint
@Nikoleta-v3
Copy link
Member Author

Hey @marcharper. I addressed your comments. Vince also added the ability to pass axes to the plot.
But I somehow managed to make a big mess of the commits. Could you perhaps squash them in the merge?

The only thing I haven't address is passing an instance for the player. I am happy to make the changes in the AshlockFingerprints as well once you give a thumbs up

@marcharper
Copy link
Member

Let's leave it as 10. This is the same default as the tournament and the Ashlock fingerprint and at least means that a default run isn't computationally costly. As shown by the images above this already gives interpretable fingerprints: for example we clearly see TF1s (admittedly not a stochastic strategy) handshake and how it takes advantage (periodically) of cooperative opponents.

Could we at least give the user a substantial warning if repetitions is small? The sampling error with n=10 is quite large and you can see it in the plots above. While it may not matter for deterministic strategies (or the parts that are deterministic of stochastic strategies), isn't generalizing from the examples above also sampling / selection bias?

I use n=1000 or 10000 in the examples repository because of this, having found that smaller n isn't adequate in many cases. Because n is small for the standard tournament isn't necessarily a justification for using small n here -- we use large n for the tournaments in our papers. Moreover there's an important distinction between running a quick tournament and defining a fingerprint.

They're not well defined: when new strategies come in the fingerprints would change.
They're not interpretable: as implemented here you can read behaviour as a function of turns and how cooperative the other player (row) is. Using the big list of strategies (whilst nice to look at) doesn't fundamentally give you very easy to access information from the plot.
Using the Random spectrum and given a stochastic process representation of a player (for example memory 1) you could get these analytically.
You can still use this fingerprint to fingerprint against a set of opponents so they can easily be used to create the fingerprints of the Examples repo (we used those as we were building these to verify). How to do this is in the docs too.

There are some counterarguments:

  • more strategies and higher repetitions --> more unique fingerprints. That's important since a fingerprint is an invariant (like the fundamental group) and won't tell you for sure if two strategies are the same, rather only that they are different. It's not ideal for sampling noise to have a significant affect on this determination.
  • I understand that new strategies can change the fingerprint but that's an advantage IMO. They are well-defined, just not stable as the population changes -- the fingerprint resulting from the addition of a new strategy is a subset of the prior fingerprint. A larger set of opponents (even if fixed) still seems desirable.
  • I think the Random spectrum is a good addition but it's also another reason to dislike n=10. The overlap in the distributions is substantial for small n.
  • I don't see how the big list is any less interpretable, aside from missing the random spectrum. I'm not (yet) convinced that the spectrum is a good proxy for how cooperative a strategy is, rather it shows how a strategy responds to a random player (there are several strategies that do something special if the opponent appears to be acting randomly). We do have other measures of cooperativeness (e.g. overall cooperation rate and the eigenmeasures).
  • I'm not sure why 3 is relevant -- if someone wants an analytic fingerprint they can restrict the subset of opponents. We have a lot of long memory and stochastic players, for what % can an analytic result be computed?

Re: (4), this is a nicer way to compute the plots. The library has changed a lot over time and the examples repository hasn't been updated much, and I'm totally fine with the library absorbing this capability. But if we're going to call it a fingerprint rather than a "cooperation heatmap" or something more generic, then I think that we need to be careful choosing the defaults and that e.g. computational cost should not be a major concern.

@drvinceknight
Copy link
Member

drvinceknight commented Aug 24, 2017

I use n=1000 or 10000 in the examples repository because of this, having found that smaller n isn't adequate in many cases. Because n is small for the standard tournament isn't necessarily a justification for using small n here -- we use large n for the tournaments in our papers. Moreover there's an important distinction between running a quick tournament and defining a fingerprint.

Yes and I used 20000 on #1124: I'm not disagreeing with the need for large samples size when we want to approximate a mean. I do not want our default to be computationally costly. I don't want a warning (because we've not done that before and it would not be clear to many beginners). We can push up the default to 50 perhaps but that's no where near the required number to get the precision you're worried about. I suggest leaving it as is and anyone actually needing this for a high level of precision would know to use a high number of repetitions. Someone who needs a very high level of precision and assumes the defaults give this seems like a lesser evil (and less likely) than off putting a newcomer to Python/Game Theory/Library by their machine crashing when they paste in some code. If you're specifically keen: we can make a note about the number of repetitions expected to be required for high levels of precision in the documentation.

more strategies and higher repetitions --> more unique fingerprints. That's important since a fingerprint is an invariant (like the fundamental group) and won't tell you for sure if two strategies are the same, rather only that they are different. It's not ideal for sampling noise to have a significant affect on this determination.

Yup: and we can do this: on the fingerprints repo we will run with 20k reps and include both against "all strategies" (perhaps just short run time?) and against the random spectrum.

It's not just an invariant it can also indicate where the differences are which I think is easier (compared to Ashlock's fingerprints which are ridiculous to interpret) with the Spectrum.

I understand that new strategies can change the fingerprint but that's an advantage IMO. They are well-defined, just not stable as the population changes -- the fingerprint resulting from the addition of a new strategy is a subset of the prior fingerprint. A larger set of opponents (even if fixed) still seems desirable.

By well defined I meant in the mathematical sense of the term: that they should not change. To describe the current default mathematically is easy enough, to describe the alternative essentially ties this fingerprint to a specific version of the Axelrod library.

I think the Random spectrum is a good addition but it's also another reason to dislike n=10. The overlap in the distributions is substantial for small n.

Sure.

I don't see how the big list is any less interpretable, aside from missing the random spectrum. I'm not (yet) convinced that the spectrum is a good proxy for how cooperative a strategy is, rather it shows how a strategy responds to a random player (there are several strategies that do something special if the opponent appears to be acting randomly). We do have other measures of cooperativeness (e.g. overall cooperation rate and the eigenmeasures).

If you look at the fingerprints for TF1 in the Moran paper you can really only see the handshake. If you look at the fingerprint against the spectrum here you can see the handshake and also how it periodically defects against strategies that cooperate more often.

I'm not sure why 3 is relevant -- if someone wants an analytic fingerprint they can restrict the subset of opponents. We have a lot of long memory and stochastic players, for what % can an analytic result be computed?

I don't know what you mean. I think my point again is simply that the current default is easily defined mathematically (akin to Ashlock's finerprints except we now just consider exponentiating the underlying matrix of the markov chain) and can be studied as such (similar to a lot of Ashlock's work).


TLDR

We're essentially disagreeing on default arguments:

  • Repetitions: the library's defaults are computationally cheap throughout. This should not change.
  • Opponents: I don't personally find the fingerprints against other strategies valuable/helpful but I know you do which is why I thought we should ensure there was the option to fp against a list of opponents. Given that there's the extra step to create the spectrum it feels sensible to have them as a default.

@drvinceknight
Copy link
Member

But if we're going to call it a fingerprint rather than a "cooperation heatmap" or something more generic,

I'd be quite happy removing the ability to pass opponents all together to this fingerprint and it can just be it's own little harmless thing and let those examples stay called as "cooperation heatmaps" but that seemed silly at the time of writing this given that it's the same code.

@marcharper
Copy link
Member

marcharper commented Aug 25, 2017

Someone who needs a very high level of precision and assumes the defaults give this seems like a lesser evil (and less likely) than off putting a newcomer to Python/Game Theory/Library by their machine crashing when they paste in some code

I don't buy the beginner argument / computational cost in this case. To the contrary, I would say it's very beginner unfriendly to knowingly return something inaccurate when a beginner would potentially not be aware of the inaccuracy.

It's also not analogous to how we treat matches and tournaments -- they are single observation objects that have a distribution. This fingerprint is the whole distribution. I also doubt that the fingerprint generator is going to crash most machines, though I haven't checked the memory footprint in a broad collection of cases. Regardless we should be caching to disk by default for the spatial tournament used to compute the fingerprint.

As for CPU time: it takes <30 seconds on a laptop that cost ~$750 a few years ago to run a fingerprint with repetitions=1000 (all other defaults unchanged) against Random(), and there's a progress bar. For reference a tournament with the default options for the short_run_time_strategies takes about 9 minutes on the same machine. A default of repetitions=1000 doesn't seem like it breaks the "computationally cheap" barrier, and I'm not sure what use case repetitions=10 covers.

As a meta point, I don't think computationally cheap should take precedence over accuracy, rather the principle should be "as computationally cheap as possible given a minimum accuracy threshold" or something similarly more refined.

By well defined I meant in the mathematical sense of the term: that they should not change. To describe the current default mathematically is easy enough, to describe the alternative essentially ties this fingerprint to a specific version of the Axelrod library.

Sorry to be pedantic, but that's what a lot of math training does to someone: I think it is well-defined for any collection of opponents. By the mathematical definition: given equivalent inputs, which includes the collection of opponents and their ordering, and a large number of repetitions, the fingerprint will be the same (within some error limit). By using a low number of repetitions the fingerprint is not well-defined in this sense, even for a fixed version of the library. I'm less concerned with the default set of opponents than I am of the repetitions -- the latter seems more likely to lead to errors in logic and understanding -- but I also don't see why we would use e.g. all the short_run_time_strategies (aka cheap strategies) in the current version. What's the big deal if the fingerprints change slightly with some versions? A paper will use a fixed version anyway.

I don't want a warning (because we've not done that before and it would not be clear to many beginners).

I don't see why a warning is inappropriate -- lots of software warns the user that a result may be inaccurate given certain parameters. Isn't it misleading not to do so? A beginner could reach the wrong conclusions if not warned. I now think we should give more warnings -- e.g. "this match/tournament is stochastic, repeat many times for accuracy".

I'd be quite happy removing the ability to pass opponents all together to this fingerprint ...

This I would not support. It makes this into a niche use case. If you recall I wanted to add the cooperation heatmaps to the library a long time ago but this was decided against at the time, so unless something has changed I don't see why we would add a less complete feature.

In the interest of merging the PR, I don't think I'm asking for much. Either:

  • increase the reps to e.g. 1000 (std error ~1% absolute), or
  • warn the user of inaccuracy

I'd like to discuss the choice of default strategies more, but that discussion isn't merge blocking IMO.

@drvinceknight
Copy link
Member

In the interest of merging the PR, I don't think I'm asking for much. Either:

increase the reps to e.g. 1000 (std error ~1% absolute), or

I have pushed c0d9590 which changes the default to 1000. @Nikoleta-v3 can you double check that looks ok.

@meatballs I am assuming you're still fine with this so will merge once tests have passed.

@Nikoleta-v3
Copy link
Member Author

Nikoleta-v3 commented Aug 25, 2017

@drvinceknight Your changes look good to me!

@drvinceknight drvinceknight merged commit 0d1c47e into Axelrod-Python:master Aug 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants