Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/filter duplicate in evc #630

Merged
merged 4 commits into from
Jul 29, 2021

Conversation

bouthilx
Copy link
Member

[Fixes #628]

Why:

During execution of the experiment the producer verifies that suggested
trials do not already exist in parent or children, but race conditions
can lead to duplicates. Also, in attempt to solve #576, we will need to
duplicate trials that are not completed in parents into executed
experiments to allow reserving and executing the trials. This will lead
to more potential duplicated trials and raise the important of handling
duplicate properly.

When fetching from the EVC, we should ignore duplicates from parent or
children if the trials are available in current experiment. This will
recursively solve the issue during recursive fetch from EVC. This will
also simplify the handling of potential duplicates during
{naive-}algorithm updates, as there will simply be no duplicates.

How:

During the call to adaptors, a set of hash is generated from trials of
current nodes based on hyperparameter values (ignores experiment id).
Any trials from the parents or child that has a hash found in this set
of hash will be filtered out. When there is a duplicate, only the trial
of the current node is kept. This also applies recursively to call from
children experiments to grand-children.

bouthilx added 4 commits July 29, 2021 13:14
During execution of the experiment the producer verifies that suggested
trials do not already exist in parent or children, but race conditions
can lead to duplicates. Also, in attempt to solve Epistimio#576, we will need to
duplicate trials that are not completed in parents into executed
experiments to allow reserving and executing the trials. This will lead
to more potential duplicated trials and raise the important of handling
duplicate properly.

When fetching from the EVC, we should ignore duplicates from parent or
children if the trials are available in current experiment. This will
recursively solve the issue during recursive fetch from EVC. This will
also simplify the handling of potential duplicates during
{naive-}algorithm updates, as there will simply be no duplicates.

How:

During the call to adaptors, a set of hash is generated from trials of
current nodes based on hyperparameter values (ignores experiment id).
Any trials from the parents or child that has a hash found in this set
of hash will be filtered out. When there is a duplicate, only the trial
of the current node is kept. This also applies recursively to call from
children experiments to grand-children.
Why:

When a dimension is deleted or added, the adaptor should not transform
them with a default value of None if there was no default values. This
would lead to invalid trials if None is an invalid value of this
dimension.

How:

If the default value is the unique NO_DEFAULT_VALUE object, then the
trials should be filtered out.
@bouthilx bouthilx merged commit 0921b7f into Epistimio:develop Jul 29, 2021
@bouthilx bouthilx added the enhancement Improves a feature or non-functional aspects (e.g., optimization, prettify, technical debt) label Jul 29, 2021
@bouthilx bouthilx added this to the v0.1.16 milestone Jul 29, 2021
@bouthilx bouthilx mentioned this pull request Aug 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves a feature or non-functional aspects (e.g., optimization, prettify, technical debt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid fetching duplicates in EVC
1 participant