-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure that parent datasets are present in the same environment as all its children #152
Comments
I have an idea how to implement the first part (second point) because it's very close to what I did for the metadata_identifier: basically we re.search for a block <mmd:related_dataset relation_type="parent">[namespace]:[uuid]</mmd:related_dataset> and if there is one we replace the namespace inside it, whatever that is, with the same + .dev or .staging if dev or staging but I am not sure how to go about with the check that the parent is actually present in the database... |
yes, the last one is a bit tricky - especially since we have three metadata stores (although solr is not yet in place). I guess we'll need to search all three (two now)? The different stores also must be searched in different ways, so we should implement a search function in each distributor. And we need to ensure that each distributor has a search function. This could probably be handled like for the run function in
|
@charlienegri. This sounds as a good implementation. Yes the querying of the database can be done using the examples that Shamly wrote for the docs. |
ah, this means that the check will not be part of
is meant to be working , I guess we can discuss next week about it |
In the below code from
Is it like that @mortenwh , @charlienegri ? |
but we want to run the search before we validate or, better, validation is dependent on the search being successful (IF the is a |
The search function must be implemented in |
Sorry - I didn't read the full thread..
Yes, if there is that relation type, we need to check that the parent exists. If it doesn't, validation should fail with an appropriate error message to be returned to the user. |
mmh, why implementing the search in distributor.py then at that point?
) |
When it is implemented in I agree that the environment customization can be done in |
Also, the search is basically a dataset search, so it is enough to call the function
and I now see that it maybe should be implemented in the distributor |
ok, I see, but if we define a
in
we need a simpler call here but I guess it's inevitable to have some kind of replication with
would be enough at that point |
I have a draft which is probably still very buggy and not pretty but I would appreciate some feedback on the logic... /~https://github.com/metno/discovery-metadata-catalog-ingestor/tree/fix-issue-152 |
Did some tests in pycsw locally, and my assumption that the database in pycsw have some foreign-key relation to primary_key for child/parent datasets, seems to be wrong. So pycsw does not care if I ingest a child dataset, if the parent are missing form the catalog. Lets discuss this next week. |
The
metadata_identifier
element in the MMD-files have identifiers on the formnaming_authority:uuid
.The parent-child relationships defined in child datasets must be updated to the correct
naming_authority:uuid
of the parentBecause of the way the discovery metadata databases are designed, both parent and child datasets should be in the same environment. Therefore we need to update the
naming_authority
parts of the id's pointing to parents, both for existing datasets and those coming into dmci. I.e., DMCI must update thenaming_authority
in the same way as it is doing for themetadata_id
, by adding "-dev" or "-staging".Since the parents need to be present in the same environment, we also need to check that the referenced parent exists, and reject the MMD file if the referenced parent does not exist.
The text was updated successfully, but these errors were encountered: