Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pycsw_dist fails when using /v1/update endpoint. #187

Closed
magnarem opened this issue Jun 2, 2023 · 3 comments
Closed

pycsw_dist fails when using /v1/update endpoint. #187

magnarem opened this issue Jun 2, 2023 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@magnarem
Copy link
Contributor

magnarem commented Jun 2, 2023

While trying to update parent dataset in dmci, the pycsw_dist fails with exception:

To reproduce:

curl --data-binary "@mmd-xml-dev/arch_b/arch_1/arch_3/c7f8731b-5cfe-4cb5-ac57-168a19a2957b.xml" https://dmci.s-enda-dev.k8s.met.no/v1/update

output:

The following distributors failed: pycsw
 - pycsw: <html>
  <head>
    <title>Internal Server Error</title>
  </head>
  <body>
    <h1><p>Internal Server Error</p></h1>
    
  </body>
</html>

DEBUG log from DMCI says:

[2023-06-02 10:39:34,164]              dmci.api.worker:304  INFO     XML file metadata_identifier: no.met.dev:c7f8731b-5cfe-4cb5-ac57-168a19a2957b          │
│ [2023-06-02 10:39:34,164]              dmci.api.worker:306  DEBUG    XML file metadata_identifier namespace: no.met.dev                                     │
│ [2023-06-02 10:39:34,164]              dmci.api.worker:308  DEBUG    XML file metadata_identifier UUID: c7f8731b-5cfe-4cb5-ac57-168a19a2957b                │
│ [2023-06-02 10:39:34,164]              dmci.api.worker:321  DEBUG    File UUID: c7f8731b-5cfe-4cb5-ac57-168a19a2957b                                        │
│ [2023-06-02 10:39:34,164]              dmci.api.worker:225  INFO     Performing in depth checking.                                                          │
│ [2023-06-02 10:39:34,195]         dmci.tools.check_mmd:305  DEBUG    Checking element(s) containing URL ...                                                 │
│ [2023-06-02 10:39:34,195]         dmci.tools.check_mmd:313  DEBUG    Checking element geographic_extent/rectangle ...                                       │
│ [2023-06-02 10:39:34,196]              dmci.api.worker:103  DEBUG    Identifier namespace: no.met.dev                                                       │
│ [2023-06-02 10:39:34,196]              dmci.api.worker:104  DEBUG    Environment customization: dev                                                         │
│ [2023-06-02 10:39:34,197]  dmci.distributors.file_dist:92   INFO     Created folder: /archive/arch_b/arch_1/arch_3                                          │
│ [2023-06-02 10:39:34,197]  dmci.distributors.file_dist:107  INFO     Replaced file: c7f8731b-5cfe-4cb5-ac57-168a19a2957b.xml                                │
│ [2023-06-02 10:39:34,641] dmci.distributors.pycsw_dist:202  ERROR    <html>                                                                                 │
│   <head>                                                                                                                                                    │
│     <title>Internal Server Error</title>                                                                                                                    │
│   </head>                                                                                                                                                   │
│   <body>                                                                                                                                                    │
│     <h1><p>Internal Server Error</p></h1>                                                                                                                   │
│                                                                                                                                                             │
│   </body>                                                                                                                                                   │
│ </html>                                                                                                                                                     │
│                                                                                                                                                             │
│ [2023-06-02 10:39:34,641] dmci.distributors.pycsw_dist:139  DEBUG    Update status: False. With response: <html>                                            │
│   <head>                                                                                                                                                    │
│     <title>Internal Server Error</title>                                                                                                                    │
│   </head>                                                                                                                                                   │
│   <body>                                                                                                                                                    │
│     <h1><p>Internal Server Error</p></h1>                                                                                                                   │
│                                                                                                                                                             │
│   </body>                                                                                                                                                   │
│ </html>                                                                                                                                                     │
│                                                                                                                                                             │
│ [2023-06-02 10:39:34,664]  dmci.distributors.solr_dist:125  DEBUG    Parent/Level-1 - dataset - no.met.dev:c7f8731b-5cfe-4cb5-ac57-168a19a2957b             │
│ [2023-06-02 10:39:34,690]  dmci.distributors.solr_dist:138  INFO     Indexed document no.met.dev:c7f8731b-5cfe-4cb5-ac57-168a19a2957b in SolR   

from here we can see that both file_dist and solr_dist sucessfully updates the file, but pycsw_dist fails.

here is the log from pycsw when dmci fails with the log above:

│ pycsw /                                                                                                                                                     │
│ pycsw 172.16.15.7 - - [02/Jun/2023:10:45:30 +0000] "POST / HTTP/1.1" 500 0 "-" "-"                                                                          │
│ pycsw Fri, 02 Jun 2023 10:45:30] [DEBUG] file=/home/pycsw/pycsw/pycsw/ogc/csw/csw2.py line=1641 module=csw2 function=parse_postdata Request operation Trans │
│ action specified.                                                                                                                                           │
│ pycsw [2023-06-02 10:45:30 +0000] [11] [ERROR] Error handling request /                                                                                     │
│ pycsw Traceback (most recent call last):                                                                                                                    │
│ pycsw   File "/usr/local/lib/python3.8/site-packages/gunicorn/workers/sync.py", line 134, in handle                                                         │
│ pycsw     self.handle_request(listener, req, client, addr)                                                                                                  │
│ pycsw   File "/usr/local/lib/python3.8/site-packages/gunicorn/workers/sync.py", line 175, in handle_request                                                 │
│ pycsw     respiter = self.wsgi(environ, resp.start_response)                                                                                                │
│ pycsw   File "/home/pycsw/pycsw/pycsw/wsgi.py", line 80, in application                                                                                     │
│ pycsw     status, contents = csw.dispatch_wsgi()                                                                                                            │
│ pycsw   File "/home/pycsw/pycsw/pycsw/server.py", line 259, in dispatch_wsgi                                                                                │
│ pycsw     return self.dispatch()                                                                                                                            │
│ pycsw   File "/home/pycsw/pycsw/pycsw/server.py", line 430, in dispatch                                                                                     │
│ pycsw     self.kvp = self.iface.parse_postdata(self.request)                                                                                                │
│ pycsw   File "/home/pycsw/pycsw/pycsw/ogc/csw/csw2.py", line 1843, in parse_postdata                                                                        │
│ pycsw     rpname = recprop.find(util.nspath_eval('csw:Name',                                                                                                │
│ pycsw AttributeError: 'NoneType' object has no attribute 'text'         
@magnarem magnarem added the bug Something isn't working label Jun 2, 2023
@magnarem
Copy link
Contributor Author

magnarem commented Jun 7, 2023

See. https://gitlab.met.no/s-enda/pycsw/-/blob/master-met/pycsw/core/repository.py#L312

It looks like the update have two functionalities.

  1. Update only one property in the document. (I think this is the logic that get executed based on how the pycsw_dist in dmci _update-function is implimented.
  2. It is also a full update command, if no propery is given in the request. So it is this logic in pycsw we want to execute when updating in dmci pycsw_dist

@magnarem
Copy link
Contributor Author

The way to solve this is to remove the <csw:constraint>-xml tag from the request at

Here is an example of a full update from the pycsw testsuite: https://gitlab.met.no/s-enda/pycsw/-/blob/master-met/tests/functionaltests/suites/manager/post/Transaction-dc-02-update-full.xml

So I think the update-function in csw_distributor should be more or less the same as the insert-function, but with the <csw:update>-tag instead of <csw:insert>-tag

@magnarem
Copy link
Contributor Author

Fixed and created MR/PR
#189

magnarem added a commit that referenced this issue Jun 21, 2023
Fixed the update-function for pycsw update (#187)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant