Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support files without default namespace #267

Merged
merged 1 commit into from
Sep 8, 2024

Conversation

hhaensel
Copy link
Contributor

@hhaensel hhaensel commented Sep 4, 2024

I have an xlsx file that is generated by a website and can be read by both Excel and openpyxl.
However, I receive a "no node name" error when trying to parse the file with XLSX.
Unfortunately, I am not able to share the file at the moment.
But I succeeded in finding out the underlying root cause.

The file uses a non-default namespace, therefore the node names were prefixed with the namespace ("x:row" instead of "row"). Furthermore, there was no default namespace included in the worksheets styles, but only the namespace "x"=>"http://schemas.openxmlformats.org/spreadsheetml/2006/main"

This PR splits off the namespace for nodename checks and accepts a single namespace for assertion.
However, there is no checking for the correct namespace. But since there is only one namespace present, it is probably ok to handle the namespace this way.

@hhaensel
Copy link
Contributor Author

hhaensel commented Sep 5, 2024

For debugging I used the following lines. It could be a good starting point if in future we may want to match the namespace.

xf = XLSX.openxlsx(filename, enable_cache=false)

name_sheet1 = XLSX.sheetnames(xf)[1]
sheet = xf[name_sheet1]
itr = XLSX.eachrow(sheet)

ws = XLSX.get_worksheet(itr)
wb = XLSX.get_workbook(ws)

target_file = XLSX.get_relationship_target_by_id("xl", XLSX.get_workbook(ws), ws.relationship_id)
zip_io, reader = XLSX.open_internal_file_stream(XLSX.get_xlsxfile(ws), target_file)

# repeat execution of the following block for debugging 
begin
    r = iterate(reader)
    nodedepth(reader), nodename(reader)
end

@felipenoris felipenoris merged commit dc7d747 into felipenoris:master Sep 8, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants