-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dev #573
Dev #573
Conversation
VikParuchuri
commented
Feb 21, 2025
•
edited
Loading
edited
- Support XLSX, DOCX, PPTX, HTML, EPUB
- Improve inline math detection
- Add claude service
Inline math
Add Support for DOCX, PPTX, XLSX, HTML and Epub
Fix character encoding issues when loading configuration files with non-ASCII characters.
Fix character encoding issues when loading JSON configuration files
api_key=self.claude_api_key, | ||
) | ||
|
||
def __call__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should update the block metadata like the other services
marker/schema/text/line.py
Outdated
if self.formats is None: | ||
self.formats = other.formats | ||
elif other.formats is not None: | ||
self.formats.extend(other.formats) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor - This might cause repeated formats, though this might not affect anything downstream currently
def merge_consecutive_math(html, tag="math"): | ||
if not html: | ||
return html | ||
pattern = fr'-</{tag}>(\s*)<{tag}>' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this hyphen required?