Skip to content

Support word, powerpoint, excel, html, epub + math improvements

Latest
Compare
Choose a tag to compare
@VikParuchuri VikParuchuri released this 28 Feb 23:55
b985880

Support xlsx, docx, pptx, html, epub

Marker now has support for additional document formats. You have to run pip install marker-pdf[full] to install all the dependencies.

Improved text detection

OCR should now work better due to an improved text detection model.

Inline math improvements

  • Better inline math detection with an improved model.
  • Inline math lines are now inference.
  • --redo-inline-math option to enable the highest quality math detection

Misc improvements

  • Support for the claude model
  • Improve benchmarking scripts
  • Merge lines better with new text detection model

What's Changed

New Contributors

Full Changelog: v1.5.5...v1.6.0