New benchmarks

Overall

Benchmark against llamaparse, docling, mathpix (see README for how to run benchmarks). Marker performs favorably against alternatives in speed, llm as judge scoring, and heuristic scoring.

Table

Benchmark tables against gemini flash:

Update gemini model

Use the new genai library
Update to gemini flash 2.0

Misc bugfixes

Fix bug with OCR heuristics not being aggressive enough
Fix bug with empty tables
Ensure references get passed through in llm processors

What's Changed

Add llm text support for references, superscripts etc by @iammosespaulr in #523
Update overall benchmark by @VikParuchuri in #515
Benchmarks by @VikParuchuri in #531

Full Changelog: v1.3.5...v1.4.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM fixes; new benchmarks