Skip to content

Releases: VikParuchuri/marker

Pagination, bug fixes

17 Jun 17:04
fe9343c
Compare
Choose a tag to compare
  • Add a setting to enable output pagination
  • Enable convert.py to use mps (but less memory efficient than cpu/cuda)
  • Fix bug with inference ram setting
  • Fix bug with pdf names with dots in them
  • Fix bug with images at the end of blocks

Fix convert.py bug

30 May 01:55
53125ac
Compare
Choose a tag to compare

Fix model device check.

Specify page range

29 May 18:09
aa8e7f0
Compare
Choose a tag to compare
  • Make it more clear MPS can't be used with convert.py
  • Specify page range in convert with start_page and max_pages

Python 3.12 compatibility

28 May 22:36
a3334ce
Compare
Choose a tag to compare
  • Remove ray to enable python 3.12 compatibility
  • Removing ray frees a lot of VRAM (since we can use torch shared tensors), so on average with convert.py each process takes 3GB VRAM. This enables much higher throughput (was between 4.5GB and 5GB before).

OCR speedups

28 May 04:34
7bf2e91
Compare
Choose a tag to compare
  • Pull in new surya and pdftext versions for speedups in OCR and text extraction, respectively
  • Refine heuristics to reduce OCR false positives (and true positives, unfortunately)
  • Enable float batch multipliers

Speed improvements

23 May 23:24
0d9b0db
Compare
Choose a tag to compare
  • Enable parallel text extraction, with worker count settings
  • Bump surya version to pull in layout/line segmentation speed improvements, and OCR bug fix

Faster OCR

18 May 04:28
cc9d830
Compare
Choose a tag to compare
  • OCR is now ~2.5x faster, due to improvements in surya

Speed up inference

17 May 22:57
a056562
Compare
Choose a tag to compare
  • (from surya) faster ocr, line detection, layout inference
  • Unpin transformers version after testing

Should be significantly faster now, but haven't fully benchmarked, since I'm running low on time this week!

Fix memory leak

16 May 22:46
74adf35
Compare
Choose a tag to compare
  • Fix a memory leak (fixed in surya, bumped the version). This caused high CPU memory usage on long docs.
  • Improve load_all_models to take device and dtype

Marker v2

10 May 16:02
6f8b239
Compare
Choose a tag to compare

Basically a full rewrite!

Main features:

  • Extracts and saves images
  • Improved table formatting
  • Better markdown wrapping
  • Better reading order on complex docs
  • Improved OCR engine with more language options
  • Simple pip package install (no more required system dependencies), so can be used easily on Windows
  • Can be used commercially (pymupdf and layoutlmv3 dependencies removed)

It takes ~2x as long to run now, but seems like a decent tradeoff.

See the README for details.