Skip to content

Get PDF content #3551

Closed Answered by mdmintz
matteocacciola asked this question in Q&A
Feb 21, 2025 · 1 comments · 4 replies
Discussion options

You must be logged in to vote

There's sb.get_pdf_text(pdf):

self.get_pdf_text(
pdf, page=None, maxpages=None, password=None,
codec='utf-8', wrap=False, nav=False, override=False, caching=True)

Example: SeleniumBase/examples/test_get_pdf_text.py

from seleniumbase import BaseCase
BaseCase.main(__name__, __file__)

class PdfTests(BaseCase):
    def test_get_pdf_text(self):
        pdf = "https://nostarch.com/download/Automate_the_Boring_Stuff_sample_ch17.pdf"
        pdf_text = self.get_pdf_text(pdf, page=1)
        print("\n" + pdf_text)

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@matteocacciola
Comment options

@mdmintz
Comment options

@matteocacciola
Comment options

@mdmintz
Comment options

Answer selected by mdmintz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants