Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout error when scraping Capology #46

Closed
c10vis opened this issue Aug 16, 2024 · 6 comments
Closed

Timeout error when scraping Capology #46

c10vis opened this issue Aug 16, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@c10vis
Copy link

c10vis commented Aug 16, 2024

ScraperFC version: 3.1.0
Selenium version: 4.23.1

As normal, I import ScraperFC, initialize the Capology scraper per the documentation, and attempt to scrape EPL data from 2023-24. The result is a timeout error (see photo below). I have tried various seasons and leagues with similar results. I have been able to scrape from other modules in ScraperFC with no problems. I have also made a Capology account and logged in in-browser; this has not changed my results.

Screenshot 2024-08-16 at 15 24 15

@oseymour
Copy link
Owner

Hey @c10vis I just tried this locally and it worked. No error. Not sure what's going on with yours. Are you still getting the error?

@c10vis
Copy link
Author

c10vis commented Aug 20, 2024

Yeah, still getting the error. I wonder if it has to do with Chrome? I'm not sure how the backend works but seems like the scraper is using a chrome driver that's causing an issue. I don't know if that's something specific to how I'm set up or just generally how it works.

@oseymour
Copy link
Owner

I doubt it's a chrome issue. You're correct, Selenium creates a chromedriver (essentially just a chrome window) and using that avoids a lot of anti-scraping measures vs. just doing an HTTP request with requests. Using a chromedriver also allows for interacting with the webpage (e.g., changing currency on Capology).

I can't do much to debug this without having the issue myself. You could try increasing the timeout duration.

What OS are you using? Is this running on your laptop or a remote machine/server?

@c10vis
Copy link
Author

c10vis commented Aug 24, 2024

I'm running MacOS 14.6.1 on my personal laptop (M1 MacBook Pro).

I timed the issue and it runs for about 1 min 20s before the timeout error hits. How would I go about increasing the timeout duration?

@oseymour
Copy link
Owner

oseymour commented Sep 7, 2024

Sorry for the delay. Was moving in with my girlfriend.

What you measured is the time to hit that error (I assume). The timeout duration for finding the element that is failing is 10 seconds. You'll need to go to where python downloads packages when it pip installs them and increase the 10 to something else in the .py file. I don't know where that is on macOS though. Google should be able to tell you. And just follow the error trace you get to see which line needs to be edited.

@oseymour
Copy link
Owner

oseymour commented Nov 7, 2024

Hey @c10vis I was finally able to reproduce this issue on my machine and I've got a fix coming! Running tests on it now. I'm travelling this weekend but should be able to get it committed when I get back.

@oseymour oseymour added the bug Something isn't working label Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants