I encountered an error with no fingerprint when running the spider. To solve this, I used Playwright, a headless browser that allows me to send requests with real browser information. By executing the fingerprint request and sending the information to the antibot, I can quickly make every request. However, antibody systems also look for consistency, which caused an error with inconsistent time zones when moving to the next level.
When I run the spider this time, I've got a big error, no fingerprint. And to understand that, I need to go back to my browser. And on the browser, as you can see, there are a lot of POST requests. The browser is sending a lot of information to the Antibot system. So I'm going to check what kind of information am I sending. I'm sending the platform type, the time zone, and the real user agents. We cannot spoof anymore this kind of information.
I need a real browser to send my request instead of access and execute JavaScript, a browser which can be controlled by a script. I will use Playwright. So Playwright is a headless browser. We can see that it's helpful for the presentations. I can execute JavaScript and it works with Chrome, Firefox, Edge, and Safari. It is open source and maintained by Microsoft. So let's see how we can adapt our spiders.
Now I can create a Playwright script based on the previous spider. In the Playwright script, I've got the same methods. I go to the home page. I post a form and get a list of hotels. And I get the details of each hotel. I extract names, emails, and ratings. So if I'm running this spider, you will see a browser opening and going to the home page. I'm executing the fingerprint request and sending all the information to the antibot. And now I can do every request very quickly. As you see, we cannot see the page because it downloads only content without rendering. So I've got my 50 items. But of course, antibody systems are not only catching fingerprint information. They are catching consistency.
So if I'm moving to the next level, the LoveX6, and I'm running again the Playwright spider, the spider connects to the home page, sends a fingerprint. But when I execute other requests, I've got a big error, inconsistent time zone. It is happening because we are sending the real time zone of the browser.
Comments