Python automation with Selenium for downloading dynamic captchas

With a large number of registration numbers to check regularly, I decided to develop a script that could automate this process for me.

The process involves visiting the website:

On this website, I need to enter a registration number in one field and solve a captcha in another. While solving the captcha is straightforward with existing Python libraries, the challenge lies in saving the image from the dynamic URL specified in the "src" attribute.

My current approach to this issue involves using web scraping techniques, as shown below:

from captcha_solver import CaptchaSolver
from selenium import webdriver
import time

driver = webdriver.Chrome()
driver.get("https://www.anaf.ro/inactivi/index.jsp")

time.sleep(5)

elem = driver.find_element_by_name('inputCui')  # Locate the first input box
elem.send_keys('17741254')   # Enter the desired code

If anyone has alternative suggestions or ideas on how I can tackle this problem more effectively, I am open to hearing them.

Answer №1

After some trial and error, I finally cracked it! My method involves capturing a screenshot of the page, cropping it to isolate the captcha, and then decoding it. For those curious minds, here is the script I used:

def capture_captcha(driver, element, path):
    location = element.location
    size = element.size
    driver.save_screenshot(path)
    image = Image.open(path)
    left = location['x']
    top = location['y']
    right = location['x'] + size['width']
    bottom = location['y'] + size['height']
    image = image.crop((left, top, right, bottom))
    image.save(path, 'png')

img = driver.find_element_by_xpath("//img[1]")
capture_captcha(driver, img, "captcha.png")

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

What is the reason behind receiving the alert message "Received 'WebElement' instead of 'collections.Iterable' as expected" at times?

When using Selenium's find_element(By.XPATH, "//tag[@class='classname']" to iterate over elements of specific classes, Pycharm sometimes shows a warning: "Expected 'collections.Iterable', got 'WebElement' i ...

Automated Ebay Script using Selenium

After logging into my Ebay account, I am trying to click on the 'My collections' hyperlink located under "G'day [username]". However, I am encountering an issue as I cannot seem to locate the element for 'My collections'. The error ...

Create a custom Python graphical user interface that integrates a button for executing a command in the terminal

I currently have a project that involves two separate files. File number one handles eye-tracking and blinking calculations. This file runs whenever I input a command in the terminal using this line: python3 blink.py --shape-predictor shape_predictor_68_f ...

Wrapper for establishing database connections

Here is a class I have designed to simplify my life: import pymssql class DatabaseConnection: def __init__(self): self.connection = pymssql.connect(host='...', user='...', password='...', database='...' ...

Java Selenium - accessing hidden input values

Hey there, this is my first post on Stack so please go easy on me if I make a mistake. I've been using selenium and Java for some automated testing. Everything was going smoothly until I encountered an issue with setting the value of a hidden input. ...

Python Code for Image Upload

Currently, I am attempting to change my avatar using code in Python on a Linux operating system: photo = wait.until(EC.presence_of_element_located((By.XPATH, '//*[@id="ButtonChangePhoto"]'))) photo.send_keys('/home/nataliya/Desktop/puppy.jp ...

Is there a way to verify if data is present in stdin?

Is it possible in Python to determine if there is data in sys.stdin? I came across the os.isatty(0) function, which can be used to not only verify if stdin is connected to a TTY device, but also to check for available data. However, I noticed that even w ...

Using Selenium to access and read a PDF document that opens in a new tab, without the need for

My team is facing a challenge in our application where clicking on a link opens a new tab with a dynamically generated PDF. The PDF that is generated opens in a new tab and has the URL as "about:blank". I am unable to verify the content of the PDF using ...

Retrieving and saving images from Azure Blob Storage

I need help with exporting matplotlib plots from Databricks to Blob Storage. Currently, I am using the following code: plt.savefig('/dbfs/my_plot.png') dbutils.fs.cp('dbfs:my_plot.jpg', blob_container) However, the issue arises when I ...

Deploying PhantomJS on Heroku

Running a node app on Heroku has been smooth sailing for me. I've implemented web scraping through selenium in Python, where my python script is called from the node app whenever needed. When testing locally on my Mac, everything functions perfectly a ...

Extracting Data from Imgur Videos: Measuring their Duration

Hey there! Lately, I've been faced with the challenge of extracting specific information from videos hosted on the imgur website. Specifically, I'm interested in retrieving the video length indicated in the attached image. Despite my best efforts ...

Can you find a solution using the GUI interface?

Hello, I am currently working on a program to confirm pregnancy using hormone levels (HCG as an integer parameter in a dictionary). However, when testing the GUI, I encountered the following error: "TypeError: string indices must be integers" while checkin ...

When Selenium is not in headless mode, it cannot capture a screenshot of the entire website

Disclaimer: Although there is a similar question already out there, none of the answers provided worked for a headless browser. Therefore, I have decided to create a more detailed version addressing this specific issue (the original question I referred to ...

Is there a method to receive a report on each replacement made by re.sub function?

TL;DR: How can I make re.sub show the substitutions it makes, especially when using groups? Is there a way to enable a verbose mode for re.sub so that it prints a message each time it replaces a substring? This feature would be useful for testing the inte ...

Tips for converting the 'numericals' in the input provided by the user into the initial point of a loop

I am in the process of developing a program to analyze the game Baccarat, and while I have a good grasp of the basics, I require assistance in enabling users to paste multiple games at once. Below is an example: games = input('Enter the games you wis ...

Unable to locate element in the DOM using XPath

Having trouble retrieving an element from the DOM using XPath <input bsdatepicker="" class="form-control font-size-normal ng-untouched ng-pristine ng-valid" id="endDate" name="endDate" placement="top" type="text"> The value of the element is not p ...

Selenium is having trouble locating the specified tag

Why is the Python code unable to find the <video> tag for this specific URL? The Chrome dev tools show that the tag exists. I've tried various waits but have had no luck. from selenium import webdriver from selenium.webdriver.common.by import B ...

Locating and Exiting a Popup Box in Firefox with Selenium and C#

Problem: I am encountering issues with my Selenium script where it is unable to recognize and close a dialog box that appears when redirecting to a URL containing a file download. The dialogue box in question can be seen in the attached image. Despite ded ...

Python encountering errors while attempting to load JSON file

I have a text file containing the following json: { "data sources" : [ "http://www.gcmap.com/" ] , "metros" : [ { "code" : "SCL" , "name" : "Santiago" , "country" : "CL" , "continent" : "South America" , "timezone" : -4 , "coordinates" : {"S" : 33, "W" : ...

Error encountered: Unknown error while using Microsoft Edge WebDriver

When attempting to utilize Microsoft Edge automation with Selenium, I keep encountering the following exception: OpenQA.Selenium.WebDriverException: Unexpected error. Unknown error The NuGet package I am using is Selenium.WebDriver.MicrosoftDriver vers ...