Retrieving Base64 Images in Python Selenium: Step-by-Step Guide

Trying to fetch base64 captcha images using Python Selenium has been a challenge.

The issue I'm encountering is that I can only access the HTML right before the images are loaded.

Here are the steps I've taken:

# importing necessary packages

from selenium.webdriver import EdgeOptions
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.microsoft import EdgeChromiumDriverManager
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = EdgeOptions()
options.add_argument("--headless")
options.add_argument('disable-gpu')
driver = webdriver.Edge(service=Service(EdgeChromiumDriverManager().install()), options=options)

# accessing the website
driver.get("https://boards.4channel.org/o/")
# opening a post
driver.execute_script("document.getElementsByClassName('mobilePostFormToggle mobile hidden button')[0].click()")
# clicking on this ID
driver.execute_script("document.getElementById('t-load').click()")

# Getting the initial html - before javascript
html1 = driver.page_source

# Retrieving the html after javascript execution
html2 = driver.execute_script("return document.documentElement.innerHTML;")

# Checking for base64 images
'Loading' in html1  # True
'Loading' in html2  # True

# Further verification for base64 images
'data:image/png;base64' in html1  # False
'data:image/png;base64' in html2  # False

The relevant HTML object seems to be:

<button id="t-load" type="button" data-board="o" data-tid="0" style="font-size: 11px; padding: 0px; width: 90px; box-sizing: border-box; margin: 0px 6px 0px 0px; vertical-align: middle; height: 18px;">Get Captcha</button>

Answer №1

After carefully reviewing the question and OP's comments, it becomes evident that the main challenge lies in obtaining the base64 captcha image presented on the screen. This image is actually comprised of two base64 images with different sizes, making the task of retrieving, decoding, and merging them into an exact replica quite complex. However, the solution provided below addresses this issue effectively:

from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time as t
from PIL import Image

import undetected_chromedriver as uc


options = uc.ChromeOptions()
options.add_argument("--no-sandbox")
options.add_argument('--disable-notifications')
options.add_argument("--window-size=1280,720")
options.add_argument('--ignore-certificate-errors')
options.add_argument('--allow-running-insecure-content')
# options.add_argument('--headless')

browser = uc.Chrome(options=options)

wait = WebDriverWait(browser, 20)
url = 'https://boards.4channel.org/o/'
browser.get(url) 

wait.until(EC.element_to_be_clickable((By.PARTIAL_LINK_TEXT, 'Start a New Thread'))).click()
t.sleep(1)
wait.until(EC.element_to_be_clickable((By.XPATH, '//button[@id="t-load"]'))).click()
captcha_img_background = wait.until(EC.element_to_be_clickable((By.XPATH, '//div[@id="t-bg"]')))
captcha_img_background.screenshot('full_captcha_image.png')
print('got captcha!')
b64img_background = wait.until(EC.element_to_be_clickable((By.XPATH, '//div[@id="t-bg"]'))).get_attribute('style').split('url("data:image/png;base64,')[1].split('");')[0]
bgimgdata = base64.b64decode(b64img_background)
with open('bg_image.png', 'wb') as f:
    f.write(bgimgdata)
print('also saved the base64 image as bg_image.png')

This code snippet allows for capturing the complete captcha image as it appears on the screen, encompassing both background and foreground elements. This can prove useful for tasks such as ML training data creation.

UPDATE: The code has been revised to illustrate the process of decoding and storing a base64 image (specifically focusing on saving the background image that may require horizontal scrolling).

Switching to undetected chromedriver was necessary when both Firefox and Chrome failed to render the captcha images successfully.

For more information on undetected chromedriver, refer to the documentation here: https://github.com/ultrafunkamsterdam/undetected-chromedriver

To explore Selenium further, check out their official documentation: https://www.selenium.dev/documentation/

Answer №2

Hey there, @John Stud! I ran your code in my environment without headless mode to see what was going on. Here's what I found:

The hidden mobilePostFormToggle button was not clicking on the link "[Start a New Thread]", so I made a change to successfully click on it.

I located the link with the xpath "//div[@id='togglePostFormLink']/a[text()='Start a New Thread']" and used driver.execute_script to click it.

You set a 60-second wait time before clicking on the captcha button, but in my opinion, 2-3 seconds should be enough. After clicking the captcha button, however, it may take longer to load. I ended up setting a 40-second wait time in my browser due to an error that prevented the image from launching properly.

If you need further assistance with this issue, let me know!

https://i.stack.imgur.com/1to55.png

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

How to Transform JSON into a List of Lists Using Python

Looking for assistance in developing Python code that can convert JSON data into a list of lists. JSON DATA : [{ 'service_count': 12, 'service_name': 'jboss', 'service_type': &ap ...

What's with all the super fast content popping up on my browser when I begin my intern test?

Hello everyone, First of all, I want to express my gratitude in advance. I am currently executing a single functional test utilizing intern and local selenium. During the initiation of the test, the following sequence occurs: The Chrome browser is laun ...

Guide on launching selenium tests from an executable jar file on a separate device

After attempting to open an executable jar file on a different computer, I encountered the following Chrome exception error. I suspect there may have been an issue during the export of the jar file. Can anyone offer guidance on how to resolve this? Thank ...

Tips on utilizing json.tool in the command line to validate and format language files while preserving unicode characters

Ubuntu 16.04 Bash 4.4 python 3.5 After receiving a collection of language files from freelance translators on Upwork, it became evident that none of the files had matching line counts. To address this discrepancy, I decided to validate and format the ...

Using QListView to Customize Column Display in QTableView

Imagine you have the following dataset: df = {'Banana': {0: 1, 1: 2, 2: 5}, 'Apple': {0: 3, 1: 4, 2: 3}, 'Elderberry': {0: 5, 1: 4, 2: 1}, 'Clementine': {0: 4, 1: 7, 2: 0}, 'Fig': {0: 1, 1: 9, 2: 3}} ...

Fixture in Py.test: Implement function fixture within scope fixture

I've encountered a small issue with pytest fixtures and I could really use some assistance. Below are some function fixtures without their implementation details for brevity. @pytest.fixture() def get_driver(): pass @pytest.fixture() def login( ...

Building a new DataFrame by combining multiple DataFrames based on predetermined conditions

Imagine a scenario where there is a list called my_list with 3 DataFrames stored inside: DF1 fur_color frequency column_name Black 9843 fur_color Brown 8733 fur_color White 3419 fur_ ...

Selenium and Python team up to effortlessly navigate a dropdown menu loop!

I'm attempting to iterate through a dropdown menu on this website: For instance, the initial dropdown menu lists various materials under options. The goal is to sequentially select each material and extract additional data from the webpage before mov ...

Automated Script for Installing Python and Executing Python Code

I have developed a utility that performs the following tasks: Checks if Python is installed. If Python is not installed, the script downloads and installs Python. It then executes the Python script in the same session. However, I am encountering an issue ...

Is the element visible before the specified wait time? If so, will the implicit/explicit wait still wait until the specified time or click?

In a scenario where I have set an implicit wait of 30 seconds for an element to click, but the script actually finds the element in only 10 seconds, what will happen? Will it still wait the full 30 seconds or click the element immediately? The same quest ...

What could be causing issues with the functionality of specflow tags in my scenarios?

I have come up with a few scenarios in the feature file Patient Search, each having 2 different tags associated with them as shown below: @PatientSearch @Functional Scenario: Patient Search-Choose a Favorite Search Given I do not have any Portal page o ...

How to assign multiple values to a single key in a Python dictionary

Here is the code I am working on: import re import os dict = {} rpm_list = ['rpm1', 'rpm2', 'rpm3'] build_type = 'candidate' for rpm in rpm_list: repo_path = '/product/Esoteric/candidate' if re.se ...

Generate a fresh jpg file upon user triggering the route in Flask using Ajax

app = Flask(__name__) @app.route('/' , methods = ['GET', 'POST']) def index(): print("hello") if request.method == 'GET': text = request.args.get("image") print(t ...

Is there a way to parse through a collection of JSON strings in Python without using keys?

My code utilizes json.dumps and generates the following JSON structure: [{ "name": "Luke", "surname": "Skywalker", "age": 34 }, { "name": "Han", "surname": "Solo", &q ...

Python sqlite3 json query in Conda environment and release environment from python.org

import os import sqlite3 g_rdb_path = 'mails.db' def test(): c = sqlite3.connect(g_rdb_path) t = c.cursor() t.execute('''SELECT m_mail_info.id FROM m_mail_info, json_each(m_mail_info.froms) WHERE json_each.val ...

What could be causing my Selenium web scraping to fail on my Digital Ocean droplet?

I'm currently experimenting with Digital Ocean and droplets for the very first time, but am encountering a challenge with my Selenium script. Initially, I was facing the error message stating DevToolsActivePort file doesn't exist, however, now my ...

Python code to transform a dictionary into binary format

I have a unique system where customer IDs are linked with movie IDs in a dictionary. Even if the customer watches the same movie multiple times, I want to simplify it as a single entry. In order to achieve this, I need to convert my dictionary data into bi ...

Struggling to Save a Python Function's Output to an External .txt File

As I am still in the process of learning Python and programming in general, please forgive my lack of knowledge on this matter. The objective is to take a list of words, allow the user to input their own keyword, combine the two, and then save them to a . ...

Creating a Bar Chart by Year in Matplotlib

I have been working on plotting a DataFrame that contains various amounts of money over the years: from matplotlib.dates import date2num jp = pd.DataFrame([1000,2000,2500,3000,3250,3750,4500], index=['2011','2012','2013',&apo ...

Issue with drag and drop functionality in Selenium/Java when dealing with hidden elements

Currently faced with a challenge using Selenium and Java bindings along with ChromeDriver 2.3 to interact with the latest browser version. After spending quite some time on this issue, I find myself stuck. The task at hand is to drag and drop an element o ...