Tips for using Selenium to interact with the save or print button in the Google Chrome print window, even when the HTML page contains a shadow-root

Question

Tips for using Selenium to interact with the save or print button in the Google Chrome print window, even when the HTML page contains a shadow-root

https://i.stack.imgur.com/NFJat.pngMy goal is to use Selenium and Python for web scraping. I am currently using Chrome version 123.0.6312.87 with a compatible web driver. Specifically, I am interested in extracting data from the following page: "https://web.bcpa.net/BcpaClient/#/Record-Search". Upon entering the address "2216 NW 6 PL FORT LAUDERDALE" on this page using Selenium, detailed property information is displayed. On clicking the Print button through Selenium, it redirects me to a new page "https://web.bcpa.net/BcpaClient/recinfoprint.html". Within this HTML page, there is a dropdown menu under the class "md-select" that contains the option to select "Save As PDF" with the value "Save as PDF/local/". Unfortunately, due to the presence of shadow root elements, Selenium is unable to locate the position of the "md-select" class.

Here is the relevant code snippet:

(insert modified code here)

I aim to interact with the "Save As PDF" option within the "md-select" class and subsequently click the Save button located in the "action-button" class. However, the presence of shadow roots has posed significant challenges in achieving this functionality. Despite attempts to extract information from the "print-preview-app" located before the shadow root, the script encounters issues.

The execution of the code stops after implementing the WebDriver shadow root method.

python selenium-webdriver shadow-root

Answer 1

Answer №1

When dealing with a newly opened tab, there's no need to access the shadow-root. Instead, saving a PDF file can be made much simpler by utilizing Chrome driver options. By passing preferences to chromedriver, your PDF file will automatically be saved to a designated directory during print actions.

import json
import sys
import time

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

from selenium.webdriver.common.action_chains import ActionChains

try:
    print_settings = {
        "recentDestinations": [{
            "id": "Save as PDF",
            "origin": "local",
            "account": "",
        }],
        "selectedDestinationId": "Save as PDF",
        "version": 2,
        "isHeaderFooterEnabled": False,
        "isLandscapeEnabled": True
    }

    prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(print_settings),
             "download.prompt_for_download": False,
             "profile.default_content_setting_values.automatic_downloads": 1,
             "download.directory_upgrade": True,
             "savefile.default_directory": "/Users/a1/PycharmProjects/PythonProject", #this is path to dir where you want to save the file
             "safebrowsing.enabled": True}

    options = webdriver.ChromeOptions()
    options.add_experimental_option('prefs', prefs)
    options.add_argument('--kiosk-printing')
    service = Service()
    driver = webdriver.Chrome(options)

    driver.maximize_window()
    actions = ActionChains(driver)
    wait = WebDriverWait(driver, 20)

    driver.get("https://web.bcpa.net/BcpaClient/#/Record-Search")

    text_input = wait.until(EC.visibility_of_element_located((By.XPATH, '//input[@class="form-control"]'))).send_keys(
        "2216 NW 6 PL FORT LAUDERDALE, FL 33311")
    search_button = driver.find_element(By.XPATH,
                                        '//span[@class="input-group-addon"]/span[@class="glyphicon glyphicon-search"]').click()

    printer_click = wait.until(EC.visibility_of_element_located((By.XPATH, '//div[@class="col-sm-1  btn-printrecinfo"]'))).click()
    time.sleep(5)
except Exception as e:
    print(e)
    sys.exit(1)

Answer 2

When dealing with a newly opened tab, there's no need to access the shadow-root. Instead, saving a PDF file can be made much simpler by utilizing Chrome driver options. By passing preferences to chromedriver, your PDF file will automatically be saved to a designated directory during print actions.

import json
import sys
import time

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

from selenium.webdriver.common.action_chains import ActionChains

try:
    print_settings = {
        "recentDestinations": [{
            "id": "Save as PDF",
            "origin": "local",
            "account": "",
        }],
        "selectedDestinationId": "Save as PDF",
        "version": 2,
        "isHeaderFooterEnabled": False,
        "isLandscapeEnabled": True
    }

    prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(print_settings),
             "download.prompt_for_download": False,
             "profile.default_content_setting_values.automatic_downloads": 1,
             "download.directory_upgrade": True,
             "savefile.default_directory": "/Users/a1/PycharmProjects/PythonProject", #this is path to dir where you want to save the file
             "safebrowsing.enabled": True}

    options = webdriver.ChromeOptions()
    options.add_experimental_option('prefs', prefs)
    options.add_argument('--kiosk-printing')
    service = Service()
    driver = webdriver.Chrome(options)

    driver.maximize_window()
    actions = ActionChains(driver)
    wait = WebDriverWait(driver, 20)

    driver.get("https://web.bcpa.net/BcpaClient/#/Record-Search")

    text_input = wait.until(EC.visibility_of_element_located((By.XPATH, '//input[@class="form-control"]'))).send_keys(
        "2216 NW 6 PL FORT LAUDERDALE, FL 33311")
    search_button = driver.find_element(By.XPATH,
                                        '//span[@class="input-group-addon"]/span[@class="glyphicon glyphicon-search"]').click()

    printer_click = wait.until(EC.visibility_of_element_located((By.XPATH, '//div[@class="col-sm-1  btn-printrecinfo"]'))).click()
    time.sleep(5)
except Exception as e:
    print(e)
    sys.exit(1)

Tips for using Selenium to interact with the save or print button in the Google Chrome print window, even when the HTML page contains a shadow-root

Answer №1

Similar questions

Constant price updates through an API loop

What are some methods for increasing the speed of debugging in Python + Django + PyCharm on a Windows operating system

I'm looking for a sample application that showcases the best MVC practices in a Python app running on Google App Engine. Can you

Managing process control in Supervisord - halting an individual subprocess

Failed to retrieve the article outcome from a search on Google

What is the best way to execute a single GET request concurrently for a set number of times?

The preference for the download directory in Selenium's Firefox profile is failing to be applied

Creating a JSON schema in JSL python library that defines a mapping of strings to strings

Designate a specific Python version to be used when executing commands in the Windows command line

Struggling to find element using Java in Selenium WebDriver - need assistance

Execute a Selenium WebDriver test on a server running the Linux operating system

Can value_counts() be applied to two columns simultaneously?

The WebDriver appears to be using a null Session ID from Selenium, possibly due to calling quit() before continuing with another test. This

Executing Browser in the Background with Protractor/Selenium

Can I consolidate identical terms into a single column in a pandas dataframe?

Ensuring Proper Alignment in PyQT5 Layouts

Collecting Payment Information using Python's Selenium Automation

The Java Selenium scrolling code fails to function in a newly opened tab

Discovering text using Selenium is easier than you might think

Transform the HTTP text response into a pandas dataframe