Python's List Comprehension Outperforms Selenium While Loops

Question

Python's List Comprehension Outperforms Selenium While Loops

I've been on a wild expedition through the depths of the internet in search of an answer to this particular enigma, but my efforts have been in vain so far.

Currently, I'm attempting to extract data from the last four pages of last.fm entries for "Jazz Metal" (check out the URL).

from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options)

driver.get('https://www.last.fm/tag/jazz+metal/artists?page=20')

super_list = []

wait = WebDriverWait(driver, 10)

while True:
    try:
        entries = wait.until(
            EC.presence_of_element_located((By.CLASS_NAME, 'grid-items-section'))
        )
        
        grid = driver.find_element(By.CLASS_NAME, 'grid-items-section')
        grid_children = grid.find_elements(By.TAG_NAME, 'li')
        
        super_list.append(grid_children)
        
        pagination = wait.until(
            EC.presence_of_element_located((By.CLASS_NAME, 'pagination-next'))
        )
        pagination.click()
                
    except:
        break

The issue is that super_list.append(grid_children) doesn't provide much help because when the while loop ends and I try to access .text method outside of that scope using super_list, all I get is a list that's almost impossible to decipher.

<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="11b49c8e-eec7-45f2->9e2a-e2034b93077a", element="ffe29b8e-5b65-4df3-985e-68e501e3a546")>

However, if I change super_list.append(grid_children) to

super_list.append([entry.text for entry in grid_children])

, everything falls apart. What's going wrong here? Also, if I remove super_list.append(grid_children) completely, it goes through every page (yes, it currently misses the last page entirely)!

To make matters more confusing, when I include

    finally:
        driver.quit()

only the first page gets visited. Can anyone shed some light on this mysterious behavior?

python selenium selenium-webdriver

Answer 1

Answer №1

After much consideration, I have decided to move away from using Selenium and return to utilizing requests-html. My apologies to anyone seeking a solution using Selenium.

Answer 2

After much consideration, I have decided to move away from using Selenium and return to utilizing requests-html. My apologies to anyone seeking a solution using Selenium.

Python's List Comprehension Outperforms Selenium While Loops

Answer №1

Similar questions

Updating the datepicker value with Selenium and Python

Innovative foundation class and production line techniques

What serves as the pip counterpart to package.json?

Having trouble detecting Selenium in Python. Any thoughts or ideas for resolving this issue?

Utilizing Selenium Webdriver in Java to read and write JSON files

Resizing on the go using MPlayer and PyGTK

What sets apart an Op from a Function in programming?

Leveraging Python's selenium module to programmatically click on buttons and initiate

Python3 is unable to perform exponentiation operations between a string and an integer due to unsupported operand types

Using HTTPS, you can access Flask from AJAX

The Python input function allows users to input data into a

What is the best way to extract data from a deeply nested JSON object with 5 or 6 levels and transform it into a dataframe?

Navigating through datasets containing recurring multivalued attributes

Issue with missing modules in cx_Freeze

Add a new row to a dataframe when a specific condition is satisfied by another row within the dataframe

Print the countdown of elements on the Python page by subtracting a specified number from the total count of

Is there a way to determine if a particular element with a specified xpath is present in the HTML document?

Multitasking with Gevent pool for handling multiple nested web requests

Tips for retrieving the URL of the selected article, rather than the Google News site

Understanding the process of extracting HTML tags from an RSS feed using Python