Python's List Comprehension Outperforms Selenium While Loops

I've been on a wild expedition through the depths of the internet in search of an answer to this particular enigma, but my efforts have been in vain so far.

Currently, I'm attempting to extract data from the last four pages of last.fm entries for "Jazz Metal" (check out the URL).

from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options)

driver.get('https://www.last.fm/tag/jazz+metal/artists?page=20')

super_list = []

wait = WebDriverWait(driver, 10)

while True:
    try:
        entries = wait.until(
            EC.presence_of_element_located((By.CLASS_NAME, 'grid-items-section'))
        )
        
        grid = driver.find_element(By.CLASS_NAME, 'grid-items-section')
        grid_children = grid.find_elements(By.TAG_NAME, 'li')
        
        super_list.append(grid_children)
        
        pagination = wait.until(
            EC.presence_of_element_located((By.CLASS_NAME, 'pagination-next'))
        )
        pagination.click()
                
    except:
        break
        

The issue is that super_list.append(grid_children) doesn't provide much help because when the while loop ends and I try to access .text method outside of that scope using super_list, all I get is a list that's almost impossible to decipher.

<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="11b49c8e-eec7-45f2->9e2a-e2034b93077a", element="ffe29b8e-5b65-4df3-985e-68e501e3a546")>

However, if I change super_list.append(grid_children) to

super_list.append([entry.text for entry in grid_children])
, everything falls apart. What's going wrong here? Also, if I remove super_list.append(grid_children) completely, it goes through every page (yes, it currently misses the last page entirely)!

To make matters more confusing, when I include

    finally:
        driver.quit()

only the first page gets visited. Can anyone shed some light on this mysterious behavior?

Answer №1

After much consideration, I have decided to move away from using Selenium and return to utilizing requests-html. My apologies to anyone seeking a solution using Selenium.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Updating the datepicker value with Selenium and Python

Within a table element, I have the following code snippet for implementing a date picker: <input name="tb_date" type="text" value="2020-07-15" onchange="javascript:setTimeout('__doPostBack(\'tb_date&bsol ...

Innovative foundation class and production line techniques

Here is the code snippet I am working with: class EntityBase (object) : __entity__ = None def __init__ (self) : pass def entity (name) : class Entity (EntityBase) : __entity__ = name def __init__ (self) : ...

What serves as the pip counterpart to package.json?

While I am familiar with the requirements.txt file, it only lists dependencies. However, I'm interested in other metadata such as package name, author, and main function. Is there a standard format for this information? I've heard of the setup.p ...

Having trouble detecting Selenium in Python. Any thoughts or ideas for resolving this issue?

Utilizing a selenium library, I developed my very own Nike SNKRS Bot in Python specifically designed to operate in the Chrome browser. I opted for a popular webdriver supported by selenium but encountered a roadblock on the Nike login page. Below is the s ...

Utilizing Selenium Webdriver in Java to read and write JSON files

Currently, I am developing an Automation Framework and exploring alternatives to using Excel for storing test data, element locators, and page objects. A friend of mine who is also working on Automation suggested using a JSON file to store all the require ...

Resizing on the go using MPlayer and PyGTK

I have written a python code using pygtk to embed mplayer in a GUI. I am using GtkSocket and the slave mode of mplayer with the -wid option. However, I am facing an issue where if the size of my GTK window is smaller than the stream, it gets cropped. And ...

What sets apart an Op from a Function in programming?

Theano features Ops and functions. What sets them apart? Defining functions is straightforward, for example: x = T.dmatrix('x') linmax = function([x], T.maximum(x,0)) Ops on the other hand, can be more intricate to create. They involve abstr ...

Leveraging Python's selenium module to programmatically click on buttons and initiate

i'm currently attempting to automate the process of clicking various download links on this site: on this particular page, my initial step is to click the "Download ICE Risk Model array files" header which reveals 2 dropdowns. From these, I ...

Python3 is unable to perform exponentiation operations between a string and an integer due to unsupported operand types

Here is my Python3 code snippet: def ask(): while True: try: n = input('Input an integer: ') except: print ('An error occurred! Please try again!') continue else: ...

Using HTTPS, you can access Flask from AJAX

I have encountered numerous inquiries concerning this issue, but none have proven effective for me. I recently switched my domain from HTTP to HTTPS. Everything was functioning properly on HTTP. The main issue lies in the fact that my javascript and flask ...

The Python input function allows users to input data into a

Understanding the split() function in Python can be confusing at times, especially when you see it implemented like this: def sayHello(): name = input("What's your name?:" ) print("Hello", name) When you only require one input from the user, ...

What is the best way to extract data from a deeply nested JSON object with 5 or 6 levels and transform it into a dataframe?

Issue: I am facing a challenge in parsing and converting a complex nested JSON file into a pandas dataframe with each field represented as a column. I have experimented with two different methods to tackle this problem: Converting the raw file to a data d ...

Navigating through datasets containing recurring multivalued attributes

Our dataset is in sparse representation with 25 features and 1 binary label. An example line of the dataset looks like this: Label: 0 exid: 24924687 Features: 11:0 12:1 13:0 14:6 15:0 17:2 17:2 17:2 17:2 17:2 17:2 21:11 21:42 21:42 21:42 21:42 21:42 22:3 ...

Issue with missing modules in cx_Freeze

Looking for some assistance with Python and cx_Freeze as I am new to it. Please guide me on how to make it work by running the command: python setup.py build Encountering an error with missing modules: ? System imported from serial.serialcli ? TERMIOS ...

Add a new row to a dataframe when a specific condition is satisfied by another row within the dataframe

I have a dataset with the following structure: BILL_NO CREATED_DATE ACCT_NO LOCATION AMOUNT 100 4/6/2021 7551 1150 1000.00 200 4/6/2021 7551 1101 500.00 300 4/6/2021 7551 2025 700 ...

Print the countdown of elements on the Python page by subtracting a specified number from the total count of

I'm looking to implement a page count down feature in my Python script for each page it navigates to. Below are my attempts so far. How can I achieve the desired result? In order to easily keep track of my script's progress, I have used the (len ...

Is there a way to determine if a particular element with a specified xpath is present in the HTML document?

Currently, I am utilizing selenium with Python and I need to determine whether an element with the given xpath exists within the HTML page. How do I achieve this? Example input: check_if_exists("xpath") Result: True or False ...

Multitasking with Gevent pool for handling multiple nested web requests

I am working on setting up a pool with a maximum of 10 concurrent downloads for organizing web data. The goal is to download the main base URL, parse all URLs on that page, and then proceed to download each individual URL, but maintaining an overall limit ...

Tips for retrieving the URL of the selected article, rather than the Google News site

When you run the code provided, it randomizes categories and selects the first article from a new site. However, after waiting for 10 seconds for the site to load using time.sleep(10), the issue arises where instead of collecting the URL of the newly loade ...

Understanding the process of extracting HTML tags from an RSS feed using Python

I am looking for a way to create a plain text readout of an RSS feed using a small utility. Below is the code I have: #!/usr/bin/python # /usr/lib/xscreensaver/phosphor -scale 3 -program 'python newsfeed.py | tee /dev/stderr | festival --tts' ...