The challenge of clicking the "ShowMore" button in an infinitely scrolling page is a common issue when using Selenium with Python

I am currently working on extracting Mobile Legend comment data from the site using web scraping. I need my bot to automatically scroll down and load as many comments as possible before scraping all of them.

However, I encountered an issue when the bot tries to click the "Showmore" button after infinite scrolling. It seems that after the second click on the "Showmore" button, an error is triggered ([7200:8128:0903/172837.024:ERROR:gpu_init.cc(441)] Passthrough is not supported, GL is disabled) causing the loop to break.

 
            // Python code for web scraping
        

This is a snippet of the terminal output:


            WebGL Activated
            [5296:3760:0903/174720.883:ERROR:device_event_log_impl.cc(214)] [17:47:20.883] USB: usb_device_handle_win.cc:1048 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
            [5296:3760:0903/174720.889:ERROR:device_event_log_impl.cc(214)] [17:47:20.889] Bluetooth: bluetooth_adapter_winrt.cc:713 GetBluetoothAdapterStaticsActivationFactory failed: Class not registered (0x80040154)
            Click Showmore 0
            [9052:7864:0903/174913.082:ERROR:gpu_init.cc(441)] Passthrough is not supported, GL is disabled
            ------Scroll Finish-------
            Click ShowMore Counts = 1
        

Answer №1

My strategy involves using the END key 8 times and clicking on "show more" to reveal additional comments. There are quite a lot of comments, so let me know if you require assistance with scraping them.

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
from selenium.common.exceptions import NoSuchElementException
import time

url = "https://play.google.com/store/apps/details?id=com.mobile.legends&showAllReviews=true"

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get(url)

while True:
    for _ in range(8):
        actions = ActionChains(driver)
        actions.send_keys(Keys.END).perform()
        time.sleep(1)
    
    try:
        driver.find_element_by_class_name("CwaK9").click()
    except NoSuchElementException:
        break

Answer №2

This code is quite lengthy, but it will keep scrolling down as long as the showmore_count variable remains in place.

I have also included relevant comments to explain the code.

from selenium import webdriver
import time

driver = webdriver.Chrome(executable_path="path")
driver.maximize_window()
driver.implicitly_wait(10)
driver.get("https://play.google.com/store/apps/details?id=com.ucool.heroesarena&showAllReviews=true") # I had issues with the provided URL so I used this one instead.
time.sleep(5) # Wait for the page to fully load
j=0
showmore_count = 1 # This stops the script from continuous scrolling
try:
    while True:
        reviews = driver.find_elements_by_xpath("//div[@jsname='fk8dgd']/div")
        
        driver.execute_script("arguments[0].scrollIntoView(true);", reviews[j])
        driver.execute_script("window.scrollBy(0,-50)")
        print("{}: {}".format(j+1, reviews[j].find_element_by_xpath(".//span[@class='X43Kjb']").text)) # Display the reviewer's name
        j += 1
except IndexError: 
    while driver.find_element_by_xpath("//span[text()='Show More']").is_displayed() and showmore_count <=2:
        driver.find_element_by_xpath("//span[text()='Show More']").click()
        print("Clicked Show more {} time".format(showmore_count))
        showmore_count+=1
        time.sleep(5)
        try: 
            while True:
                reviews = driver.find_elements_by_xpath("//div[@jsname='fk8dgd']/div")
               
                driver.execute_script("arguments[0].scrollIntoView(true);", reviews[j])
                driver.execute_script("window.scrollBy(0,-50)")
                print("{}: {}".format(j, reviews[j].find_element_by_xpath(".//span[@class='X43Kjb']").text))
                j += 1
        except:
            pass
except Exception as e:
    print(e)

driver.quit()

The output of this script will look something like the following:

1: Reviewer name1
2: Reviewer name2
3: Reviewer name3
...
520: Reviewer name520

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

In Java with Selenium, I used the command list.clear() to empty the list that was previously storing the result I was trying to keep in a

As I attempted to store country and state values in a map with key-value pairs while looping through each country to obtain the list of states, I encountered an issue. Every time I selected a new country, I tried to clear the existing list using list.cle ...

Having difficulty capturing Selenium browser errors within a try/except block

Currently, I am in the process of developing a Python script that is designed to test my ability to successfully log into a URL and confirm the presence of certain elements on the page. One key aspect of this project involves capturing logs from Selenium ...

How to deselect a single option in a dropdown menu with Python Selenium WebDriver

Having trouble deselecting a selected option after using webdriver to select it? I'm encountering an error that says NotImplementedError("You may only deselect options of a multi-select") NotImplementedError: You may only deselect options of a multi-s ...

Retrieve an element using Selenium's find_element

While working with find_element_by() in Selenium version 3.5, I found that the syntax for find_element has changed. There are now two ways to specify elements: using find_element(By.ID, "id-name") and find_element("id", "id-name& ...

Tips for troubleshooting a testcase in testng when utilizing a framework that includes page factory

Having trouble debugging a web app test case created using Selenium, testing, and Eclipse. The page object classes are all set up with elements and service methods. These page object classes are being utilized in the test classes. However, one particular t ...

Converting HDF5 data from MATLAB to a Pandas DataFrame using Python

I am facing an issue while trying to load .mat files with HDF5 data into Python as a Pandas DataFrame. Here is my current code: f2 = h5py.File("file.mat") f2['data'] Upon running this, I get the following output: <HDF5 dataset "data": shape ...

Utilizing decorator functions to yield the return of a wrapper

I have a question regarding the process of returning the wrapper in this code snippet. Why is it necessary to return the wrapper and where exactly is it being returned to? I understand that when I return the wrapper, it returns a pointer, but I'm conf ...

Error in main thread: The program was unable to find the specified element

Issue encountered in the main thread: org.openqa.selenium.NoSuchElementException: The element being searched for cannot be found: {"method":"css selector","selector":".ui-state-default.ui-state-highlight.ui-state-active"} Encountering an error when using ...

What is the most suitable data structure for storing an array of dictionaries?

Looking to create a data structure that follows this format: { key: {k: v for k in range(fixed_small_number)} for key in range(fixed_large_number) } I am taking an "eclectic" approach, adding one item at a time to a random k for a random key. Thi ...

Using Python to iterate through various pages on an API and extracting specific data before printing it

Hey there! I'm new to programming and practicing my skills. I've been exploring the Star Wars API, which contains data on characters, films, planets, and more from the Star Wars universe. Right now, I'm working on looping through all the pag ...

Issue with saving GIF files when using ImageIO/PIL

My current project involves using ImageIo and PIL to save a series of images as a GIF file. While the images are indeed saved in .gif format, they do not playback as a seamless "video" GIF. gif_images[0].save('path/test.gif', save_all=True, appen ...

Utilizing Django fixtures: Importing HTML content as JSON into a Text Field

Currently in the process of transitioning my website to Django from another content management system, I am facing challenges with importing HTML into Django using fixtures. The specific issue revolves around "Events", which is a model within my Django app ...

Issue encountered while attempting to import vincent library into Python 3

Can someone help me figure out why I am having trouble importing vincent? https://i.stack.imgur.com/X3Jrj.png ...

I require assistance with figuring out how to trigger a scroll up event

I've been struggling to find a solution in Python using Pygame that allows me to detect scrolling up. My goal is to recreate the experience from the popular video "Interstellar Mouse" https://www.youtube.com/watch?v=aANF2OOVX40 I want to trigger the ...

Can someone show me how to find and select the start button with selenium in python coding?

<div id="start" class="btn btn-huge btn-success"><i class="fas fa-power-off"></i> Start</div> This snippet of HTML code is found within the source of a website, representing a 'Start' button. ...

What is the best location to place the "get current URL" method while implementing page objects in Selenium?

When attempting to confirm whether the current URL matches the homepage URL, is it recommended to include the logic for retrieving the current URL directly within the test method, as shown in let currentUrl = browser.getCurrentUrl();, or should this logic ...

employing a robotic tool to download a webpage under various file names

I am currently utilizing selenium to develop a robot that automates the process of opening and saving a webpage. Here is the code snippet: WebDriver driver = new FirefoxDriver(); driver.get("http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber= ...

Learn the process of extracting keys and values from a response in Python and then adding them to a list

I have utilized Python code to fetch both the key and values in a specific format. def test() data={u'Application': u'e2e', u'Cost center': u'qwerty', u'Environment': u'E2E', u'e2e3': u ...

Verifying the date displayed using Java in Selenium

I am attempting to retrieve the date in a specific format to validate if it follows MM/dd/yyyy HH:mm. I've attempted: element.getAttribute("value") but it returns "yyyy-MM-ddTHH:mm" which does not match the UI display. Furthermor ...

An effective method for removing various stems from the end of a word is to utilize slicing techniques

While I am aware of tools like NLTK that can assist with this task, I am interested in learning how to efficiently extract multiple stems from a list. Let's consider the following list of words: list = ["another", "cats", "walrus", "relaxed", "annoy ...