Scanning through the correct sequence of WhatsApp web chat list using the Selenium WebDriver

Is there a way to programmatically retrieve all chat divs in WhatsApp Web in the order they are displayed? Currently, using

driver.find_elements_by_class_name('_210SC')
seems to only fetch the first 20 or so chats in no particular sequence. It appears that the chats are generated dynamically.

When attempting to select specific chats by index, such as chats[0].click() for the 1st chat and chat[1].click() for the 46th chat, the results are inconsistent as the order changes with scrolling and re-executing the query.

Is there a method to retrieve the chats exactly as they appear on the screen, ensuring that chats[0] corresponds to Mike and chats[1] to George, for instance? What is the underlying reason for this behavior?

Answer №1

Whatsapp Web uses a lazyloaded react app structure. It currently displays 21 elements on the screen, however, this number may vary based on the screen size. The order in which the elements are displayed is from top to bottom - starting with the most recent entry at the top, followed by 20 entries in reverse order, meaning

chat[0] > chat[20] > chat[19] ... chat[1]

To efficiently navigate through these elements, I would recommend fetching the first 21 elements, scrolling down to the last element (which should be at chats[1]), fetching again, and repeating this process until no new divs remain. It would also be beneficial to keep track of the chatters you have already fetched, possibly by evaluating their XPath using

//*[@id="pane-side"]//div[@class='_210SC']//div[@class='_3dtfX']//span[@class='_3ko75 _5h6Y_ _3Whw5']
to retrieve their names.

Answer №2

I managed to figure out a technique for saving the entire contact list. While I acknowledge that there may be more efficient methods available, this approach seems to get the job done:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import time 
from selenium.webdriver.common.keys import Keys 

#navigate to WhatsApp Web and scan QR code
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://web.whatsapp.com/')
time.sleep(15)

#click on search bar
search_field = driver.find_element_by_xpath('//div[contains(@class,"copyable-text selectable-text")]')
search_field.click()
time.sleep(3)

#scroll down to access contact list
search_field.send_keys(Keys.ARROW_DOWN)
time.sleep(3)

#retrieve elements by class + continue scrolling down 
while True:
    contacts = []
    contact_title = driver.find_elements_by_class_name('_3Dr46')
    selected_contact = driver.find_element_by_xpath('//div[@aria-selected="true" and @role="row"]')
    for i in contact_title:
        contacts.append(i.text)
    selected_contact.send_keys(Keys.ARROW_DOWN)
    time.sleep(1)
    selected_contact.send_keys(Keys.ARROW_DOWN)
    time.sleep(1)
    selected_contact.send_keys(Keys.ARROW_DOWN)

I utilized send_keys 20 times due to difficulties with using ActionChains

(from selenium.webdriver.common.action_chains import ActionChains)
, as it was operating too quickly without allowing sufficient loading time for the desired data.

Subsequently, I employed print(len(contacts)) and print(contacts), yielding the following results:

16
['num1', 'num2, 'num3','num4'...]
16
['num1', 'num2, 'num4','num5'...]
16
['num2', 'num3, 'num4','num5'...]

This pattern continues until reaching the end of the scroll bar. I will share further updates, as my next objective is to compile this information into a list containing approximately 200 unique contacts.

I trust that this information proves beneficial and welcome any suggestions for optimizing this process.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Unexpected closure occurred with status 1 using Firefox webdrivers in the context of Watir automation (Ruby on Rails)

My Rails application includes the following gems: gem 'headless' gem 'watir' gem 'webdrivers' gem 'watir-screenshot-stitch' gem 'watir-scroll' I am using Debian 9 and have installed sudo apt-get install f ...

Developing a Python serverless function on Vercel using Next.js is a streamlined

Discovering the capability of using Python to create a serverless function within a Next.js project was truly exciting. Once deployed on Vercel, it seamlessly transforms into a serverless function. Browsing through the documentation, I stumbled upon a str ...

User timeout led to connection failure in the scraping process

When using scrapy to download images, I encountered a timeout error: Retrying <GET http://www/***.jpg> (failed 1 times): User timeout caused connection failure Surprisingly, I was able to instantly download the image with wget. Even though DOWNLOAD ...

Utilize Beautiful Soup, Selenium, and Pandas to extract price information by scraping the web for values stored within specified div class

My goal is to retrieve the price of a product based on its size, as prices tend to change daily. While I succeeded in extracting data from a website that uses "a class," I am facing difficulties with websites that use div and span classes. Link: Price: $ ...

Python's function file.truncate() does not behave as expected and does not actually truncate the file

This is a basic Python program that I have: def print_file(filename): with open(filename,'r') as read_file: print(read_file.read()) def create_random_file(filename,count): with open(filename,'w+', encoding='utf ...

building an administrator profile with django

As I was in the process of setting up an admin user account, my keyboard suddenly stopped working during the password creation. Despite rebooting the system and starting from scratch, the issue persisted. Has anyone else encountered difficulty creating a ...

Tips for implementing Bitwise Exclusive OR on a nested list:

Consider having a list of lists like [[1,2,3],[1,2,3,4]] How would one go about finding the xor of all the elements in the list of lists? For example, the xor of [1^2^3] is 0, and the xor of [1^2^3^4] is 4. Therefore, the resulting list will be [0,4] Th ...

Encountering WebDriver Firefox and Selenium issues - requires the use of Gecko driver

Attempting to execute driver = webdriver.Firefox(capabilities={"marionette":False}) generated the following error message: WebDriverException: Message: Can't load the profile. Possible firefox version mismatch. You must use GeckoDriver instead for Fi ...

Begin the execution of JMeter using a JUnit test

Can a recorded JMeter script, used to test the load of multiple user actions, be integrated into a Selenium/JUnit test case? I want to run the Selenium/JUnit test case with Java and receive performance results in the JUnit report, but most resources only ...

The Rise and Fall of Python: A Study of Ascendance and

Forgive me for the simplicity of my question, but I am looking to display: The total number and types of nodes with 0, 1, 2, or 3 children. The total number of nodes with 0, 1, 2, or 3 parents. Below is the simple script I have written. Thank you. Vicin ...

Employing Python with Selenium to programmatically click on a button with a ng-click attribute and automatically upload

I am completely new to using Python and Selenium, but I have a task that requires me to automate file uploads with Selenium. There is a button that needs to be clicked which will launch a window for selecting the file to upload. Here is the HTML code fo ...

Guide to installing torch through python

Attempting to install PyTorch using pip3 install torch --no-cache-dir resulted in the following error after a few seconds: Collecting torch Downloading https://files.pythonhosted.org/packages/24/19/4804aea17cd136f1705a5e98a00618cb8f6ccc375ad8bfa4374 ...

Discovering the xpath with a certain condition based on text length using Python Selenium

In my quest to extract the specific content I need from countless pages, I have devised a reliable rule that works 99% of the time: //a[@class='popular class' and not (contains(text(),'text1')) and not (contains(text(),'text2&apos ...

Can I choose multiple rows at once in a treeview widget?

Can you select multiple rows in a treeview widget and how do you retrieve the selected rows? I've created a treeview, but I can't seem to figure out how to select multiple rows at once. https://i.stack.imgur.com/lH43J.png If selecting multiple ...

Issue with mismatched dynamic values in Selenium IDE

Just started using selenium ide and encountered an obstacle: I'm dealing with a dynamic value in the Target field for a click command. The value looks like this: XYZ_1234098:out. The numeric part keeps changing. I've attempted to use both cont ...

What is the best way to align text to the center below an image?

I'm currently working on creating a meme generator and here is a snippet of my code: from PIL import Image, ImageDraw, ImageFont import os, random, datetime TEXT_FOR_MEMES = [ 'Taking a breather', 'I\'m smart and you& ...

Attempting to scan through each Reddit headline in order to make a decision on which one to click

I have been attempting to extract the titles of each post in text format, but I've had no luck so far. Each title is enclosed within an h3 tag and my previous approach using this tag has not been successful. Below is the code snippet that I have deve ...

Discover the XPATH for selenium in VBA programming

I am currently facing a challenge with a web page hosted on a secure website. I have attached a snapshot of the specific section I am troubleshooting. The XPATH that identifies the rows of a table (totaling 13 rows) is: //div[@id='Section3'] How ...

Transferring live video feed from NodeJS to Python instantaneously

Running a NodeJS server to capture video stream via WebRTC PeerConnection and the need to transfer it to a python script has brought me here. The decision to use NodeJS was based on its seamless integration with WebRTC, where the 'wrtc' package ...

Looking for assistance with parsing out four numerical values from an HTML scrape using Python

I currently have code that opens a URL and retrieves HTML data into htmlA Within htmlA, I am attempting to extract 4 specific pieces of information: A date Price 1 Price 2 A percentage The section of htmlA where these 4 pieces of information are located ...