Broaden and derive the content

Do you think it's achievable to click on all the + buttons and retrieve text values using Selenium (in Python) from this page containing codes for literature genres in Russian? Keep in mind that the options need to be fully expanded, covering multiple levels.

Answer №1

If you find yourself with some time on your hands, consider brewing a cup of tea or catching up on an episode of your favorite Netflix series ^^ Here's the code that should do the trick:

from selenium import webdriver
import time

d = webdriver.Firefox(executable_path="PATH TO GECKODRIVER")
d.get("http://bbk.rsl.ru/external/bbk?block=ETALON")
time.sleep(1)

# Function to identify and expand all collapsed subnodes within a node
def search_sub_nodes(found_elements):
    sub_elements = d.find_elements_by_class_name("tree_node_status_collapsed")
    for sub_element in sub_elements:
        if sub_element not in found_elements:
            sub_element.click()
            found_elements.append(sub_element)
#            time.sleep(0.3) # use time.sleep as needed for loading
            search_sub_nodes(sub_elements.copy())

# Main Function to expand all nodes, invokes search_sub_nodes to access child nodes
def open_all_tree_nodes():
    elements = d.find_elements_by_class_name("tree_node_status_collapsed")
    for element in elements:
        element.click()
#        time.sleep(0.3) # use time.sleep as needed for loading
        search_sub_nodes(elements.copy())

open_all_tree_nodes()
# open_all_tree_nodes() # run a second time to ensure all nodes are opened

titles = []
# Locate all tr elements under class 'node_own_area' (these contain titles)
elements = d.find_elements_by_css_selector(".node_own_area tr")
for element in elements:
    # retrieve the title property and add it to the titles list
    titles.append(str(element.get_property("title")))

# print(titles) debug
# write titles to txt file
with open("title_file.txt", "w") as f:
    f.writelines(titles)

This code could probably be optimized further, but I'm no expert in Python ^^

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

DRF: Avoid exposing the API

Utilizing Django in conjunction with Django Rest Framework, I have made the browseable api inaccessible in the settings.py file. Despite this configuration, when I navigate to http://example.com/api - with "example.com" representing my domain - I encounter ...

What should you do to move to the next line or block if a specific element cannot be located?

Hello there, I am Hugo. Currently, I am scraping a website that lacks a 'next page' button. To navigate through the pages, I am manually changing the page number in the URL. I have implemented a loop to cycle through a list of URLs within the cod ...

Exploring the capabilities of Selenium 4 for adjusting column sizes in C#

In my website testing, I have been using Selenium 3.141 and now want to upgrade to the latest version, Selenium 4.4. However, I am facing issues with resizing columns in a table due to changes in Selenium Actions. Below is the code that worked for me in S ...

BeautifulSoup fails to detect tables within webpage

I am struggling to extract the data from the first table on a website. Despite attempting various solutions found here, I have been unsuccessful in locating the table and consequently retrieving the data within it. The methods I have tried are as follows: ...

Tips on extracting data from CSV columns for PyTest test cases

Our system has a utility that interacts with APIs and saves the responses to a CSV file called resp.csv. This CSV file contains the API request in column A, headers in column C, payload in column B, response body in column D, and response code in column E ...

A guide to testing XSS vulnerability with Selenium WebDriver

I'm currently working on creating automated tests for my application. One test idea I have is to check for Cross-Site Scripting (XSS) vulnerabilities, specifically looking for script/markup injection. Let's say there's a field that takes te ...

What is the process for converting JSON output into a Python data frame?

I have a JSON file that I need to convert into a Python dataframe: print(resp2) { "totalCount": 1, "nextPageKey": null, "result": [ { "metricId": "builtin:tech.generic.cpu.usage", ...

Just getting started with the selenium package, experimenting with entering login information

Hello everyone, I need some input to help me troubleshoot an issue I'm having with my code. I am trying to log in to a specific website using Selenium, but I keep encountering this error message: An Exception has occurred: AttributeError 'list ...

Discover the solution by utilizing XPath

I am struggling to extract data from an HTML table: <div class="parameters"> <div class="property">property 1</div> <div class="value">value</div> </div> <div class="paramete ...

A guide on extracting links from a webpage using selenium

I've been attempting to extract the links that end with 20012019.csv from a specific webpage using the provided script, but I keep encountering a timeout exception. I believe I have followed the correct approach for this task. Nevertheless, I would g ...

Utilizing the `contains` CSS Selector in Selenium WebDriver to Retrieve Elements

When attempting to utilize CSS Selector with contains() in Selenium WebDriver, I encountered a NullPointerException. Could anyone provide assistance with this issue? private static final String testcode = "p:contains('Notes')"; public String g ...

Encountered an issue while parsing JSON data: Error message indicates that string indices must be integers

Would appreciate some assistance with parsing the Json. The image displaying the json structure is attached below for reference. The error message I keep encountering is: print("title: "+json_data["title"]) TypeError: string indices must be integers Her ...

Pandas effortlessly transforms strings into date values

Working with a csv file that contains values in the format of 1/2, 2/1, 3/1, and so on. Upon loading the csv into a pandas data frame, the values are automatically converted to: 01-Feb, 02-Jan, 03-Jan, etc. How can I load this csv into a dataframe where ...

What can I do to alter this Fibonacci Sequence Challenge?

I'm currently exploring ways to modify the code below in order to address the given question. My goal is to enhance the functionality so that I can also take 3 steps at a time, in addition to 1 or 2 steps. You are faced with a ladder consisting of N ...

What is the process for assigning an ID to an object by clicking on an adjacent link in Django?

Trying to set an identifier for a CSV file's name so that clicking on a link next to it will lead to viewing the file itself on another webpage. Encountering the error message 'QueryDict' object has no attribute 'objects' Please ...

What could be causing my code to loop twice when utilizing the keyboard?

Currently, I am developing a script that will notify the user when a specific sequence of numbers is inputted. The functionality appears to work correctly, however, it returns "1122334455" instead of the expected output of "12345": import sys sys.path.app ...

Remove all duplicate lists from JSON data

Currently seeking more information about a particular issue I am facing. I have already explored JSON encoding/decoding, but it did not provide the exact solution I need. I am looking for an efficient way to extract data from a list like this: //response ...

The order in which the test cases are executed by TestNG has been altered with the latest update to version 6.14

After updating the TestNG version to 6.14.2, I started encountering issues with running sequences that were not present when using version 6.8.8. Despite trying various solutions such as changing priorities, the tests did not run as expected. For more deta ...

Calculate the mean value of several columns using pandas

What is the best way to calculate the average of multiple columns? Gender Age Salary Yr_exp cup_coffee_daily Male 28 45000.0 6.0 2.0 Female 40 70000.0 15.0 10.0 Female 23 40000.0 ...

Is there a way to retrieve this data from a webpage using Python, Selenium, and ChromeDriver?

<div class="flexible row ng-scope"> <!-- ngRepeat: graph in graphs track by $index --><figure class="figure-gauge flexible column ng-scope" data-ng-repeat="graph in graphs track by $index"> <figcaption class="rigid"> ...