BeautifulSoup does not recognize circular HTML pages

Encountered an issue where the page parsing code consistently checks the same page every time, despite using it alongside selenium. Selenium has no problem opening new links, but the parsing only occurs on the initial page.

The frustrating part is that similar logic works correctly with another website in different code examples.

from bs4 import BeautifulSoup
from selenium import webdriver


keys_list = []

def start_browser(link):
    profile = 'C:\\Users\\Crazy_MoT\\AppData\\Local\\Google\\Chrome\\User Data\\Default'
    options = webdriver.ChromeOptions()
    try:
        options.add_argument(f"user-data-dir={profile}")
        browser = webdriver.Chrome(options=options)
    except:
        print("Connect to profile... Error\n Opening new profile")
        browser = webdriver.Chrome()
        browser.quit()
        #browser.get(link)
        
    browser.get(link)
    html = browser.page_source
    soup = BeautifulSoup(html, 'html.parser')

    author = soup.find("a", attrs={'data-qa': 'FileViewAuthorBox'}, href=True)
    print(author["href"])

    keywords = soup.find_all("span", class_="_oX66p")
    for keys in keywords:
        keys_list.append(keys.text)
    print(keys_list)


def start(links):
    for link in links:
        start_browser(link)


links = ["https://ru.depositphotos.com/26182475/stock-photo-happy-birthday.html",
         "https://ru.depositphotos.com/39273619/stock-photo-label-with-happy-birthday.html"]
         
start(links)

Attempting to gather data from various pages on the site, but information retrieval is limited to the first page and then repeats itself.

Answer №1

After thorough testing, it was found that the code is functioning correctly. Interestingly, no errors were detected when running on different devices.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Element not found: {"method":"name","selector":"markUp"}

Snippet : System.setProperty("webdriver.chrome.driver", "C:\\Users\\pkshirs3\\SeleniumMaterial\\chromedriver.exe"); WebDriver webDriver = new ChromeDriver(); String urlToBeUsed = "internalURL"; webDri ...

Improving efficiency of lengthy conditional statements in Python

I am working on a loop with an if statement that assigns a specific country name to a variable based on certain conditions. These conditions involve checking if the parameter contains the country name within a list of paths. The paths can vary with the co ...

What is the best method for extracting all links from a webpage using selenium?

Currently, I am attempting to scrape a website using Python, Selenium, and Beautifulsoup. However, when trying to extract all the links from the site, I keep encountering an issue with invalid string returns. Below is my current code snippet - could some ...

Accumulating total over specified time periods

Examine the following dataset: dfgg Out[305]: Parts_needed output Year Month PartId 2018 1 L27849 72 72 2 L27849 75 147 3 L27849 101 248 4 L27849 ...

Learn how to implement hooks that run before committing changes and others that run before pushing code using pre-commit

Some hooks can take a while to run, and I prefer running them before pushing rather than before each individual commit (pylint being one such example). I've come across the following resources: Inquiry: Using Hooks at Different Stages mesos-commits ...

What is the best way to save a dictionary with tuple keys into a .txt file?

Is it possible to save a dictionary containing tuples as keys into a .txt file? The program works fine with integer keys, but encounters an error when using tuple keys. dict = {(1, 1): 11, (2, 2): 22, (3, 3): 33, (4, 4): 44, (5, 5): 55, (6, 6): 66, (7, 7) ...

pyodbc connection makingit through direct pyodbc connnection but not working with sqlalchemy

Why is there a difference in how these two methods establish a connection to the SQL Server? sql_server = 'myserver.database.windows.net' sql_database = 'pv' sql_username = 'sqladmin' sql_password = 'password1' sq ...

The email confirmation feature in Flask-Security using Flask-Mail fails to send unless changes are made to the flask_security/utils.py file

Recently, I came across an issue while working with Flask-Security where the confirmation email was not being sent successfully. After some troubleshooting, I managed to resolve it by removing a specific line in flask_security/utils.py. The problematic lin ...

retrieving a colorbar tick from matplotlib that falls beyond the dataset boundaries, intended for use with the

I am attempting to utilize a colorbar to label discrete, coded values shown using imshow. By utilizing the boundaries and values keywords, I am able to achieve the desired colorbar where the maximum value is effectively 1 greater than the maximum data valu ...

Encountering a 503 error after deploying Flask on CloudRun

I encountered an issue while trying to deploy a simple Flask app that utilizes Google Bucket. Upon deployment, I kept getting error 503 - Service Unavailable. I'm unsure if I missed something crucial or what exactly I might be doing wrong. Any assist ...

Create a fresh dictionary by using a for loop to iterate through the keys of an existing dictionary

For my python programming assignment, I have a starting dictionary called aa2mw. This dictionary contains keys representing various amino acids along with their respective molecular weights. Here is what the dictionary looks like: aa2mw = { 'A&ap ...

Filtering another dataframe based on a specified range of hours

I need to filter my dataset based on 3-hour intervals, starting at 0000hr, 0300hr, 0600hr, and so on. An example of the dataset: Time A 2019-05-25 03:54:00 1 2019-05-25 03:57:00 2 2019-05-25 04:00:00 3 ... 2020-05-25 03:54:00 ...

Unable to locate and interact with a specific element within a dropdown menu while utilizing Protractor

I am currently facing an issue with selecting a checkbox from a dropdown in jq widgets. The code seems to work fine when the element is visible on the screen, but fails otherwise. I have tried various methods such as executeScript and scrollIntoView to bri ...

I'm encountering a problem while attempting to upload an image in tkinter using a label

Whenever I attempt to upload an image using tkinter, I encounter an error. I have fetched a gif image and stored it in a file named "image.gif", and saved the picture as "image". #Displaying image using label logo = PhotoImage(file="img.gif") w1 = Label(n ...

There seems to be an issue with the screenshots in Extent Reports when using

I attempted to enhance my extent reports by adding screenshots. I successfully captured the screenshots, saved them locally, and then attached them to the report. However, when checking the report, I noticed that the attached screenshot appears broken. B ...

Generate a list of JSON strings using a collection of Pydantic objects

I have been working on extracting a list of JSON strings from a collection of Pydantic objects. My current approach involves the following code snippet: raw_books = [json.dumps(x.dict()) for x in batch.books] While this method functions properly, it tends ...

Combine video using ffmpeg (or similar tools) while incorporating additional elements (such as overlays)

I am currently in the process of scraping clips from Twitch and combining them into a single video file. I have successfully managed to scrape Twitch clip links, but I am only able to get 16-20 videos due to the need to scroll with Selenium. While this is ...

Combining Slider with FuncAnimation in Python causes frames to overlap in the figure

In my 3D scatter plot animation, everything runs smoothly until I introduce a slider to the figure. At that point, new frames are drawn on top of older ones, resulting in overlapping frames (see attached image). Due to my limited understanding of how FuncA ...

Check the preview of a music score generated from a MIDI file using Python

Is there a way to generate a png image of a score from a MIDI file using Python? I am aware that MuseScore can convert MIDI files into scores, so theoretically this should be possible. Currently, I am using the lilypond functions !midi2ly and !lilypond - ...

What is the best way to display data from an Excel spreadsheet on my website?

I'm faced with a challenge in my Excel spreadsheet- I have a mix of numbers and text that I want to turn into graphical representations on a webpage. I'm considering using Python or PHP to achieve this as HTML alone doesn't seem up to the ta ...