resend requests at specified intervals

I'm currently learning from this insightful guide

Utilizing session.mount, I have the capability to instruct requests to perform multiple retries. However, it appears challenging to regulate the time interval between each request.

Currently, I am resorting to methods like the one below:

    for retry in range(1, 5):
        logging.warning('[fetch] try=%d, url=%s' % (retry, url))
        try:
            resp = requests.get(url, timeout = 3)
            data = resp.text    
        except Exception as e:
            logging.warning('[try=%d] fetch_list_html: %s' % (retry, e))
            pass

        if data is False:
            time.sleep(retry * 2 + 1)
        else:
            break

Are there more effective solutions available?

Answer №1

As per the source code of urllib3.util.retry, you have the option to adjust the backoff_factor in order to control the delay between retry attempts:

:param float backoff_factor:
    A backoff factor to apply between attempts after the second try
    (most errors are resolved immediately by a second try without a
    delay). urllib3 will sleep for::
        {backoff factor} * (2 ^ ({number of total retries} - 1))
    seconds. If the backoff_factor is set to 0.1, then :func:`.sleep` will rest
    for [0.0s, 0.2s, 0.4s, ...] between retries. The maximum length of the delay
    is defined as :attr:`Retry.BACKOFF_MAX`.
    By default, backoff is turned off (set to 0).

In the link provided, the backoff_factor was configured at 0.3, which might be inadequate for your needs. Therefore, consider adjusting it to 1. With this setting, urllib3 will rest for intervals like [0s, 2s, 4s, ...], while ensuring it doesn't exceed 120 seconds.

Answer №2

Timeouts are a unique type of exception that can be handled differently based on the exception type:

http://docs.python-requests.org/en/master/user/quickstart/#timeouts

Essentially, you can handle timeouts using the following code:

for retry in range(1, 5):
    logging.warning('[fetch] try=%d, url=%s' % (retry, url))
    retry_because_of_timeout = False
    try:
        resp = requests.get(url, timeout = 3)
        data = resp.text    
    except Timeout as e:
        logging.warning('[try=%d] fetch_list_html: %s' % (retry, e))
        retry_because_of_timeout = True
    except Exception as e:
        logging.warning('[try=%d] fetch_list_html: %s' % (retry, e))
        pass

    if retry_because_of_timeout:
        time.sleep(retry * 2 + 1)
    else:
        break

Consider refactoring this code to handle exceptions better and use them to control flow within a subfunction instead of relying solely on conditional statements.

[Edit - Thanks to Adam for catching a typo in the code]

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Issue encountered while attempting to download the most recent version of chromedriver

Currently, I am using Chrome version 115.0.5790.110 and Selenium version 4.10.0. Below is my Python code snippet: from selenium import webdriver from selenium.webdriver.chrome.service import Service from webdriver_manager.chrome import ChromeDriverManager ...

Tips for parsing a data frame with multiple JSON values in a single column

After loading a JSON file into a dataframe using pd.read_json, I noticed that one of the columns named Info contains data in this format: {'name': 'john', 'lname': 'buck', 'address': '101 N state' ...

Utilize a specialized Python module within various modules independently

I have developed a custom Python module called awesome-lib.py, which needs to be imported and utilized by multiple other Python modules (module1.py, module2.py, etc). The challenge lies in the fact that all of these modules must reside in separate folders, ...

Using Python's Selenium library, implement a WebDriverWait class with dual

Upon loading a page, one of two outcomes will occur: either the css-can-click div class will be present or the css-donot-click div class will be present. css-can-click css-donot-click If the css-can-click class is present, then I intend to click on that ...

Generating individual rows for every item belonging to a user within a Spark dataframe

I have a dataset in Spark that looks like this: User Item Purchased 1 A 1 1 B 2 2 A 3 2 C 4 3 A 3 3 B 2 3 D 6 only showing top 5 rows Each user has a row for an item they purchased, with the 'Purchased' colum ...

Selenium is unable to retrieve text from a webpage

I'm attempting to scrape text from a specific section of this webpage https://i.stack.imgur.com/SeFWD.png Specifically, I am interested in extracting the content between the tags. Despite using Selenium and the code below, the text returned is empty ...

Guidance on utilizing a Python virtual environment within my Ubuntu terminal?

I am endeavoring to bundle my project utilizing a virtual environment for easier implementation. My goal is to achieve this in an Ubuntu bash environment. Successfully, I have created a Python venv by leveraging the Python virtualenv library. Activating ...

Selenium causes the Chrome browser to instantly shut down upon opening

I am having an issue with my basic Python script that is supposed to open a Chrome window. However, when I run the code, the window appears briefly and then immediately closes. from selenium import webdriver import time browser = webdriver.Chrome(executab ...

Python's function can only be executed once

I am utilizing tkinter as a class with functions. However, I am encountering an issue where running one function leads to another function, which in turn ends up running the first function again. But after this loop occurs, the original function stops wo ...

There are known compatibility challenges between Django's paging object and Postgresql QuerySets

After developing some Django code that functions smoothly on a SQLite or MySQL database, I encountered issues when using Postgres. It's baffling to me that no one else seems to have faced this problem before. It seems like the trouble might be linked ...

What is the best way to navigate through a structure in an iterative search?

[PYTHON 3.7] A New Approach to Backend Query: Exploring the potential of converting backend code from a LADDER logic program into data extracted from an XML file. Starting with STEP S1, envisioning a tree that signifies a sequence. The objective is to s ...

A step-by-step guide on splitting by double quotes in Python

I have a string that I need to convert, for example, from STRING to ['S', 'T', 'R', 'I', 'N', 'G']. I've attempted using the following methods: o.split('') and o.split(). Could you ...

"What is the process for creating a single line border around a pandas DataFrame using df.to_html method

After exporting a pandas dataframe as an HTML table to Outlook, I noticed that the double cell borders are not aesthetically pleasing. I am seeking assistance in changing them to a single line border. Please refer to the attached screenshot and code snip ...

Having Trouble with URLs in Tango with Django Chapter 6

I've been working on "Tango with Django" and I'm stuck on a problem that I can't seem to solve despite searching online. Can anyone provide guidance on how to approach it? When clicking on the relevant link, the expected page should open, b ...

Filtering JSON text from Twitter using Python

After utilizing the streaming API from Twitter, I successfully retrieved JSON formatted data. However, I encountered a challenge in filtering out specific keywords from this dataset using Python. Here is the approach I took: To begin with, I initialized e ...

What is causing the error when attempting to install xlwings via the terminal?

Recently, I've started learning Python and embarked on a personal project. I wanted to install xlwings to execute Python code from Excel but I encountered issues with the installation process. Here's what I attempted: C:\Users\Rafi> ...

Running Google App Engine (GAE) within a Docker container proves challenging as port

I am managing a GAE application with multiple team members, and I want to simplify the setup process for everyone by running the GAE development server in a docker container. My Dockerfile includes the following command: CMD dev_appserver.py app_localhos ...

What is causing this malfunctioning python if statement?

As a beginner in Python, I encountered an issue when attempting to run a code snippet immediately following the number 5. Here is the code snippet: if 2 < 5 print('Five is greater than two') I attempted to remove all unnecessary spaces to ...

What is the best way to utilize selenium and python to send a series of keywords in a loop for conducting searches

My current task involves sending a single keyword to a search box and clicking the search button using selenium. The code below successfully achieves this for a single keyword. page = driver.get('my_url') searchbox = driver.find_elements_by_name ...

Generate a personalized report for the Behave Python framework, requiring retrieval of both the Scenario Name and its corresponding status

Looking to generate a personalized report for the Behave and Python framework. The goal is to retrieve the Scenario Name and status. Any suggestions on how to accomplish this? Curious if there is an event listener class available in the behave framework t ...