Substitute the values in a data table with their ongoing consecutive sequence

I have been successfully replacing all the numbers in my dataframe with their current positive streak number. However, I find my code to be quite messy as I am doing it column by column and manually mentioning the column names each time. Can anyone suggest a more efficient way to achieve this with just a few lines of code?

If you have any ideas on how to simplify my code, please share them!

df = pd.DataFrame([[9, 5, 2], [-2, 6, -4], [-5, 1, -1], [9, 6, -5], [7, -1, -3], [6, -4, 1], 
              [2, -9, 3]],
             columns=['A', 'B', 'C'], index=[1, 2, 3, 4, 5, 6, 7])

def streaks(df, col):
    sign = np.sign(df[col])
    s = sign.groupby((sign!=sign.shift()).cumsum()).cumsum()
    return df.assign(A=s.where(s>0, 0.0).abs())
df = streaks(df, 'A')

def streaks(df, col):
    sign = np.sign(df[col])
    s = sign.groupby((sign!=sign.shift()).cumsum()).cumsum()
    return df.assign(B=s.where(s>0, 0.0).abs())
df = streaks(df, 'B')

def streaks(df, col):
    sign = np.sign(df[col])
    s = sign.groupby((sign!=sign.shift()).cumsum()).cumsum()
    return df.assign(C=s.where(s>0, 0.0).abs())
df = streaks(df, 'C')

Answer №1

One way to achieve this is by using a single function:

Utilizing the apply method:

def calculate_streaks(cols):
    signs = np.sign(cols)
    streak_count = signs.groupby((signs!=signs.shift()).cumsum()).cumsum()
    return streak_count.where(streak_count>0, 0.0).abs()

df = df.apply(calculate_streaks)

Another option is to tweak your original approach:

def calculate_streaks(dataframe, column):
    signs = np.sign(dataframe[column])
    streak_count = signs.groupby((signs!=signs.shift()).cumsum()).cumsum()
    return dataframe.assign(**{column: streak_count.where(streak_count>0, 0.0).abs()})

df = calculate_streaks(df, 'A')
df = calculate_streaks(df, 'B')
df = calculate_streaks(df, 'C')

You can also modify the DataFrame in place:

def calculate_streaks(dataframe, column):
    signs = np.sign(dataframe[column])
    streak_count = signs.groupby((signs!=signs.shift()).cumsum()).cumsum()
    dataframe[column] = streak_count.where(streak_count>0, 0.0).abs()

calculate_streaks(df, 'A')
calculate_streaks(df, 'B')
calculate_streaks(df, 'C')
print(df)

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

"Configuration options" for a Python function

Not sure what to title this, but look at the example below: def example(): """ Good """ pass If I were to print example.__doc__, it would display " Good ". Is it possible to create additional 'variables' like this for other purposes? ...

How to send a JavaScript variable to Flask and trigger an AJAX reload action

Introduction: I have explored similar inquiries and attempted to apply relevant code/concepts but without success. -https://stackoverflow.com/questions/43645790/passing-javascript-variable-to-python-flask -https://stackoverflow.com/questions/10313001/is- ...

Selenium - incapability to select the subsequent page

I'm experiencing difficulties with clicking the next button (>) and repeating this process until reaching the last page. Despite researching similar issues, I have been unable to identify what's causing my code to malfunction. Below is the co ...

Break up the string without any separators using restricted field names and content limitations

I am facing an issue with a dataframe containing bank mutations. The dataframe has a description column that includes field names and their corresponding content. Here is what it looks like: AAm.loc[0, ’OmsBank'] => ‘ fieldname1: content ...

Exploring Deep Q Learning **WITHOUT** the use of OpenAI Gym

Seeking tutorials or courses on q learning without relying on open ai gym. I am working on a convolutional q learning model using pytorch and open ai gym, which is straightforward. However, applying it to environments outside of open ai gym, especially n ...

ERROR: Cannot call the LIST object

https://i.stack.imgur.com/jOSgE.png Can someone assist me in resolving this issue with my code: "I'm getting an error message saying 'TypeError: 'list' object is not callable'" ...

Display on the terminal screen indefinitely

Recently, I delved into the world of Python and out of boredom, I decided to create a basic password generator. Below is the code snippet: import random upper = "ABCDFGHIJKLMNOPQRSTUVXYZ" lower = "abcdefghijklmnopqrstuvxwyz" numbers = ...

Struggling to retrieve content from a webpage, encountering an unexpected error

I am having trouble understanding why I keep encountering an error. My goal is to extract the description and price of the first five search results from a specific webpage. The code successfully performs tasks like searching for terms in a CSV file, openi ...

Retrieve a button using its name attribute in Selenium with Python

https://i.stack.imgur.com/MbJgQ.png In the HTML code below, there is a button that changes its x-path dynamically while the text "Show Actions" remains constant. I am uncertain whether "Show Actions" is the name, title, or id of the button. Is there a wa ...

Unable to proceed with installation due to a ProtocolError message stating, "Connection aborted" along with a PermissionError of code 13, indicating that the process was denied permission

While attempting to use pip to install numpy or another package, I encountered the following error: PS C:\Users\giuse> pip install numpy WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection b ...

Ways to add up only the numerical digits in a given list by utilizing the decorate method

While attempting to pass a list to the function for summing numbers in a list, I encountered an error 'unsupported operand type(s) for +=: 'int' and 'str''. As a workaround, I decided to create a nettoyage function to filter o ...

How can I properly format the JSON data for a POST request using the Python requests package? Should I include the jsonRequest

My current challenge involves attempting to send a POST request to the USGS EarthExplorer inventory API, beginning with a straightforward log-in process. They provide a handy test page at: This test page showcases some formatting examples. When exploring ...

Boost the efficiency of my code by implementing multithreading/multiprocessing to speed up the scraping process

Is there a way to optimize my scrapy code using multithreading or multiprocessing? I'm not well-versed in threading with Python and would appreciate any guidance on how to implement it. import scrapy import logging domain = 'https://www.spdigit ...

displaying outcomes as 'Indefinite' rather than the anticipated result in the input field

I need to automatically populate values into 4 text fields based on the input value entered by the user. When the user exits the input field, a function called getcredentials() is triggered. This function, in turn, executes Python code that retrieves the r ...

Using Selenium to modify the sorting of Google Maps reviews

I am encountering a fascinating issue with my **Web Scraping** project. My goal is to retrieve the latest **Google Maps reviews**. I specifically need to arrange the reviews based on their date of posting. Although most tutorials I've come across a ...

Is it Possible to Find Multiple Values with Selenium's find_elements Method?

Having the following line in python: driver.find_elements(by=By.TAG_NAME, value='a') Is there a way to modify this line to include additional values like b, c, d, etc...? I attempted this approach but it was unsuccessful: driver.find_elements(b ...

Replacing column values with NaN in pandas using index positions

Within my dataset, I have a data frame with several columns and a Time Stamp index. My goal is to identify a specific range of rows in a particular column based on their index values and then replace them with NaN. It seems like the approach involves combi ...

Encountering a problem while trying to deploy a Django application on Heroku

I'm encountering an error while attempting to deploy my app to Heroku. The error message I receive is as follows: Counting objects: 1907, done. Delta compression using up to 4 threads. Compressing objects: 100% (1894/1894), done. Writing objects: 100 ...

Using Python to Generate Folders within a Google Cloud Storage Bucket

Currently, I am attempting to utilize the Google Cloud Storage Python client library to generate a new bucket that contains two empty folders. Despite referencing the Python client library API for GCS (https://google-cloud-python.readthedocs.io/en/latest/s ...

The positioning of the button's rectangle is inaccurate

For my school project, I am working on an idle clicker game. One issue I encountered is that the button intended for upgrading clicking power has its rectangle placed incorrectly. The rectangle assigned to the button does not align with the actual button i ...