Removing rows based on conditions in python

I am facing a multilabel classification challenge.

My goal is to remove rows that have a value of 0 in all 35 columns of the data frame, except for the ['Doc'] column.

Below is an example of the dataframe:

Doc   Big    Small    Int    Bor   Drama
j2     0       0        0      0     0
i9     1       0        1      1     0
ui8    0       0        0      1     0
po4    0       1        0      0     0
po9    0       0        0      0     0

The expected outcome should look like this:

Doc   Big    Small    Int    Bor   Drama
i9     1       0        1      1     0
ui8    0       0        0      1     0
po4    0       1        0      0     0

The following are the rows that need to be deleted:

 j2     0       0        0      0     0
 po9    0       0        0      0     0

To identify and count these rows, follow this code snippet:

rowSums = df.iloc[:,2:].sum(axis=1)
no_labelled = (rowSums==0).sum(axis=0)
print("Number of docs with no label =", no_labelled)

Number of docs with no label = 60

If you're wondering how to delete these 60 rows from the dataframe, feel free to ask for further assistance.

Thank you

Answer №1

If you need to manipulate a specific subset of the dataframe, it's as simple as extracting that subset and assigning it back to the original variable without using the del method:

df =  df.loc[df.iloc[:, 1:].sum(axis=1) > 0, :]
print(df)

Answer №2

To eliminate rows where the sum of columns (excluding the first one) is 0, follow these steps:

df.drop(df[df.iloc[:,1:].sum(axis=1) == 0].index)

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Achieving validation for individual items in Python step by step

I have a program that is functioning properly, but I need it to validate after each number input rather than at the end. How can I modify it to check after each number while maintaining separate functions for each task? Whenever I try to return a number ...

The dropdown menu is unfortunately unresponsive to Selenium's attempts to click on it

Currently, I am working on automating the scraping process for a specific webpage: The main challenge I am facing involves dealing with a dropdown menu: https://i.stack.imgur.com/BCO7y.jpg This is the code snippet related to that particular section: apa ...

What are the steps to switch end-of-line conventions?

My task seemed simple - create a text file using the Unix LF convention at the end of each line. However, when I use pandas .to_csv to accomplish this, it defaults to CR LF instead of just LF. This becomes problematic as my code needs to be used by a cowor ...

pytesser issue with subprocess.Popen

For the past day, I've been attempting to use the OCR module pytesser. I managed to solve a few issues on my own, but one problem remains unsolved. The error message is as follows: H:\Python27>python.exe lol.py Traceback (most recent call las ...

When scraping, the text within dynamically generated content's html tags does not appear as expected

While analyzing the information of products on the Myntra website, such as Title, Discount, and Price, I utilized the same tags I observed while inspecting the page in the Chrome browser and incorporated them into my code. Please see the following code: i ...

Iterate through and append to a list

My current goal is to create a list of URLs named visit_urls that I need to visit. To begin with, I manually provide the first URL to be visited using self.br.get(url). By determining the number of pages on the website, let's say it has 40 pages, I ca ...

What is the best method to verify the presence of jQuery on a webpage using Python/Selenium WebDriver?

Our in-house automated testing framework is built using Selenium Webdriver in Python, with a mapping system linking object names to identifiers on webpages. However, we are encountering issues due to not waiting long enough for AJAX calls to complete, caus ...

Scraping social media followers using web scraping, however, the list is massive with hundreds of thousands. Selenium crashes due to memory overload

After using Selenium in Chrome to gather usernames from a social media profile, I encountered an issue with the limited loading of the page and Chrome crashing due to running out of memory. The list of followers is extensive, reaching hundreds of thousands ...

Calculate Total Expenses using Python

Problem: The cost of a cupcake is represented as A dollars and B cents. Calculate the total amount in dollars and cents that should be paid for N cupcakes. The program requires three inputs: A, B, N. It will output two numbers representing the total cost. ...

Basic paddleball game with an unresponsive ball

I've been learning Python through a course designed for kids, and one of the projects we worked on was creating a simple paddleball game. I managed to get the ball bouncing off the walls earlier, but now it's not working as expected after complet ...

Is there a specific algorithm in Python that is capable of filtering out data points that represent "deep valleys" on a linear slope?

I am faced with a challenge involving a set of datasets, each comprising 251 data points that need to be fitted into a sloping straight line. However, within each dataset, there are approximately 30 outliers that create deep valleys, as illustrated below.v ...

The error message "TypeError list indices must be integers not str" arises when trying to access a

Currently, I am exploring the world of Python and APIs, specifically experimenting with the World Cup API accessible at . This is a sample snippet of the JSON data: [ { "firstName": "Nicolas Alexis Julio", "lastName": "N'Koulou N'Doub ...

Analyzing data using Python regex and saving it in a list

Hey there! I've got a query related to utilizing regex in python. Imagine having a string code comprising multiple lines of formulas with variables or numeric values. For instance: code1 = ''' g = -9 h = i + j year = 2000 month = 0xA d ...

CrispyForms: FormHelper - Easily move the </form> tag to a different location and manage multiple forms from a single model

When utilizing FormHelper and invoking the Form using {% crispy form %}, it generates a Form wrapped within <form> tags. However, my Template is structured into two columns. The first column displays the dynamically generated {% crispy form %}. The ...

Is there compatibility for Python 3 in nolearn/lasagne?

While delving into Neural Net implementation using nolearn.lasagne as detailed in this resource, I encountered an issue: ImportError: No module named 'cPickle' After some investigation, I realized that cPickle is referred to as pickle in Pyth ...

Error: The `math.sqrt()` function encountered a math domain error due to an invalid value

Can anyone help me with a program to check for Herone Triangle in the specified range of tries to max_tries? I'm having trouble with the math.sqrt() function. This is the code I have so far: import math max_tries = 10000 tries = 1 half_perimeter = ( ...

"Creating a correlation matrix using the pymc.LKJCorr function - a step-by-step guide

My goal is to explicitly build a correlation matrix using the pymc.LKJCorr distribution class, but I am uncertain about how to utilize the pymc.expand_packed_triangular function. Below is a simplified demonstration. import arviz as az import matplotlib.pyp ...

A python script designed to retrieve URLs from a webpage

Trying to download an entire playlist for Android development tutorials from YouTube can be quite a task. The use of savefrom.net helped in generating the playlist for download, but facing the issue of handling numerous videos in the playlist. To simplify ...

What is the best way to transfer form data stored locally to a Django view and then store it in the database?

Is there a way to transfer form data from an offline website to a Django view for database storage? I want to fill out a form, save it in local storage, and then upload the data to the database when I reconnect to the internet. Is there a tutorial or can ...

Dividing a sequence of characters at the boundaries where the alphabet system transitions

I am attempting to compile a list consisting of items from only one alphabet, such as the Latin alphabet or Hangul. In this list, the Latin alphabet will always be included while the other may vary. I also want to avoid having blank items caused by spaces ...