Questions tagged [dataframe]

A 2D tabular structure known as a data frame is commonly used to store data. The rows represent observations, while the columns contain variables that can have different types. This differs from an array or matrix. Various programming languages like R, Apache Spark, deedle, Maple, Python's pandas library, and Julia's DataFrames library refer to this concept as "data frame" or "dataframe." However, MATLAB and SQL choose to use the term "table" to describe the same idea.

← Prev 1 2 3 4 Next →

What is the best way to replace a segment of a pandas dataframe with another?

Suppose I have two DataFrames, df_a and df_b. I am looking to swap lines 42 through 51 of df_a with the corresponding rows from df_b (same number of rows, but more columns than df_a). The code I am currently using is df_a.loc[45:52,:] = df_b.loc[45:52," ...

Questions tagged [dataframe]

What is the best way to replace a segment of a pandas dataframe with another?

Learning to extract information from space-delimited data with varying row types and numerous missing values

convert all the characters to lowercase in JSON

Converting serial data into columns within a pandas dataframe through row transformation

Creating a dataframe with fixed intervals - Python

Is there a way to calculate the percentile of a column in a dataframe by only taking into account the values that came before

Create a new boolean column by comparing values in two different columns from separate DataFrames

The chunk size is not initiating from the initial row in the CSV file

What is the procedure for swapping out a value/phrase in a column within a data frame?

Generate sections to tally the different genres present in films

Generate a dataframe by combining several arrays through an iterative process using either a for loop or a nested loop in

Converting a flattened column of type series or list to a dataframe

Combining three PySpark columns into a single struct

Finding corresponding elements between a list and dataframe in Python

What is the best way to iterate through a series of dataframes and perform operations using a for loop?

Looking to iterate through a dataframe and transform each row into a JSON object?

How can JSON data that is disorganized and undefined be properly transformed into a DataFrame?

Updating missing values in a DataFrame row by replacing them with values from different rows that match a specific column value

Performing mathematical operations with Pandas on specific columns based on conditions set by other columns in the dataset

What is the best way to compress or combine a pandas dataframe vertically?

Steps to create a histogram using a dataframe

Utilize designated columns to create a new column by associating it with JSON data

Pandas encountered a ValueError while attempting to add a new column, as it cannot reindex from a duplicate axis

Ways to merge two dataframes of varying lengths when both have datetime indexes

Calculate the sum of each column in a pandas dataframe using user-defined functions in Python

Tips for implementing an IF statement within a FOR loop to iterate through a DataFrame efficiently in Python

What is the method to add a value based on two specific cells in a row of a Dataframe?

Navigating through monthly and yearly data in an extensive Python Pandas dataframe

Examining data within individual groups in a Python dataframe and making comparisons

Counting and labeling cumulative totals with Pandas

Combine, or blend two pandas dataframes into one

Transforming Dictionary Data in JSON DataFrame into Individual Columns Using PySpark

Manipulate DataFrame in Python using masks to generate a fresh DataFrame

Generating a dataframe with cells that have a combination of regular and italicized text using the sjPlot package functions

Is there a way to fetch values from a dataframe one day later, which are located one row below the largest values in a row of another dataframe of identical shape?

Analyzing data distribution using Python

Saving a Python pandas dataframe to a CSV file with line breaks represented as text

Is there a way for me to generate a tally of the frequency of each number's occurrence in column B?

transform a JSON file containing multiple keys into a single pandas DataFrame

Guide to establishing a connection to CloudantDB through spark-scala and extracting JSON documents as a dataframe

The function Dataframe.reset_index() fails to operate properly following a concat operation

Decrease the index's level

When attempting to sort, the rows in my pandas dataframe suddenly become rearranged

I am having trouble assigning a value to a specific position in a dataframe

What is the best way to rearrange a lengthy string of dates and timestamps that have been combined with commas using Python?

What is the best way to extract all labels from a column that has been one hot encoded?

Creating a new column in a Pandas dataframe that contains a list of values based on the repetition of rows in another

Tips for organizing a dataframe with numerous NaN values and combining all rows that do not begin with NaN

What is the best method for sorting through data entries using only designated time parameters?

A Fresh Approach to Altering Dictionary Organization

Displaying dataframes with Pandas

What is the best way to transform a series of probabilities into binary values of 0 and 1?

struggling to open a csv file using pandas

Combining DataFrames in Pandas with custom weights

Converting JSON arrays into structured arrays using Spark

Steps for creating a new dataset while excluding specific columns

Combine pandas rows by their values and missing data cells

Utilizing re.sub for modifying a pandas dataframe column while incorporating predefined restrictions - **Highly Beneficial**

Generating a new column by applying a condition to existing column values

Transform a folding process into a vectorized operation within a dataset

Improving the efficiency of cosine similarity calculations among rows in a dataframe

A guide on setting custom boundaries for the age column in a Python dataframe

What is the best way to sort a loop and keep the data for future use?

Compare three columns in Pandas and display the outcome if the count exceeds one

What is causing the dtype to be "object" even though all columns consist of float64 and int64 data

Showing the names of the columns in a Pandas dataframe for individual rows based on a specific condition

Issue with unnamed column in Pandas dataframe prevents insertion into MySQL from JSON data

Substitute values in a dataframe using specific index positions from a separate list

The Pandas DataFrame is displaying cells as strings, but encountered an error when attempting to split the cells

What is the most efficient way to iterate through a list of URLs containing JSON data, transform each one into a dataframe, and then store them in individual CSV files?

Combine dataframes while excluding any overlapping values from the second dataframe

How can I calculate the difference between the values in two columns in pandas using python and store the results in a new column?

Combine a column in pandas when all rows are identical

Creating a Pandas DataFrame from Scraped Code with bs4/selenium in Python: A Step-by-Step Guide

Guide to obtaining the total number of a category depending on a different category and visualizing the outcome

Utilizing re.sub for modifying a pandas dataframe column while incorporating predefined restrictions - Highly Beneficial