Is there a way to decode and convert the nested JSON structure into a readable table format using Python?

Currently, I have a dataframe with a column named test_col that contains JSON structures. The data within the lineItemPromotions object can be quite complex, with nested JSONs and varying numbers of items. My goal is to unnest these structures in order to create new rows for each ID found under lineItemPromotions. How can I achieve this correctly?

{'provider': 'ABC',
 'discountCodes_out': [],
 'discounts_out': [],
 'lineItemPromotions': [{'id': '1',
   'discountCodes': [],
   'discounts': [{'rule': 'Bundle Discount',
     'name': 'Bundle Discount',
     'ruleId': '',
     'campaignId': '419f9a2f-0342-41c0-ac79-419d1023aaa9',
     'centAmount': 1733550}],
   'perUnitPromotionsShares': [1733550]},
  {'id': '2',
   'discountCodes': [],
   'discounts': [{'rule': 'Bundle Discount',
     'name': 'Bundle Discount',
     'ruleId': '',
     'campaignId': '419f9a2f-0342-41c0-ac79-419d1023aaa9',
     'centAmount': 119438}],
   'perUnitPromotionsShares': [119438, 119438]}]}

I've attempted some code to unnest these structures, but it's not yielding the desired results. It's creating nested items that require further unnesting. Apologies for having to use an image to demonstrate the issue.

https://i.stack.imgur.com/oNvCB.png

Answer №1

Despite its length, one approach is to explicitly normalize each level:

pd.concat(
   [
      pd.json_normalize(data).explode("lineItemPromotions")
        .drop(columns="lineItemPromotions").reset_index(drop=True),
      pd.json_normalize(data, record_path=["lineItemPromotions"])
         .drop(columns="discounts"),
      pd.json_normalize(data, record_path=["lineItemPromotions", "discounts"])
   ], 
   axis=1
)
  provider discountCodes_out discounts_out id discountCodes perUnitPromotionsShares             rule             name ruleId                            campaignId  centAmount
0      ABC                []            []  1            []               [1733550]  Bundle Discount  Bundle Discount         419f9a2f-0342-41c0-ac79-419d1023aaa9     1733550
1      ABC                []            []  2            []        [119438, 119438]  Bundle Discount  Bundle Discount         419f9a2f-0342-41c0-ac79-419d1023aaa9      119438

An additional step could be to

.explode("perUnitPromotionsShares")
if necessary.

Answer №2

To start off, you can break down your columns and then merge them with a newly created dataframe based on the discounts column:

df = pd.json_normalize(updated_dict, meta='provider', record_path='lineItemPromotions')
df = df.apply(pd.Series.explode)
pd.concat([df.drop(columns='discounts').reset_index(drop=True), 
           pd.DataFrame(df['discounts'].values.tolist())], axis=1)

Result:

  id discountCodes perUnitPromotionsShares provider             rule             name ruleId                            campaignId  centAmount
0  1           NaN                 1733550      ABC  Bundle Discount  Bundle Discount         419f9a2f-0342-41c0-ac79-419d1023aaa9     1733550
1  2           NaN                  119438      ABC  Bundle Discount  Bundle Discount         419f9a2f-0342-41c0-ac79-419d1023aaa9      119438
2  2           NaN                  119438      ABC  Bundle Discount  Bundle Discount         419f9a2f-0342-41c0-ac79-419d1023aaa9      119438

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Using Python 3 to switch window background on button press

After spending some time utilizing this website, I must say it has been incredibly beneficial for my Linux and Python needs - so thank you :) Recently, I embarked on the journey of adding a background to an application window. I created a button that trigg ...

Struggling to interact with an HTML element using Selenium?

Currently struggling with getting Selenium to locate and click a button on a WebEx meeting page. Here is the Xpath for the elusive button: //*[@id="interstitial_join_btn"] However, every attempt I make using driver.find_element_by_xpath('//*[@id="in ...

Maximizing Python efficiency - the ultimate strategy for parallel computing

I am currently developing a Python script that requires sending over 1500 packets simultaneously within less than 5 seconds each. In essence, the main requirements are: def send_packets(ip): #craft packet while True: #send packet ...

Show JSON data in a tree structure

I'm having trouble displaying JSON data in a tree view. Could you please help me out? Here's what I've tried so far: my tree is being generated, but only the array is converted to a node. I believe there might be something missing in my code ...

Error: JSON parsing encountered an unexpected character "D" at position 1

When I call a python script as a child process from my node.js app to extract data from an uploaded file, I encounter the error 'UnhandledPromiseRejectionWarning: SyntaxError: Unexpected token D in JSON at position 1" when uploading the file thro ...

Utilizing the t-distribution in Python

Can anyone help me with sampling from a non-standardized t-distribution in Python? I have the location, degrees of freedom, and scale parameters denoted as a, b, and c respectively. I want to generate samples from this distribution. I came across this res ...

Adding external JSON data to a plain HTML document can be achieved through the process of

I have been experimenting with extracting data from an API in JSON format, but I am struggling to figure out how to convert the JSON tags into HTML elements. You can view a sample of the JSON data here. Does anyone know how to transform this JSON into DI ...

What is the best method for storing a modest number of images in a single file using Python?

As a beginner programmer, I am embarking on the journey of creating a basic 2D animation software for a school project. My goal is to enable users to save their animations as a single file that can later be loaded by the program. This will involve storing ...

Adjusting sound pitch by moving sliders in Tkinter for different frequencies

I am fairly new to coding in Python and currently working on creating square wave sound through the speaker by adjusting the frequency. I have a code snippet that generates noise based on frequency as well as tkinter slider code. My goal is to combine thes ...

When attempting to close the current window in Python, Windows Handling automatically shuts down the entire browser

Currently, I am using Windows handling to open the map directions in a new window. After the new window opens, I close it and continue with the remaining work in the code. However, instead of just closing the child window, the whole browser is being closed ...

Inputting data one row at a time instead of all at once in groups

I am working on a project to extract rows from a large JSON document and insert them into an SQL Server database. The current code I have successfully inserts one row at a time into the table, which seems to be every 1000th row. add-type -path "C:\P ...

Python Data Transformation Techniques

Apologies for the unconventional title, I couldn't find a succinct way to describe my issue... I am working with a DataFrame that resembles the following: 24 36 48 A 1 2 1 B 2 2 2 C 2 1 3 My goal is to restructure it like this ...

Merging text and a JSON object to retrieve the information

Having some trouble with a JSON object and retrieving values. This is the syntax that works for getting the data I need. dataJSON.companies[0].fields.Internet.length I want to dynamically evaluate the object using a string variable, like this... var me ...

Manage how data is presented in a JSON format using Go's structures

Currently, I am working on a nested structure in golang and trying to manage which substructures should be displayed in JSON. For example, if I only want to show the treeid and name fields from Citrus, I attempted the following notation, but it still prin ...

Node.js tutorial: Fetching all pages of playlists from Spotify API

Currently, I am attempting to retrieve all of a user's playlists from the spotify API and display them in a selection list. The challenge lies in only being able to access 20 playlists at a time, which prompted me to create a while loop utilizing the ...

Failed to retrieve JSON data on the subsequent request using the specified URL

I'm encountering an issue with a function that retrieves and processes JsonData. My current code looks like this: def getData(link,row): u=urllib2.urlopen(link).read(); jsonObject=json.loads(u); # do some stuff return (jsonObject[u&ap ...

using Class inheritance as a point of discussion

Imagine a scenario where we need our class to inherit from any sklearn clustering class, in the most basic way possible: class MyOwnClustering(AgglomerativeClustering): def __init__(self, mysetting , **kwargs): super().__init__(**kwargs) Can ...

After utilizing an if statement to reposition the cursor based on a flex sensor input, I noticed that even after prolonged bending of the sensor, the values remained consistent

After bending the flex sensor for a brief period, it responds normally by changing values and moving my cursor. However, when I bend the sensor for an extended period of time and then release it, the sensor reading remains constant at 197 and my cursor con ...

Sending JSON Data from C# to External JavaScript File without Using a Web Server

Trying to transfer JSON data from a C# (winforms) application to a static HTML/JavaScript file for canvas drawing without the need for a web server. Keeping the HTML file unhosted is preferred. Without involving a server, passing data through 'get&ap ...

Python scraping of specs from psref website has ceased functioning

There was a script that I used occasionally to gather computer specifications from lenovo.psref.com, but it stopped working after they updated their website design. So I created a new one, this time incorporating bard+chatgpt to ensure accuracy. When test ...