The function 'read_json' in Pandas is not functioning properly as anticipated

Having trouble loading a JSON file with pandas as expected! I've checked various Stack Overflow answers but my issue doesn't seem to be there. The structure of the JSON file is shown below:

View JSON File

Code snippet used to load the file:-

import pandas as pd
df = pd.read_json("BrowserHistory.json")
print(df)

Expected Output:-

View Pandas Dataframe Output

However, instead of having only 1 column for each JSON element, I wish to have 6 columns namely 'favicon_url', 'page_transition', 'title', 'url', 'client_id' and 'time_usec' as depicted in the image of the 'JSON file' provided, each populated with respective values from every element.

Desired result format:

favicon url   page_transition   title   url   client_id   time_user
    .                .            .      .        .           .
    .                .            .      .        .           .
    .                .            .      .        .           .
    .                .            .      .        .           .

Structure of JSON File:

{
    "Browser History": [
        {
            "favicon_url": "https://www.google.com/favicon.ico",
            "page_transition": "LINK",
            "title": "Google Takeout",
            "url": "https://takeout.google.com/",
            "client_id": "cliendid",
            "time_usec": 1620386529857946
},
        {
            "favicon_url": "https://www.google.com/favicon.ico",
            "page_transition": "LINK",
            "title": "Google Takeout",
            "url": "https://takeout.google.com/",
            "client_id": "cliendid",
            "time_usec": 1620386514845201
},
        {
            "favicon_url": "https://www.google.com/favicon.ico",
            "page_transition": "LINK",
            "title": "Google Takeout",
            "url": "https://takeout.google.com/",
            "client_id": "cliendid",
            "time_usec": 1620386499014063
},
        {
            "favicon_url": "https://ssl.gstatic.com/ui/v1/icons/mail/rfr/gmail.ico",
            "page_transition": "LINK",
            "title": "Gmail",
            "url": "https://mail.google.com/mail/u/0/#inbox",
            "client_id": "cliendid",
            "time_usec": 1620386492788783
}
  ]
}

Answer №1

The issue arises from the use of curly braces `{}` around your file, which causes pandas to interpret the first level of the JSON as columns and resulting in only Browser History being used as a column. To resolve this problem, you can utilize the following code:

import pandas as pd
df = pd.DataFrame(json.load(open('BrowserHistory.json', encoding='cp850'))['Browser History'])
print(df)

Answer №2

The objects within your JSON file are nested in a list at the second level, making it impossible to directly import into a dataframe using read_json. To work around this, you can first read the JSON into a variable and then create a dataframe from that:

import pandas as pd
import json

f = open("BrowserHistory.json")
js = json.load(f)
df = pd.DataFrame(js['Browser History'])
df
#                                          favicon_url page_transition  ... client_id         time_usec
# 0                 https://www.google.com/favicon.ico            LINK  ...  cliendid  1620386529857946
# 1                 https://www.google.com/favicon.ico            LINK  ...  cliendid  1620386514845201
# 2                 https://www.google.com/favicon.ico            LINK  ...  cliendid  1620386499014063
# 3  https://ssl.gstatic.com/ui/v1/icons/mail/rfr/g...            LINK  ...  cliendid  1620386492788783

Keep in mind that you might have to specify the file encoding when using the open function like so:

f = open("BrowserHistory.json", encoding="utf8")

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Uncovering unseen tags generated by JavaScript on a webpage using Python

I have THIS LINK page that contains javascript. To view the javascript, simply click on show details. How can I extract data from this URL source? Should I use re? Here is what I attempted with re: import urllib import re gdoc = urllib.urlopen('Tha ...

Put JSON data into SQL database table

I have the SQL Table 1 shown below: id name gender age country ambition 1 Peter Male 20 Italy Doctor 2 Angeli Female 30 Australia Lawyer I need to insert data into another table using a similar approach. Here is how I want the output in SQL Tabl ...

What is Angular's approach to handling a dynamic and unprocessed JSON object?

When a JSON file is placed under assets, accessing it using something like http://localhost:4200/myapp.com/assets/hello.json will fetch the JSON file directly without any graphical user interface. This indicates that Angular must be able to return a raw JS ...

Using Selenium in Python to extract distinct list

I'm currently working on a script to scrape hotel.com using selenium import time import os from selenium import webdriver from selenium.webdriver.chrome.service import Service from webdriver_manager.chrome import ChromeDriverManager from selenium.webd ...

What is the best way to locate a web element in Selenium using Python when it does not have an ID

I'm having trouble selecting an element on the minehut.com webpage that doesn't have an ID. Despite trying CSS Selectors, I haven't had any success. The element I want to select is: <button _ngcontent-c17 color="Primary" mat-raised-bu ...

Adding nested JSON data to MySQL using NodeJS

My current challenge involves using Node.js to INSERT JSON data into a MySQL database. Everything runs smoothly until I encounter nested values within the JSON structure. Here is an example snippet of my JSON data: var result2 = [{ "id": 89304, "employe ...

Creating dropdown options with JSON and Angular

This dilemma has been causing me no end of distress. I am trying to figure out how to populate options for a select tag from a JSON object using Angular. Here is a snippet of the code: <select id="cargo" ng-model="cargo.values.cargoList"> <op ...

Is it true that echo and print are disabled in the 'real-time' version of php?

As a newcomer to programming, I can only describe my script as live, although I know that might not be the correct term. Initially, I developed a bot in php and ran it on xampp locally on my mac, where I could easily print arrays and other content using ec ...

Determining the count of series using Python's Pandas

I needed to determine the number of series contained within a specific dataset. The count of time-series information was required for analysis. https://i.stack.imgur.com/VHQvw.png Within this context, I wanted users to select how they wished to analyze ...

Dealing with Errors in Selenium Using Python 2.7

After transitioning from Python 3.5 to Python 2.7 due to py2exe compatibility issues, I encountered an error in my script. Can someone help me resolve this problem? Any assistance would be greatly appreciated. from selenium import webdriver import time ...

Find the most affordable rate in the JSON data without knowledge of the parent label

I am seeking to extract the best deal from JSON data obtained through an API call and decoded using json_decode. The structure is as follows: products productnumber -> price $product_array['products'][..changingnumber..]['value'] ...

Automating button clicks with Python and Selenium (Updated for a different problem)

After encountering ad pop-ups on a previous mp3 converter site, I decided to try using an alternative website, h2converter.com/tr/, for my script. However, this time the web driver had trouble finding the button and the program stopped due to a timeout e ...

Is memory shared between Processes in a process pool affected by class attributes?

There is a class A that, when initialized, modifies a mutable class attribute called nums. Upon initializing the class using a Process pool with maxtasksperchild= 1, it appears that nums contains values from various processes, which is not the desired beh ...

Automating web page login with Python Selenium

Recently I started working with Python and Selenium to create an automated login script for checking appointment availability. I have encountered a problem with my code where it throws an exception at the "Button click failed" line when run without breakpo ...

What could be causing this error to appear when I try to run the makemigrations command?

from django.db import models # Defining the Products model class Products(models.Model): title = models.TextField() description = models.TextField() price = models.TextField() An issue has arisen and I am unsure why. Here is the error messag ...

Retrieve the outer-HTML of an element when it is clicked

I am working on a project to develop a tool for locating xpath, and I am seeking the most efficient and user-friendly method for allowing the user to select the desired element on a webpage. Ideally, this selection should be made with just a single click, ...

Muffling SSH Connection Output with Pexpect

I have a pexpect script that connects via SSH to a remote server and retrieves a value from a command. Is there a way, using pexpect or SSH, to bypass the standard Unix login message? In other words, how can I extract the returned value without being affec ...

Efficiently retrieving Django model relations through JSON

Struggling to properly title this question, but please bear with me as I explain my dilemma. I am in the process of creating an app for my hockey team that involves a django backend and a mobile app using JSON communication (via django-rest-framework). On ...

Click here to start your Django download now

I am looking for a way to monitor the number of times a file has been downloaded. Here is my plan: 1) Instead of using <a href="{{ file.url }}" download>...</a>, I propose redirecting the user to a download view with a link like <a href="do ...

Extracting Data from JSON Using Vue.js

I am facing an issue with extracting data from a JSON file using Vue.js. Below is the HTML and JSON data along with the script. Any help would be appreciated. <!DOCTYPE html> <html> <head> <title>Vu ...