Converting a text file to JSON in Python with element stripping and reordering techniques

I have a file with data separated by spaces like this:

2017-05-16 00:44:36.151724381 +43.8187 -104.7669 -004.4 00.6 00.2 00.2 090 C
2017-05-16 00:44:36.246672534 +41.6321 -104.7834 +004.3 00.6 00.3 00.2 130 C
2017-05-16 00:44:36.356132768 +46.4559 -104.5989 -004.2 01.1 00.4 00.2 034 C

and I want to convert it into JSON format like this:

import json
import sys
def convert(filename):
    dataDict = {}
    txtFile = filename[0]
    print "Opening TXT file: ",txtFile
    infile = open(txtFile, "r")
    for line in infile:
        lineStrip = line.strip()
        parts = [p.strip() for p in lineStrip.split()]
        date = parts[0].strip("-") 
        time = parts[1].strip(":") 
        dataDict.update({"dataset":"Lightning"})
        dataDict.update({"observation_date": date + time})
        dataDict.update({"location": {"type":"point", "coordinates": [parts[2], parts[3]]}})
        json_filename = txtFile.split(".")[0]+".json"
        jsonf = open(json_filename,'a')
        data = json.dumps(dataDict)
        jsonf.write(data + "\n")
        print dataDict
    infile.close()
    jsonf.close()   
if __name__=="__main__":
    convert(sys.argv[1:])

I'm unsure how to remove the "-", ".", and ":" characters as well as add the "dataset":"lightning" element at the beginning.

Answer №1

It appears to be functioning

date = parts[0].replace("-", '') #attempting to remove "-"
time = parts[1].replace(":", '').replace(".", '') #trying to eliminate ":" and "."

Answer №2

Here are the recommended steps:

Replace the hyphens in the date with an empty string and remove the colons from the time using the following code:

date = parts[0].replace('-', '') 
   time = parts[1].replace(':', '')

In order to obtain the dataset at the beginning of a JSON file, the only available option is to sort the keys like so:

data = json.dumps(dataDict, sort_keys=True)

In addition, it is advisable to execute the following code instead of using the .update method:

dataDict["dataset"] = "Lightning"

Answer №3

When working with Python dictionaries, it is important to remember that they are unordered. This means that you cannot rely on the order of elements such as "dataset":"lightning" to be guaranteed. To ensure a specific order, consider using an OrderedDict or sorting the JSON data as suggested by others.

To correctly format time values, utilizing a datetime object is recommended:

import datetime

date_string = parts[0] + parts[1]
format = "%Y-%d-%m%H:%M:%S.%f"
dt = datetime.strptime(date_string, format)
new_date_string = dt.strftime("%Y%d%m%H%M%S")

Using a datetime object not only ensures proper time formatting, but also facilitates compatibility with libraries like pandas and numpy for further data manipulation. Additionally, it provides support for mathematical operations and time zone adjustments if required.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

various types of information in a grid

My data includes the following matrices: [[-0.96092685 1.16253848] [ 1.49414781 0.265738 ] [ 0.02689231 -1.09912591] ... [ 0.16677277 1.43807138] [-0.36808792 -0.03435113] [-0.3693595 0.44464701]] and another matrix like this: [-1 1 -1 -1 -1 ...

AngularJS - Managing Multiple State Parameters

Currently, we are in the process of learning AngularJS and are facing difficulty with a specific issue. We aim to select a product type and utilize the formData value ("productType":"1") image1 to accurately display the JSON product data associated with th ...

`Can Beautiful Soup extract data (text) from elements that share the same class identifier?`

Currently working on a personal project focused on web scraping, using beautiful soup to extract data from a website. Encountered an issue where data with the same class but different attributes is present. For instance: <div class="pi--secondary-price ...

Transferring an array between servers

On www.website1.com, I am storing user login information in an array. Once a user logs in on www.website1.com, I want to redirect them to www.website2.com while passing along their information (such as username and password) stored in the array. Below is ...

Transform the value of a string JSON property into a custom object

I'm working with JSON data that I need to deserialize into a C# object. Here is the structure of the data: { "version": 0, "guid": "2166a7d5744d47009adaa29f0e549696", "csv": "one:1,two:2,three: ...

The Process of Transforming Postgres Query Result into JSON Object

I have a straightforward query that retrieves data from a table called "things". The query SELECT name, grp FROM things; returns the following table: name | grp ------+----- a | y b | x c | x d | z e | z f | z My goal is to tran ...

Retaining input history in Tkinter Entry post conversion with Pyinstaller: A seamless solution

For my GUI app built using Tkinter and converted into a single .exe file with pyinstaller --onefile, I need the Entry text box to have the ability to recall previous input values. This means that when users click on the Entry box, they should see a list of ...

Python3 function that returns the integer representation of the temporary file path

I'm interested in finding a way to retrieve the path of a temporary file while ensuring it returns a string instead of an int (like 15) as it currently does. The path I'm looking for is where I've stored a temporary file (with its correspond ...

Unable to transform JSON Object into a List

Here are the classes I have: public class Datum { public bool Prop1 { get; set; } public bool Prop2 { get; set; } public bool Prop3 { get; set; } public bool Prop4 { get; set; } public bool Prop5 { get; set; } public bool Prop6 { g ...

Tips for configuring headers with Python Selenium Chrome WebDriver

I'm struggling to figure out how to set request headers using Selenium Chrome Webdriver in Python. Specifically, I need to set the "host" header. Here's what I've tried: from selenium import webdriver from pyvirtualdisplay import Display fr ...

Error: The function 'fetch' is not recognized in Selenium Console

Having some trouble with Selenium and Chrome Developer Tools. I want to open Selenium, go to a URL, and then use driver.execute_script to make a fetch request via Console in Chrome Developer Tools within the Selenium window. However, when I try to run thi ...

Error encountered while trying to utilize the modal input box in the Robot Framework Python script

I developed a Robot Framework code to interact with an input box inside a Modal that opens when a button is clicked. However, upon opening the modal, it displays a message stating that the "input box" is not interactable. It's worth noting that there ...

Prepend a fixed header to the filter

Imagine this (simplified) JSON data: [ { "type": "foo", "name": "test_1" }, { "type": "bar", "name": "test_2" }, { & ...

The configuration file tsconfig.json did not contain any input

After downloading angular2-highcharts through npm for my application, I encountered an error in the tsconfig.json file of the package while using Visual Studio Code: file: 'file:///c%3A/pdws-view-v2/node_modules/angular2-highcharts/tsconfig.json&apos ...

Utilize accepts_nested_attributes_for to generate nested records during a put/post request

My project involves two primary models: Landscape Model: class Landscape < ActiveRecord::Base has_many :images, :as => :imageable accepts_nested_attributes_for :images, :allow_destroy => true attr_accessible :id, :name, :city, :state, :z ...

Extract specific information from a text file by parsing it based on a given list

Looking for assistance with extracting a part of a txt.file, specifically: #some comments #some comments #some comments # Predicted genes for sequence number 1 on both strands # start gene g1 scaffold_0 AUGUSTUS gene 1268 6647 0.19 - . ...

The power of Numba with multi-dimensional Numpy arrays

Preliminary Setup These are the two different approaches to matrix calculation that I have: The initial method involves a matrix with dimensions (n, m) where the calculations are done in nested for-loops and repeated repetition-times: import numpy as ...

Can the default JSON input for "Start Execution" be configured using AWS CDK for Step Functions in AWS?

We are currently in the process of streamlining the Step Function generation for our Cloud Ops team to call upon. I am wondering if there is a way to define the default input parameters for the Execution using the SDK. The current default setting is as sho ...

Order JSON array based on the time, extract the pairs of keys and values, and transfer them to a Google

My goal is to extract the most recent entry from a JSON array stored in a Google Sheet and copy it into two adjacent columns. The desired data resides in Column L of my spreadsheet (starting from row 2) and follows this format: [{"id": "XX:123456", "time ...

Showing JSON content in UITableView using Swift 4 along with Alamofire

Hi there, I'm a beginner in Swift and I could really use some help. I'm working on an app that parses JSON data in Swift and here's the code snippet I've put together: import UIKit import Alamofire import SwiftyJSON class ViewControll ...