Having difficulty parsing, filtering, or extracting a json.dumps object within a loop

I am looking to extract the first element starting after [{ using the code provided below.

[
  {
    "Bkav": {
      "category": "harmless",
      "result": "clean",
      "method": "blacklist",
      "engine_name": "Bkav"
    },
    "CMC Threat Intelligence": {
      "category": "harmless",
      "result": "clean",
      "method": "blacklist",
      "engine_name": "CMC Threat Intelligence"
    }
]

`

The code is functional and outputs are stored in variable y.

import json
import re
from http.client import responses

import vt
import requests

with open('/home/asad/Downloads/ssh-log-parser/ok', 'r') as file:
    file = file.read()

pattern = re.compile(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}')
ips = pattern.findall(file)
unique_ips = list(set(ips))
headers = {
    "accept": "application/json",
    "x-apikey": "###"
}
i = 0
url = "https://www.virustotal.com/api/v3/ip_addresses/"
messages = []
while i < len(unique_ips):
    furl = url + str(unique_ips[i])
    response = requests.get(furl, headers=headers)
    data_ = response.json()
    i += 1
    messages = [data_['data']['attributes']['last_analysis_results']]
    y = json.dumps(messages)

    labels = [{"value": i} for i in unique_ips]

    out_json = {
        "indicators": {
            "value": labels,
            "type": 'ip'

        },

    }

When trying to utilize y[0], I encounter an issue where the output is not a string. Here is the error:

Traceback (most recent call last):
  File "/home/asad/Downloads/ssh-log-parser/auth_log_parser.py", line 35, in <module>
    print(ii, ":", y[ii])
TypeError: string indices must be integers

The objective is to extract the following keys in bold:

[{"Bkav": {"category": "harmless", "result": "clean", "method": "blacklist", "engine_name": "Bkav"}, "CMC Threat Intelligence": {"category": "harmless", "result": "clean", "method": "blacklist", "engine_name": "CMC Threat Intelligence"} `

Answer №1

Have you considered using pandas for this task? (The example .json data is stored in 'Test.json')

import pandas as pd

df = pd.read_json('Test.json')
print(df.values)

Output:

[[{'category': 'harmless', 'result': 'clean', 'method': 'blacklist', 'engine_name': 'Bkav'}
  {'category': 'harmless', 'result': 'clean', 'method': 'blacklist', 'engine_name': 'CMC Threat Intelligence'}
  {'category': 'harmless', 'result': 'clean', 'method': 'blacklist', 'engine_name': 'Snort IP sample list'} ...]]

If you want to print the content without the extra square brackets, you can do so by modifying the code:

import pandas as pd

df = pd.read_json('Test.json', orient='index')
print(df)

Output:

                  category result     method              engine_name
Bkav                     harmless  clean  blacklist                     Bkav
CMC Threat Intelligence  harmless  clean  blacklist  CMC Threat Intelligence
Snort IP sample list     harmless  clean  blacklist     Snort IP sample list
0xSI_f33d                harmless  clean  blacklist                0xSI_f33d
ViriBack                 harmless  clean  blacklist                 ViriBack
Comodo Valkyrie Verdict  harmless  clean  blacklist  Comodo Valkyrie Verdict
PhishLabs                harmless  clean  blacklist                PhishLabs
K7AntiVirus              harmless  clean  blacklist              K7AntiVirus
CINS Army                harmless  clean  blacklist                CINS Army
Quttera                  harmless  clean  blacklist                  Quttera
PrecisionSec             harmless  clean  blacklist             PrecisionSec
OpenPhish                harmless  clean  blacklist                OpenPhish
VX Vault                 harmless  clean  blacklist                 VX Vault
Web Security Guard       harmless  clean  blacklist       Web Security Guard
Scantitan                harmless  clean  blacklist                Scantitan
AlienVault               harmless  clean  blacklist               AlienVault
Sophos                   harmless  clean  blacklist                   Sophos
Phishtank                harmless  clean  blacklist                Phishtank
Cyan                     harmless  clean  blacklist                     Cyan
Spam404                  harmless  clean  blacklist                  Spam404
SecureBrain              harmless  clean  blacklist              SecureBrain

To find the first element in the dataframe:

print("First element",df.first_valid_index())

which will give you: First element **Bkav**

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Tips on waiting for the user to click before proceeding with the rest of the automation process

I am working on automating a form filling process using code. However, I would like the automation to pause and wait for the user to manually click after reading a notice before proceeding. postalcode = driver.find_element(By.NAME, 'postalCode') ...

Elementary Mathematics / straightforward encryption techniques

Currently, I am in the process of developing a relatively simple encryption method. However, I have encountered a problem that I cannot seem to solve. During the encryption process, I take a string as input and convert it into binary for some calculations ...

Sending server variable to client side script

I am currently using EJS as the templating engine for my Express application, but I'm facing some challenges with passing a variable from the server to the client script. Despite trying methods like JSON.stringify and JSON.parse mentioned in similar ...

Tips on transforming JSON data into a hierarchical/tree structure with javascript/angularJS

[ {"id":1,"countryname":"India","zoneid":"1","countryid":"1","zonename":"South","stateid":"1","zid":"1","statename":"Karnataka"}, {"id":1,"countryname":"India","zoneid":"1","countryid":"1","zonename":"South","stateid":"2","zid":"1","s ...

Utilizing Selenium for automated data extraction through programming script

I am attempting to extract data from a website that utilizes the following script: <div id="premarket_container" style="min-height: 250px;"> </div> <script> $(w ...

Unsure about the approach to handle this PHP/JSON object in Javascript/jQuery

From my understanding, I have generated a JSON object using PHP's json_encode function and displayed it using echo. As a result, I can directly access this object in JavaScript as an object. Here is an example: .done(function(response) { var ...

Is it true that sklearnex (sklearn-intel-extension) provides support for linear regression models?

Currently, I am exploring the use of sklearnex/scikit-learn-intelex for GPU acceleration. The code snippet below is what I have implemented based on the instructions provided in 'Patching several algorithms': try: from sklearnex import patch_ ...

Error: JSON at position 1 is throwing off the syntax in EXPRESS due to an unexpected token "

I'm currently utilizing a REST web service within Express and I am looking to retrieve an object that includes the specified hours. var express = require('express'); var router = express.Router(); /* GET home page. ...

Deploying static files with Django in a production environment

My Django application is functioning properly on Ubuntu 14.04 with nginx 1.10, Django 1.10.2, and uWSGI 2.0.14. It is able to load static files such as JavaScript, CSS, and images, but the CSS files are not being applied to my website. Below is the configu ...

Can Selenium successfully scrape data from this website?

I am currently attempting to extract Hate Symbol data (including the name, symbol type, description, ideology, location, and images) from the GPAHE website using Selenium. As one of my initial steps, I am trying to set the input_element to the XPATH of the ...

What is the Best Way to Understand CloudKit Error Messages?

I'm currently working on a php script that aims to create a record in a CloudKit database. However, I keep encountering this error: object(stdClass)#1 (3) { ["uuid"]=> string(36) "c70072a1-fab6-491b-a68f-03b9056223e1" ["serverErrorCode"]=> ...

Using Networkx to Assign Different Colors to Groups of Nodes in a Graph

I'm working on drawing a network where each community is represented by colored nodes (I already have node lists for each community). This is what I currently have: plot = nx.draw(G3, nodecolor='r', node_color= 'white', edge_colo ...

What is the best way to iterate over JSON data and organize the output based on the key value?

Let's say I want to filter through the JSON data below and only push the home_team_conference if it matches Southeastern. My goal is to organize the data by home_team_conference in order to display each array on different pages of my website. Currentl ...

Merging backend code (Java and Python) into HTML for seamless integration

While I have a solid background in back-end coding languages like Java and Python, the task at hand requires integrating this code into a website. The backend code comprises various methods and classes to encrypt text messages using encryption techniques ...

Creating a JSON object by initializing it with a string containing a regex

I need assistance with initializing a JSON object using a specific string. String myString = "{regex:/^(a-z|A-Z|0-9)*[^:*()#%`~*^&+={}| >&quot;|\\]*$/,'value':10}"; try { JSONObject jsonObj = new JSONObject(myString); ...

In the latest version of Qiskit 0.24.0 tutorial on Max-Cut and Traveling Salesman Problem, learn the method to tackle the TSP problem involving more than three nodes

I recently experimented with qiskit's Traveling Salesman Problem example, starting with 3 nodes and running it on IBM's simulator_statevector backend. The execution went smoothly and the results were as expected. However, I decided to challenge ...

Python code to find the mean and standard deviation for each column in a CSV file

I need help with pre-processing a dataset. My goal is to remove all occurrences of '?' in each data point, then calculate the mean and standard deviation for every column. However, I keep encountering this error message: IOError: [Errno 13] Pe ...

Uncovering dynamically generated nested elements using Selenium

I am currently utilizing Selenium along with the Chrome Driver, but I am encountering difficulties in locating an element by its ID. Despite being visible in the browser's web inspector, it seems that this element is dynamically generated. The URL in ...

Creating a variable in the outer scope from within a function

I am currently implementing validation for a form field on the server side using ExpressJS. Here are the steps I am taking: Reading data from a JSON file Extracting an array property from the data Verifying if this array contains every element of a ...

Get rid of the percentage displayed in the tooltip of a Google Pie Chart

Is it Possible to Exclude the Percentage from Google Charts Tooltip? For instance, I am looking to eliminate the display of 33.33% in the tooltip and only show the value itself. ...