How can I use Presto/Athena to find the frequency of JSON attributes in a query?

I created a Hive table with a single column that stores JSON data:

CREATE EXTERNAL TABLE IF NOT EXISTS my.rawdata (
  json string
)
PARTITIONED BY (dt string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   'separatorChar' = '\n',
   'quoteChar' = '\0',
   'escapeChar' = '\r'
)
STORED AS TEXTFILE
LOCATION 's3://mydata/';

Can someone help me with a Presto/Athena query that can identify all field names present in the JSON along with their frequencies within the table?

Answer №1

Utilize the power of JSON functions to dissect the JSON data and convert it into a map structure. Subsequently, extract the keys and explode them. Lastly, employ traditional SQL aggregation techniques:

SELECT key, count(*)
FROM (
  SELECT map_keys(cast(json_parse(json) AS map(varchar, json))) AS keys
  FROM rawdata
)
CROSS JOIN UNNEST (keys) AS t (key)
GROUP BY key

Answer №2

  • Allows for structured documents at multiple levels
  • Does not consider keys within nested elements

select    attribute
         ,sum(*)
from      table t cross join 
          lateral unnest (regexp_extract_all(json_data,'"([^"]+)"\s*:\s*("[^"]+"|[^,{}]+)',1)) u (attribute)
group by  attribute
;

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Iterating over numerous data points within a JSON file and leveraging a single value for each instance in Gatling

Here is an interesting coding scenario I encountered: > .exec(http("get print package") > .get("url.json") > .headers(headers_0) > .check(jsonPath("$..shapes[?(@.state=='UNUSED'&& @.assetId==null ...

Using Excel VBA to extract data from a JSON string and populate cells with the corresponding key:value pairs

I am facing a dilemma with my Excel spreadsheet setup which includes columns: | A | B | C | D | | Job number | Status | Actual Hrs | % Complete | 1___|____________|________|____________|____________| 2___|___________ ...

Tips for iterating through data in JSON format and displaying it in a Codeigniter 4 view using foreach

As a newcomer to JSON, I have a question - how can I iterate through JSON data (which includes object data and object array data) using jQuery/javascript that is retrieved from an AJAX response? To illustrate, here is an example of the JSON data: { "p ...

PHP cURL Facebook Messenger bot for sending messages

For the last few days, I have been experimenting with the Facebook Messenger Platform and encountering a problem. PHP has been the main language used. So far, I have managed to incorporate a couple of APIs into the system, mainly dealing with plain text. ...

How to display JSON images in an Android listview

I'm facing difficulty in loading images from a JSON file to the listview on my Android app Although I know that I need to use BitmapFactory.decodeStream, I am unsure about how to proceed with the implementation Below are the details of my adapter an ...

send email with .jpg image attachment through AWS SES using node.js

Check out the code snippet below from https://github.com/andrewpuch/aws-ses-node-js-examples, which provides an example of sending an email with an attachment. I made some modifications to the code in order to retrieve an image file from AWS S3 and send i ...

Synchronous fetch request within Core Data

In my application, I am utilizing Alamofire to fetch JSON data asynchronously from a server. In order to achieve this, I have the following objectives: 1. Execute multiple fetch requests and retrieve JSON data 2. Send the JSON data to my DataImporter ...

Unable to correlate the response with the designated object

I'm currently facing an issue while attempting to utilize Angular4 HttpClient with an observable object that I've defined. My challenge lies in mapping the response to the designated object. The root of the problem appears to be related to my us ...

Is there a way to specify a custom Content-Type for a jax-rs client?

Running JAX-RS resources declared in the following way: public interface MyEntityServiceV2 { @PUT @Consumes({"application/myentity-v2+json"}) @Produces({"application/myentity-v2+json"}) @Path("/{uuid:[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA- ...

JSON: A guide on transforming a flat JSON structure into a nested one

I've been struggling to transform a rather flat JSON structure into one with more depth, but I haven't had any luck so far. Here is the initial data: [ { "id": "27", "time": "2017-12-21 07:24:00", "service_name": "prices", "vers ...

Difficulty encountered while implementing Ajax POST in CodeIgniter

I've been working on a project similar to a ticket system that occasionally requires lengthy answers. When using CKEDITOR in the answer area, the agent's changes are automatically saved to the database using Json GET. However, I encountered an er ...

Exploring the JSON data in Javascript using Ajax

Completely new to Javascript, I am just trying to grasp the basics of the language. Currently, I have a JSON request set up with the following code: function request(){ $.ajax({ dataType: "jsonp", type: 'GET', url: "getWebsite", ...

Having trouble showing a specific number of JSON data items in SwiftUI

I am currently utilizing the SwiftUI framework and fetching JSON data from an Online API. After successfully decoding the data, I find myself facing a challenge in displaying only 6 random items from the decoded data. The requirement is to show these 6 ite ...

Removing quotation marks from a string stored as a parameter in the MongoDB Java driver

I am currently working with the MongoDB Java driver and I have a code snippet that looks like this: doc = new BasicDBObject("Physicalentity", "Pressure"). append("Sensor", "Tinkerforge"). append("Unit", "Lux"). append("loc", Location). append("value", p ...

Update an imageview within a listview by implementing a click listener

I have a unique ListView layout that showcases an outstanding ImageView and two exceptional buttons; the mighty primary setter and the astonishing image rotator. My vision for this masterpiece is simple. When a user clicks on the Set Primary button, it wi ...

Encountering Keyerror while trying to parse JSON in Python

Recently, I developed a program for extracting data from an API that returns information in JSON format. However, when attempting to parse the data, I encountered a key error. Traceback (most recent call last): File "test.py", line 20, in <module> ...

One of the Amplify and AWS files that seems to be missing is the "aws-exports" module

Recently, a company interested in hiring me sent over a "trial project" to work on. It was supposed to be simple, but it required the use of Amplify and AWS. The CTO specified that I needed to have a specific version of Node (10.18.1 or 10.19.0) and instru ...

Guide to serializing JSON in C# without nested elements

I have developed a controller to send JSON data to a mobile application. The following code snippet shows the action I used: public JsonResult GetFirm(int id) { Firm firm = new Firm(); firm = dbContext.Firms.FirstOrDefault(s => s.id == id); ...

Analyzing URLs with curly braces in an Android application

Is it possible to parse a URL with curly brackets, like http://example.com/api/login/{username}/{password}, in an Android application? A regular Volley post request returns HTML, but I require JSON format. How can the Login API be integrated into an Andr ...

Tips for retrieving numerical values from JSON paths using Scala

I just received the following response: { "code" : 201, "message" : "Your Quote Id is 353541551" } To extract the number 353541551 from the above response, I attempted to use some basic Scala code snippets. Here's what I tried: .check((status i ...