Pandas Json Normalize

January 22, 2023August 5, 2023 | by Arround The Web | No comments

JSON or JavaScript Object Notation is a widely utilized format for storing and exchanging/sharing data. It is very difficult to work/deal with JSON data as it is often nested and complex. In order to make it a simple, flattened, or more manageable format, the “pandas.json_normalize()” method of the “Pandas” module is used in Python.

This blog presents a thorough tutorial on the “pandas.json_normalzie()” method using numerous examples and via the following content:

What is the “pandas.json_normalize()” Method in Python?

The “pandas.json_normalzie()” method is used to normalize semi-structured JSON (JavaScript object notation) data into a flat table. The syntax of “pandas.json_normalize()” method is shown below:

pandas.json_normalize(data, max_level=None, record_path=None, record_prefix=None, meta=None, errors='raise', meta_prefix=None, sep='.')

In the above syntax:

The “data” parameter specifies the data that can be a dictionary, a nested dictionary, or a list of dictionaries.
The “max_level” parameter is used to indicate the maximum level or depth of the dictionary to normalize. By default, it normalizes all the levels.
The “record_path” parameter is used to indicate the path to the record of the JavaScript Object Notation, which should be flattened.
The other parameters are optional and can perform certain operations when passed to the function.

Example 1: Normalizing JSON Data Using “pandas.json_normalize()” Method

The below code is used to normalize the nested dictionary JSON data:

import pandas

input_json = [{"name": "Joseph", "id_no": "1234"},{"name": "Lily", "id_no": "1453"},{"name": "Anna", "id_no": "1932"},

{"name": "Henry", "id_no": "1567"},{"name": "David", "id_no": "1354"}]

print(input_json, '\n')

print(pandas.json_normalize(input_json))

In the above code:

The “pandas” module is imported, the JSON data is created, and stored in the variable “input_json”.
The “pandas.json_normalize()” method takes the JSON data as a parameter and normalizes it.

Output

The JSON data has been normalized to the maximum level.

Example 2: Normalizing JSON Data Using “pandas.json_normalize()” Method With “max_level” Parameter

The following code is used to normalize the JSON data based on the level of the normalization:

import pandas

input_json = [

{

"id_no": 18012,

"name": "Joseph",

"grades": {"english": 22, "sports": 30},

{"name": "Henry", "grades": {"english": 38, "sports": 60}},

{

"id_no": 18043,

"name": "Anna",

"grades": {"english": 31, "sports": 190},

]

print(input_json, '\n')

print(pandas.json_normalize(input_json, max_level=0), '\n')

print(pandas.json_normalize(input_json, max_level=1))

In the above code:

The “pandas.json_normalize()” method takes the JSON data and the “max_level=0” parameter value as an argument to normalize the JSON data at level “0”.
The “pandas.json_normalize()” method is used again to normalize the JSON data at level “1” by assigning the parameter value “max_level=1”.

Output

The JSON data has been normalized according to the specified levels, such as “0” and “1”.

Conclusion

In Python, the “pandas.json_normalize()” method of the “pandas” module is utilized to normalize semi-structured JSON (JavaScript Object Notation) data into a flat table. This method is used with various parameters such as “max_level” to normalize the data based on the passed level value. This write-up has delivered a detailed guide on Panda’s JSON normalize method using numerous examples.

Source: linuxhint.com