Loading...

JSON

Lightweight text-based format for storing and exchanging structured data with nested hierarchies

Open Science & Data Sharing Essential Core Tool
Quick Info
  • Category: Open Science & Data Sharing
  • Level: Essential
  • Type: Core Tool

Why We Recommend JSON

JSON is the standard for structured data exchange on the web and in APIs. Its human-readable format with support for nested structures makes it perfect for configuration files, metadata, and hierarchical data that doesn't fit in CSV.

Common Use Cases

  • Configuration files and settings
  • API data exchange
  • Hierarchical metadata storage
  • Nested data structures

Getting Started

JSON (JavaScript Object Notation) is a lightweight, text-based format for storing and exchanging structured data. Despite its JavaScript origins, JSON is language-independent and widely used across all programming languages.

Why JSON?

  • Human-Readable: Easy to read and write
  • Nested Structures: Supports hierarchical data
  • Language-Independent: Works with any programming language
  • Web Standard: Native format for web APIs
  • Self-Describing: Structure is clear from content
  • Lightweight: Minimal syntax overhead

Structure

Basic Syntax

{
  "subject_id": "S01",
  "age": 25,
  "active": true,
  "scores": [95, 87, 92],
  "metadata": {
    "experiment": "visual_discrimination",
    "date": "2024-01-15"
  }
}

Data Types

  • String: "text"
  • Number: 42, 3.14
  • Boolean: true, false
  • Null: null
  • Array: [1, 2, 3]
  • Object: {"key": "value"}

Working with JSON in Python

Reading JSON

import json

# From file
with open('data.json', 'r') as f:
    data = json.load(f)

# From string
json_string = '{"name": "Alice", "score": 95}'
data = json.loads(json_string)

print(data['name'])  # 'Alice'
print(data['score']) # 95

Writing JSON

import json

# Data structure
data = {
    "subject_id": "S01",
    "trials": [
        {"trial": 1, "response": "left", "rt": 0.45},
        {"trial": 2, "response": "right", "rt": 0.38}
    ],
    "metadata": {
        "date": "2024-01-15",
        "experiment": "visual_task"
    }
}

# To file
with open('results.json', 'w') as f:
    json.dump(data, f, indent=2)

# To string
json_string = json.dumps(data, indent=2)

Pretty Printing

import json

# Pretty print with indentation
print(json.dumps(data, indent=2))

# Even more readable
print(json.dumps(data, indent=2, sort_keys=True))

Using Pandas with JSON

Read JSON to DataFrame

import pandas as pd

# From file
df = pd.read_json('data.json')

# From string
df = pd.read_json(json_string)

# Specify orientation
df = pd.read_json('data.json', orient='records')

Different JSON Orientations

# records: [{col: val}, {col: val}]
df.to_json('data.json', orient='records', indent=2)

# columns: {col: {index: val}}
df.to_json('data.json', orient='columns')

# index: {index: {col: val}}
df.to_json('data.json', orient='index')

# split: {index: [], columns: [], data: []}
df.to_json('data.json', orient='split')

# values: [[val, val], [val, val]]
df.to_json('data.json', orient='values')

Neuroscience Examples

Experiment Metadata

{
  "experiment": {
    "name": "Visual Discrimination",
    "date": "2024-01-15",
    "pi": "Dr. Smith",
    "protocol": "IRB-2024-001"
  },
  "subjects": [
    {
      "id": "S01",
      "age": 25,
      "gender": "F",
      "sessions": [
        {
          "date": "2024-01-15",
          "trials": 100,
          "accuracy": 0.95
        }
      ]
    }
  ],
  "parameters": {
    "stimulus_duration_ms": 500,
    "iti_ms": 1000,
    "contrast_levels": [0.25, 0.5, 0.75, 1.0]
  }
}

ROI Metadata

{
  "recording": {
    "date": "2024-01-15",
    "microscope": "2P_Bruker",
    "objective": "20x",
    "frame_rate_hz": 30
  },
  "rois": [
    {
      "id": 1,
      "type": "soma",
      "coordinates": {"x": 120, "y": 85, "z": 15},
      "area_pixels": 245,
      "mean_intensity": 128.5
    },
    {
      "id": 2,
      "type": "soma",
      "coordinates": {"x": 145, "y": 92, "z": 15},
      "area_pixels": 189,
      "mean_intensity": 95.2
    }
  ]
}

Configuration Files

{
  "analysis": {
    "preprocessing": {
      "gaussian_sigma": 1.5,
      "threshold_method": "otsu",
      "min_roi_size": 20
    },
    "detection": {
      "algorithm": "watershed",
      "min_distance": 5,
      "sensitivity": 0.8
    },
    "output": {
      "save_images": true,
      "save_traces": true,
      "format": "hdf5"
    }
  }
}

Advanced Usage

Nested Data Access

import json

with open('experiment.json', 'r') as f:
    data = json.load(f)

# Access nested data
pi_name = data['experiment']['pi']
first_subject = data['subjects'][0]
first_session = data['subjects'][0]['sessions'][0]
contrast_levels = data['parameters']['contrast_levels']

Handling Custom Objects

import json
from datetime import datetime

class DateEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.isoformat()
        return super().default(obj)

# Use custom encoder
data = {"date": datetime.now(), "value": 42}
json_str = json.dumps(data, cls=DateEncoder)

Merging JSON Files

import json
from pathlib import Path

# Read multiple JSON files
json_files = Path('data').glob('*.json')
all_data = []

for file in json_files:
    with open(file, 'r') as f:
        all_data.append(json.load(f))

# Save combined
with open('combined.json', 'w') as f:
    json.dump(all_data, f, indent=2)

Validation

Check JSON Validity

import json

def is_valid_json(filename):
    try:
        with open(filename, 'r') as f:
            json.load(f)
        return True
    except json.JSONDecodeError as e:
        print(f"Invalid JSON: {e}")
        return False

Schema Validation

import json
import jsonschema

# Define schema
schema = {
    "type": "object",
    "properties": {
        "subject_id": {"type": "string"},
        "age": {"type": "integer", "minimum": 18},
        "scores": {"type": "array", "items": {"type": "number"}}
    },
    "required": ["subject_id", "age"]
}

# Validate data
data = {"subject_id": "S01", "age": 25, "scores": [95, 87]}
jsonschema.validate(instance=data, schema=schema)

JSON vs CSV

Use JSON When:

  • Data has nested structure
  • Different records have different fields
  • Need to store arrays within records
  • Configuration files
  • API data exchange

Use CSV When:

  • Flat tabular data
  • All records have same fields
  • Need to open in Excel
  • Simpler data structure
  • Human editing required

Common Patterns

Loading Configuration

import json
from pathlib import Path

def load_config(config_file='config.json'):
    """Load configuration from JSON file."""
    config_path = Path(config_file)

    if not config_path.exists():
        # Return defaults
        return {
            "threshold": 0.5,
            "output_dir": "results"
        }

    with open(config_path, 'r') as f:
        return json.load(f)

config = load_config()
threshold = config['threshold']

Saving Results

import json

def save_analysis_results(results, output_file):
    """Save analysis results as JSON."""
    output = {
        "timestamp": datetime.now().isoformat(),
        "parameters": results['parameters'],
        "metrics": results['metrics'],
        "summary": {
            "n_rois": len(results['rois']),
            "mean_response": results['metrics']['mean_response']
        }
    }

    with open(output_file, 'w') as f:
        json.dump(output, f, indent=2)

Tools for Working with JSON

Command Line

# Pretty print (requires jq)
cat data.json | jq .

# Extract specific field
cat data.json | jq '.subjects[0].age'

# Python pretty print
python -m json.tool data.json

Python Pretty Print

import json

# Read and pretty print
with open('messy.json', 'r') as f:
    data = json.load(f)

with open('clean.json', 'w') as f:
    json.dump(data, f, indent=2, sort_keys=True)

Best Practices

  • Use 2 or 4 spaces for indentation
  • Keep files human-readable with indentation
  • Use meaningful key names
  • Include metadata (date, version, etc.)
  • Validate JSON structure before saving
  • Use .json file extension
  • Handle exceptions when loading
  • Consider YAML for complex configs
  • Use schema validation for critical data
  • Document structure in README

Common Errors

Trailing Commas

{
  "key": "value",  // ❌ Comma after last item
}

{
  "key": "value"   // ✅ No trailing comma
}

Single Quotes

{'key': 'value'}    // ❌ Single quotes not allowed

{"key": "value"}    // ✅ Use double quotes

Comments

{
  // This is a comment  // ❌ Comments not allowed in JSON
  "key": "value"
}

Installation

JSON support is built into Python’s standard library - no installation needed!

import json  # Built-in module

For schema validation:

pip install jsonschema

Summary

JSON is essential for:

  • Configuration files
  • API data exchange
  • Hierarchical metadata
  • Nested data structures

While CSV is great for flat tabular data, JSON excels at representing complex, nested structures common in modern research workflows.

Resources

Top