JSON

Lightweight text-based format for storing and exchanging structured data with nested hierarchies

Open Science & Data Sharing Essential Core Tool

Quick Info

Category: Open Science & Data Sharing
Level: Essential
Type: Core Tool

Why We Recommend JSON

JSON is the standard for structured data exchange on the web and in APIs. Its human-readable format with support for nested structures makes it perfect for configuration files, metadata, and hierarchical data that doesn't fit in CSV.

Common Use Cases

Configuration files and settings
API data exchange
Hierarchical metadata storage
Nested data structures

Getting Started

JSON (JavaScript Object Notation) is a lightweight, text-based format for storing and exchanging structured data. Despite its JavaScript origins, JSON is language-independent and widely used across all programming languages.

Why JSON?

Human-Readable: Easy to read and write
Nested Structures: Supports hierarchical data
Language-Independent: Works with any programming language
Web Standard: Native format for web APIs
Self-Describing: Structure is clear from content
Lightweight: Minimal syntax overhead

Structure

Basic Syntax

{
  "subject_id": "S01",
  "age": 25,
  "active": true,
  "scores": [95, 87, 92],
  "metadata": {
    "experiment": "visual_discrimination",
    "date": "2024-01-15"
  }
}

Data Types

String: "text"
Number: 42, 3.14
Boolean: true, false
Null: null
Array: [1, 2, 3]
Object: {"key": "value"}

Working with JSON in Python

Reading JSON

import json

# From file
with open('data.json', 'r') as f:
    data = json.load(f)

# From string
json_string = '{"name": "Alice", "score": 95}'
data = json.loads(json_string)

print(data['name'])  # 'Alice'
print(data['score']) # 95

Writing JSON

import json

# Data structure
data = {
    "subject_id": "S01",
    "trials": [
        {"trial": 1, "response": "left", "rt": 0.45},
        {"trial": 2, "response": "right", "rt": 0.38}
    ],
    "metadata": {
        "date": "2024-01-15",
        "experiment": "visual_task"
    }
}

# To file
with open('results.json', 'w') as f:
    json.dump(data, f, indent=2)

# To string
json_string = json.dumps(data, indent=2)

Pretty Printing

import json

# Pretty print with indentation
print(json.dumps(data, indent=2))

# Even more readable
print(json.dumps(data, indent=2, sort_keys=True))

Using Pandas with JSON

Read JSON to DataFrame

import pandas as pd

# From file
df = pd.read_json('data.json')

# From string
df = pd.read_json(json_string)

# Specify orientation
df = pd.read_json('data.json', orient='records')

Different JSON Orientations

# records: [{col: val}, {col: val}]
df.to_json('data.json', orient='records', indent=2)

# columns: {col: {index: val}}
df.to_json('data.json', orient='columns')

# index: {index: {col: val}}
df.to_json('data.json', orient='index')

# split: {index: [], columns: [], data: []}
df.to_json('data.json', orient='split')

# values: [[val, val], [val, val]]
df.to_json('data.json', orient='values')

Neuroscience Examples

Experiment Metadata

{
  "experiment": {
    "name": "Visual Discrimination",
    "date": "2024-01-15",
    "pi": "Dr. Smith",
    "protocol": "IRB-2024-001"
  },
  "subjects": [
    {
      "id": "S01",
      "age": 25,
      "gender": "F",
      "sessions": [
        {
          "date": "2024-01-15",
          "trials": 100,
          "accuracy": 0.95
        }
      ]
    }
  ],
  "parameters": {
    "stimulus_duration_ms": 500,
    "iti_ms": 1000,
    "contrast_levels": [0.25, 0.5, 0.75, 1.0]
  }
}

ROI Metadata

{
  "recording": {
    "date": "2024-01-15",
    "microscope": "2P_Bruker",
    "objective": "20x",
    "frame_rate_hz": 30
  },
  "rois": [
    {
      "id": 1,
      "type": "soma",
      "coordinates": {"x": 120, "y": 85, "z": 15},
      "area_pixels": 245,
      "mean_intensity": 128.5
    },
    {
      "id": 2,
      "type": "soma",
      "coordinates": {"x": 145, "y": 92, "z": 15},
      "area_pixels": 189,
      "mean_intensity": 95.2
    }
  ]
}

Configuration Files

{
  "analysis": {
    "preprocessing": {
      "gaussian_sigma": 1.5,
      "threshold_method": "otsu",
      "min_roi_size": 20
    },
    "detection": {
      "algorithm": "watershed",
      "min_distance": 5,
      "sensitivity": 0.8
    },
    "output": {
      "save_images": true,
      "save_traces": true,
      "format": "hdf5"
    }
  }
}

Advanced Usage

Nested Data Access

import json

with open('experiment.json', 'r') as f:
    data = json.load(f)

# Access nested data
pi_name = data['experiment']['pi']
first_subject = data['subjects'][0]
first_session = data['subjects'][0]['sessions'][0]
contrast_levels = data['parameters']['contrast_levels']

Handling Custom Objects

import json
from datetime import datetime

class DateEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.isoformat()
        return super().default(obj)

# Use custom encoder
data = {"date": datetime.now(), "value": 42}
json_str = json.dumps(data, cls=DateEncoder)

Merging JSON Files

import json
from pathlib import Path

# Read multiple JSON files
json_files = Path('data').glob('*.json')
all_data = []

for file in json_files:
    with open(file, 'r') as f:
        all_data.append(json.load(f))

# Save combined
with open('combined.json', 'w') as f:
    json.dump(all_data, f, indent=2)

Validation

Check JSON Validity

import json

def is_valid_json(filename):
    try:
        with open(filename, 'r') as f:
            json.load(f)
        return True
    except json.JSONDecodeError as e:
        print(f"Invalid JSON: {e}")
        return False

Schema Validation

import json
import jsonschema

# Define schema
schema = {
    "type": "object",
    "properties": {
        "subject_id": {"type": "string"},
        "age": {"type": "integer", "minimum": 18},
        "scores": {"type": "array", "items": {"type": "number"}}
    },
    "required": ["subject_id", "age"]
}

# Validate data
data = {"subject_id": "S01", "age": 25, "scores": [95, 87]}
jsonschema.validate(instance=data, schema=schema)

JSON vs CSV

Use JSON When:

Data has nested structure
Different records have different fields
Need to store arrays within records
Configuration files
API data exchange

Use CSV When:

Flat tabular data
All records have same fields
Need to open in Excel
Simpler data structure
Human editing required

Common Patterns

Loading Configuration

import json
from pathlib import Path

def load_config(config_file='config.json'):
    """Load configuration from JSON file."""
    config_path = Path(config_file)

    if not config_path.exists():
        # Return defaults
        return {
            "threshold": 0.5,
            "output_dir": "results"
        }

    with open(config_path, 'r') as f:
        return json.load(f)

config = load_config()
threshold = config['threshold']

Saving Results

import json

def save_analysis_results(results, output_file):
    """Save analysis results as JSON."""
    output = {
        "timestamp": datetime.now().isoformat(),
        "parameters": results['parameters'],
        "metrics": results['metrics'],
        "summary": {
            "n_rois": len(results['rois']),
            "mean_response": results['metrics']['mean_response']
        }
    }

    with open(output_file, 'w') as f:
        json.dump(output, f, indent=2)

Tools for Working with JSON

Command Line

# Pretty print (requires jq)
cat data.json | jq .

# Extract specific field
cat data.json | jq '.subjects[0].age'

# Python pretty print
python -m json.tool data.json

Python Pretty Print

import json

# Read and pretty print
with open('messy.json', 'r') as f:
    data = json.load(f)

with open('clean.json', 'w') as f:
    json.dump(data, f, indent=2, sort_keys=True)

Best Practices

Use 2 or 4 spaces for indentation
Keep files human-readable with indentation
Use meaningful key names
Include metadata (date, version, etc.)
Validate JSON structure before saving
Use .json file extension
Handle exceptions when loading
Consider YAML for complex configs
Use schema validation for critical data
Document structure in README

Common Errors

Trailing Commas

{
  "key": "value",  // ❌ Comma after last item
}

{
  "key": "value"   // ✅ No trailing comma
}

Single Quotes

{'key': 'value'}    // ❌ Single quotes not allowed

{"key": "value"}    // ✅ Use double quotes

Comments

{
  // This is a comment  // ❌ Comments not allowed in JSON
  "key": "value"
}

Installation

JSON support is built into Python’s standard library - no installation needed!

import json  # Built-in module

For schema validation:

pip install jsonschema

Summary

JSON is essential for:

Configuration files
API data exchange
Hierarchical metadata
Nested data structures

While CSV is great for flat tabular data, JSON excels at representing complex, nested structures common in modern research workflows.

Resources

Documentation