JSON (JavaScript Object Notation) is a lightweight, text-based format for storing and exchanging structured data. Despite its JavaScript origins, JSON is language-independent and widely used across all programming languages.
Why JSON?
- Human-Readable: Easy to read and write
- Nested Structures: Supports hierarchical data
- Language-Independent: Works with any programming language
- Web Standard: Native format for web APIs
- Self-Describing: Structure is clear from content
- Lightweight: Minimal syntax overhead
Structure
Basic Syntax
{
"subject_id": "S01",
"age": 25,
"active": true,
"scores": [95, 87, 92],
"metadata": {
"experiment": "visual_discrimination",
"date": "2024-01-15"
}
}
Data Types
- String:
"text" - Number:
42,3.14 - Boolean:
true,false - Null:
null - Array:
[1, 2, 3] - Object:
{"key": "value"}
Working with JSON in Python
Reading JSON
import json
# From file
with open('data.json', 'r') as f:
data = json.load(f)
# From string
json_string = '{"name": "Alice", "score": 95}'
data = json.loads(json_string)
print(data['name']) # 'Alice'
print(data['score']) # 95
Writing JSON
import json
# Data structure
data = {
"subject_id": "S01",
"trials": [
{"trial": 1, "response": "left", "rt": 0.45},
{"trial": 2, "response": "right", "rt": 0.38}
],
"metadata": {
"date": "2024-01-15",
"experiment": "visual_task"
}
}
# To file
with open('results.json', 'w') as f:
json.dump(data, f, indent=2)
# To string
json_string = json.dumps(data, indent=2)
Pretty Printing
import json
# Pretty print with indentation
print(json.dumps(data, indent=2))
# Even more readable
print(json.dumps(data, indent=2, sort_keys=True))
Using Pandas with JSON
Read JSON to DataFrame
import pandas as pd
# From file
df = pd.read_json('data.json')
# From string
df = pd.read_json(json_string)
# Specify orientation
df = pd.read_json('data.json', orient='records')
Different JSON Orientations
# records: [{col: val}, {col: val}]
df.to_json('data.json', orient='records', indent=2)
# columns: {col: {index: val}}
df.to_json('data.json', orient='columns')
# index: {index: {col: val}}
df.to_json('data.json', orient='index')
# split: {index: [], columns: [], data: []}
df.to_json('data.json', orient='split')
# values: [[val, val], [val, val]]
df.to_json('data.json', orient='values')
Neuroscience Examples
Experiment Metadata
{
"experiment": {
"name": "Visual Discrimination",
"date": "2024-01-15",
"pi": "Dr. Smith",
"protocol": "IRB-2024-001"
},
"subjects": [
{
"id": "S01",
"age": 25,
"gender": "F",
"sessions": [
{
"date": "2024-01-15",
"trials": 100,
"accuracy": 0.95
}
]
}
],
"parameters": {
"stimulus_duration_ms": 500,
"iti_ms": 1000,
"contrast_levels": [0.25, 0.5, 0.75, 1.0]
}
}
ROI Metadata
{
"recording": {
"date": "2024-01-15",
"microscope": "2P_Bruker",
"objective": "20x",
"frame_rate_hz": 30
},
"rois": [
{
"id": 1,
"type": "soma",
"coordinates": {"x": 120, "y": 85, "z": 15},
"area_pixels": 245,
"mean_intensity": 128.5
},
{
"id": 2,
"type": "soma",
"coordinates": {"x": 145, "y": 92, "z": 15},
"area_pixels": 189,
"mean_intensity": 95.2
}
]
}
Configuration Files
{
"analysis": {
"preprocessing": {
"gaussian_sigma": 1.5,
"threshold_method": "otsu",
"min_roi_size": 20
},
"detection": {
"algorithm": "watershed",
"min_distance": 5,
"sensitivity": 0.8
},
"output": {
"save_images": true,
"save_traces": true,
"format": "hdf5"
}
}
}
Advanced Usage
Nested Data Access
import json
with open('experiment.json', 'r') as f:
data = json.load(f)
# Access nested data
pi_name = data['experiment']['pi']
first_subject = data['subjects'][0]
first_session = data['subjects'][0]['sessions'][0]
contrast_levels = data['parameters']['contrast_levels']
Handling Custom Objects
import json
from datetime import datetime
class DateEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
return obj.isoformat()
return super().default(obj)
# Use custom encoder
data = {"date": datetime.now(), "value": 42}
json_str = json.dumps(data, cls=DateEncoder)
Merging JSON Files
import json
from pathlib import Path
# Read multiple JSON files
json_files = Path('data').glob('*.json')
all_data = []
for file in json_files:
with open(file, 'r') as f:
all_data.append(json.load(f))
# Save combined
with open('combined.json', 'w') as f:
json.dump(all_data, f, indent=2)
Validation
Check JSON Validity
import json
def is_valid_json(filename):
try:
with open(filename, 'r') as f:
json.load(f)
return True
except json.JSONDecodeError as e:
print(f"Invalid JSON: {e}")
return False
Schema Validation
import json
import jsonschema
# Define schema
schema = {
"type": "object",
"properties": {
"subject_id": {"type": "string"},
"age": {"type": "integer", "minimum": 18},
"scores": {"type": "array", "items": {"type": "number"}}
},
"required": ["subject_id", "age"]
}
# Validate data
data = {"subject_id": "S01", "age": 25, "scores": [95, 87]}
jsonschema.validate(instance=data, schema=schema)
JSON vs CSV
Use JSON When:
- Data has nested structure
- Different records have different fields
- Need to store arrays within records
- Configuration files
- API data exchange
Use CSV When:
- Flat tabular data
- All records have same fields
- Need to open in Excel
- Simpler data structure
- Human editing required
Common Patterns
Loading Configuration
import json
from pathlib import Path
def load_config(config_file='config.json'):
"""Load configuration from JSON file."""
config_path = Path(config_file)
if not config_path.exists():
# Return defaults
return {
"threshold": 0.5,
"output_dir": "results"
}
with open(config_path, 'r') as f:
return json.load(f)
config = load_config()
threshold = config['threshold']
Saving Results
import json
def save_analysis_results(results, output_file):
"""Save analysis results as JSON."""
output = {
"timestamp": datetime.now().isoformat(),
"parameters": results['parameters'],
"metrics": results['metrics'],
"summary": {
"n_rois": len(results['rois']),
"mean_response": results['metrics']['mean_response']
}
}
with open(output_file, 'w') as f:
json.dump(output, f, indent=2)
Tools for Working with JSON
Command Line
# Pretty print (requires jq)
cat data.json | jq .
# Extract specific field
cat data.json | jq '.subjects[0].age'
# Python pretty print
python -m json.tool data.json
Python Pretty Print
import json
# Read and pretty print
with open('messy.json', 'r') as f:
data = json.load(f)
with open('clean.json', 'w') as f:
json.dump(data, f, indent=2, sort_keys=True)
Best Practices
- Use 2 or 4 spaces for indentation
- Keep files human-readable with indentation
- Use meaningful key names
- Include metadata (date, version, etc.)
- Validate JSON structure before saving
- Use
.jsonfile extension - Handle exceptions when loading
- Consider YAML for complex configs
- Use schema validation for critical data
- Document structure in README
Common Errors
Trailing Commas
{
"key": "value", // ❌ Comma after last item
}
{
"key": "value" // ✅ No trailing comma
}
Single Quotes
{'key': 'value'} // ❌ Single quotes not allowed
{"key": "value"} // ✅ Use double quotes
Comments
{
// This is a comment // ❌ Comments not allowed in JSON
"key": "value"
}
Installation
JSON support is built into Python’s standard library - no installation needed!
import json # Built-in module
For schema validation:
pip install jsonschema
Summary
JSON is essential for:
- Configuration files
- API data exchange
- Hierarchical metadata
- Nested data structures
While CSV is great for flat tabular data, JSON excels at representing complex, nested structures common in modern research workflows.