Loading...

NetCDF

Network Common Data Form for array-oriented scientific data

Open Science & Data Sharing Intermediate Data Format
Quick Info
  • Category: Open Science & Data Sharing
  • Level: Intermediate
  • Type: Data Format

Why We Recommend NetCDF

NetCDF provides self-describing, portable storage for multi-dimensional arrays with labeled dimensions. It integrates seamlessly with xarray and follows standardized metadata conventions, making it excellent for time-series data with spatial structure.

Common Use Cases

  • Store time-series data with spatial dimensions
  • Multi-dimensional experimental recordings
  • Data with complex coordinate systems
  • Integration with xarray workflows

Getting Started

NetCDF (Network Common Data Form) is a set of software libraries and self-describing, machine-independent data formats for array-oriented scientific data. Originally developed for climate science, it’s now widely used across scientific domains.

Key Features

  • Self-describing: Variables include metadata describing dimensions, units, and conventions
  • Portable: Platform-independent binary format
  • Efficient access: Direct access to subsets of large arrays
  • CF Conventions: Standardized metadata conventions for geophysical data
  • Multiple backends: Can use HDF5 as storage layer (NetCDF-4)

Scientific Applications

NetCDF is commonly used for:

  • Time-series data with spatial dimensions
  • Multi-dimensional experimental recordings
  • Model outputs and simulations
  • Data with complex coordinate systems

Python Integration

import xarray as xr
import numpy as np

# Create dataset
ds = xr.Dataset(
    {
        'neural_activity': (['time', 'channel', 'trial'], activity_data),
        'stimulus': (['time', 'trial'], stimulus_data),
    },
    coords={
        'time': np.arange(0, 10, 0.001),  # 10s at 1kHz
        'channel': np.arange(64),
        'trial': np.arange(20),
    },
    attrs={
        'experiment': 'visual_response',
        'subject_id': 'M01',
        'recording_date': '2024-01-15',
    }
)

# Add metadata
ds['neural_activity'].attrs['units'] = 'microvolts'
ds['neural_activity'].attrs['description'] = 'LFP recordings'

# Save to NetCDF
ds.to_netcdf('experiment.nc')

# Load and work with subsets
ds = xr.open_dataset('experiment.nc')
trial_5 = ds.sel(trial=5)
channels_10_20 = ds.sel(channel=slice(10, 20))

When to Use NetCDF

Best for:

  • Multi-dimensional arrays with labeled dimensions
  • Time-series data with spatial/channel structure
  • Data sharing with standardized metadata
  • Integration with xarray workflows

Consider alternatives for:

  • Simple tabular data (use Parquet or CSV)
  • Hierarchical/nested structures (use HDF5)
  • Specialized neuroscience formats (use NWB)
Top