PyTorch is an open-source machine learning framework that accelerates the path from research prototyping to production deployment. Developed by Meta AI, it’s become one of the most popular deep learning frameworks, especially in research.
Why PyTorch?
- Intuitive: Feels like NumPy with GPU support
- Dynamic: Computational graphs built on-the-fly (easier debugging)
- Flexible: Easy to customize and experiment
- Fast: Optimized C++ backend with GPU acceleration
- Research-Friendly: Widely used in academic papers
Key Concepts
Tensors
PyTorch’s fundamental data structure:
import torch
# Create tensors
x = torch.tensor([1.0, 2.0, 3.0])
y = torch.randn(3, 4) # Random 3x4 matrix
# GPU acceleration
if torch.cuda.is_available():
x = x.cuda() # Move to GPU
Neural Networks
import torch.nn as nn
class SimpleNet(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super().__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
model = SimpleNet(784, 128, 10)
Training Loop
import torch.optim as optim
# Setup
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Training
for epoch in range(num_epochs):
for batch_x, batch_y in dataloader:
# Forward pass
outputs = model(batch_x)
loss = criterion(outputs, batch_y)
# Backward pass
optimizer.zero_grad()
loss.backward()
optimizer.step()
print(f'Epoch {epoch}, Loss: {loss.item():.4f}')
Common Architectures
Feedforward Network
model = nn.Sequential(
nn.Linear(input_dim, 512),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(512, 256),
nn.ReLU(),
nn.Linear(256, output_dim)
)
Convolutional Network
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 32, kernel_size=3)
self.conv2 = nn.Conv2d(32, 64, kernel_size=3)
self.fc1 = nn.Linear(64 * 6 * 6, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(x, 2)
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2)
x = x.view(x.size(0), -1)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
Recurrent Network
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super().__init__()
self.rnn = nn.LSTM(input_size, hidden_size, batch_first=True)
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x):
out, _ = self.rnn(x)
out = self.fc(out[:, -1, :])
return out
Neuroscience Applications
Predicting Neural Activity
# Model brain responses from stimulus features
model = nn.Sequential(
nn.Linear(stimulus_dim, 256),
nn.ReLU(),
nn.Linear(256, n_neurons)
)
Spike Sorting
# Classify spike waveforms
# Use CNNs on spike snippets
Decoding Behavior
# Predict behavior from neural population activity
# RNNs for temporal dependencies
Synthetic Data Generation
# VAEs or GANs to generate realistic neural data
Getting Started
Install PyTorch:
# CPU version
conda install pytorch torchvision torchaudio cpuonly -c pytorch
# GPU version (CUDA)
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
Basic example:
import torch
import torch.nn as nn
import torch.optim as optim
# Create synthetic data
X = torch.randn(100, 10)
y = torch.randint(0, 2, (100,))
# Define model
model = nn.Sequential(
nn.Linear(10, 5),
nn.ReLU(),
nn.Linear(5, 2)
)
# Train
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)
for epoch in range(100):
outputs = model(X)
loss = criterion(outputs, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
Tips
- Always move model and data to same device (CPU/GPU)
- Use
model.train()andmodel.eval()modes - Save checkpoints during long training runs
- Use
torch.no_grad()during evaluation - Profile code to find bottlenecks
- Use DataLoader for efficient batching
- Consider using PyTorch Lightning for cleaner code
- Start with small models and scale up