Jupyter (formerly IPython Notebook) is an open-source web application that allows you to create and share documents containing live code, equations, visualizations, and narrative text. It’s become the standard tool for interactive data science and research computing.
Why Jupyter?
- Interactive: Write code, run it, see results immediately
- Literate Programming: Mix code with markdown explanations
- Visualization: Display plots and figures inline
- Reproducible: Share notebooks with all code and outputs
- Multi-Language: Supports Python, R, Julia, and more
Key Features
Code Cells
Execute Python code interactively:
import numpy as np
import matplotlib.pyplot as plt
# Generate data
x = np.linspace(0, 10, 100)
y = np.sin(x)
# Plot inline
plt.plot(x, y)
plt.show()
Markdown Cells
Document your analysis with rich text:
- Headers, lists, bold, italic
- LaTeX equations: $E = mc^2$
- Links and images
- Code blocks with syntax highlighting
Rich Output
Display diverse content types:
- Plots and figures
- DataFrames (pretty HTML tables)
- Images and videos
- Interactive widgets
- HTML and JavaScript
Installation & Usage
Install Jupyter:
conda install jupyter
# or
pip install jupyter
Launch notebook server:
jupyter notebook
# or for JupyterLab (modern interface)
jupyter lab
Jupyter Lab vs Classic Notebooks
JupyterLab (recommended):
- Modern interface with multiple panels
- Integrated file browser and terminal
- Extensions and themes
- Better for complex projects
Classic Notebook:
- Simpler, single-document interface
- Lighter weight
- Good for quick analyses
Common Workflows in Research
Exploratory Analysis
# Load data
import pandas as pd
data = pd.read_csv('experiment.csv')
# Quick exploration
data.head()
data.describe()
# Visualize
data.groupby('condition')['reaction_time'].hist(alpha=0.5)
Parameterized Notebooks
Use tools like Papermill to run notebooks with different parameters:
papermill analysis.ipynb output.ipynb -p subject_id S01
Publishing Results
Convert notebooks to:
- HTML reports
- PDF documents
- Slideshows
- Blog posts (with Hugo, Jekyll)
Best Practices
- Restart & Run All: Ensure notebooks execute in order
- Clear Outputs: Before committing to version control
- Short Cells: Keep individual cells focused
- Document: Explain what and why, not just how
- Version Control: Commit
.ipynbfiles to Git - Virtual Environments: Use separate environments per project
Tips
- Use keyboard shortcuts (
Escfor command mode,Enterfor edit mode) - Learn magic commands:
%matplotlib inline,%timeit,%%time - Use tab completion and
?for documentation - Split complex analyses across multiple notebooks
- Use
%load_ext autoreloadfor developing packages - Consider nbconvert for generating reports