GIN (G-Node Infrastructure) is a free data management system designed for comprehensive and reproducible management of scientific data. It’s optimized for neuroscience research but suitable for any field with large datasets.
Why GIN?
- Version Control for Data: Git-like workflow for datasets
- Large File Support: Efficiently handles files of any size
- Free Storage: Generous storage quotas for researchers
- Neuroscience Focus: Designed with neuroscience workflows in mind
- DOI Integration: Publish datasets with permanent identifiers
Key Features
- Web interface for browsing and managing data
- Git integration for command-line workflows
- Support for large files through git-annex
- Issue tracking and wikis for collaboration
- Public and private repositories
Getting Started
- Create account at gin.g-node.org
- Install the GIN client or use Git with git-annex
- Create a repository for your dataset
- Push data using Git commands
- Share with collaborators or make public
GIN vs. GitHub
- GIN: Optimized for large data files, neuroscience community
- GitHub: Optimized for code, broader software community
- Use Both: Store code on GitHub, data on GIN, link them together
Best Practices
- Use clear naming conventions for data files
- Document data structure in README files
- Use Git tags to mark dataset versions
- Archive final versions with DOIs before publication
- Link GIN datasets to GitHub code repositories
Tips
- Use the GIN client for easier large file management
- Add metadata files to describe your datasets
- Make repositories public after paper acceptance
- Use organizations for lab or project group data