Lab Facilities & Resources
Access high-performance computing resources and best practices for research.
The Cyber Innovation Compute Cluster
The Cyber Innovation Cluster is optimized for Distributed Data Parallel (DDP) PyTorch training, managed via SLURM.
🖥️
Architecture
1 Login Node, 4 Compute Nodes
⚡
GPU Hardware
NVIDIA RTX 5070 Ti (Single GPU/node)
⚙️
CPU Hardware
~20 Cores per Compute Node
📊
Orchestration
SLURM Workload Manager
SLURM Quick Reference
sbatch <script.sh>Submit a job to the cluster queue.
squeue -u $USERList your currently active/queued jobs.
scancel <job_id>Cancel a specific job.
sinfo -NelView node information and status.
Data Management & Storage Hygiene
# Upload datasets with progress tracking
rsync -avhp /local/path/ user@login:~/datasets/folder/
# Purge Pip & Conda caches
pip cache purge && conda clean --all -y
Cluster Etiquette & Policies
- Never run training on the login node. Use it only for compiling/submitting.
- Request only what you need. Release nodes promptly after experiments.
- Maintain reproducible environments (Conda/requirements.txt).
- Clean up shared storage (Hugging Face, Torch, Pip caches).
- Use physical desk nodes (c1, c2) for interactive work.
