Welcome to Swarm Cluster Documentation.
Swarm is NFS MRI grant funded Cluster.
The swarm hardware cluster consist of:
- 60 compute nodes with 8 cores(Xeon 5355 2.66 GHz), 16GB of ram, and 250GB local disk.
- A 70 TB Cluster file system which includes:
- 2 Metadata servers each with 8 cores, 16GB of ram, and ~1Tb of shared disk attached between them for Active/Passive failover.
- 8 I/O nodes with 8 cores, 16GB of ram each with dedicated 9TB of attached disk.
The software consist of:
- Rocks cluster software (maintenance, package distribution and OS distribution.)
- Centos Linux (OS)
- Lustre (Shared cluster file system )
- Sun Grid engine (Job Scheduling & resource management)
Before usage please read the following Documentation. It may save you time.
- Policy Documentation This document explains how the resources are shared. Some policy is enforced via software. To make sure that your jobs don't die, please read this documentation.
- User Documentation This document has quick reference on how to get started using swarm.
- Library This has links to useful software and ideas on using clusters that may make your work easier.
NOTE: All papers resulting from use of the swarm cluster need to acknowledge the NSF MRI grant with the following text.
"This work is in supported in part by the National Science Foundation under NSF grant #CNS-0619337. Any opinions, findings conclusions or recommendations expressed here are the author(s) and do not necessarily reflect those of the sponsors."
A copy of the paper in .pdf format should be emailed to Kate Moruzzi (kate)
We also need a brief writeup that we can include in the final report due June 2008.