Instructions for Users
Kindly do not run any job on the main/head node.
Please put all your data files in the directory /mnt/oss/your_user_name, and run your programs from there.
Please take regular backups of your data. If you need help with this, please e-mail Ipsita Satpathy. We do not have the resources to take backups for all user data, and accidents do happen.
Use QSUB script to run every job.
For jobs that are in "qw" state and need only a change of number
of cores and/or queue, use qalter
instead killing the job and resubmitting it.
Please use ONLY multiples of 16 cores for all your MPI jobs. This
isolates the nodes you are using from nodes other users are using, and
protects you from nodes that crash or hang because of errors in running
programs. It also protects you when the admin starts debugging or
removing other users' jobs; accidents can happen.
All parallel jobs requiring less than 48 cores and a wall time of less than 4 days should be
put on all.q queue.
All serial jobs (single-core jobs) should be run on serial.q.
If you need access to more than 48 cores per job, or more than 4 days of wall clock time to run a job, please e-mail Sumit Mookerjee to get access to largejob.q.
Please use test.q to run test jobs, to check they run as you expect before you place them on the production queues.
NOTE that jobs under the following conditions will be TERMINATED
Any job on "largejob.q" with less than 48 cores.
Any job on all.q with more than 48 cores.
Any serial jobs on all.q or largejob.q.
4. Any job running on the head node.
5. Jobs not under the control of SGE.