What is the procedure to get an access to the supercomputing facility at IUAC?
To access the supercomputing facility at IUAC, you have to request an account, including a short (~ 1 page) description of the proposed work, the software you require, and the resources you need (number of cores, amount of RAM, disk space, time) for a typical run.
You may send this information to the sumit@iuac.res.in. If your request is approved, we will get back to you with details of how you can access the system.
What are KALKI and K2?
KALKI and K2 are the two systems that currently comprise the IUAC supercomputing facility.
Both systems are MPI clusters. Kalki has 96 compute nodes, 768 compute cores, and an 8-node 6 TB PVFS2 file system, with a Linpack Rmax score of 6.5 teraflops. K2 has 200 compute nodes, 3200 compute cores and a 55 TB Lustre parallel file system, with a Linpack sustained rating of 62 teraflops. Distributed memory CPU-intensive jobs parallelized using MPI should run well on both clusters.
How do I connect to KALKI and K2?
Kindly contact the system administrators for connection to the HPC facility. For new accounts, contact Sumit Mookerjee; for access issues for existing accounts, contact Ipsita Satpathy.
What is the hardware and OS configuration of the supercomputing facility at IUAC?
KALKI has 96 compute nodes with dual quad-core Xeon CPUs (total 768 cores at 3.0 GHz, 16 GB RAM, and 500 GB disk per node) with a 20 GB/s Infiniband interconnect, and 6 TB of additional storage on a PVFS2 cluster. The KALKI head node is also a dual quad-core Xeon system with 32 GB of RAM. The cluster is built using Rocks 5.1 and CentOS. The system supports both the GNU suite (GCC, GSL, and OpenMPI) and the Intel suite (ICC/ifort, MKL, and IMPI),
K2 has 200 compute nodes with dual octa-core Xeon CPUs (total 3200 cores at 2.4 GHz, 64 GB RAM, and 500 GB disk per node) with a 40 GB/s Infiniband interconnect, and 55 TB of additional storage on a Lustre parallel file system. The K2 head node is also a dual octa-core Xeon system with 128 GB of RAM. The cluster is built using Rocks 6.1 and CentOS 6.3, and with Intel compilers, Intel MKL, and Intel MPI.
How do I submit jobs on KALKI or K2?
For submitting jobs on KALKI/K2, use only the qsub command. Your job will be submitted through the Sun Grid Engine resource manager. The sample qsub script for KALKI should work on K2 as well for most applications. For example, if your script name is "sample_script.txt" you should give the following command to submit it:
$ qsub sample_script.txt
For a sample qsub script, look here (41.5 KB).
How do I change the queue and parallel environment in the KALKI qsub script?
To decide which queue you should use, check out the queue management policy here. Then edit/insert into the qsub script something like these lines:
- #$ -q all.q
- #$ -cwd
- #$ -pe mpich 16
- The first line specifies the use of the "all.q" queue for calculations.
- The second line tells to work in the Current Working Directory (CWD) and the third line tells to use MPICH parallel environment with 16 cores.
- On Kalki, the other parallel environment available is the OpenMPI ORTE (-pe orte 8). Since the KALKI compute nodes have 8 cores each, this also means using one whole machine or node to do the calculations. On K2, please use multiples of 16 as the core count.
How do I find out how many cores are free on KALKI and K2?
The following command tell you how many cores are available on various queues:
qstat -g c
What are the limits on number of cores and time for various queues?
For K2, please see this page.
For Kalki, the time and core limits for various queues are tabulated below:
Queue | Core Limit | Time Limit | Usage |
---|---|---|---|
all.q | 8 to 40 | 96 hours (4 days) | all except WIEN2K jobs |
largejob.q | 48 to 192 | 336 hours (14 days) | all except WIEN2K jobs |
wien.q | 8 to 64 cores | No time limit | WIEN2K jobs only |
test.q | 1 to 32 | 6 hours | Non-WIEN2K jobs - for testing purpose only |