HPC

Last modified by Administrator on Wed, 03/25/2020, 12:55 PM

284914309.png

The COARE's HPC consists of a cluster of compute and storage servers to allow high-speed and resource-intensive computations and processing of large datasets.

The system architecture for the COARE HPC service is detailed below:

1129331312.png

The HPC service uses slurm as its batch scheduler. The cluster is divided into 3 partitions: Batch, Debug, and GPU. Below are the specifications per partition:

Batch (46 Nodes)

  • 24 cores, 48 threads
  • 250Gb ram

Debug (2 Nodes)

  • 24 cores, 48 threads
  • 250Gb ram

GPU (1 Node)

  • 12 cores, 24 threads
  • 250Gb ram
  • 2 NVidia Tesla K80

The home directory (/home) is the COARE's network filesystem using GlusterFS and is built to serve as the user’s home directory. Users’ scripts input data are stored here.

The scratch directories (/scratch1 and /scratch2) are the COARE's parallel filesystem using LustreFS. These are built to handle user’s I/O heavy workloads. The output of running jobs including the intermediary files are stored here.

NOTE:  As part of the efforts to upgrade the COARE's current infrastructure, the COARE Team has started to implement the saliksik cluster, which comprises the next generation of HPC-based CPUs and GPUs of the COARE. For more information on saliksik, visit this Wiki page.

Default Allocation for HPC service

The COARE provides a default allocation for each user to ensure the fair and equitable use of the Facility. For more information, please read the COARE's Acceptable Use Policy (AUP).

The table below summarizes the default allocation provided for each COARE HPC user:

CPU240 logical cores for 7 days
Storage(/home)100 GB usable for home directory
Scratch directories (/scratch1 and /scratch2)5 TB for each scratch
GPU2 GPUs for 7 days
Max running Job30 jobs
Max submit job45 jobs
Job queueing time

No guarantee; depends on the status of the queue and the availability of the requested resource/s

Any requests for allocation increase will be subject to the COARE Team's evaluation and approval.

Tags: