Expertise

Research Computing at CU Boulder consists of a small group of computational scientists, high-performance computing specialists, and system administrators with the mission to provide leadership in developing, deploying, and operating an integrated cyberinfrastructure. This cyberinfrastructure consists of high-performance computing, on-premises and commercial cloud computing, storage, and high-speed networking that supports research, collaboration and discovery. Research Computing contributes to the educational mission of the university by providing training workshops and consultation services for cyberinfrastructure related topics.

Compute
  • Research Computing’s third-generation supercomputer, Alpine, is funded by contributions from the University of Colorado Boulder, Colorado State University, and the University of Colorado Anschutz (award 2201538). As of 2023, the Alpine supercomputer has 22,692 cores. Hardware includes 347 general compute nodes, each with 32-64 cores aboard AMD Milan CPUs, 256 GB of RAM, and a local SSD; 12 GPU nodes, each with 3x NVIDIA A100 GPUs; 8 GPU nodes each, with 3x AMD MI100 GPUs; and 22 high-memory nodes with 48-64 cores and 1TB of RAM. The general compute nodes are connected through a high-performance network based on Mellanox InfiniBand with a bandwidth of 200 Gb/s. A 2 PB high-performance IBM GPFS file system is provided.
  • Research Computing offers a condo computing service that enables researchers to purchase and own compute nodes that are operated as part of a cluster, named “Blanca.” The aggregate cluster is made available to all condo partners while maintaining priority for the owner of each node.
  • Research Computing hosts a free-to-use on-premises cloud service, called CUmulus, which supports cases not well-suited for HPC such as webservers, databases, and long-running services. The CUmulus service includes access to a Virtual Private Cloud (VPC), which provides users with a logically isolated section of the cloud with a small number of outside routable floating IP addresses. Within this VPC customers are given an allocation of CPU cores, memory and storage, which can be used to host virtual machines and volumes to host workloads. The CU Research Computing group would like to acknowledge support of this project from the National Science Foundation (OAC-1925766). In addition to the CUmulus service, Research Computing can facilitate researcher access to offsite commercial cloud resources at negotiated rates.
Networking

The current CU Boulder network is a 40 Gbps fiber core with Cat 5 or higher wiring throughout campus. Research Computing has created an 80 Gbps Science-DMZ to connect the Alpine supercomputer to storage and to bring individual dedicated 10 Gbps circuits to various locations as needed. CU Boulder participates in I2 (the Internet2 higher education, government, and vendor research computing consortium) and is an active member of other networks. Research Computing has begun to provide campus researchers with a leading-edge network that meets their needs and facilitates collaboration, high performance data exchange, access to co-location facilities, remote mounts to storage, and real-time communications.

File Transfer

Research Computing has several data transfer nodes dedicated to moving large volumes of data.

The CU Office of Information Technology also offers a file transfer service with a web interface, which provides an ideal way to transfer large files to collaborators. Files are uploaded to a server and a link to download the file is emailed to an on- or off-campus user.

Storage

Each Research Computing user has a 2 GB home directory and a 250 GB projects directory, which are backed up regularly. Alpine users also have a 10 TB scratch directory.

PetaLibrary Storage Services

The PetaLibrary is a CU Research Computing service supporting the storage, archival, and sharing of research data. It is available at a subsidized cost to any researcher affiliated with the University of Colorado Boulder. The two main categories of PetaLibrary storage are Active and Archive, for data requiring frequent access or infrequent access, respectively. Both tiers are stored on spinning disk and are accessible to researchers on resources managed by Research Computing. The cost for CU researchers is $45/TB/year for Active, $20/TB/year for Archive, and $65/TB/year for Active+Archive.

Through a collaboration with the CU Libraries, the PetaLibrary can also host the publication and long-term archival of large datasets. The datasets are assigned unique digital object identifiers (DOIs) that are searchable and accessible via the “CU Scholar” institutional repository.

OnDemand

Open OnDemand is a browser-based, integrated, single access portal for CURC high performance computing (HPC) resources. It provides a graphical interface to view, edit, download, and upload files, manage and create job templates for CURC’s computing clusters, and access CURC interactive applications (visualization nodes, Matlab, Rstudio, and Jupyter Notebooks), all via a web browser.

Center for Research Data and Digital Scholarship (CRDDS)

The Center for Research Data & Digital Scholarship (CRDDS) is a collaboration between Research Computing and University Libraries, offering a full range of data services for both university and community members. The aim of CRDDS is to provide support to community members in areas related to data-intensive research. CRDDS fulfills this mission by providing education and support on such issues as data discovery, reuse, access, publication, storage, visualization, curation, cleaning, and preservation, as well as digital scholarship. CRDDS is located in Norlin Library on Main Campus at CU Boulder.

CRDDS offers many opportunities to students working with data. The expert staff work hand-in-hand with researchers via weekly office hours, one-on-one consultations, and group trainings in programming, data visualization and more. CRDDS serves as a resource for data management, manipulation and publication for trainees working through undergraduate and graduate coursework.

Examples of workshops/trainings CRDDS has offered include:

  • High performance computing
  • Programming in R
  • Programming in Python
  • Containerization
  • Data mining

 

Facilities Statement (PDF, Updated 8/14/23)