Loading…
2025 Midwest RCD Annual Meeting has ended
Pro-tips!
  • Click on SCHEDULE (next to SPEAKERS, COMMITTEE, etc) and then choose EXPANDED view to see the abstracts of each talk and the speaker’s name.
  • Make your profile active to others -> SETTINGS -> Under Privacy & Email, turn the switch for MAKE MY PROFILE & SCHEDULE PUBLIC to green. Make sure to SAVE your changes. There are over 60 attendees - only ~30 have made their profiles active as of 4/25.
Midwest RCD is funded, in part, by the National Science Foundation, see the sponsors tab.

Official websiteSlackJoin Slack
PRESENTATION 1: Stephen Deems, Pittsburgh Supercomputing Center (12 mins)
Building On-Ramps to the NSF ACCESS Ecosystem The NSF ACCESS program connects your institution’s researchers and educators to a range of national-scale computing systems—at no cost to you or them. This talk will introduce campus IT staff to the ACCESS program and the services it offers. We will also describe ACCESS On-Ramps, a new feature developed by the ACCESS Allocations team that lets campus IT staff share ACCESS resource information with your campus community. Our goal is to help researchers find out about ACCESS in the first place they go for IT information—their campus IT websites.

PRESENTATION 2: Geoffrey Lentner, Purdue University (12 mins)
HyperShell: A user-facing tool for high-throughput scheduling of small jobs We’ve been crafting a user-facing tool for high-throughput computing scenarios for the past 5 years. The HyperShell software is a command-line tool that fits somewhere between GNU Parallel and HTCondor, with HPC user ergonomics and features in mind. It allows for users to run millions of small jobs, automate "pilot job" workflows, or make better use of large GPUs for packing smaller jobs in parallel. Currently, we are focused on user education, workshops, and outreach. The goal of this presentation will be to share the essential concepts and functionality of HyperShell with other center staff who may be in a position to facilitate researchers who might benefit from this capability. https://hypershell.org https://github.com/hypershell/hypershell

PRESENTATION 3: Joseph Tang, Ohio Super Computing Center (12 mins)
Backup Validation at the Ohio Supercomputer Center: A Belt and Suspenders Approach We present our current large data restoration and backup validation scheme at the Ohio Supercomputer Center (OSC).  There are approximate 12 PiB data and ~ 3 billion files stored in Project (ESS) and NetApp storage systems at OSC, and it is critical to efficiently restore data and validate the fidelity of backups. It is essential for maintaining business continuity, especially in disaster recovery scenarios.  Per NIST 800-53 compliance frameworks, we employ acceptance sampling concepts and statistical models estimating the amount of sample sizes, as it is not practical to restore 10+ PB of data with billions of files at once.  Projects / filesets are divided into collocation groups, so we could apply level of parallelism with group of nodes and reduce serialization due to large data volume from one node when migrating data from disks to tapes.  We employ “no query restore” from IBM Storage Protect with multiple sessions and restore data from filesets in the same collocation group, so the data could be restored efficiently.  Restored data are validated by comparing checksum of original files in snapshots with checksum of restored files using in-house developed scripts.  A typical restore test is 110 – 140 TiB data.  In the end, we discuss the current challenges and future directions about data restoration.

PRESENTATION 4: Preston Smith, Purdue University (12 mins)
Updates on statistical models for presenting the value of RCD investments
This presentation will discuss new research in modeling RCD value: specifically production function models across multiple institutions, and a work in progress effort for institutional benchmarking as a function of a university’s research output. Participants will learn best practices in carrying their value statement back to administration, and tools for benchmarking against peer institutions.


Q/A Session (~10 Minutes)





Speakers
avatar for Stephen Deems

Stephen Deems

Director of Strategic Initiatives, Pittsburgh Supercomputing Center
avatar for Geoffrey Lentner

Geoffrey Lentner

Lead Research Data Scientist, Purdue University
Lead Data Scientist. Astrophysicist. Research Software Engineer. Expert in high-performance computing (HPC), advanced data processing, mathematics and statistics. I lead campus-facing research facilitation, support, sponsored projects, and innovation. I specialize in data, workflow... Read More →
avatar for Preston Smith

Preston Smith

Executive Director, Purdue University
JT

Joseph Tang

HPC Storage Engineer, Ohio SuperComputer Center
Feedback form is now closed.

Sign up or log in to save this to your schedule, view media, check-in, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link