New York, United States
Senior HPC / Grid Engineer, New York, Permanent Direct Hire role
A top hedge fund is hiring in their New York City office for a Senior HPC/Grid Engineer with a deep understanding of Grid workloads, scheduling, and data services at scale. You will work with the business to architect, build, and operate high throughput computational grids, using a combination of off the shelf software, vendor storage appliances, and internally developed automation to power the revenue generating workloads of hundreds of quantitative researchers. In this role you will evaluate new hardware and software to meet business requirements, build automation tools to monitor manage, and provision compute nodes and grid workloads, build automation to manage, provision, and report on grid resources, job status, performance and failures, and provide escalation support to business users and other grid consumers.
This role requires at least 4 years of experience running and deploying HPC or Grid services for large scale environments and expertise in the challenges of scale out workloads, and know how to compose reliable services out of unreliable components, experience with HPC file systems such as Lustre, GPFS< or Quobyte, and strong fluency in Python for building automation tools and services. The role also requires knowledge of the appropriate compute, network, and storage offerings for different use cases, deep knowledge of job schedulers such as Slurm or Symphony, understanding of Unix, NFS, and networking, a practitioner of Infrastructure as Code, with constant evaluation and pursuit of opportunities to automate tasks, and experience using configuration management such as Salt, Ansible, and Puppet.
Candidates should ideally have experience with public cloud services such as AWS, Azure, and GCP to architect hybrid storage environments using S3, GCS, or Glacier, and experience managing and automating enterprise storage vendors and products such as EMC and NetApp, experience with monitoring tools such as Prometheus and InfluxDB, and experience with version control such as Git or Perforce.
If this sounds like a fit for you or someone you know, get in touch! I can be reached at Matt.Kerwin@HarringtonStarr.com or at 646 809 4742.