DevOps Engineer

Role Overview

We're seeking an exceptional DevOps Engineer to take ownership of the cloud infrastructure and deployment systems that power our mission in applied atmospheric science. In this role, you'll be the technical foundation our scientists, ML engineers, and software engineers build on — designing and operating the cloud environments, deployment pipelines, and infrastructure-as-code that keep our systems reliable, scalable, and fast.

What You'll Do

Own and operate our cloud infrastructure across AWS, GCP, and other compute providers, ensuring high availability, cost efficiency, and security
Lead the development and maintenance of Terraform-based infrastructure-as-code across all environments
Design and implement CI/CD pipelines that enable rapid, reliable delivery of our modeling products
Guide architectural decisions for cloud deployments, establishing best practices and patterns for the engineering team
Manage containerized workloads, including Docker image builds, container registries, and orchestration via AWS
Build and maintain scalable compute environments for HPC and GPU workloads, including cluster management and job scheduling
Implement observability across our infrastructure stack
Partner with software engineers and scientists to optimize data pipeline infrastructure for processing massive meteorological datasets
Drive security, compliance, and cost governance across our cloud footprint

Requirements

Bachelor's degree in Computer Science, Engineering, or related field (or equivalent practical experience)
Strong hands-on experience with AWS and GCP, including core services (compute, networking, storage, IAM)
Deep proficiency with Terraform for infrastructure-as-code at production scale
Experience designing and operating CI/CD systems (GitHub Actions, GitLab CI, or similar)
Solid understanding of containerization and orchestration (Docker, Kubernetes, or equivalent)
Experience managing networking, security groups, VPCs, and IAM policies in multi-cloud environments
Familiarity with Linux system administration and shell scripting

Nice to Have

Experience with HPC environments and job schedulers (SLURM, AWS ParallelCluster, or similar)
Familiarity with GPU compute infrastructure (provisioning, scheduling, cost management)
Background working with scientific or data-intensive workloads
Experience with container runtimes for HPC (Apptainer/Singularity, Pyxis/enroot)
Knowledge of serverless and event-driven architectures (AWS Lambda, GCP Cloud Run, Modal)
Familiarity with workflow orchestration platforms (Dagster, Airflow, Prefect)
Prior work in a research-oriented or scientific computing environment
Experience with cost optimization and FinOps practices in cloud environments

Compensation & Benefits

$160,000–$200,000 base
Meaningful equity
Full health, dental, and vision benefits
In-person five days a week in San Francisco, CA

Role Overview

What You'll Do

Requirements

Nice to Have

Compensation & Benefits

Apply for this role