Technical Lead for high performance cloud file-based (POSIX) storage solutions / Remote
B2+Lead
Time Zone: The candidate should be available to work till 5 pm PST
Duration: from 3-6 months with possible extension
Job Description:
We are seeking a highly skilled and motivated Cloud Storage Technical Lead engineer to join our dynamic team, focusing on the development of scalable file storage solutions for cloud based High Performance Computing (HPC) platforms.
The ideal candidate will have a strong background in both traditional parallel filesystems and modern cloud-native storage solutions such as S3, ElastiCache and File Cache. You should also have extensive experience with AWS, infrastructure as code, and continuous integration/continuous deployment (CI/CD) pipelines.
As a Cloud Storage Technical Lead, you will play a crucial role in designing, building, and maintaining the storage infrastructure that supports our cutting-edge life sciences research initiatives. You will collaborate with computational scientists and other engineers to ensure that the platform is robust, scalable, and capable of handling complex computational workloads.
You will also ensure our solutions are implemented securely, with appropriate controls to allow safe storage of sensitive data. You will collaborate with additional cloud platform technical and product leads to ensure your solutions align with other emerging infrastructure capabilities being developed concurrently for the R&D organization.
Key Responsibilities:
- Design, implement, and maintain scalable and high-performance file storage environments on AWS.
- Develop and manage infrastructure as code using tools such as Terraform and Ansible.
- Automate deployment pipelines and improve CI/CD processes using GitLab CI/CD.
- Collaborate with cross-functional teams to understand the computational needs of scientists and translate them into effective platform solutions.
- Monitor and optimize platform performance, ensuring reliability and scalability.
- Troubleshoot and resolve issues related to infrastructure, deployment, and application performance.
- Provide technical guidance and mentorship to junior team members.
- Identify and advance collaboration opportunities with other product teams, such as integration with existing data movement and data catalog solutions.
Required Skills:
- AWS: Deep understanding of AWS services and best practices for building scalable, secure, and cost-effective cloud environments.
- DevOps: Proven experience with DevOps practices, including infrastructure as code (Terraform, Ansible), continuous integration, and continuous deployment (GitLab CI/CD).
- IAM: Prior experience integrating storage with common identity and access management solutions such as Active Directory and IAM Identity Center.
- Version Control: Proficiency with Git and experience managing code repositories.
- Expert level proficiency with POSIX file system semantics.
- Proficiency with POSIX I/O profiling for high performance / high throughput workloads
- Expert level proficiency in at least one high performance / parallel filesystem technology such asWeka, Lustre, GPFS, CEPH or JuiceFS.
- High proficiency with Amazon S3 object storage.
- High proficiency with Network File System (NFS) semantics and solutions.
- Knowledge of security best practices in cloud environments and experience implementing them.
- Excellent communicator, ability to clearly share architecture plans, designs, risks, and implementation with a variety of stakeholders
Desired Qualifications:
- Experience: 7+ years working in engineering, solution architecture, or DevOps, with a track record of successfully delivering complex projects.
- Problem Solving: Strong analytical and problem-solving skills, with the ability to troubleshoot complex issues in distributed systems.
- Communication: Excellent communication skills, with the ability to convey technical concepts to both technical and non-technical stakeholders.
- Team Player: Ability to work effectively in a collaborative team environment, as well as independently when required.
Preferred Skills (Nice to have):
- Prior experience with AWS managed services for file storage, such as EFS, FSx for Lustre, or FSx for OpenZFS.
- Prior experience with at least one POSIX interface solution for S3 object storage, such as S3 Mountpoint, CunoFS, or goofys.
- Prior experience with cloud data caching solutions such as Amazon ElastiCache or Amazon File Cache.