TraceLink, Inc. Sr. Director, Cloud Engineering in North Reading, Massachusetts
TraceLink is seeking a Senior Manager of Cloud Engineering to lead our Site Reliability Engineering team. This team will manage the development of tools and processes used to deploy and maintain the core infrastructure components supporting the TraceLink software offerings across the globe. In this role you will be leading a team focused on building infrastructure as code, and working very closely with product engineering teams building platform and user applications.
Manage a distributed team of Cloud Engineers employing agile processes to build and release infrastructure tools coordinated with the overall product release timelines.
Manage team deliverables, backlog, interacting with other product teams, measuring progress across sprints and releases, as well as communicating progress and risks to relevant stakeholders. Set standards and provide requirements to engineering teams to deliver ops-ready software.
Work closely with key members of senior leadership as well as architecture and security teams to align technical direction with the overall TraceLink technical direction. Build upon TraceLink's goals of operational excellence, systems/infrastructure/security best practices and technical leadership.
In developing infrastructure as code, build and extend CI/CD infrastructure pipelines to provide metrics and visibility in support of reducing deployment errors and increase testing coverage, drive improvements in resource usage and reducing cost, as well as confirming scale and resiliency. Incorporate strong security practices throughout
Work closely with the on-call Cloud Operations team to develop methods to improve operational deployment, incident handling and reduce toil
Be an SRE/DevOps evangelist and subject matter expert for the broader TraceLink technical/developer community
Bachelor's degree in an engineering field
3+ years experience as a Site Reliability Engineer or DevOps Engineer
5+ years experience managing DevOps/SRE teams working to build infrastructure as code
Has built and managed geographically distributed teams working on a large-scale SaaS platform
Skilled in DevOps/SRE practices and build/release pipelines
Skilled with AWS services both from technology and cost perspectives
Strong understanding of cloud deployment and management practices
Strong knowledge of Linux and open source tools, their application in large-scale distributed systems
Hands-on experience with Terraform/Helm/Docker/Kubernetes/Prometheus/ELK/Bash/Python
Clear understanding of security best practices and how best to incorporate
Excellent communication skills, written and verbal
Strong analytical and problem solving skills
Advanced degree in an engineering field
Knowledge of compliance and security audit requirements and how to incorporate them into current and future practices
External Company Name: TraceLink, Inc.
External Company URL: http://www.tracelink.com