IBM Sr. Site Reliability Engineer - Infrastructure in Boston, Massachusetts
Rise to meet the challenge of solutioning the world’s problems. Software Engineers at IBM get to see their work in the real world, from improving the current state of disaster preparedness to providing more accurate medical solutions around the globe. You will be challenged to think outside the box, work across organizations, and engineer creative solutions that scale to the demands of our ever-growing customer base. Take ownership and be actively engaged in the entire product lifecycle - from quick hits to full feature development.
Your Role and Responsibilities
This position will be located onsite in either Austin, TX, RTP, NC or Southbury, CT.
Site Reliability Engineering (SRE) professionals are engineers who specialize in reliability and resilience with the right mix of knowledge and skills in software and systems. They are responsible for analyzing business needs, problem determination, and to advise, design, build, test, deploy, and maintain a well-engineered information system and ecosystems.
As a Site Reliability Engineering (SRE), you will ensure that the designed solution responds to non-functional requirements such as availability, performance, security, and maintainability. They also work with release engineers to ensure that the software delivery pipeline is as efficient as possible.
You will bring a strong engineering focus to operations, putting your energy on preventing incidents, automation frameworks, self-service infrastructure, logging and metrics, and operational scorecards.
You will be expected to use tools include: logging, monitoring, event management, notification, Runbook Automation, ChatOps, Root Cause Analysis.
Keeping your assigned site or service up and running or getting it back up and running quickly when failure occurs
Working closely with internal partners and teams to ensure that our infrastructure meets security, SLA, and performance requirements
Writing, updating, and using documentation, including runbooks/playbooks
Automating work including infrastructure needs, testing, failover solutions, failure mitigation, and much more
Debugging complex problems across an entire stack and creating solid solutions
Developing CI/CD processes to improve cadence
Using Chaos Engineering to test what you build under real-world conditions
Sponsor healthy software development practices – including complying with the chosen software development methodology (Agile, or alternatives), building standards for code reviews, work packaging, etc.
Persistent testing of application and infrastructure resiliency over a variety of error conditions.
Partnering with security engineers and developing plans and automation to aggressively and safely respond to new risks and vulnerabilities.
Develop, communicate, and monitor standard processes to promote the long-term health of sustainability and health of operational development tasks.
Required Technical and Professional Expertise
3+ years experience with software engineering, software development, or system operations
Knows their way around a Unix/Linux shell, can write shell scripts, and understands Linux internals
Experience debugging complex problems
Experience designing, building, and operating large-scale production systems
Knows Python, Go, Rust, or similar
Understands networking and messaging, especially between services
Has hands-on experience using source control (Git, GitHub) and feature branching strategies
Preferred Technical and Professional Expertise
Experience with DevOps engineering or SRE
Experience with containers, such as with Docker, Kubernetes and Open Shift
Experience with monitoring and observability such as with New Relic, Nagios, Icinga, or Sysdig
Experience automating infrastructure, configuration management, testing, and deployments using tools like Ansible, Chef and can explain the Infrastructure as Code paradigm
A strong understanding of diverse infrastructure platforms and infrastructure concepts required.
Familiarity with SCAP/OVAL/XCCDF and implementing NIST standards
Familiarity with managing and securing a distributed Windows Server and/or Linux environment
About Business Unit
The Office of the Chief Information Officer (CIO) owns IBM’s IT strategy and provides the tools, workstations, devices, and infrastructure that IBMers use to do their jobs every day. Put simply, our mission is to create a productive environment for IBM's 365,000 worldwide employees. Join us as we lead with with design to drive simplicity and ease of use, engineering the systems that run the business, and innovating to transform the business.
Your Life @ IBM
What matters to you when you’re looking for your next career challenge?
Maybe you want to get involved in work that really changes the world? What about somewhere with incredible and diverse career and development opportunities – where you can truly discover your passion? Are you looking for a culture of openness, collaboration and trust – where everyone has a voice? What about all of these? If so, then IBM could be your next career challenge. Join us, not to do something better, but to attempt things you never thought possible.
Impact. Inclusion. Infinite Experiences. Do your best work ever.
IBM’s greatest invention is the IBMer. We believe that progress is made through progressive thinking, progressive leadership, progressive policy and progressive action. IBMers believe that the application of intelligence, reason and science can improve business, society and the human condition. Restlessly reinventing since 1911, we are the largest technology and consulting employer in the world, with more than 380,000 IBMers serving clients in 170 countries.
IBM will not be providing visa sponsorship for this position now or in the future. Therefore, in order to be considered for this position, you must have the ability to work without a need for current or future visa sponsorship.
Being You @ IBM
IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
- IBM Jobs