Job Information
Houghton Mifflin Staff DevOps Engineer in United States
Staff DevOps EngineerApply now »
Apply now
Apply Now
Start applying with LinkedIn
Start
Please wait...
Date:May 14, 2024
Location: Pune, MH, IN
Company: Houghton Mifflin Harcourt
HMH is a learning technology company committed to delivering connected solutions that engage learners, empower educators, and improve student outcomes. As a leading provider of K–12 core curriculum, supplemental and intervention solutions and professional learning services, HMH partners with educators and school districts to uncover solutions that unlock students' potential and extend teachers' capabilities. HMH serves more than 50 million students and 4 million educators in 150 countries.
Our technical infrastructure
AWS EC2, Terraform Enterprise, Docker, Aurora, Mesos, Kubernetes, ELK (Elastic Search, Logstash & Kibana).
Grafana, Prometheus, Datadog, Telegraf, Runscope, Apollo, GraphQL.
Microservices architecture, Spring, Java & NodeJS, React, Koa, Express.js.
Amazon RDS, Dynamo DB, Postgres, Oracle, MySQL, Influx DB, Linux, Jenkins, GitHub.
You can read more on our Engineering Blog - here. (https://hmh.engineering/)
About the role:
You will constantly be asking: what are the most important infrastructure problems we need to solve for today, that will increase the reliability and performance of our applications and infrastructure.
You will apply your deep technical knowledge, taking a broad look at our technology infrastructure
You’ll help us identify common and systematic issues and validate these, prioritizing which to strategically address first
We value collaboration so you will partner with our SRE/DevOps team, discussing and refining your ideas and preparing proof of concepts
You’ll present and validate these across technology teams, figuring out the best solution
And you’ll be given ownership to engineer and implement your solutions
There are lots of interesting technology problems for you to solve, so you are constantly applying latest thinking. These include implementing Canary, designing a new automated pipeline solution, extension of Kubernetes capabilities, implementation of machine learning to build load testing, ensuring mutability of containerization etc.
You’ll get to evaluate existing technologies and design the future state, without being afraid to challenge the status quo. And you’ll regularly review existing infrastructure, looking for opportunities to improve, e.g. service improvement, cost reduction, security, and performance.
You’ll also get to automate everything necessary combining reliability with a pragmatic approach doing it right – the first time.
We’re continuing our journey of making our code and configuration deployments self-serve for our development teams.
You’ll help us build and maintain the right tooling
And you’ll have ownership to design and implement the infrastructure needed.
You’ll also be involved in the daily management of our AWS infrastructure
This means working with our Agile development teams, to troubleshoot server, application, and performance issues
Skills & Experience:
5 to 8 years hands-on SRE/DevOps experience in an Agile environment
You’ll be able to collaborate effectively with both engineers and operations and be comfortable recommending best practices
You bring substantial experience using AWS in a production environment
You have the expertise and skills to navigate the AWS ecosystem and will know when and where to recommend the most appropriate service, and/or usage pattern
You have experience resolving outages and are able to quickly diagnose issues and been instrumental in restoring normal service levels
You’ll also have significant experience, and/or an interest in the following:
Experience managing cloud infrastructure as code
Application container management
Expertise with an RDBMS. You’ll know how to tune, scale and how performance and reliability are achieved
Experience working with Linux
Experience with management of Messaging Queues and event driven systems
Experience working with firewalls, network and application load balancing & secret management
Experience with CI/CD tools
Experience with scripting languages
A strong and informed point of view with respect to monitoring tools and how best to use them
Job Segment: