hero




The world’s largest collection of jobs backed by Venture Capital & Private Equity firms

2,952
companies
94,013
Jobs

Manager, Enterprise Monitoring and Observability

Pilot Flying J

Pilot Flying J

Knoxville, TN, USA
Posted on Wednesday, September 11, 2024

Manager, Enterprise Monitoring and Observability

Job Description

Company Description

Pilot Company is an industry-leading network of travel centers with more than 30,000 team members and over 750 retail and fueling locations in 44 states and six Canadian provinces. Our energy and logistics division serves as a top supplier of fuel, employing one of the largest tanker fleets and providing critical services to oil operations in our nation's busiest basins. Pilot Company supports a growing portfolio of brands with expertise in supply chain and retail operations, logistics and transportation, technology and digital innovation, construction, maintenance, human resources, finance, sales and marketing.

Founded in 1958 by Jim A. Haslam II and currently led by CEO Adam Wright, our founding values, people-first culture and commitment to giving back remains true to us today. Whether we are serving guests, a fellow team member, or a trucking company, we are dedicated to fueling people and keeping North America moving.

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status or any other characteristic protected under applicable federal, state or local law.

Military encouraged to apply.

Job Description

The purpose of this position is to lead the Monitoring and Observability practice across the Pilot Company enterprise. The role will establish monitoring and observability, proactive solutions, alerting, automation, and site reliability for business-critical systems and platforms.

1. Oversee, lead, and set priorities for the Monitoring and Observability team specifically focused on monitoring and observability, proactive solutions, alerting, automation, and site reliability

2. Coach, train and develop direct reports (includes appraising job performance and conducting performance reviews)

3. Lead team of site reliability engineers (SRE) to develop enterprise logging, metrics, and traces for business-critical systems as well as dashboards (visibility) for different levels of support

4. Work with infrastructure, product, and support teams to define tools and strategy to ensure full observability, alerting, and proactive monitoring of business-critical systems

5. Integrate full observability and proactive monitoring practice with ITSM Office to ensure tracking and timely communication of incidents, outages, and issues

6. Collaborate with Business and IT stakeholders to define thresholds, SLAs, and runbooks and help proactively identify issues and drive down reoccurring incidents

7. Lead oversight of third party vendors’ work to ensure vendors fulfill contractual commitments and statements of work

8. Assist with monitoring events (e.g., warnings and exceptions) and identify routine activities and resolutions that can be automated to improve system efficiencies

9. Serve as a subject matter expert and maintain knowledge of current industry trends and developing technologies

10. Model behaviors that support the company’s common purpose; ensure guests and team members are supported at the highest level

11. Ensure all activities are in compliance with rules, regulations, policies, and procedures

12. Complete other duties as assigned

#LI-CR1

Qualifications
  • Bachelor’s degree or associate degree required; field of study in technology preferred
  • Minimum seven years’ experience in technology or related field required
  • Minimum one year’s experience managing people preferred

Specialized Knowledge

  • Intermediate knowledge of ITSM/ITIL
  • Intermediate knowledge of Splunk/ITSI, AWS CloudWatch, APM (AppDynamics), SolarWinds, Grafana, Prometheus, or similar.
  • Working knowledge of service-oriented architecture (SOA), microservices, and/or API network design paradigm
  • Working knowledge of network protocols/technology, databases, and application servers and their roles in service delivery
  • Experience using cloud native technologies (Kubernetes, OpenTelemetry, GitHub) in a production environment
Additional Information


Nation-wide Medical Plan/Dental/Vision
401(k) and Flexible Spending Accounts
Employee Fuel Discount
Adoption Assistance
Tuition Reimbursement
Weekly Pay

All your information will be kept confidential according to EEO guidelines

7071

Application Instructions

Please click on the link below to apply for this position. A new window will open and direct you to apply at our corporate careers page. We look forward to hearing from you!

Apply Online
Privacy Policy

Share This Page

Manager, Enterprise Monitoring and Observability

Share link. Copy this URL:

Posted: 9/10/2024

Job Status: Full Time

Job Reference #: eaeb3266-b13b-4910-a83b-fd9263c59970