Job Details

CVS Health
  • Position Number: 5209277
  • Location: Remote
  • Position Type: Business - Management

As CVS Health continues to grow, we are looking for experienced leaders to ensure our systems remain stable and performant as we scale. The Lead Director of Observability Operations will play a crucial leadership role within Solutions Architecture and Infrastructure Platforms, leading and managing a team responsible for the operation and maintenance of Observability platforms at CVS Health. The Lead Director will act as the product owner for several key observability platforms, providing technical guidance, support, and leadership. This individual will also occasionally perform in all areas of observability platform administrative and operational activities, such as managing customer issues and requests, observability solutions deployments, platform management, release management, upgrades, patching, integrations, and incident troubleshooting. They will also provide continual planning and action as it pertains to platform performance to support scale and complexity.


Platform Monitoring and Health Checks: Continuously monitor health and performance of observability systems to ensure functionality and reliability. This includes tracking system availability, responsiveness, and identifying potential issues before they become critical.


Incident Management and Problem Resolution: Address platform-related incidents and problems by troubleshooting issues, coordinating responses, and working to resolve problems that impact the operation of our observability systems, and observability of our enterprise platforms.


Reporting and Analytics: Handle requests to obtain audit and compliance information from our logging platforms where assistance is needed. Assist in configuring sensors and analytics where assistance is needed, to enable our partners to gain deeper insight into their operating environments.


Platform Upgrades, Patch, Change and Release Management: Collaborate with business leads to plan and execute platform and agent upgrades, and manage patches to ensure platforms remain up-to-date and secure. Manage changes, releases, and deployments by planning, testing, and overseeing new feature rollouts.


Team Leadership and Collaboration: Engage with executives, department heads, and IT teams to provide status updates on activities and operational metrics. This includes guiding and supervising the observability operations team, coordinating with other IT and business units, establishing observability services for new systems and those transitioning platforms, establishing user training, establishing and ensuring effective communication within the organization.

Required Qualifications
1. Minimum 8 years of IT industry experience with 5+ years of experience leading teams in the IT operations, DevOps, SRE, or observability space within a large enterprise (Fortune 100) environment.
2. 5 years of experience in development of a high performing team and interface with all levels of the organization; Ability to participate in development of resource plans and structures, and influence organizational priorities.
3. 5 years of Excellent verbal/written communication, collaboration, analytical and presentation skills to lead an environment driven by customer service and teamwork; Must be able to set goals and participate in strategic initiatives for a team
4. 5 years of Experience implementing observability solutions for component and application performance monitoring, real user monitoring, log management, incident management and triage, data analysis, and events management.
5. 5 years of experience in ITSM, change management, and IT Operations; experience with ServiceNow ITOM and/or other AIOps platforms a plus).
6. 5 years of Experience developing automation solutions and workflows for deployment and incident remediation; experience with automation tools and related languages (e.g., ansible, Jenkins, bash, powershell, python)
7. 8 years of experience in on-premises infrastructure, cloud infrastructure, and application architectures; Experience supporting major incident triage, operational readiness, and change management processes leveraging observability platforms.
8. 8 years of experience in a variety of observability tools and standards (e.g., App Dynamics, Prometheus, Grafana stack, ELK stack, Splunk, Dynatrace, OTEL).
9. 8 years of Excellent analytical and problem-solving skills.
10. 8 years of Exceptional project and program management skills, with a proven track record of managing and prioritizing multiple simultaneous initiatives to successful completion.
11. 8 years of experience in owning problems end-to-end, willingness to obtain any required knowledge to get the job done.

Preferred Qualifications
Knowledge of the healthcare industry's regulatory landscape, particularly in the context of data security and privacy, is desirable.


Education-


Bachelor degree from accredited university or equivalent work experience(HS diploma + 4 years relevant experience)



jeid-4f1938530ac4d9488cc4bf931cebccd5