The Architect - Observability & Monitoring Solutions is responsible for designing, implementing, and maintaining enterprise-level monitoring and observability platforms to ensure system reliability, performance, and operational excellence. This role drives innovation through automation and AI-based solutions, collaborates across teams, and serves as a subject matter expert for monitoring tools and processes.
Primary Responsibilities:
Platform Management & Optimization
Lead administration and lifecycle management of Grafana Enterprise, including onboarding, upgrades, and troubleshooting
Develop and maintain dashboards for system health and performance monitoring
Support Dynatrace adoption and enhance dashboarding capabilities for cloud observability
Configuration & Integration
Create and update Telegraf configuration files for vendor partners
Migrate dashboards from legacy platforms to next-generation environments
Resolve event ingestion and integration issues across monitoring tools
Operational Excellence
Ensure platform stability, availability, and compliance through proactive vulnerability management and lifecycle maintenance
Drive process improvements for monitoring workflows and incident management
Innovation & AI Enablement
Explore and implement AI-driven solutions for observability and automation
Share knowledge and best practices through internal guides and community contributions (e.g., AI reference materials, Copilot usage)
Collaboration & Leadership
Partner with cross-functional teams (Telemetry, OptumServe, VOC, OptumRX) to align monitoring strategies with business needs
Act as a subject matter expert (SME) in ITSM discussions and change management processes
Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so
Qualifications - External
Required Qualifications:
Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent experience)
Proven experience with monitoring tools such as Grafana Enterprise, Dynatrace, and Telegraf
Solid knowledge of observability best practices and ITSM processes
Familiarity with AI-based automation and productivity tools (e.g., M365 Copilot)
Proven excellent problem-solving, communication, and collaboration skills
Preferred Qualifications:
Experience in cloud environments and infrastructure monitoring
Experience in enterprise-scale technology environments
Knowledge of scripting and automation frameworks
Core Competencies
Technical Expertise
Analytical Thinking
Innovation & Continuous Improvement
Collaboration & Influence
Customer Focus
ATS Match is available
1) Upload your resume. 2) Open any job and click Check ATS Match to see your fit score.