Key
Responsibilities:
·
Design
and implement Grafana-based monitoring solutions tailored to various
infrastructure components (servers, databases, applications, containers, etc.).
·
Integrate
Grafana with diverse data sources
·
Build and
maintain custom dashboards and visualizations to meet operational and business
requirements.
·
Develop
alerting rules and notification systems to ensure timely detection and
resolution of issues.
·
Produce
analytical reports and usage summaries for stakeholders to support
decision-making and performance tuning.
·
Continuously
optimize monitoring architecture for scalability, reliability, and
cost-efficiency.
·
Provide
guidance and training to internal teams on Grafana usage and best practices.
Required
Qualifications:
·
Bachelor's
degree in Computer Science, Engineering, or related field (or equivalent
experience).
·
3+ years
of hands-on experience with Grafana in production environments.
·
Proficiency
in writing queries for PromQL, InfluxQL, or Elasticsearch DSL.
·
Experience
with alerting tools like Alertmanager, PagerDuty, Opsgenie, or similar.
·
Familiarity
with infrastructure monitoring (CPU, memory, disk, network) and application
performance monitoring (APM).
·
Experience
in scripting and automation (e.g., Bash, Python, or PowerShell).
·
Knowledge
of CI/CD pipelines and containerized environments (Docker, Kubernetes).
Preferred Skills:
·
Experience
with Terraform Grafana provider or similar IaC integrations.
·
Knowledge
of Grafana Enterprise features and licensing models.
·
Grafana
certification or relevant credentials (if available).
Soft Skills:
·
Strong
analytical and problem-solving skills.
·
Excellent
communication and documentation abilities.
·
Ability
to work independently and collaboratively in a fast-paced environment.