Open Libra Observability upgrade: Community Grafana

Here’s your observability upgrade proposal formatted as a GitHub feature request:

---

# **Feature Request: Upgrade Observability Infrastructure (Grafana + Prometheus)**

## **Is your feature request related to a problem? Please describe.**  
The current Open Libra observability infrastructure lacks:  
- Publicly accessible metrics/dashboards for community transparency  
- Scalable node monitoring as the network grows  
- Modernized dashboards based on 2022 templates and Aptos-inspired designs  
- Secure, dynamic target management for Prometheus (currently IP-based)  

This makes it difficult for node operators and the community to monitor network health effectively.

---

## **Describe the solution you'd like**  
### **Grafana Implementation**  
1. **Public Access**:  
   - Hosted Grafana instance with view-only permission for the public  
   - Anonymous access to selected dashboards (SSL secured via Certbot)  

2. **Dashboards**:  
   - Modernize existing 2022 dashboards as foundation  
   - Incorporate design patterns from Aptos dashboards (reference implementation)  
   - Modular panels for node health, network performance, and consensus metrics  

3. **Admin Team**:  
   - Restricted admin access for:  
     - @Sasuke  
     - @sirouk  
     - @Hemulin  
     - @David  

### **Prometheus Implementation**  
1. **Node Monitoring**:  
   - Start with cooperative validators/VFNs  
   - Adopt UUID system (based on node public keys) to replace IP addresses  
   - Dynamic target updates via peer list (inspired by [David Boreham’s design](https://github.com/b-n-space/0l-monitoring/blob/main/prometheus/prometheus.example.yml))  

2. **Scalability**:  
   - Borrow from [0L seed-peers](https://github.com/0LNetworkCommunity/seed-peers) for peer discovery  
   - Scheduled scraping jobs with secure authentication  

---

## **Describe alternatives you've considered**  
1. **Third-Party SaaS (e.g., Datadog, New Relic)**:  
   - Rejected due to cost and desire for community-controlled infrastructure  

2. **IP-Based Prometheus Targets**:  
   - Current system is brittle and exposes node IPs; UUIDs are more secure  

3. **Static Dashboards**:  
   - Considered but opted for modular designs to accommodate future metrics  

---

## **Additional Context**  
### **References**  
- Example Prometheus config: [0l-monitoring](https://github.com/b-n-space/0l-monitoring/blob/main/prometheus/prometheus.example.yml)  
- Peer list management: [seed-peers](https://github.com/0LNetworkCommunity/seed-peers)  

### **Screenshots (Mockups)**  
*(Note: Attach dashboard mockups or Aptos reference screenshots here in GitHub issue)*  

### **Implementation Phases**  
1. **Phase 1**: Grafana setup + basic dashboards  
2. **Phase 2**: UUID-based Prometheus integration  
3. **Phase 3**: Alerting + scaling optimizations  

--- 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Open Libra Observability upgrade: Community Grafana #403

Feature Request: Upgrade Observability Infrastructure (Grafana + Prometheus)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Grafana Implementation

Prometheus Implementation

Describe alternatives you've considered

Additional Context

References

Screenshots (Mockups)

Implementation Phases

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Open Libra Observability upgrade: Community Grafana #403

Description

Feature Request: Upgrade Observability Infrastructure (Grafana + Prometheus)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Grafana Implementation

Prometheus Implementation

Describe alternatives you've considered

Additional Context

References

Screenshots (Mockups)

Implementation Phases

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions