-
Notifications
You must be signed in to change notification settings - Fork 567
feat: Prometheus metrics #860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…al-time metrics tracking Signed-off-by: matanbaruch <[email protected]>
… metrics collector code formatting Signed-off-by: matanbaruch <[email protected]>
Signed-off-by: matanbaruch <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a Prometheus-based metrics system to STF, including real-time hooks, periodic collection, and a new /metrics
endpoint.
- Integrates
prom-client
with custom gauges and helper functions (lib/util/metrics.js
) - Adds
MetricsCollector
service for periodic DB metric collection and lifecycle integration (lib/util/metrics-collector.js
,lib/units/api/index.js
) - Implements real-time update hooks (
lib/util/metrics-hooks.js
), frontend reporting (group-list-controller.js
), and exposes/metrics
via a controller and Swagger docs
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
res/app/group-list/group-list-controller.js | Sends group stats from the UI to backend for metrics |
package.json | Adds prom-client dependency |
lib/util/metrics.js | Defines Prometheus metrics and update helpers |
lib/util/metrics-hooks.js | Implements real-time hooks for entity changes |
lib/util/metrics-collector.js | Collects metrics periodically from DB |
lib/units/api/swagger/api_v1.yaml | Documents the /metrics endpoint |
lib/units/api/index.js | Integrates MetricsCollector into the API lifecycle |
lib/units/api/controllers/metrics.js | Serves Prometheus metrics via /metrics |
lib/db/api.js | Exports getDevices() for metrics collection |
.eslintrc | Updates ESLint config to ES6/2017 |
I have to review seriously this PR (I saw some problems, e.g. you can't export |
…cs collector Signed-off-by: matanbaruch <[email protected]>
…improved performance Signed-off-by: matanbaruch <[email protected]>
…etter compatibility Signed-off-by: matanbaruch <[email protected]>
Signed-off-by: matanbaruch <[email protected]>
You are right. Fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR integrates Prometheus-based metrics into the STF platform, adding real-time hooks, a periodic collector service, and an exposed /metrics
endpoint.
- Adds
prom-client
and defines custom gauges inlib/util/metrics.js
- Implements
MetricsCollector
for scheduled metric aggregation andMetricsHooks
for real-time updates - Exposes a new
/metrics
API endpoint and updates frontend to emit group metrics
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
res/app/group-list/group-list-controller.js | Sends group counts to backend for metrics |
package.json | Adds prom-client dependency |
lib/util/metrics.js | Defines Prometheus registry and custom gauges |
lib/util/metrics-hooks.js | Updates metrics on entity events in real time |
lib/util/metrics-collector.js | Periodically gathers and updates metrics from DB |
lib/units/api/swagger/api_v1.yaml | Defines /metrics path in Swagger spec |
lib/units/api/index.js | Starts/stops MetricsCollector in app lifecycle |
lib/units/api/controllers/metrics.js | Implements the /metrics controller |
lib/db/api.js | Adds getDeviceMetrics aggregation function |
.eslintrc | Updates parser options for ES6 |
Comments suppressed due to low confidence (2)
lib/util/metrics-collector.js:13
- There are no tests covering the
MetricsCollector
class or itscollectMetrics
and lifecycle behavior. Adding unit tests for these methods will help ensure reliability of the new metrics feature.
class MetricsCollector {
lib/db/api.js:1705
- The
log
variable is not defined in this scope. Pleaserequire
the logger and create alog
instance (e.g.,const logger = require('../util/logger'); const log = logger.createLogger('dbapi');
) at the top of the file.
log.error('Error getting device metrics:', error)
Signed-off-by: matanbaruch <[email protected]>
@matanbaruch , your PR raises some concerns, and I would need more time to respond, which I currently lack; I will have more time later this summer or later. To quickly summarize, certain elements make it unacceptable in its current state. Here are some comments in bulk:
If you agree with all my comments, I think it would be better to close this PR and propose a more mature one later taking into account all my observations? |
Thank you for the detailed feedback. I appreciate you taking the time despite your limited availability. Below are my responses to your comments.
I see your point regarding
Understood. I’ll refactor it to reside in a more appropriate location as per your suggestion.
You're right. I will clean up the debug logs accordingly.
I understand. I'll remove the unused code and ensure the PR reflects only production-ready functionality.
Thanks for the note. While I typically rely on squash-and-merge to keep history clean, I understand the importance of using dedicated branches for clarity. I’ll adopt that approach going forward.
That makes sense. I’ll include a
The goal here is to expose basic device-level metrics in a standardized way using Prometheus, which is a common tool in many monitoring stacks. It provides a strong starting point, and being open-source and extensible, it can accommodate further integrations if needed.
Understood, I’ll update the copyright accordingly. (Since it's an opensource didn't knew contributors names are inside the code)
Prometheus has a specific plain text exposition format designed for scraping, as outlined in their [data model documentation]. That said, I’ll double-check to ensure the output conforms strictly to their expected format.
Got it. To clarify, the Metrics API is not meant for the STF UI at all—it's purely backend-facing and intended to be scraped by Prometheus. I’ll ensure there are no UI hooks or unrelated changes.
I agree with most of the feedback and will incorporate all necessary changes. However, I’m not sure how closing and reopening the PR would help in this case—especially as the ongoing discussion and review context would be lost. I’d prefer to revise this PR directly, clean up the commit history, and continue reviewing here. Once the work is complete, we can squash and merge as needed. |
@matanbaruch, If you’d like, feel free to open the same PR on our fork of OpenSTF: https://github.com/VKCOM/devicehub. |
This pull request introduces a comprehensive metrics collection and monitoring system for the STF (Smartphone Test Farm) application. The changes include adding Prometheus metrics support, implementing a metrics collector, and integrating real-time hooks for updating metrics. Additionally, the ESLint configuration has been updated for modern JavaScript support.
Metrics Collection and Monitoring:
Prometheus Metrics Integration:
prom-client
library for metrics collection (package.json
, package.jsonR87).metrics.js
to define and manage custom metrics, including devices, users, and groups (lib/util/metrics.js
, lib/util/metrics.jsR1-R161)./metrics
endpoint to serve Prometheus-compatible metrics (lib/units/api/controllers/metrics.js
, [1];lib/units/api/swagger/api_v1.yaml
, [2].Metrics Collection Service:
MetricsCollector
class to periodically gather metrics from the database and update Prometheus metrics (lib/util/metrics-collector.js
, lib/util/metrics-collector.jsR1-R148).lib/units/api/index.js
, [1] [2].Real-Time Metrics Hooks:
MetricsHooks
to update metrics in response to changes in devices, users, and groups (lib/util/metrics-hooks.js
, lib/util/metrics-hooks.jsR1-R115).res/app/group-list/group-list-controller.js
, res/app/group-list/group-list-controller.jsR67-R75).Codebase Enhancements:
ESLint Configuration Update:
.eslintrc
to support ES6+ features by enabling thees6
environment and settingecmaVersion
to 2017 (.eslintrc
, .eslintrcL4-R8).Database API Enhancements:
getDeviceMetrics
function to securely aggregate device statistics without exposing sensitive data (lib/db/api.js
, lib/db/api.jsR1664-R1715).These changes collectively enhance the observability and maintainability of the STF system by providing detailed metrics for monitoring system health and usage.