Releases: m-cmp/mc-observability
v0.5.0
Release Notes v0.5.0
What's Changed
Major Features & Enhancements
🚨 Alert & Trigger System
- feat: Add direct alert functionality by @suahlingo in #256
- feat: Add trigger module with email, SMS, and Slack integration by @suahlingo in #201, #205, #170
- feat: Add RabbitMQ integration for alert message queuing by @suahlingo
- docs: Add comprehensive RabbitMQ guide by @suahlingo
🤖 AI/LLM Integration
- feat: Add MCP (Model Context Protocol) core integration by @GreenScreen410 in #164
- feat: Add external MCP servers (InfluxDB, MariaDB) by @GreenScreen410 in #163
- feat: Add log analysis API with LLM support by @GreenScreen410 in #166
- feat: Add conversation summary features with context retention by @kyuengmanKim in #176, #154
- feat: Add LangGraph workflow & state management by @kyuengmanKim
- feat: Add prompt management features by @kyuengmanKim
- feat: Integrate LLM provider APIs and DB tables by @inhun in #175
- feat: Add anomaly detection feature aligned with o11y manager API by @inhun
- feat: Add prediction feature aligned with o11y manager API by @inhun
📊 Logging & Tracing
- feat: Add logging & tracing configuration with OpenTelemetry by @inhun in #190
- feat: Add m-cmp system log collector by @kyuengmanKim
- feat: Add Tempo configuration for distributed tracing by @kyuengmanKim
- feat: Add Loki integration for log aggregation by @suahlingo
- refactor: Refactor tracing and logging to use OpenTelemetry Java agent by @suahlingo
🏗️ Infrastructure & Architecture
- refactor: Major project structure reorganization by @kyuengmanKim in #254, #249, #247, #246, #244, #241, #240, #237
- refactor: Updates docker compose & config directory structure by @kyuengmanKim
- feat: Add second InfluxDB instance (mcp-influxdb) by @ish-hcc
- feat: Add MinIO for object storage by @ish-hcc
- feat: Add Semaphore integration for deployment automation by @ish-hcc
- feat: Add comprehensive GitHub Actions workflows for CI/CD by @ish-hcc
- feat: Add Docker multi-stage builds for reduced image sizes by @ish-hcc
🔧 Monitoring Agent Enhancements
- feat: Implement per metric configuration feature by @ish-hcc in #141
- feat: Implement plugin feature for monitoring agents by @ish-hcc
- feat: Add agent configuration update via Ansible by @ish-hcc
- feat: Add Fluent-Bit agent for log collection by @ish-hcc
- feat: Add Telegraf agent improvements with procstat top10 CPU/memory by @ish-hcc
- feat: Use private key for SSH connections by @ish-hcc
Breaking Changes & Migrations
Java Module Restructuring
- refactor: Rename
java-moduletojavaby @kyuengmanKim - refactor: Split old
mc-o11y-agentandmc-o11y-managerinto newmc-o11y-managerstructure - refactor: Upgrade to JDK 17 and Spring Boot 3.2.12 by @ish-hcc
- refactor: Upgrade Gradle to 9.1.0 by @ish-hcc
- refactor: Rename "target" to "vm" throughout codebase by @ish-hcc
Configuration Changes
- refactor: Externalize sensitive configs via environment variables by @suahlingo
- feat: Use Grafana environment file by @ish-hcc
- fix: Update InfluxDB database name handling by @ish-hcc
API Improvements
New Endpoints
- feat: Add readyz API for health checks by @ish-hcc in #189
- feat: Add OpenAI API key management endpoints by @GreenScreen410 in #143
- feat: Add log API improvements (start/end fields optional) by @ish-hcc
- feat: Add WebSocket support for real-time VM status updates by @ish-hcc
- feat: Add insight module APIs by @kyuengmanKim
API Refactoring
- refactor: Split LLM analysis API into feature-specific functions by @kyuengmanKim
- refactor: Update Insight controller & swagger definitions by @kyuengmanKim
- refactor: Split controllers by feature by @ish-hcc
- refactor: Improve error responses and exception handling by @suahlingo
Documentation & Developer Experience
Documentation
- docs: Add RABBITMQ_GUIDE.md by @suahlingo
- docs: Update Slack user guide with Channel ID instructions by @suahlingo
- docs: Add PROJECT_MANAGEMENT.md by @kyuengmanKim
- docs: Relocate and organize documentation by @kyuengmanKim
API Documentation
- refactor: Comprehensive Swagger/OpenAPI updates by @suahlingo, @kyuengmanKim
- refactor: Add detailed API descriptions and examples by @kyuengmanKim
- refactor: Translate Korean comments to English by @suahlingo, @kyuengmanKim
Developer Tools
- feat: Add Makefile with lint, test, and build commands by @ish-hcc
- feat: Add Spotless for code formatting by @suahlingo
- chore: Update .gitignore for better project hygiene by @GreenScreen410
Bug Fixes
Critical Fixes
- fix: Resolve agent installation permission issues by @kyuengmanKim
- fix: Fix agent status tracking and state management by @suahlingo
- fix: Fix WebSocket issues when agent status changes by @ish-hcc
- fix: Apply locks to prevent race conditions in host/VM operations by @ish-hcc
- fix: Fix InfluxDB connection and database name issues by @ish-hcc
- fix: Call agent uninstall when deleting VMs by @ish-hcc
Configuration Fixes
- fix: Add missed InfluxDB2 provisioning file by @ish-hcc
- fix: Add missed Ansible files for config updates by @ish-hcc
- fix: Fix Grafana datasource configurations by @ish-hcc
- fix: Fix RabbitMQ dependency and configuration issues by @ish-hcc
Monitoring Fixes
- fix: Remove NVIDIA SMI input (fields not provided) by @ish-hcc
- fix: Remove Fluent-Bit HTTP service port for security by @ish-hcc
- fix: Optimize Telegraf configuration by @ish-hcc
- fix: Fix Loki error messages and Log Explorer errors by @최낙수
Scheduler & Timezone
- fix: Insight scheduler timezone bug by @kyuengmanKim
- fix: Improve healthcheck configurations by @ish-hcc
Dependency Updates
Infrastructure Components
- docker-compose: Update cb-tumblebug to v0.11.16 by @ish-hcc
- go: Update cb-spider to v0.11.16 by @ish-hcc
- Previous updates include cb-tumblebug v0.11.13, v0.11.9, v0.11.8, v0.11.6
- Previous updates include cb-spider v0.11.13, v0.11.5, v0.11.4, v0.11.3, v0.11.1
Python Dependencies
- chore: Optimize packaging with uv to reduce image size by @inhun in #203, #218
- feat: Reduce Python image size by @ish-hcc
Java Dependencies
- feat: Upgrade to Gradle 9.1.0 by @ish-hcc
- feat: Upgrade to JDK 17 and Spring Boot 3.2.12 by @ish-hcc
Performance Improvements
- feat: Reduce image size of mc-observability-manager by @ish-hcc
- feat: Reduce image size of mc-observability-grafana by @ish-hcc
- feat: Optimize Telegraf configuration for better performance by @ish-hcc
- feat: Implement procstat collection of only top10 CPU/memory by @ish-hcc
- chore: Optimize Python packaging to reduce image size by @inhun
Contributors
Special thanks to all contributors:
- @suahlingo
- @ish-hcc (@임수현)
- @kyuengmanKim (@kyeongman Kim, @kkm)
- @inhun
- @GreenScreen410
- @최낙수
Full Changelog
Full Changelog: v0.4.0...v0.5.0
Swagger: https://m-cmp.github.io/mc-observability/java/swagger/index.html
Migration Guide
For Users Upgrading from v0.4.0
-
Directory Structure Changes:
java-modulehas been renamed tojava- Configuration files moved to
config/directory
-
Environment Variables:
- Review and update your environment variables
- Check
config/manager/.envfor new configuration options
-
Java Version:
- JDK 17 is now required (upgraded from JDK 11)
-
API Changes:
- "target" endpoints renamed to "vm"
- Check Swagger documentation for updated API endpoints
-
New Features:
- Configure RabbitMQ for alert notifications
- Set up Loki for log aggregation
- Configure Tempo for distributed tracing
- Set up LLM integration if needed
For detailed migration instructions, please refer to the project documentation.
v0.4.4
v0.4.3
Notice
This release serves as a pre-release ahead of the upcoming v0.5.0.
API
- Swagger UI URL: https://m-cmp.github.io/mc-observability/java/swagger/index.html
Major changes
- Refactored overall project structure for better maintainability.
- Improved Java module build system and cleaned Docker-related files.
- Refactored Loki setup and enhanced Swagger API documentation.
- Added Grafana configurations and environment variables for observability.
- Fixed timezone issue in Insight scheduler and InfluxDB database name errors.
What's Changed
- refactor : refactor agent status and swagger by @suahlingo in #233
- fix: java-module: Fix Gradle structure by @ish-hcc in #235
- Refactor/java module by @suahlingo in #234
- fix: java-module: Fix container dependencies and fix tar command by @ish-hcc in #236
- Refactor/project structure by @kyuengmanKim in #237
- fix: java-module: Clean-up remained Docker related files by @ish-hcc in #238
- refactor : refactor loki setup and apply swagger and update make lint… by @suahlingo in #239
- Refactor/project structure by @kyuengmanKim in #240
- fix: add grafana conf by @kyuengmanKim in #241
- docker-compose.yaml: Make RabbitMQ version fixed by @ish-hcc in #242
- fix: java: Add missed config-update playbook.yaml by @ish-hcc in #243
- Refactor/project structure by @kyuengmanKim in #244
- chore: add environment variables for mcp-grafana by @inhun in #245
- Refactor/project structure by @kyuengmanKim in #246
- Refactor/project structure by @kyuengmanKim in #247
- fix: insight scheduler timezone bug by @kyuengmanKim in #249
- refactor : add info update chnnel noti by @suahlingo in #251
- fix: Fix InfluxDB database name issue by @ish-hcc in #252
- fix: java: Fix InfluxDB database name issue and clean-up sources by @ish-hcc in #253
- Refactor/project structure by @kyuengmanKim in #254
Full Changelog: v0.4.2...v0.4.3
v0.4.2
API
Swagger UI URL: https://m-cmp.github.io/mc-observability/java-module/swagger/index.html
API Detail Usage Scenario (v0.4.2)
mc-observability Agent support metric list: (link)
mc-observability v0.4.2 monitoring & logging API usage scenarios: (link)
mc-observability v0.4.2 Trigger/Event Handler API usage scenarios: (link)
mc-observability v0.4.2 Insight API usage scenarios: (link)
Major changes
- Refactored Java module and overall codebase
- Enhanced Insight API integration with Manager
- Fixed RabbitMQ and configuration dependency issues
- Improved observability container stability and healthchecks
- Upgraded build and workflow system
What's Changed
- Refactor/trigger by @suahlingo in #205
- Refactor/code updates by @kyuengmanKim in #206
- Refactor/java module by @suahlingo in #207
- Refactor/java module by @suahlingo in #208
- Refactor/code updates by @kyuengmanKim in #209
- Refactor/java module by @suahlingo in #210
- Update Gradle & Add readyz API by @ish-hcc in #211
- fix: java-module: Fix configuration variables by @ish-hcc in #212
- Refactor/log code updates by @kyuengmanKim in #213
- fix: java-module: Need to set InfluxDB's external IP to manager by @ish-hcc in #214
- fix: java-module: Set external IPs for all needed variables by @ish-hcc in #216
- Refactor/java module by @suahlingo in #215
- fix: resolve Tempo container startup issue by @inhun in #217
- chore: add healthcheck for Tempo container by @inhun in #218
- Refactor/java module by @suahlingo in #219
- Refactor/code updates by @kyuengmanKim in #220
- fix: workflows: Change build command by @ish-hcc in #221
- Refactor/java module by @suahlingo in #222
- refactor : Format Java sources using make lint by @suahlingo in #223
- fix: java-module: Fix RabbitMQ dependecy issue by @ish-hcc in #225
- feat: Trigget test code updates by @kyuengmanKim in #224
- feat: update insight feature to align with o11y manager api changes by @inhun in #226
- Refactor/java module by @suahlingo in #227
- Refactor/java module by @suahlingo in #228
- fix: Manager env update by @kyuengmanKim in #229
- feat: Update Insight Controller & swagger.yaml by @kyuengmanKim in #230
- refactor : refactor swagger content by @suahlingo in #231
Full Changelog: v0.4.1...v0.4.2
v0.4.1
Notice
This release focuses on feature integration updates, with verification limited to program startup and container health checks.
Detailed feature validation and guidance will be provided in the upcoming v0.4.2 release(mid-October).
API
Swagger UI URL: https://m-cmp.github.io/mc-observability/java-module/swagger/index.html
Major changes
- Refactored backend architecture for integration and scalability.
- Migrated to JDK 17 and Spring Boot 3.2.
- Introduced LLM and log analysis APIs.
- Added MCP integration.
- Upgraded cb-tumblebug and cb-spider to v0.11.x.
What's Changed
- feat : add tumblebug, spider, insight client, refactor target model by @suahlingo in #108
- feat: Upgrade cb-tumblebug to v0.10.10 and fix minio by @ish-hcc in #109
- fix: java-module: Add missed classes by @ish-hcc in #112
- feat/refactor: java-module: Updates docker service & container names by @ish-hcc in #113
- fix: java-module: Fix merge error by @ish-hcc in #114
- feat/refactor: Fix merge error in application.yaml by @ish-hcc in #115
- fix: Apply lock when changing hosts by @ish-hcc in #116
- fix : yaml and query by @suahlingo in #117
- feat/refactor: Remove application-local.yaml by @ish-hcc in #118
- fix : add ACTIVE enum and vmId change to TargetId by @suahlingo in #119
- feat: go: Upgrade cb-spider base to v0.11.0 by @ish-hcc in #122
- add insight-scheduler depends on mariadb by @kyuengmanKim in #123
- Upgrade cm-tumblebug and cb-spider to v0.11.0 by @ish-hcc in #124
- feat/refactor: Implementing semaphore working with targets by @ish-hcc in #126
- feat: add delete all sessions by @GreenScreen410 in #127
- feat/refactor: Implement semaphore working with targets by @ish-hcc in #129
- feat/refactor: Use JDK 17, Spring Boot 3.2.12 by @ish-hcc in #131
- Refactor/manager by @suahlingo in #132
- Refactor/manager by @suahlingo in #133
- ansible, service: Use private key for SSH connections by @ish-hcc in #134
- refactor : remove git method, refactor update agent task status by @suahlingo in #135
- feat/refactor: Add missed telegraf binay and fix swagger by @ish-hcc in #136
- refactor : get single target, fix query param, add agent service status by @suahlingo in #137
- feat/refactor: Improve SSH connection by @ish-hcc in #138
- feat/refactor: Implement plugin feature by @ish-hcc in #139
- feat/refactor: Implement per metric configuration feature by @ish-hcc in #141
- Refactor/manager by @suahlingo in #140
- Refactor/manager by @suahlingo in #142
- feat: OpenAI key API refactor by @GreenScreen410 in #143
- Refactor/manager by @suahlingo in #145
- feat/refactor: Optimize Telegef config by @ish-hcc in #146
- Refactor Spring Log Format by @suahlingo in #148
- chore: add .env to .gitignore by @GreenScreen410 in #147
- Refactor/manager by @suahlingo in #151
- Refactor/manager by @suahlingo in #153
- Develop/log analysis by @kyuengmanKim in #154
- fix: add method to delete all chat sessions by @GreenScreen410 in #156
- Refactor/manager by @suahlingo in #157
- fix : fix select influxdb query and logic by @suahlingo in #158
- Refactor/manager by @suahlingo in #160
- feat/refactor: Upgrade cb-tumblebug to v0.11.8 and cb-spider ro v0.11.4 by @ish-hcc in #161
- feat/refactor: go: Update patch file for cb-spider v0.11.4 by @ish-hcc in #162
- Feat/trigger by @suahlingo in #170
- java-module: Rename target to vm + some fixes by @ish-hcc in #171
- java-module: Update cb-tumblebug to v0.11.9 and cb-spider to v0.11.5 by @ish-hcc in #172
- feat: add external MCP servers (InfluxDB, MariaDB) by @GreenScreen410 in #163
- feat: mcp core integration by @GreenScreen410 in #164
- feat: llm config by @GreenScreen410 in #165
- feat: log analysis api by @GreenScreen410 in #166
- feat: change infra files (db, docker) by @GreenScreen410 in #167
- Hotfix/mcp bug fix by @kyuengmanKim in #173
- Refactor/llm feature updates by @kyuengmanKim in #174
- refactor: relocate mcp from external by @inhun in #175
- Develop/graph by @kyuengmanKim in #176
- refactor: Add LLM API descriptions & Update res req models examples by @kyuengmanKim in #177
- refactor: Integrate LLM provider API & DB Table by @inhun in #178
- Feat/swagger by @suahlingo in #179
- feat: Add second InfluxDB by @ish-hcc in #182
- go: Fix build error by @ish-hcc in #183
- java-module: Add default key for mc-o11y-manager by @ish-hcc in #184
- Add container CD files by @ish-hcc in #185
- fix: Use deployed images and fix Docker build by @ish-hcc in #186
- Fix/python config by @kyuengmanKim in #187
- feat: python: Reduce image size by @ish-hcc in #188
- feat: add insight module readyz api by @kyuengmanKim in #189
- feat: add logging & tracing configuration by @inhun in #190
- java-module: Make Docker containers work by @ish-hcc in #191
- java-module: Use Grafana environment file by @ish-hcc in #192
- fix: Update git workflow yaml by @kyuengmanKim in #193
- java-module: Fix ansible variables by @ish-hcc in #195
- fix: Update python config.yaml by @kyuengmanKim in #194
- fix: java-module: Fix lint warning by @ish-hcc in #196
- refactor: update swagger.yaml by @kyuengmanKim in #197
- java-module: Update cb-tumblebug to v0.11.13 and cb-spider to v0.11.13 by @ish-hcc in #198
- java-module: Reduce image size of mc-observability-manager by @ish-hcc in #199
- feat: java-module: Reduce image size of mc-observability-grafana by @ish-hcc in #200
- refactor: update insight dockerfile by @kyuengmanKim in #202
- chore: optimize packaging with uv to reduce container image size by @inhun in #203
- Refactor/trigger by @suahlingo in #201
- fix: java-module: Fix lint warning by @ish-hcc in #204
New Contributors
- @suahlingo made their first contribution in #108
- @GreenScreen410 made their first contribution in #127
Full Changelog: v0.4.0...v0.4.1
v0.4.0
API
Swagger UI URL: https://m-cmp.github.io/mc-observability/java-module/swagger/index.html
Major changes
- LLM Log Analysis feature development in progress (add some WIP API)
- CB-Tumblebug v0.11.0 Integration test complete.
- CB-Spider v0.11.0 Integration test complete.
- minor bug fixes
What's Changed
- feat: add chronograf guide by @kyuengmanKim in #92
- Develop log analysis api by @inhun in #99
- feat: add openai key management api PR by @kyuengmanKim in #100
- fix: Resolve Python-Module Build Error by @kyuengmanKim in #103
- feat: Add Insight Log Analysis API by @kyuengmanKim in #104
- feat: Add Insight Log Analysis DB Table by @kyuengmanKim in #105
- feat: Upgrade cb-tumblebug to 0.10.10 by @ish-hcc in #106
- feat: insight docker compose, env, mcp server updates by @kyuengmanKim in #107
- fix: java-module: Fix credential file mount path by @ish-hcc in #110
- chore: Docker compose updates by @kyuengmanKim in #111
- fix: java-module: Increase cb-tumblebug read timeout by @ish-hcc in #120
- feat: go: Upgrade cb-spider base to v0.11.0 by @ish-hcc in #121
- feat: go: Upgrade cb-spider base to v0.11.0 by @ish-hcc in #122
- add insight-scheduler depends on mariadb by @kyuengmanKim in #123
Full Changelog: v0.3.1...v0.4.0
v0.3.1
What's Changed
- Update Python docker by @kyuengmanKim in #89
- java-module: Fix init.sh running command
- Develop get monitoring items api by @inhun in #90
- Change scheduler DB connection url by @inhun in #91
Full Changelog: v0.3.0...v0.3.1
v0.3.0
What's Changed
- refactor: update slack user token guide by @kyuengmanKim in #79
- fix trigger date format UTC by @eunbikim-inno in #80
- fix: update anomaly timezone kst to utc by @kyuengmanKim in #81
- fix: anomaly time range bug by @kyuengmanKim in #82
- Fix/downsampling bug fix by @kyuengmanKim in #83
- Update downsampling code by @kyuengmanKim in #84
- Update downsampling code by @kyuengmanKim in #85
- Update trigger and anomaly by @kyuengmanKim in #86
- Update manager conf by @kyuengmanKim in #87
- Update anomaly model by @kyuengmanKim in #88
Major changes
- 클린한 환경에서 o11y 매니저를 처음 실행하였을때 Database에 필요한 정보가 없는 문제를 수정하였습니다.
- 간혹 Monitoring Target 등록 후 Agent가 최초 실행될 때 모니터링 정보를 수집하지 못하는 현상을 수정하였습니다.
- InfluxDB 컨테이너의 데이터가 유지되지 않는 현상을 수정하였습니다.
- M-CMP 환경의 로그 수집을 위해 o11y 매니저의 로그 수집이 가능하도록 하였습니다.
- o11y 매니저와 VM 로그 조회 API를 별도로 분리 하였습니다.
- Tumblebug v0.9.22를 적용하였습니다.
Tested CSPs
- Azure
- AWS
- GCP
참고 사항
- 모니터링 수집 주기는 1분 마다 수집하도록 되어있습니다.
- Monitoring Target 등록 후 최소 1분이 지난후에 모니터링 데이터가 정상적으로 수집되는지 확인 가능합니다.
- CSP 기반 모니터링 API는 Azure에 배포된 VM들을 통해서만 가능합니다! (README.md: Check VM's monitoring data from CSP)
Full Changelog: v0.2.3...v0.3.0
v0.2.3
What's Changed
- python utils bug fix & swagger update by @kyuengmanKim in #76
- refactor: README update by @kyuengmanKim in #77
- feat: add manager logging conf by @kyuengmanKim in #78
Fixed
- Fix log search issue
- Fix tumble bug model mismatch issue (Tested with CB-TB v0.9.18)
- Fix docker compose default configuration issue
Added
- Add o11y-manager monitoring configuration
How to Use
Alarm & Trriger
- Scenario: #39
- Slack Bot Setup Guide: https://github.com/m-cmp/mc-observability/blob/main/java-module/slack_user_guide.md
Insight Howto
Full Changelog: v0.2.2...v0.2.3
v0.2.2
What's Changed
- swagger: Define SpiderMonitoringInfo.Data by @ish-hcc in #54
- Add api link by @hyeon-inno in #55
- Python swagger update by @kyuengmanKim in #56
- java-module: README.md: Add more monitoring setup guide by @ish-hcc in #58
- service url yaml update by @kyuengmanKim in #59
- python Readme update by @kyuengmanKim in #60
- refactor: predict response model update by @kyuengmanKim in #62
- fix:trigger bug fix by @eunbikim-inno in #63
- python: Add scheduler dag for downsampling by @inhun in #64
- fix:trigger make script bug fix by @eunbikim-inno in #65
- Refactor: python model refactor by @kyuengmanKim in #66
- python: Fix create downsampling database bug by @inhun in #67
- java-module: Use snake case for request body and response body by @ish-hcc in #68
- swagger & insight bug fix by @kyuengmanKim in #69
- Refactor: python model refactor & swagger.yaml update by @kyuengmanKim in #70
- fix: python client update by @kyuengmanKim in #71
- trigger-event-handler-service by @eunbikim-inno in #72
- Update DTO when triggering policy by @eunbikim-inno in #73
- Python update(metric post api) by @kyuengmanKim in #74
- opensearch query key update by @kyuengmanKim in #75
New Contributors
- @eunbikim-inno made their first contribution in #63
- @inhun made their first contribution in #64
Full Changelog: v0.2.1...v0.2.2