Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using check_interval < 1s crashes Icinga DB #882

Open
yhabteab opened this issue Jan 21, 2025 · 2 comments
Open

Using check_interval < 1s crashes Icinga DB #882

yhabteab opened this issue Jan 21, 2025 · 2 comments
Labels
area/schema bug Something isn't working
Milestone

Comments

@yhabteab
Copy link
Member

Describe the bug

All timestamps < 1s used as check interval result in floating point numbers when converted to timestamp aka seconds. However, while such values can be encoded/decoded into the corresponding struct field without problems, it crashes when trying to insert 0.001 as check_interval value, which is of type bigint. There was a similar issue in Icinga DB Web Icinga/icingadb-web#910, but apparently MySQL/MariaDB simply round these numbers down to 0, causing a division by zero error instead of rejecting them like PostgreSQL does. We observed this last time during a debugging session with @nilmerg a mysterious crash by such a check interval.

2025-01-21T15:39:18.592+0100	WARN	config-sync	Aborted config sync after 10.086498s
2025-01-21T15:39:19.539+0100	FATAL	main	pq: invalid input syntax for type bigint: "0.001"
can't perform "INSERT INTO \"host\" (\"check_timeperiod_id\", \"command_endpoint_id\", \"eventcommand_id\", \"display_name\", \"passive_checks_enabled\", \"address6\", \"flapping_enabled\", \"flapping_threshold_high\", \"icon_image_alt\", \"active_checks_enabled\", \"zone_name\", \"is_volatile\", \"flapping_threshold_low\", \"icon_image_id\", \"max_check_attempts\", \"name_ci\", \"notifications_enabled\", \"check_retry_interval\", \"eventcommand_name\", \"address\", \"checkcommand_name\", \"properties_checksum\", \"name_checksum\", \"check_interval\", \"command_endpoint_name\", \"event_handler_enabled\", \"zone_id\", \"name\", \"notes\", \"environment_id\", \"id\", \"notes_url_id\", \"affected_children\", \"check_timeout\", \"checkcommand_id\", \"action_url_id\", \"perfdata_enabled\", \"address_bin\", \"address6_bin\", \"check_timeperiod_name\") VALUES (:check_timeperiod_id, :command_endpoint_id, :eventcommand_id, :display_name, :passive_checks_enabled, :address6, :flapping_enabled, :flapping_threshold_high, :icon_image_alt, :active_checks_enabled, :zone_name, :is_volatile, :flapping_threshold_low, :icon_image_id, :max_check_attempts, :name_ci, :notifications_enabled, :check_retry_interval, :eventcommand_name, :address, :checkcommand_name, :properties_checksum, :name_checksum, :check_interval, :command_endpoint_name, :event_handler_enabled, :zone_id, :name, :notes, :environment_id, :id, :notes_url_id, :affected_children, :check_timeout, :checkcommand_id, :action_url_id, :perfdata_enabled, :address_bin, :address6_bin, :check_timeperiod_name) ON CONFLICT ON CONSTRAINT pk_host DO UPDATE SET \"check_timeperiod_id\" = EXCLUDED.\"check_timeperiod_id\", \"command_endpoint_id\" = EXCLUDED.\"command_endpoint_id\", \"eventcommand_id\" = EXCLUDED.\"eventcommand_id\", \"display_name\" = EXCLUDED.\"display_name\", \"passive_checks_enabled\" = EXCLUDED.\"passive_checks_enabled\", \"address6\" = EXCLUDED.\"address6\", \"flapping_enabled\" = EXCLUDED.\"flapping_enabled\", \"flapping_threshold_high\" = EXCLUDED.\"flapping_threshold_high\", \"icon_image_alt\" = EXCLUDED.\"icon_image_alt\", \"active_checks_enabled\" = EXCLUDED.\"active_checks_enabled\", \"zone_name\" = EXCLUDED.\"zone_name\", \"is_volatile\" = EXCLUDED.\"is_volatile\", \"flapping_threshold_low\" = EXCLUDED.\"flapping_threshold_low\", \"icon_image_id\" = EXCLUDED.\"icon_image_id\", \"max_check_attempts\" = EXCLUDED.\"max_check_attempts\", \"name_ci\" = EXCLUDED.\"name_ci\", \"notifications_enabled\" = EXCLUDED.\"notifications_enabled\", \"check_retry_interval\" = EXCLUDED.\"check_retry_interval\", \"eventcommand_name\" = EXCLUDED.\"eventcommand_name\", \"address\" = EXCLUDED.\"address\", \"checkcommand_name\" = EXCLUDED.\"checkcommand_name\", \"properties_checksum\" = EXCLUDED.\"properties_checksum\", \"name_checksum\" = EXCLUDED.\"name_checksum\", \"check_interval\" = EXCLUDED.\"check_interval\", \"command_endpoint_name\" = EXCLUDED.\"command_endpoint_name\", \"event_handler_enabled\" = EXCLUDED.\"event_handler_enabled\", \"zone_id\" = EXCLUDED.\"zone_id\", \"name\" = EXCLUDED.\"name\", \"notes\" = EXCLUDED.\"notes\", \"environment_id\" = EXCLUDED.\"environment_id\", \"id\" = EXCLUDED.\"id\", \"notes_url_id\" = EXCLUDED.\"notes_url_id\", \"affected_children\" = EXCLUDED.\"affected_children\", \"check_timeout\" = EXCLUDED.\"check_timeout\", \"checkcommand_id\" = EXCLUDED.\"checkcommand_id\", \"action_url_id\" = EXCLUDED.\"action_url_id\", \"perfdata_enabled\" = EXCLUDED.\"perfdata_enabled\", \"address_bin\" = EXCLUDED.\"address_bin\", \"address6_bin\" = EXCLUDED.\"address6_bin\", \"check_timeperiod_name\" = EXCLUDED.\"check_timeperiod_name\""
github.com/icinga/icinga-go-library/database.CantPerformQuery
	/Users/yhabteab/Workspace/go/icinga-go-library/database/utils.go:16
github.com/icinga/icinga-go-library/database.(*DB).NamedBulkExec.func1.(*DB).NamedBulkExec.func1.1.2.1
	/Users/yhabteab/Workspace/go/icinga-go-library/database/db.go:431
github.com/icinga/icinga-go-library/retry.WithBackoff
	/Users/yhabteab/Workspace/go/icinga-go-library/retry/retry.go:65
github.com/icinga/icinga-go-library/database.(*DB).NamedBulkExec.func1.(*DB).NamedBulkExec.func1.1.2
	/Users/yhabteab/Workspace/go/icinga-go-library/database/db.go:426
golang.org/x/sync/errgroup.(*Group).Go.func1
	/Users/yhabteab/Workspace/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78
runtime.goexit
	/opt/homebrew/Cellar/go/1.23.4/libexec/src/runtime/asm_arm64.s:1223
retry deadline exceeded
github.com/icinga/icinga-go-library/retry.WithBackoff
	/Users/yhabteab/Workspace/go/icinga-go-library/retry/retry.go:100
github.com/icinga/icinga-go-library/database.(*DB).NamedBulkExec.func1.(*DB).NamedBulkExec.func1.1.2
	/Users/yhabteab/Workspace/go/icinga-go-library/database/db.go:426
golang.org/x/sync/errgroup.(*Group).Go.func1
	/Users/yhabteab/Workspace/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78
runtime.goexit
	/opt/homebrew/Cellar/go/1.23.4/libexec/src/runtime/asm_arm64.s:1223
exit status 1
@yhabteab yhabteab added area/schema bug Something isn't working and removed area/schema labels Jan 21, 2025
@yhabteab
Copy link
Member Author

Note that this is not limited to the check_interval column only, but some other columns such as check_timeout, check_retry_interval etc. also suffer from the same problem.

@oxzi
Copy link
Member

oxzi commented Mar 6, 2025

Good catch. I have taken another look at the struct types and annotate some similar findings.

diff --git a/pkg/icingadb/v1/checkable.go b/pkg/icingadb/v1/checkable.go
index 78d75f5..8d9680c 100644
--- a/pkg/icingadb/v1/checkable.go
+++ b/pkg/icingadb/v1/checkable.go
@@ -11,11 +11,11 @@ type Checkable struct {
        NameCiMeta            `json:",inline"`
        ActionUrlId           types.Binary `json:"action_url_id"`
        ActiveChecksEnabled   types.Bool   `json:"active_checks_enabled"`
-       CheckInterval         float64      `json:"check_interval"`
+       CheckInterval         float64      `json:"check_interval"` // mysql: int unsigned, pgsql: uint
        CheckTimeperiodName   string       `json:"check_timeperiod_name"`
        CheckTimeperiodId     types.Binary `json:"check_timeperiod_id"`
-       CheckRetryInterval    float64      `json:"check_retry_interval"`
-       CheckTimeout          float64      `json:"check_timeout"`
+       CheckRetryInterval    float64      `json:"check_retry_interval"` // mysql: int unsigned, pgsql: uint
+       CheckTimeout          float64      `json:"check_timeout"`        // mysql: int unsigned, pgsql: uint
        CheckcommandName      string       `json:"checkcommand_name"`
        CheckcommandId        types.Binary `json:"checkcommand_id"`
        CommandEndpointName   string       `json:"command_endpoint_name"`
diff --git a/pkg/icingadb/v1/command.go b/pkg/icingadb/v1/command.go
index 74ec555..575b0d9 100644
--- a/pkg/icingadb/v1/command.go
+++ b/pkg/icingadb/v1/command.go
@@ -12,7 +12,7 @@ type Command struct {
        NameCiMeta         `json:",inline"`
        ZoneId             types.Binary `json:"zone_id"`
        Command            string       `json:"command"`
-       Timeout            uint32       `json:"timeout"`
+       Timeout            uint32       `json:"timeout"` // mysql: int unsigned AND smallint unsigned, pgsql: uint AND smalluint -- differs in implementing structs below
 }

 type CommandArgument struct {
diff --git a/pkg/icingadb/v1/state.go b/pkg/icingadb/v1/state.go
index d2e48e8..ee895c8 100644
--- a/pkg/icingadb/v1/state.go
+++ b/pkg/icingadb/v1/state.go
@@ -14,7 +14,7 @@ type State struct {
        CheckCommandline          types.String                       `json:"check_commandline"`
        CheckSource               types.String                       `json:"check_source"`
        SchedulingSource          types.String                       `json:"scheduling_source"`
-       ExecutionTime             float64                            `json:"execution_time"`
+       ExecutionTime             float64                            `json:"execution_time"` // mysql: int unsigned, pgsql: uint
        HardState                 uint8                              `json:"hard_state"`
        InDowntime                types.Bool                         `json:"in_downtime"`
        IsAcknowledged            icingadbTypes.AcknowledgementState `json:"is_acknowledged"`
@@ -24,7 +24,7 @@ type State struct {
        IsReachable               types.Bool                         `json:"is_reachable"`
        LastStateChange           types.UnixMilli                    `json:"last_state_change"`
        LastUpdate                types.UnixMilli                    `json:"last_update"`
-       Latency                   float64                            `json:"latency"`
+       Latency                   float64                            `json:"latency"` // mysql: int unsigned, pgsql: uint
        LongOutput                types.String                       `json:"long_output"`
        NextCheck                 types.UnixMilli                    `json:"next_check"`
        NextUpdate                types.UnixMilli                    `json:"next_update"`
@@ -36,5 +36,5 @@ type State struct {
        Severity                  uint16                             `json:"severity"`
        SoftState                 uint8                              `json:"soft_state"`
        StateType                 icingadbTypes.StateType            `json:"state_type"`
-       CheckTimeout              float64                            `json:"check_timeout"`
+       CheckTimeout              float64                            `json:"check_timeout"` // mysql: int unsigned, pgsql: uint
 }

In addition, some types.Int (being a nullable int64) were represented as unsigned integers. However, they seemed fine.

@oxzi oxzi added this to the 1.3.0 milestone Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/schema bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants