Skip to content

Enable waiting for postgresql instance #856

@devops-42

Description

@devops-42

Proposal

Use case

Zalando's postgres-operator supports configuring sidecar containers, e.g. postgres_exporter, to enable observability. After the instantiation of a Postgresql resource, the operator injects the sidecar into the Pod spec, both containers start simultaneously.

When looking into the metrics endpoint provided by the exporter (http://localhost:9187/metrics), we've seen, that (in some cases) metrics are missing: In our setup the pg_stat_database_xact_commit was affected (along with many other pg_stat_database_* metrics).

After playing around and trying to narrow down the root case we've seen, that a simple restart of the exporter (by issuing)

/ $ kill 1

in the sidecar container's shell fixed the problem. All metrics are now visible.

This leads to the assumption, that the exporter has started sometime in between the postgres instance was in the start up process and could not prepare/return all metrics.

We'd appreciate a feature by which it is possible to enforce the start of the postgres_exporter only, until the postgres instance is ready. Most likely this could be provided by setting an environment variable DATA_SOURCE_WAIT_UNTIL_READY and, additionally, configuring a timeout flag, if the database won't come up in time: DATA_SOURCE_WAIT_UNTIL_READY_TIMEOUT.

Thanks for comments/help/criticism :)

Cheers!

Activity

bsv798

bsv798 commented on Aug 11, 2023

@bsv798

I'm experiencing exactly the same problem now.
Had to modify exporter docker image and include script which waits for postgres is ready to accept connections.

kyleli666

kyleli666 commented on Aug 16, 2023

@kyleli666

same issue here. Had to manually kill 1

YannickDevos

YannickDevos commented on Sep 8, 2023

@YannickDevos

Same issue here relying on an extraContainers running CloudSQL_proxy to access a managed DB instance.

Looking at the prometheus-postgres-exporter container logs I can only find this single error line (no other error pop up):
ts=2023-09-08T08:26:04.059Z caller=main.go:142 level=warn msg="Failed to create PostgresCollector" err="dial tcp 127.0.0.1:5432: connect: connection refused"

My assumption is that the cloudsql-proxy container is not ready to accept connection.

Manually kill 1 solve the issue

sysadmind

sysadmind commented on Sep 8, 2023

@sysadmind
Contributor

This should be resolved by #882. We have changed the logic to connect to the database during metrics collection. This means that each scrape is a new attempt to connect to the database. Of note, #902 fixes a connection leak that #882 introduced. If this doesn't solve your use case, feel free to re-open with a description of the use case so we can discuss.

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @bsv798@sysadmind@YannickDevos@devops-42@kyleli666

        Issue actions

          Enable waiting for postgresql instance · Issue #856 · prometheus-community/postgres_exporter