Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Severe performance degradation when using Django 5.1 connection pool #445

Open
LuisGMM opened this issue Nov 18, 2024 · 1 comment
Open

Comments

@LuisGMM
Copy link

LuisGMM commented Nov 18, 2024

The new Django 5.1 version adds connection pools for postgresql.

I noticed a very interesting, and terrifying, problem. The number of requests/second dramatically decreases up to a point Django explodes when using django_prometheus and the new connection pooling.

I tested with this database settings:

DATABASES = {
    'default': {
        'ENGINE': 'django_prometheus.db.backends.postgresql',
        # 'ENGINE': 'django.db.backends.postgresql',
        'NAME': env('DB_NAME'),
        'USER': env('DB_USER'),
        'PASSWORD': env('DB_PASSWORD'),
        'HOST': env('DB_HOST'),
        'PORT': env('DB_PORT'),
        'DISABLE_SERVER_SIDE_CURSORS': True,
        'OPTIONS': {
            'pool': True,
        },
    }
}

And these requirements:

# Infrastructure
Django==5.1.3
adrf==0.1.8
gunicorn==20.1.0
uvicorn==0.22.0

# Metrics, Monitoring && logging
django-prometheus==2.3.1
prometheus-client==0.21.0

# db
psycopg==3.2.3
psycopg-pool==3.2.4

I noticed the problem using this command: hey -c 1 -n 100 https://127.0.0.1/api/v1/test/

With this view:

class TestView(APIView):
    permission_classes = ()

    async def get(self, request) -> Response:  # type: ignore
        del request
        origins = await Origin.objects.alast()

        if not origins:
            return Response(status=status.HTTP_204_NO_CONTENT)

        return Response(OriginGetResponseSerializer({'origins': [origins]}).data, status=status.HTTP_200_OK)

Where Origin is a simple model:

from uuid import uuid4

from django.db import models


class Origin(models.Model):
    id = models.UUIDField(
        primary_key=True,
        default=uuid4,
        editable=False,
        unique=True,
        blank=False,
        null=False,
    )
    name = models.CharField(
        unique=True,
        max_length=32,
    )

    active_android = models.BooleanField(default=False)
    active_ios = models.BooleanField(default=False)
    active_web = models.BooleanField(default=False)

    active_new_item = models.BooleanField(default=False)

    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

    objects = models.Manager()  # To comply with mypy

Note that using django.db.backends.postgresql works as expected, steadily. Also, removing the OPTIONS key also avoids the issue.

When running the benchmark with hey -c 1 -n 100 https://127.0.0.1/api/v1/test/ I get a report like

Summary:
  Total:	0.7236 secs
  Slowest:	0.0113 secs
  Fastest:	0.0063 secs
  Average:	0.0072 secs
  Requests/sec:	138.2019
  
  Total data:	24400 bytes
  Size/request:	244 bytes

Response time histogram:
  0.006 [1]	|■
  0.007 [25]	|■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.007 [39]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.008 [21]	|■■■■■■■■■■■■■■■■■■■■■■
  0.008 [9]	|■■■■■■■■■
  0.009 [4]	|■■■■
  0.009 [0]	|
  0.010 [0]	|
  0.010 [0]	|
  0.011 [0]	|
  0.011 [1]	|■


Latency distribution:
  10% in 0.0066 secs
  25% in 0.0068 secs
  50% in 0.0071 secs
  75% in 0.0076 secs
  90% in 0.0081 secs
  95% in 0.0084 secs
  99% in 0.0113 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0000 secs, 0.0063 secs, 0.0113 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0001 secs
  resp wait:	0.0071 secs, 0.0062 secs, 0.0086 secs
  resp read:	0.0000 secs, 0.0000 secs, 0.0001 secs

Status code distribution:
  [200]	100 responses

Then, the Requests/sec start to decrease in subsequent runs:

Run 1:
  Requests/sec:	53.0024

Run 2:
  Requests/sec:	138.1491

Run 3:
  Requests/sec:	128.2628

Run 4:
  Requests/sec:	110.5167

Run 5:
  Requests/sec:	114.9347

Run 6:
  Requests/sec:	108.7823

Run 7:
  Requests/sec:	103.1321

Run 8:
  Requests/sec:	90.7509

Run 9:
  Requests/sec:	91.4321

Run 10:
  Requests/sec:	83.5613

Run 11:
  Requests/sec:	86.0125

Run 12:
  Requests/sec:	76.2282

Run 13:
  Requests/sec:	78.4934

Run 14:
  Requests/sec:	70.1448

Run 15:
  Requests/sec:	68.1487

Note I used this cute command to run the test: for i in {1..15}; do echo "Run $i:"; hey -c 1 -n 100 https://127.0.0.1/api/v1/test/ | awk '/Requests\/sec/'; echo ""; done

I am not sure why this happens, however, it has to do with connections and cursors (in this file of the repo?).

@LuisGMM
Copy link
Author

LuisGMM commented Nov 26, 2024

I have found this Django ticket where they mention that connections must be closed when not ran in django e.g. in other services like celery, etc. Could it be possible that the Prometheus logging is not closing the connections?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant