Skip to content

auto_pdb_load_data.sh fails with MySQL Error Data too long for column 'logo' #129

@momorientes

Description

@momorientes

From a clean server setup built with uv run peeringdb server --setup , auto_pdb_load_data.sh fails as follows.

» ./auto_pdb_load_data.sh log
2025-11-30 10:50:59 [info     ] Starting pdb_load_data daemon...
/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/coreapi/utils.py:5: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
System check identified some issues:

WARNINGS:
?: (debug_toolbar.W001) debug_toolbar.middleware.DebugToolbarMiddleware is missing from MIDDLEWARE.
        HINT: Add debug_toolbar.middleware.DebugToolbarMiddleware to MIDDLEWARE.
including /srv/www.peeringdb.com/mainsite/settings/dev.py dev
SECRET_KEY not set, generating an ephemeral one
loaded additional settings file '/srv/www.peeringdb.com/mainsite/settings/dev.py'
Release env is 'dev'
Checking if Redis is available for negative
Was not able to ping Redis for negative, falling back to LocMemCache
Checking if Redis is available for session
Was not able to ping Redis for session, falling back to DatabaseCache
loaded settings for PeeringDB 2.74.1 (DEBUG: True)
Using API key for sync: prefix <REDACTED>
Syncing data from https://www.peeringdb.com/api/
[org] Fetching from remote cache
[org] https://public.peeringdb.com/org-0.json
[org] Processing 32393 objects
Traceback (most recent call last):
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/backends/utils.py", line 105, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/backends/mysql/base.py", line 76, in execute
    return self.cursor.execute(query, args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/MySQLdb/cursors.py", line 179, in execute
    res = self._query(mogrified_query)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/MySQLdb/cursors.py", line 330, in _query
    db.query(q)
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/MySQLdb/connections.py", line 280, in query
    _mysql.connection.query(self, query)
MySQLdb.DataError: (1406, "Data too long for column 'logo' at row 14489")

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/srv/www.peeringdb.com/manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/core/management/base.py", line 416, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/core/management/base.py", line 460, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/peeringdb_server/management/commands/pdb_load_data.py", line 168, in handle
    client.updater.update_all(resource.all_resources(), since=None)
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/peeringdb/_update.py", line 256, in update_all
    self._handle_initial_sync(entries, res)
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/peeringdb/_update.py", line 185, in _handle_initial_sync
    self.backend.get_concrete(res).objects.bulk_create(objs)
  File "/srv/www.peeringdb.com/peeringdb_server/managers.py", line 6, in bulk_create
    instance = super().bulk_create(objs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/models/manager.py", line 87, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/models/query.py", line 808, in bulk_create
    returned_columns = self._batched_insert(
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/models/query.py", line 1912, in _batched_insert
    self._insert(
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/models/query.py", line 1873, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/models/sql/compiler.py", line 1882, in execute_sql
    cursor.execute(sql, params)
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/backends/utils.py", line 122, in execute
    return super().execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/backends/utils.py", line 79, in execute
    return self._execute_with_wrappers(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/backends/utils.py", line 92, in _execute_with_wrappers
    return executor(sql, params, many, context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/backends/utils.py", line 100, in _execute
    with self.db.wrap_database_errors:
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/backends/utils.py", line 105, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/django/db/backends/mysql/base.py", line 76, in execute
    return self.cursor.execute(query, args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/MySQLdb/cursors.py", line 179, in execute
    res = self._query(mogrified_query)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/MySQLdb/cursors.py", line 330, in _query
    db.query(q)
  File "/srv/www.peeringdb.com/venv/lib/python3.12/site-packages/MySQLdb/connections.py", line 280, in query
    _mysql.connection.query(self, query)
django.db.utils.DataError: (1406, "Data too long for column 'logo' at row 14489")
2025-11-30 10:52:35 [info     ] Data loaded successfully.
2025-11-30 10:52:35 [info     ] Sleeping for 1546 seconds...

This seems to be caused because logo at peeringdb_organization is of type varchar(100):

mysql> describe peeringdb_organization;
+------------------------+--------------+------+-----+----------------+-------------------+
| Field                  | Type         | Null | Key | Default        | Extra             |
+------------------------+--------------+------+-----+----------------+-------------------+
| id                     | int          | NO   | PRI | NULL           | auto_increment    |
| status                 | varchar(255) | NO   | MUL | NULL           |                   |
| created                | datetime(6)  | NO   |     | NULL           |                   |
| updated                | datetime(6)  | NO   |     | NULL           |                   |
| version                | int          | NO   |     | NULL           |                   |
| address1               | varchar(255) | NO   |     | NULL           |                   |
| address2               | varchar(255) | NO   |     | NULL           |                   |
| city                   | varchar(255) | NO   |     | NULL           |                   |
| state                  | varchar(255) | NO   |     | NULL           |                   |
| zipcode                | varchar(48)  | NO   |     | NULL           |                   |
| country                | varchar(2)   | NO   |     | NULL           |                   |
| name                   | varchar(255) | NO   | UNI | NULL           |                   |
| website                | varchar(255) | NO   |     | NULL           |                   |
| notes                  | longtext     | NO   |     | NULL           |                   |
| logo                   | varchar(100) | YES  |     | NULL           |                   |
| latitude               | decimal(9,6) | YES  |     | NULL           |                   |
| longitude              | decimal(9,6) | YES  |     | NULL           |                   |
| floor                  | varchar(255) | NO   |     | NULL           |                   |
| suite                  | varchar(255) | NO   |     | NULL           |                   |
| geocode_date           | datetime(6)  | YES  |     | NULL           |                   |
| geocode_status         | tinyint(1)   | NO   |     | NULL           |                   |
| aka                    | varchar(255) | NO   |     | NULL           |                   |
| name_long              | varchar(255) | NO   |     | NULL           |                   |
| flagged                | tinyint(1)   | YES  |     | NULL           |                   |
| flagged_date           | datetime(6)  | YES  |     | NULL           |                   |
| email_domains          | longtext     | YES  |     | NULL           |                   |
| restrict_user_emails   | tinyint(1)   | NO   |     | NULL           |                   |
| periodic_reauth        | tinyint(1)   | NO   |     | NULL           |                   |
| periodic_reauth_period | varchar(255) | YES  |     | NULL           |                   |
| social_media           | json         | NO   |     | _utf8mb4\'{}\' | DEFAULT_GENERATED |
| require_2fa            | tinyint(1)   | NO   |     | NULL           |                   |
| last_notified          | datetime(6)  | YES  |     | NULL           |                   |
+------------------------+--------------+------+-----+----------------+-------------------+

Additionally, as seen in the output above, auto_pdb_load_data.py does not have proper error handling and just assumes that run_pdb_load_data() was successful.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions