Skip to content

load performance slower than stdlib json, double that of orjson #260

@davetapley

Description

@davetapley

Things to check first

  • I have searched the existing issues and didn't find my bug already reported there

  • I have checked that my bug is still present in the latest release

cbor2 version

5.6.5

Python version

3.12.2

What happened?

Per this I assumed CBOR would be faster than JSON, but my testing seems to indicate otherwise.

For a 5MB JSON file:

json 2.7858687019997888
orjson 1.316059184999176
cbor 3.216009937999843
cbor_open 3.6368825289991946

The cbor_open uses ⬇️ , others use read_bytes.

cbor2/docs/usage.rst

Lines 16 to 18 in 071a165

# Efficiently deserialize from a file
with open('input.cbor', 'rb') as fp:
obj = load(fp)

The file sizes are also closer than I'd imagine:

ll -h 5MB-min.*
-rw-rw-rw- 1 vscode vscode 4.2M Aug 21 20:19 5MB-min.cbor
-rw-rw-rw- 1 vscode vscode 4.5M Aug  6 10:49 5MB-min.json

Am I missing something?

I'm able to from _cbor2 import * without ImportError, so I assume I am using the optimized C version?

cbor2/cbor2/__init__.py

Lines 22 to 27 in 071a165

try:
from _cbor2 import * # noqa: F403
except ImportError:
# Couldn't import the optimized C version; ignore the failure and leave the
# pure Python implementations in place.

How can we reproduce the bug?

from timeit import timeit
import json
import orjson
import cbor2
from pathlib import Path


data = orjson.loads(Path('5MB-min.json').read_bytes())
Path('5MB-min.cbor').write_bytes(cbor2.dumps(data))


def load_json():
    path = Path('5MB-min.json')
    json.loads(path.read_bytes())


def load_orjson():
    path = Path('5MB-min.json')
    orjson.loads(path.read_bytes())


def load_cbor():
    path = Path('5MB-min.cbor')
    cbor2.loads(path.read_bytes())


def load_cbor_open():
    with open('5MB-min.cbor', 'rb') as fp:
        cbor2.load(fp)


print('json', timeit(load_json, number=100))
print('orjson', timeit(load_orjson, number=100))
print('cbor', timeit(load_cbor, number=100))
print('cbor_open', timeit(load_cbor_open, number=100))

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions