Skip to content

fix(document_loaders): ChatGPTLoader misses last message with default num_logs#513

Open
leoneperdigao wants to merge 2 commits intolangchain-ai:mainfrom
leoneperdigao:lc/465-bug-chatgptloader-misses-the-last-message-when
Open

fix(document_loaders): ChatGPTLoader misses last message with default num_logs#513
leoneperdigao wants to merge 2 commits intolangchain-ai:mainfrom
leoneperdigao:lc/465-bug-chatgptloader-misses-the-last-message-when

Conversation

@leoneperdigao
Copy link

@leoneperdigao leoneperdigao commented Jan 26, 2026

Summary

Related Issue

Fixes #465

Issue: Bug: ChatGPTLoader misses the last message when loading default logs
URL: #465

Problem

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain Community rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain Community.
  • I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Example Code

'''python
class ChatGPTLoader(BaseLoader):
"""Load conversations from exported ChatGPT data."""

def __init__(self, log_file: str, num_logs: int = -1):
    """Initialize a class object.

    Args:
        log_file: Path to the log file
        num_logs: Number of logs to load. If 0, load all logs.
    """
    self.log_file = log_file
    self.num_logs = num_logs

'''

Error Message and Stack Trace (if applicable)

No response

Description

Description

I found a bug in ChatGPTLoader. The default num_logs in __init__ is set to -1.
In the load method, the slicing [: self.num_logs] is executed when num_logs is not 0.
Since -1 evaluates to True, it performs data[:-1], which excludes the last record of the conversation.

Reproduction Steps

  1. Export a single conversation from ChatGPT.
  2. Load it using ChatGPTLoader(file_path).
  3. The returned document list is empty.

Proposed Fix

Change the default value of num_logs to 0 (which means load all), or handle -1 explicitly.
I have prepared a fix and unit tests for this.

Solution

Tests

Checklist

  • Code follows project style guidelines
  • Tests added/updated and passing
  • Documentation updated (if applicable)
  • Self-review completed

Additional Notes

Fixes langchain-ai#465

The bug was that `num_logs` defaulted to `-1`, which caused
`data[:-1]` slicing (excluding the last conversation) because
`-1` is truthy in Python.

Changes:
- Changed default `num_logs` from `-1` to `0` (load all)
- Made slicing logic explicit: only slice when `num_logs > 0`
- Updated docstring to clarify the default behavior
- Added comprehensive unit tests for ChatGPTLoader

The fix ensures that:
- `ChatGPTLoader(file)` loads ALL conversations (default)
- `ChatGPTLoader(file, num_logs=0)` loads ALL conversations
- `ChatGPTLoader(file, num_logs=N)` loads first N conversations
@github-actions github-actions bot added fix and removed fix labels Jan 26, 2026
Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: ChatGPTLoader misses the last message when loading default logs

1 participant