Skip to content

Conversation

@duskfallcrew
Copy link

Open source edits should be shared regardless :) -- This was the changes I worked on wiht LLM support back in the day I was relying soely on vendoring SDPR's code. There's other support changes iv'e mde in Dataset tools, but i'll have to see if i can clone the repo again locally and then add those changes in here.

Mostly those are just added vendored code for TensorArt and Moescape and a few other systems that SDPR doesn't support.

Yes I realize i'm making Dataset tools and everything but it's not meant to rival the thing that inspired it. Thus i'm sharing the content that may help refresh SDPR :)

Adding Civitai Mojibake decoding
feat(parser): Add CivitaiComfyUIFormat for Civitai JPEG metadata

Introduces a new parser, `CivitaiComfyUIFormat`, to handle the specific
UserComment format found in JPEGs generated by Civitai's ComfyUI-based
service.

This parser correctly:
- Decodes UserComment strings that may include a "charset=Unicode"
  prefix and potentially mojibake UTF-16LE encoded JSON (if passed
  through by piexif).
- Parses the main ComfyUI workflow JSON.
- Extracts A1111-style generation parameters (prompt, negative prompt,
  steps, CFG scale, seed, sampler, width, height) from the nested
  "extraMetadata" JSON string found within the workflow.
- Populates standard BaseFormat attributes (positive, negative, parameter,
  width, height, raw, etc.).

ImageDataReader has been updated to detect these Civitai JPEGs (based on
UserComment content patterns and/or Software EXIF tag) and dispatch
to this new parser, improving metadata extraction for these images.

This is based on code I've been working on for Dataset-tools which is already inspired by and borrows from SD Prompt Reader
Logging functions
Refactor(CivitaiComfyUIFormat): Improve decoding logic and add logging

Refactors the `CivitaiComfyUIFormat` parser for better robustness and integrates standard logging using the project's Logger.

Key changes to `CivitaiComfyUIFormat`:
- Enhanced `_decode_civitai_user_comment` method:
    - More clearly differentiates between Civitai mojibake JSON,
      plain JSON (potentially after a 'charset=Unicode' prefix),
      and other non-JSON UserComment content.
    - Returns `None` if the input string cannot be resolved to one
      of the expected Civitai JSON formats.
- Improved `parse` method:
    - Initializes and uses `self._logger` for detailed logging of
      parsing steps, warnings, and errors, consistent with the
      `stable-diffusion-prompt-reader` logging system.
    - More explicit error handling and status setting
      (e.g., `BaseFormat.Status.FORMAT_ERROR` with `self._error` messages)
      if decoding or JSON parsing (main workflow or `extraMetadata`) fails.
    - Clarified population of `BaseFormat` attributes like `self._positive`,
      `self._negative`, `self._parameter`, `self._width`, `self._height`,
      and `self.raw`.
    - Handles cases where `extraMetadata` might be missing or malformed.

This commit makes the `CivitaiComfyUIFormat` parser more debuggable,
its internal logic clearer, and prepares it for better integration
into the main ImageDataReader by providing more informative status
and error feedback.
Fix(CivitaiComfyUIFormat): Correct status assignment and enhance logging

This commit addresses an AttributeError in the CivitaiComfyUIFormat parser
and improves its internal logging for better diagnostics.

Changes:
- Corrected status assignment from `self.status = ...` to `self._status = ...`
  throughout the `parse()` method to prevent AttributeError, aligning with
  BaseFormat's likely intended usage for subclasses.
- Ensured the `parse()` method consistently returns `self._status`.
- Integrated logger instance (`self._logger`) from the parent package's
  Logger class, replacing print statements with standard logging calls
  (debug, info, warn, error) for better traceability within the
  sd-prompt-reader ecosystem.
- Refined error messages and status setting for various parsing scenarios,
  including failures in decoding UserComment, parsing main workflow JSON,
  or parsing nested 'extraMetadata'.
- Clarified that `self.raw` will store the cleaned ComfyUI workflow JSON string.
- Added fallback for width/height from ImageDataReader's initial values if
  not present in 'extraMetadata'.

These changes make the CivitaiComfyUIFormat parser more robust, correctly
report its status, and provide more informative logs.
Original last line before return self._status:
self.raw = cleaned_workflow_json_str
Corrected last line before return self._status:
self._raw = cleaned_workflow_json_str
Refine(CivitaiComfyUIFormat): Add re-parse guard, width/height handling, and tool name

Further refines the `CivitaiComfyUIFormat` parser for better integration
and robustness within the stable-diffusion-prompt-reader framework.

Key changes to `CivitaiComfyUIFormat`:
- Implemented a guard in `parse()` to prevent re-parsing if the status
  is already `READ_SUCCESS`, avoiding redundant operations and logging.
- Modified `__init__` to accept `width` and `height` parameters, allowing
  ImageDataReader to pass image dimensions. These are used as fallbacks
  if not found in `extraMetadata`.
- Set `self.tool = "Civitai ComfyUI"` within the parser's `__init__`
  for consistent tool identification.
- Renamed internal decoding method to `_decode_user_comment_for_civitai`
  for better clarity.
- Ensured `self._raw` is consistently set to the cleaned workflow JSON string
  after successful decoding of the UserComment.
- Minor improvements to logging messages for clarity.

These changes enhance the parser's efficiency, ensure correct dimension
handling, and improve its overall integration behavior.
Create civitai.py

Merge 'mojibake' branch into 'master'.

This branch introduces the `CivitaiComfyUIFormat` parser, designed to handle
UserComment metadata from Civitai's ComfyUI-based JPEG image generations.

Key features implemented in `CivitaiComfyUIFormat` (format/civitai.py):
- Decodes "charset=Unicode" prefixed UserComment strings.
- Reverses UTF-16LE mojibake if present in the string provided by piexif.
  (Note: piexif often provides clean JSON, making this a robustness measure).
- Parses the main ComfyUI workflow JSON.
- Extracts A1111-style parameters (prompt, negative, steps, CFG, seed, etc.)
  from the nested "extraMetadata" JSON string.
- Populates BaseFormat attributes for consumption by ImageDataReader.
- Includes logging via the project's Logger.
- Prevents re-parsing if already successful.
- Handles width/height passed from ImageDataReader or found in extraMetadata.

Experimental modifications to `ImageDataReader.py` (on this branch) were
made to test the detection and invocation of `CivitaiComfyUIFormat`.
These ImageDataReader changes demonstrate proof-of-concept integration
and include:
- Detection heuristics for Civitai JPEGs (software tag, UserComment patterns).
- Dispatch logic to call `CivitaiComfyUIFormat`.
- Fallback to existing parsers if Civitai format is not detected or parsing fails.

This merge consolidates the development of the Civitai parser and its
initial integration testing within this fork. Next steps would involve
refining the ImageDataReader modifications for a potential upstream PR
and more extensive testing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant