feat: auto-detect image aspect ratio from prompt by TriTue2011 · Pull Request #125 · basketikun/chatgpt2api

TriTue2011 · 2026-05-06T23:40:22Z

Cho phép tự động nhận diện tỷ lệ khung hình (ví dụ 16:9, 1:1, 4:3) từ nội dung của prompt để ghi đè lên cấu hình mặc định (khi kích thước bị Home Assistant hoặc các client khác gửi lên mặc định). Điều này cho phép ghi đè tham số \size\ bằng những keyword tìm thấy trong yêu cầu gốc của người dùng.

TriTue2011 · 2026-05-07T00:36:22Z

I have pushed two additional commits to this PR:

Fix: Changed _build_tool_prompt\ to check for \properties\ instead of
equired. This fixes a critical bug where tools with only optional arguments (like Home Assistant's \HassTurnOn) were forced to be called with {}\ because they lacked required parameters. Now the model can see and use optional arguments correctly.
Update: Translated the hardcoded image size hints from Chinese to Vietnamese to better support Vietnamese text-to-speech and users. (Note: Feel free to ask if you'd prefer this part to remain in Chinese or be changed to English for internationalization!)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…"text",...}]) Critical bug: HA messages use list content format, but inject_search_results only handled string content. Search results were never actually injected. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- HA addon config.yaml, Dockerfile, run.sh - Full Vietnamese README in homeassistant-addon/ - Updated root README.md with quick start Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: auto model now respects user config + list models uses intersection across all tokens - gemini_free/auto now reads model from config.data.providers.gemini_free.model - backend_router resolves auto by checking user config before hardcoded defaults - /v1/models fetches models from all active ChatGPT tokens in parallel - with multi-account, only shows models common to all tokens (intersection) - falls back to anon if no tokens available Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> @

Already present in conversation.py. Fallback to len/4 if import fails. Completion tokens use stream content length / 4 estimate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Both Add New and Edit mode pickers now 420px wide - Larger row height (py-2.5) and bigger font (13px) - Better visual separation with borders between rows - z-50 elevation for proper layering Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Popup width: 640px, max-height: 70vh - Search input to filter models by name - Backdrop overlay (click to close) - ModelSearch state for text filtering Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Fixed centered modal 720px, max-height 80vh - Click outside does NOT close (no backdrop handler) - Close button (X) + Done button to dismiss - Selected models show green checkmark + highlight - Button shows count of selected models - Edit mode also uses same modal style Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Unused providers shown in gray (#9ca3af) with no connection line - Only providers with actual usage get colored nodes + animated edges - Provider status determined by usage log, not config - Falls back to showing all (without edges) if no usage data yet Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Changed prefix matching from exact "geminiapi/" to startsWith("geminiapi") Now geminiapi1, geminiapi2, geminiapi3, geminiapi4 all get: - image (image generation) - vision (image analysis) - video (video analysis) capability labels in model list and combo picker. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Pre-configured for ports 8000-8003 with geminiapi1-4 prefixes. Appears at top of provider presets for easy access. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

_is_model_enabled now checks: if the model's provider has no explicit filtering configured, treat all its models as enabled. Fixes issue where enabling model filter for chatgpt would hide all custom provider models. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…hars - Now detects relative paths like /media/img_xxx.png?token=yyy - Prepends base_url for relative image URLs - Fixed regex to exclude ) ] from URL capture - Converts relative URLs to absolute before downloading Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

stream_image_outputs_with_pool now checks if model is a combo and resolves to the first route with an IMAGE_MODELS entry. Fixes "unsupported image model" error for image combos. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Gemini API Server returns images as markdown links in chat text. Adapter now extracts URL from ![alt](url) syntax + downloads to base64. Verified: image access works from within Docker container (200 OK). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- build_body now detects image data from edit endpoint - Edit mode: sends original image as base64 + editing instruction - Generate mode: existing behavior unchanged - Adapted to OpenAI vision format (image_url type) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ImageComposer now accepts model + imageModels props. Parent page.tsx needs wiring to fetch image models and pass down. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Added imageModel state (default gpt-image-2) - Fetch image-capable models from /api/v1/models-with-capabilities - Pass model + imageModels + onModelChange to ImageComposer - Use selected model instead of hardcoded "gpt-image-2" Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Gemini returns content as array when prompt uses multi-part format. parse_response now converts array content to string before parsing. Also handles image_url type in content array. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

format_image_result now strips data:image/...;base64, prefix before calling base64.b64decode to save image bytes. Fixes Gemini custom provider image generation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Removed "Hạn mức còn lại" display (frees space) - Removed "Tải lên" button (frees space) - Textarea: 16px font-semibold, min-h 120px, no absolute overlay - Controls bar: flat layout below textarea (not floating on top) - Removed sm:absolute positioning that caused text compression Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Button opens library image picker for selecting reference images from previously generated images. Parent page.tsx wiring pending. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Button "Thư viện" opens modal with all generated images - Grid view, click to select and add as reference image - Downloads image, converts to File, adds via handleReferenceImageChange - Modal closes on backdrop click or after selection Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…int) build_body now accepts 'images' list from _handle_adapter_edit, encodes raw bytes to base64 for Gemini chat API. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

TriTue2011 force-pushed the main branch from 3a4540c to 9aa7d7d Compare May 7, 2026 12:06

TriTue2011 and others added 28 commits May 12, 2026 11:08

fix: Gemini stream - separate role and content deltas

5d50af4

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: restore Format 2 regex for ToolName\n{JSON} anywhere in text

bee1e98

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

feat: Gemini multi-key support for chat + auto retry on 429

a09b5dd

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

feat: Gemini config card back in Settings page

02acd79

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: retry next codex token on 429 rate limit

e5ca056

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: add toolConfig.functionCallingConfig.mode=AUTO for Gemini

22f8c09

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: entity_ids→domain conversion to prevent HA tool call loop

5dbc39e

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: extract text from list-format message content for search query

a7f1cc0

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: remove duplicate headers line in Gemini search + search enabled

48e5bb9

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

feat: full Gemini model list from AI Studio + default gemini-3-flash

eb957a7

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: cooldown Gemini rate limit log (1 log per key per 60s)

4a9ac95

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: Gemini search — fallback to 2.5-flash + key in URL + error body log

4516081

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

feat: separate search_model config for Gemini search

dcc4485

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: search injection format — stronger prompt for model to use results

f5b3798

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: syntax error - remove duplicate closing bracket

a5530e8

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

docs: HA addon + comprehensive README

f439cdb

- HA addon config.yaml, Dockerfile, run.sh - Full Vietnamese README in homeassistant-addon/ - Updated root README.md with quick start Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: restructure addon to HA-compatible root folder

2106801

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

docs: HA addon install instructions in README

d15c734

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: simplify addon config.yaml for HA Supervisor compatibility

49c4552

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: update addon config.yaml and repository.yaml per HA docs

2f7a75a

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: force LF line endings for YAML files (HA Supervisor requirement)

12a5947

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

docs: comprehensive README with addon + Docker install

75d2027

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

cleanup: remove addon files from chatgpt2api repo (now in has-addons)

115307c

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

docs: comprehensive README with all install methods

7f78d57

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: skip search for AI task / image analysis prompts

6f2289d

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix: pass images to Codex API - handle image_url + input_image

7a96875

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

TriTue2011 and others added 30 commits May 17, 2026 00:09

Use tiktoken (encoding_for_model) for accurate prompt token counting

53e695d

Already present in conversation.py. Fallback to len/4 if import fails. Completion tokens use stream content length / 4 estimate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Model picker: full screen popup 640px + search filter

495574c

- Popup width: 640px, max-height: 70vh - Search input to filter models by name - Backdrop overlay (click to close) - ModelSearch state for text filtering Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Model modal: close on backdrop click

63aa570

Add Gemini Custom 1-4 presets to quick-select list

9f6861a

Pre-configured for ports 8000-8003 with geminiapi1-4 prefixes. Appears at top of provider presets for easy access. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Custom Providers: Pencil edit icon instead of Save icon

f670e19

Revert: custom provider models follow same filter rules

5da8093

Image composer: add model selector dropdown (partial)

72142ee

ImageComposer now accepts model + imageModels props. Parent page.tsx needs wiring to fetch image models and pass down. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Image composer: visible border + white bg for textarea

78789ca

Fix textarea visibility: thick border + light gray bg

21a6f9c

Fix textarea: indigo border + light indigo bg for clear visibility

cbf5dfe

Fix "Incorrect padding": strip data: prefix before b64decode

e988409

format_image_result now strips data:image/...;base64, prefix before calling base64.b64decode to save image bytes. Fixes Gemini custom provider image generation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Add back upload button to image composer

6a5b498

Image composer: add 'Thư viện' button + onPickLibraryImage prop

b75295e

Button opens library image picker for selecting reference images from previously generated images. Parent page.tsx wiring pending. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Wire up library button: opens /image-manager in new tab

4486cfb

Fix: add X icon import to image page.tsx

923c7a1

Fix image edit: handle raw bytes upload (images tuple from edit endpo…

409dfd2

…int) build_body now accepts 'images' list from _handle_adapter_edit, encodes raw bytes to base64 for Gemini chat API. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Fix library picker: use data.items instead of data.images

dd7a42c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: auto-detect image aspect ratio from prompt#125

feat: auto-detect image aspect ratio from prompt#125
TriTue2011 wants to merge 384 commits into
basketikun:mainfrom
TriTue2011:main

TriTue2011 commented May 6, 2026

Uh oh!

TriTue2011 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TriTue2011 commented May 6, 2026

Uh oh!

TriTue2011 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant