Skip to content

Commit

Permalink
Merge pull request #71 from kotaro-kinoshita/feature/export-figure-wi…
Browse files Browse the repository at this point in the history
…th-json-and-csv

feature export figure with csv and json
  • Loading branch information
kotaro-kinoshita authored Dec 31, 2024
2 parents 0d37294 + 76d7c30 commit 7979ead
Show file tree
Hide file tree
Showing 9 changed files with 78 additions and 5 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ yomitoku ${path_data} -f md -o results -v --figure --lite
- `-d`, `--device` モデルを実行するためのデバイスを指定します。gpu が利用できない場合は cpu で推論が実行されます。(デフォルト: cuda)
- `--ignore_line_break` 画像の改行位置を無視して、段落内の文章を連結して返します。(デフォルト:画像通りの改行位置位置で改行します。)
- `--figure_letter` 検出した図表に含まれる文字も出力ファイルにエクスポートします。
- `--figure` 検出した図、画像を出力ファイルにエクスポートします。(html と markdown のみ)
- `--figure` 検出した図、画像を出力ファイルにエクスポートします。
- `--encoding` エクスポートする出力ファイルの文字エンコーディングを指定します。サポートされていない文字コードが含まれる場合は、その文字を無視します。(utf-8, utf-8-sig, shift-jis, enc-jp, cp932)

その他のオプションに関しては、ヘルプを参照
Expand Down
2 changes: 1 addition & 1 deletion README_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ yomitoku ${path_data} -f md -o results -v --figure
- `-d`, `--device`: Specify the device for running the model. If a GPU is unavailable, inference will be executed on the CPU. (Default: cuda)
- `--ignore_line_break`: Ignores line breaks in the image and concatenates sentences within a paragraph. (Default: respects line breaks as they appear in the image.)
- `--figure_letter`: Exports characters contained within detected figures and tables to the output file.
- `--figure`: Exports detected figures and images to the output file (supported only for html and markdown).
- `--figure`: Exports detected figures and images to the output file
- `--encoding` Specifies the character encoding for the output file to be exported. If unsupported characters are included, they will be ignored. (utf-8, utf-8-sig, shift-jis, enc-jp, cp932)


Expand Down
2 changes: 1 addition & 1 deletion docs/usage.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ yomitoku ${path_data} -f md -o results -v
- `-d`, `--device`: Specify the device for running the model. If a GPU is unavailable, inference will be executed on the CPU. (Default: cuda)
- `--ignore_line_break`: Ignores line breaks in the image and concatenates sentences within a paragraph. (Default: respects line breaks as they appear in the image.)
- `--figure_letter`: Exports characters contained within detected figures and tables to the output file.
- `--figure`: Exports detected figures and images to the output file (supported only for html and markdown).
- `--figure`: Exports detected figures and images to the output file
- `--encoding` Specifies the character encoding for the output file to be exported. If unsupported characters are included, they will be ignored. (utf-8, utf-8-sig, shift-jis, enc-jp, cp932)


Expand Down
2 changes: 1 addition & 1 deletion docs/usage.ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ yomitoku ${path_data} -f md -o results -v
- `-d`, `--device` モデルを実行するためのデバイスを指定します。gpu が利用できない場合は cpu で推論が実行されます。(デフォルト: cuda)
- `--ignore_line_break` 画像の改行位置を無視して、段落内の文章を連結して返します。(デフォルト:画像通りの改行位置位置で改行します。)
- `--figure_letter` 検出した図表に含まれる文字も出力ファイルにエクスポートします。
- `--figure` 検出した図、画像を出力ファイルにエクスポートします。(html と markdown のみ)
- `--figure` 検出した図、画像を出力ファイルにエクスポートします。
- `--encoding` エクスポートする出力ファイルの文字エンコーディングを指定します。サポートされていない文字コードが含まれる場合は、その文字を無視します。(utf-8, utf-8-sig, shift-jis, enc-jp, cp932)

その他のオプションに関しては、ヘルプを参照
Expand Down
6 changes: 6 additions & 0 deletions src/yomitoku/cli/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,12 +59,18 @@ def process_single_file(args, analyzer, path, format):
out_path,
ignore_line_break=args.ignore_line_break,
encoding=args.encoding,
img=img,
export_figure=args.figure,
figure_dir=args.figure_dir,
)
elif format == "csv":
results.to_csv(
out_path,
ignore_line_break=args.ignore_line_break,
encoding=args.encoding,
img=img,
export_figure=args.figure,
figure_dir=args.figure_dir,
)
elif format == "html":
results.to_html(
Expand Down
32 changes: 32 additions & 0 deletions src/yomitoku/export/export_csv.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
import csv
import cv2
import os


def table_to_csv(table, ignore_line_break):
Expand Down Expand Up @@ -33,11 +35,33 @@ def paragraph_to_csv(paragraph, ignore_line_break):
return contents


def save_figure(
figures,
img,
out_path,
figure_dir="figures",
):
for i, figure in enumerate(figures):
x1, y1, x2, y2 = map(int, figure.box)
figure_img = img[y1:y2, x1:x2, :]
save_dir = os.path.dirname(out_path)
save_dir = os.path.join(save_dir, figure_dir)
os.makedirs(save_dir, exist_ok=True)

filename = os.path.splitext(os.path.basename(out_path))[0]
figure_name = f"{filename}_figure_{i}.png"
figure_path = os.path.join(save_dir, figure_name)
cv2.imwrite(figure_path, figure_img)


def export_csv(
inputs,
out_path: str,
ignore_line_break: bool = False,
encoding: str = "utf-8",
img=None,
export_figure: bool = True,
figure_dir="figures",
):
elements = []
for table in inputs.tables:
Expand All @@ -63,6 +87,14 @@ def export_csv(
}
)

if export_figure:
save_figure(
inputs.figures,
img,
out_path,
figure_dir=figure_dir,
)

elements = sorted(elements, key=lambda x: x["order"])

with open(out_path, "w", newline="", encoding=encoding, errors="ignore") as f:
Expand Down
33 changes: 33 additions & 0 deletions src/yomitoku/export/export_json.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
import json

import cv2
import os


def paragraph_to_json(paragraph, ignore_line_break):
if ignore_line_break:
Expand All @@ -12,11 +15,33 @@ def table_to_json(table, ignore_line_break):
cell.contents = cell.contents.replace("\n", "")


def save_figure(
figures,
img,
out_path,
figure_dir="figures",
):
for i, figure in enumerate(figures):
x1, y1, x2, y2 = map(int, figure.box)
figure_img = img[y1:y2, x1:x2, :]
save_dir = os.path.dirname(out_path)
save_dir = os.path.join(save_dir, figure_dir)
os.makedirs(save_dir, exist_ok=True)

filename = os.path.splitext(os.path.basename(out_path))[0]
figure_name = f"{filename}_figure_{i}.png"
figure_path = os.path.join(save_dir, figure_name)
cv2.imwrite(figure_path, figure_img)


def export_json(
inputs,
out_path,
ignore_line_break=False,
encoding: str = "utf-8",
img=None,
export_figure=False,
figure_dir="figures",
):
from yomitoku.document_analyzer import DocumentAnalyzerSchema

Expand All @@ -28,6 +53,14 @@ def export_json(
for paragraph in inputs.paragraphs:
paragraph_to_json(paragraph, ignore_line_break)

if export_figure:
save_figure(
inputs.figures,
img,
out_path,
figure_dir=figure_dir,
)

with open(out_path, "w", encoding=encoding, errors="ignore") as f:
json.dump(
inputs.model_dump(),
Expand Down
2 changes: 2 additions & 0 deletions tests/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@ def test_run_tiff_csv(monkeypatch, tmp_path):
"--tsr_cfg",
"tests/yaml/table_structure_recognizer.yaml",
"--lite",
"--figure",
],
)
main.main()
Expand Down Expand Up @@ -193,6 +194,7 @@ def test_run_dir_json(monkeypatch, tmp_path):
str(tmp_path),
"-f",
"json",
"--figure",
],
)
main.main()
Expand Down
2 changes: 1 addition & 1 deletion tests/test_export.py
Original file line number Diff line number Diff line change
Expand Up @@ -583,7 +583,7 @@ def test_export(tmp_path):
with open(out_path, "r") as f:
assert json.load(f) == document_analyzer.model_dump()

document_analyzer.to_csv(tmp_path / "document_analyzer.csv")
document_analyzer.to_csv(tmp_path / "document_analyzer.csv", img=img)
document_analyzer.to_html(tmp_path / "document_analyzer.html", img=img)
document_analyzer.to_markdown(tmp_path / "document_analyzer.md", img=img)

Expand Down

0 comments on commit 7979ead

Please sign in to comment.