|
13 | 13 | </div>
|
14 | 14 |
|
15 | 15 | ### Recent Updates
|
16 |
| -- **2024.10.22** |
17 |
| - - Added the complex background multi-table detection and extraction solution [RapidTableDet](https://github.com/RapidAI/RapidTableDetection). |
18 |
| - |
19 | 16 | - **2024.11.12**
|
20 | 17 | - Extracted model recognition and processing core thresholds for easier fine-tuning according to specific scenarios. See [Core Parameters](#core-parameters).
|
21 | 18 | - **2024.11.16**
|
22 |
| - - Added document distortion correction solution, which can be used as a pre-processing step [Document Distortion Correction](https://github.com/Joker1212/RapidUnWrap) |
| 19 | + - Added document distortion correction solution, which can be used as a pre-processing step [RapidUnWrap](https://github.com/Joker1212/RapidUnWrap) |
| 20 | +- **2024.11.22** |
| 21 | + - Support Char Rec, RapidOCR>=1.4.0 |
23 | 22 | ### Introduction
|
24 | 23 | 💖 This repository serves as an inference library for structured recognition of tables within documents, including models for wired and wireless table recognition from Alibaba DulaLight, a wired table model from llaipython (WeChat), and a built-in table classification model from NetEase Qanything.
|
25 | 24 |
|
26 |
| -[Quick Start](#installation) [Model Evaluation](#evaluation-results) [Usage Recommendations](#usage-recommendations) [Document Distortion Correction](https://github.com/Joker1212/RapidUnWrap) [Table Rotation & Perspective Correction](#table-rotation-and-perspective-correction) [Fine-tuning Input Parameters Reference](#core-parameters) [Frequently Asked Questions](#faqs) [Update Plan](#update-plan) |
| 25 | +[Quick Start](#installation) [Model Evaluation](#evaluation-results) [Char Rec](#Single-Character-OCR-Matching) [Usage Recommendations](#usage-recommendations) [Document Distortion Correction](https://github.com/Joker1212/RapidUnWrap) [Table Rotation & Perspective Correction](#table-rotation-and-perspective-correction) [Input Parameters](#core-parameters) [Frequently Asked Questions](#FAQ) [Update Plan](#update-plan) |
27 | 26 | #### Features
|
28 | 27 |
|
29 | 28 | ⚡ **Fast:** Uses ONNXRuntime as the inference engine, achieving 1-7 seconds per image on CPU.
|
@@ -71,7 +70,7 @@ Surya-Tabled uses its built-in OCR module, which is a row-column recognition mod
|
71 | 70 | ### Usage Recommendations
|
72 | 71 | wired_table_rec_v2 (highest precision for wired tables): General scenes for wired tables (papers, magazines, journals, receipts, invoices, bills)
|
73 | 72 |
|
74 |
| -paddlex-SLANet-plus (highest overall precision): Document scene tables (tables in papers, magazines, and journals) [Fine-tuning Input Parameters Reference](#core-parameters) |
| 73 | +paddlex-SLANet-plus (highest overall precision): Document scene tables (tables in papers, magazines, and journals) |
75 | 74 |
|
76 | 75 | ### Installation
|
77 | 76 |
|
@@ -121,6 +120,16 @@ print(f"elasp: {elasp}")
|
121 | 120 | # Visualize OCR recognition boxes
|
122 | 121 | # plot_rec_box(img_path, f"{output_dir}/ocr_box.jpg", ocr_res)
|
123 | 122 | ```
|
| 123 | +#### Single Character OCR Matching |
| 124 | +```python |
| 125 | +# Convert single character boxes to the same structure as line recognition |
| 126 | +from rapidocr_onnxruntime import RapidOCR |
| 127 | +from wired_table_rec.utils_table_recover import trans_char_ocr_res |
| 128 | +img_path = "tests/test_files/wired/table4.jpg" |
| 129 | +ocr_engine =RapidOCR() |
| 130 | +ocr_res, _ = ocr_engine(img_path, return_word_box=True) |
| 131 | +ocr_res = trans_char_ocr_res(ocr_res) |
| 132 | +``` |
124 | 133 |
|
125 | 134 | #### Table Rotation and Perspective Correction
|
126 | 135 | ##### 1. Simple Background, Small Angle Scene
|
@@ -166,21 +175,19 @@ for i, res in enumerate(result):
|
166 | 175 | ```python
|
167 | 176 | wired_table_rec = WiredTableRecognition()
|
168 | 177 | html, elasp, polygons, logic_points, ocr_res = wired_table_rec(
|
169 |
| - img_path, |
170 |
| - version="v2", # Default to use v2 line model, switch to Alibaba ReadLight model by changing to v1 |
171 |
| - morph_close=True,# Whether to perform morphological operations to find more lines, default is True |
172 |
| - more_h_lines=True, # Whether to check for more horizontal lines based on line detection results to find smaller lines, default is True |
173 |
| - h_lines_threshold = 100, # Must enable more_h_lines, threshold for connecting horizontal line detection pixels, new horizontal lines will be generated if below this value, default is 100 |
174 |
| - more_v_lines=True, # Whether to check for more vertical lines based on line detection results to find smaller lines, default is True |
175 |
| - v_lines_threshold = 15, # Must enable more_v_lines, threshold for connecting vertical line detection pixels, new vertical lines will be generated if below this value, default is 15 |
176 |
| - extend_line=True, # Whether to extend line segments based on line detection results to find more lines, default is True |
177 |
| - need_ocr=True, # Whether to perform OCR recognition, default is True |
178 |
| - rec_again=True,# Whether to re-recognize table boxes that were not recognized, default is True |
| 178 | + img, # Image Union[str, np.ndarray, bytes, Path, PIL.Image.Image] |
| 179 | + ocr_result, # Input rapidOCR recognition result, use internal rapidocr model by default if not provided |
| 180 | + version="v2", # Default to using v2 line model, switch to AliDamo model by changing to v1 |
| 181 | + enhance_box_line=True, # Enhance box line find (turn off to avoid excessive cutting, turn on to reduce missed cuts), default is True |
| 182 | + need_ocr=True, # Whether to perform OCR recognition, default is True |
| 183 | + rec_again=True, # Whether to re-recognize table boxes without detected text by cropping them separately, default is True |
179 | 184 | )
|
180 | 185 | lineless_table_rec = LinelessTableRecognition()
|
181 | 186 | html, elasp, polygons, logic_points, ocr_res = lineless_table_rec(
|
182 |
| - need_ocr=True, # Whether to perform OCR recognition, default is True |
183 |
| - rec_again=True, # Whether to re-recognize table boxes that were not recognized, default is True |
| 187 | + img, # Image Union[str, np.ndarray, bytes, Path, PIL.Image.Image] |
| 188 | + ocr_result, # Input rapidOCR recognition result, use internal rapidocr model by default if not provided |
| 189 | + need_ocr=True, # Whether to perform OCR recognition, default is True |
| 190 | + rec_again=True, # Whether to re-recognize table boxes without detected text by cropping them separately, default is True |
184 | 191 | )
|
185 | 192 | ```
|
186 | 193 |
|
|
0 commit comments