Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练时Corrupted image errror,字节对象进lmdb,lmdb读出来的不一样,请高手帮解决,急。 #75

Open
luyun760324 opened this issue Oct 23, 2019 · 7 comments

Comments

@luyun760324
Copy link

Corrupted image for 1
Traceback (most recent call last):
File "/home/wdj/mycode3/Lets_OCR/recognizer/crnn/lib/dataset.py", line 132, in getitem
img = Image.open(buf).convert('L')
File "/root/anaconda3/envs/Pytorch_CRNN3/lib/python3.6/site-packages/PIL/Image.py", line 2821, in open
raise IOError("cannot identify image file %r" % (filename if filename else fp))
OSError: cannot identify image file <_io.BytesIO object at 0x7f6110132308>

@ShangLe0607
Copy link

我之前也出现过这个错误,你的标签是数字还是中文

@luyun760324
Copy link
Author

你的问题解决没,我的标签是数字

@huitang
Copy link

huitang commented Nov 26, 2019

我也遇到这个问题我的标签是中文

@Ryansanity
Copy link

你的问题解决没,我的标签是数字

您好,我的标签是中文对应的数字,但是也还是会出现上述情况,请问大佬是什么原因呢

@paohaijiao
Copy link

请问解决了吗

@oweiii
Copy link

oweiii commented May 15, 2020

请问解决了吗 我中英文的标签都试过了

@htyquq
Copy link

htyquq commented Oct 26, 2023

我猜测你在读取图像时使用:
with open(imagePath, 'rb') as f: #这里要用rb打开图片
imageBin = f.read()
在创建数据集时使用了下列代码:
with env.begin(write=True) as txn:
for k, v in cache.items():
txn.put(str(k).encode(), str(v).encode())
这里错误的将bytes数据再次encode,正常encode是无法编码bytes类型的,decode后的内容虽然一样,但是一个是str一个是bytes
buf = six.BytesIO() #创建一个内存地址
buf.write(imgbuf) #写入图片二进制数据
buf.seek(0) #File.seek(1) File.seek(2) 0指针回到文件开头 1当前位置 2文件结尾
#对一个空文件写后再读时候,应在写完之后seek(0),使指针回到文件开头以便再读
try:
img = Image.open(buf).convert('L')
这里的imgbuf必须是bytes类型才能打开
我修改后的代码如下,对于图像数据不编码即可
for k, v in cache.items():
if 'image' in k:
txn.put(str(k).encode(),v)
else:
txn.put(str(k).encode(), str(v).encode()) #这里写入的是编码后的数据,读取需decode或其他解码

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants