训练时Corrupted image errror，字节对象进lmdb，lmdb读出来的不一样，请高手帮解决，急。 #75

luyun760324 · 2019-10-23T13:49:50Z

Corrupted image for 1
Traceback (most recent call last):
File "/home/wdj/mycode3/Lets_OCR/recognizer/crnn/lib/dataset.py", line 132, in getitem
img = Image.open(buf).convert('L')
File "/root/anaconda3/envs/Pytorch_CRNN3/lib/python3.6/site-packages/PIL/Image.py", line 2821, in open
raise IOError("cannot identify image file %r" % (filename if filename else fp))
OSError: cannot identify image file <_io.BytesIO object at 0x7f6110132308>

ShangLe0607 · 2019-10-25T07:01:44Z

我之前也出现过这个错误，你的标签是数字还是中文

luyun760324 · 2019-10-28T03:57:57Z

你的问题解决没，我的标签是数字

huitang · 2019-11-26T00:03:39Z

我也遇到这个问题我的标签是中文

Ryansanity · 2020-01-15T03:55:08Z

你的问题解决没，我的标签是数字

您好，我的标签是中文对应的数字，但是也还是会出现上述情况，请问大佬是什么原因呢

paohaijiao · 2020-01-20T22:12:23Z

请问解决了吗

oweiii · 2020-05-15T09:07:18Z

请问解决了吗我中英文的标签都试过了

htyquq · 2023-10-26T13:47:36Z

我猜测你在读取图像时使用：
with open(imagePath, 'rb') as f: #这里要用rb打开图片
imageBin = f.read()
在创建数据集时使用了下列代码：
with env.begin(write=True) as txn:
for k, v in cache.items():
txn.put(str(k).encode(), str(v).encode())
这里错误的将bytes数据再次encode，正常encode是无法编码bytes类型的，decode后的内容虽然一样，但是一个是str一个是bytes
buf = six.BytesIO() #创建一个内存地址
buf.write(imgbuf) #写入图片二进制数据
buf.seek(0) #File.seek(1) File.seek(2) 0指针回到文件开头 1当前位置 2文件结尾
#对一个空文件写后再读时候，应在写完之后seek(0),使指针回到文件开头以便再读
try:
img = Image.open(buf).convert('L')
这里的imgbuf必须是bytes类型才能打开
我修改后的代码如下，对于图像数据不编码即可
for k, v in cache.items():
if 'image' in k:
txn.put(str(k).encode(),v)
else:
txn.put(str(k).encode(), str(v).encode()) #这里写入的是编码后的数据，读取需decode或其他解码

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

训练时Corrupted image errror，字节对象进lmdb，lmdb读出来的不一样，请高手帮解决，急。 #75

训练时Corrupted image errror，字节对象进lmdb，lmdb读出来的不一样，请高手帮解决，急。 #75

luyun760324 commented Oct 23, 2019

ShangLe0607 commented Oct 25, 2019

luyun760324 commented Oct 28, 2019

huitang commented Nov 26, 2019

Ryansanity commented Jan 15, 2020

paohaijiao commented Jan 20, 2020

oweiii commented May 15, 2020

htyquq commented Oct 26, 2023

训练时Corrupted image errror，字节对象进lmdb，lmdb读出来的不一样，请高手帮解决，急。 #75

训练时Corrupted image errror，字节对象进lmdb，lmdb读出来的不一样，请高手帮解决，急。 #75

Comments

luyun760324 commented Oct 23, 2019

ShangLe0607 commented Oct 25, 2019

luyun760324 commented Oct 28, 2019

huitang commented Nov 26, 2019

Ryansanity commented Jan 15, 2020

paohaijiao commented Jan 20, 2020

oweiii commented May 15, 2020

htyquq commented Oct 26, 2023