Skip to content

Commit c4e3ab7

Browse files
committed
WarcParser: Improve compatibility with ARC variants
This makes it so we can read the warcio example.arc file and the example in the ARC file format reference. * Ignore up to 3 spurious linefeeds at the start of ARC records. * Accept ARC records with the trailing linefeed missing. * Accept (but currently ignore) the extra URL-record-v2 fields. * Accept "0" in the ARC IP address field. Fixes #82
1 parent efcb28f commit c4e3ab7

File tree

5 files changed

+411
-172
lines changed

5 files changed

+411
-172
lines changed

src/org/netpreserve/jwarc/LengthedBody.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ public synchronized void consume() throws IOException {
142142
position += buffer.remaining();
143143
buffer.clear();
144144
if (channel.read(buffer) < 0) {
145-
throw new EOFException();
145+
throw new EOFException("Expected to read " + (size - position) + " more bytes");
146146
}
147147
buffer.flip();
148148
}

0 commit comments

Comments
 (0)