Commit c4e3ab7
committed
WarcParser: Improve compatibility with ARC variants
This makes it so we can read the warcio example.arc file and the example in the ARC file format reference.
* Ignore up to 3 spurious linefeeds at the start of ARC records.
* Accept ARC records with the trailing linefeed missing.
* Accept (but currently ignore) the extra URL-record-v2 fields.
* Accept "0" in the ARC IP address field.
Fixes #821 parent efcb28f commit c4e3ab7
File tree
5 files changed
+411
-172
lines changed- src/org/netpreserve/jwarc
- test/org/netpreserve/jwarc/apitests
5 files changed
+411
-172
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
142 | 142 | | |
143 | 143 | | |
144 | 144 | | |
145 | | - | |
| 145 | + | |
146 | 146 | | |
147 | 147 | | |
148 | 148 | | |
| |||
0 commit comments