Skip to content

Commit 132bd52

Browse files
committed
Update CHANGELOG.md for 0.32.0 release
1 parent 065a299 commit 132bd52

File tree

1 file changed

+24
-0
lines changed

1 file changed

+24
-0
lines changed

CHANGELOG.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Changelog
2+
3+
## 0.32.0
4+
5+
### Added
6+
7+
- HeaderValidator with WARC/1.1 standard ruleset
8+
- ExtractTool: can now extract sequential concurrent records (`--concurrent` option)
9+
- DedupeTool
10+
- In-memory cache for cross-URL digest-based deduplication (`--cache-size` option)
11+
- Now prints deduplication statistics (`--dry-run` and `--quiet` options)
12+
- Multi-threaded deduplication (`--threads` option)
13+
- ValidateTool
14+
- Multi-threaded validation (`--threads` option)
15+
- ParsingException message is now annotated with the source filename and record offset when available
16+
17+
### Fixed
18+
19+
- RFC5952 canonical form is now used for IPv6 addresses in WARC-IP-Address
20+
- HttpParser in lenient mode now:
21+
- accepts responses missing version number
22+
- ignores header lines missing :
23+
- ignores folded status lines
24+
- WarcParser: treats `alexa/dat` ARC records as not HTTP type

0 commit comments

Comments
 (0)