Skip to content

Releases: q-m/scrapy-webarchive

0.5.2

13 Mar 19:14

Choose a tag to compare

  • Filter out revisits

Full Changelog: 0.5.1...0.5.2

0.5.1

12 Mar 16:07

Choose a tag to compare

  • Fix mismatch between crawling from local storage vs. S3

Full Changelog: 0.5.0...0.5.1

0.5.0

11 Mar 11:35

Choose a tag to compare

  • Add archive_regexp, archive_blacklist_regexp; remove archive_disallow_regexp (#39) - possibly breaking change
  • Ignore unrecognized index entries (#38)
  • Fix reading compressed index files (#42)

Full Changelog: 0.4.1...0.5.0

0.4.1

17 Nov 07:28

Choose a tag to compare

What's Changed

  • Fix for getting spider name in different scrapy versions

Full Changelog: 0.4.0...0.4.1

0.4.0

28 Feb 13:55

Choose a tag to compare

What's Changed

Full Changelog: 0.3.0...0.4.0

0.3.0

13 Jan 09:31

Choose a tag to compare

  • Change _check_configuration_prerequisites logic in WaczExporter

Full Changelog: 0.2.0...0.3.0

v0.2.0 - Hotfixes

10 Jan 10:40

Choose a tag to compare

Initial Release

10 Jan 09:32

Choose a tag to compare

  • Save web crawls in WACZ format (multiple storages supported; local and cloud).
  • Crawl against WACZ format archives.