-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add getzim.sh script #40
Conversation
Seems related to #58? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @mkg20001!
Quick feedback:
- Add or switch to non-interactive mode. We need it for Automate snapshot updates #58, something like:
getzim --list
should return a list of available wikisgetzim --wiki <id> --latest
would return the name of latest snapshot of a specified wikigetzim --wiki <id> --get <snapshot>
would download specified snapshot
- Do you know why I can't see
wikipedia
on the list? - Instead of grepping Wiki HTML, try grepping directory listings from https://download.kiwix.org/zim/ (and https://download.kiwix.org/zim/wikipedia/ etc)
- Should be enough for now, in the future we wil lconsider switching to https://wiki.kiwix.org/wiki/OPDS (as suggested in Automate snapshot updates #58 (comment)), btu for now API is not stable enough
getzim.sh
Outdated
|
||
if [ ! -e ".content" ]; then | ||
echo "Downloading content list..." | ||
curl -s http://wiki.kiwix.org/wiki/Content_in_all_languages > .content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It did not work for me unless I enabled following redirect to HTTPS:
curl -s http://wiki.kiwix.org/wiki/Content_in_all_languages > .content | |
curl -sL http://wiki.kiwix.org/wiki/Content_in_all_languages > .content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just better use the https version https://wiki.kiwix.org/wiki/Content_in_all_languages
I've changed the script to work with the download.kiwix.org site directly There's a command One of these for example is Note that currently SHA256 validation is broken and needs to be fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkg20001 Apologies for missing notification on this! I did a test run with bash getzim.sh download wikipedia wikipedia tr all maxi any
and download worked as expected 👍
Let me know if you have time to make two changes before we merge this:
- Rename
any
tolatest
(I believe you already sort to ensure it returns latest entry, so it is a better name anyway) - Add
url
command to print the URL behindlatest
without downloading it, something like:The idea is to use this in automation (Automate snapshot updates #58) to tell if$ bash getzim.sh url wikipedia wikipedia tr all maxi latest https://download.kiwix.org/zim/wikipedia/wikipedia_tr_all_maxi_2019-10.zim
latest
newer than the last successful snapshot build.
The url command did already exist ( Any is now latest |
Now the url command also returns json and prints the log to stderr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Works as expected, can be refined later if needed, let's merge this.
@mkg20001 are you able to apply below fix? (i can't commit to your repo)
Co-Authored-By: Marcin Rataj <[email protected]>
The cache being filled without cache_update was sorta a unintended feature, but I think we can just make this a proper feature |
@mkg20001 is there anything you want to add, or ok to merge? (can be a separate PR) |
Not really. Don't fix it, if it ain't broken :P |
Thanks again! |
The script allows the user to pick a source and download it from cli without having to visit http://wiki.kiwix.org/wiki/Content_in_all_languages
It also automatically resumes the download and verifies the md5sum