-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rendering pages from XML sources instead of kiwix HTML dumps #9
Comments
@flyingzumwalt this might be more important if: we are not able to get updated Turkish dump and fix the Arabic dump. |
If I remember this right, the Zim files are created from the public Mediawiki/Parsoid API. This means you may be able to not use any existing dump but get it closer from the source. And drop the "create a zim file, send it around the internet" in the middle of the current process. There is some documentation about this here: https://wikitech.wikimedia.org/wiki/Nova_Resource_Talk:Mwoffliner In another project ( https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual#runUpdate.sh ) the public Mediawiki changes API is polled and so a mirror is created that usually only lags seconds. I don't know if Mwoffliner already supports it but these ways could be combined. |
@lidel @JanZerebecki @Kubuxu Anything we can do to help here? I'm not informed about any ticket open on Kiwix/openZIM org side related to that "old" complains (anymore)! |
See also https://github.com/derhuerst/wikipedia-edits-stream . |
I believe this is not feasible yet, we superseded the idea with #42 |
Zim files are nice source but they are a bit limiting. The layout is fixed (and it isn't wikipedia alike), they are updated much more rarely then XML dumps.
Rendering XML ourselves would be quite an effort so I think it is long term goal.
They are also good source of mediawiki assets (as no per language dumps od assets are available).
The text was updated successfully, but these errors were encountered: