Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rendering pages from XML sources instead of kiwix HTML dumps #9

Closed
Kubuxu opened this issue May 1, 2017 · 5 comments
Closed

Rendering pages from XML sources instead of kiwix HTML dumps #9

Kubuxu opened this issue May 1, 2017 · 5 comments

Comments

@Kubuxu
Copy link
Member

Kubuxu commented May 1, 2017

Zim files are nice source but they are a bit limiting. The layout is fixed (and it isn't wikipedia alike), they are updated much more rarely then XML dumps.

Rendering XML ourselves would be quite an effort so I think it is long term goal.

They are also good source of mediawiki assets (as no per language dumps od assets are available).

@flyingzumwalt flyingzumwalt changed the title Rendering XMLs Rendering pages from XML sources instead of kiwix HTML dumps May 12, 2017
@Kubuxu
Copy link
Member Author

Kubuxu commented May 12, 2017

@flyingzumwalt this might be more important if: we are not able to get updated Turkish dump and fix the Arabic dump.

@JanZerebecki
Copy link

If I remember this right, the Zim files are created from the public Mediawiki/Parsoid API. This means you may be able to not use any existing dump but get it closer from the source. And drop the "create a zim file, send it around the internet" in the middle of the current process. There is some documentation about this here: https://wikitech.wikimedia.org/wiki/Nova_Resource_Talk:Mwoffliner

In another project ( https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual#runUpdate.sh ) the public Mediawiki changes API is polled and so a mirror is created that usually only lags seconds.

I don't know if Mwoffliner already supports it but these ways could be combined.

@kelson42
Copy link

kelson42 commented Sep 9, 2019

@lidel @JanZerebecki @Kubuxu Anything we can do to help here? I'm not informed about any ticket open on Kiwix/openZIM org side related to that "old" complains (anymore)!

@derhuerst
Copy link

In another project ( https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual#runUpdate.sh ) the public Mediawiki changes API is polled and so a mirror is created that usually only lags seconds.

See also https://github.com/derhuerst/wikipedia-edits-stream .

@lidel
Copy link
Member

lidel commented Feb 15, 2021

I believe this is not feasible yet, we superseded the idea with #42

@lidel lidel closed this as completed Feb 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants