You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I am a tourist to your project. I think there is some low-hanging fruit / best practices standardization you can follow to help the world better consume your project.
Bless some official directories
I opened this ticket on the issue. It probably should have been a discussion instead, my apologies. There are already a few people chiming in who would find it beneficial. I think the questions you have to answer are:
Do you want to bless an official directory for storing the tldr-pages for system-wide/packager installation? What would it be on Windows?
Is the tldr.zip stored there or are decompressed pages stored there? Or do you want to do it like man and store individually compressed text files?
my own two cents: the singular zip file is good. It'll be fast to sync, you don't need to do all the filesystem ops to decompress, and clients will likely find it faster to read directly from the zip instead of hitting the filesystem directly
Is the tldr directory intended to be extended by 3rd parties dropping their own pages in there?
this is only relevant if you distribute it as a zip file, because then either you have to expect them to modify the zip (bad!) or extend the spec to instruct clients to look for additional zips. And I suspect you don't want to support this anyway because it looks like you've been very welcoming of new PRs.
Do you want to bless a per-user directory like clients have been using for their caches?
Localization
If you banish the localized pages from tldr.zip you can ship something like the tldr-pages.LANG.zip that you currently build alongside it. This lets users download only the data they're interested in, reduces the size of the archive the clients have to dig through, lets packagers reduce the size of what they're shipping and lets packagers automatically install the correct language support for a localized user.
Versioning on the filesystem
You can use versioned filenames and symlinks to support multiple versions of tldr pages installed on the same system. e.g. symlink tldr.zip and tldr-2.zip to tldr-2.3.zip.
Don't change published artifacts!
I see there has been some iteration on this already (e.g. #12048). You've learned the hard way that it's bad to check zip files into a git repository. I want to add that it is considered improper/impolite/inappropriate to change an artifact after you've released it. Right now you're clobbering the same tldr.zip filename with every release, multiple times a day, and don't seem to provide a versioned history of the file. It is a better practice to instead:
come up with a unique name for each release. A couple of ideas here are the git commit ID, an incrementing serial number, or a YYYMMDDXX timestamp (where XX Is a serial number for the releases done on that day).
publish each new release with a unique filename e.g. tldr-v2.3-GITHASH.zip
provide tldr.zip (or tldr-current.zip) as a symlink/redirect to the latest version and update that on release. Clients interested in fetching the latest can still get it by hitting a single URL, clients who need to verify hashes can fetch a specific versioned file.
Host the zips yourself
I suspect that the reason why you're using your current release process is because you want to offload the costs of hosting onto GitHub Pages. I also suspect you're clobbering the same zipfile over and over in GitHub releases because you don't want to clutter the releases feed on GitHub. All of this and the previous section are code smells (release engineering smells?) downstream from avoiding the responsibility of hosting your own source files. I want to encourage you all to not be afraid, you are all capable of serving source files cheaply and reliably!
You don't have to pay cloud prices for compute or bandwidth. You can get away with rented dedicated servers or VPSes. There are lots of providers.
You don't need to have a single point of failure. Round-robin DNS will get you load balancing and it's not uncommon to publish a list of active mirrors that a client could consume.
You might not need to pay anything at all: there are several businesses who offer free mirroring for open source projects.
Running the server yourself is not challenging. It is a standard Linux machine with nginx/apache installed. Updating your mirror from the master copy is running rsync in a cronjob.
If you are unable to find any volunteers willing to mirror the tldr zips I will throw my hat in and offer mirroring if it helps break the ice jam.
Rsync / binary diff
Once you have your own server you can provide rsync updates as part of your spec by running an rsync daemon alongside the HTTP server. This is a win-win-win for users: you can release tldr.zip multiple times a day, clients can update themselves multiple times a day, and users get fast updates with minimal data transfers. rsync is standard for this but perhaps there are other tools out there now. Clients that don't want a dependency on rsync can keep using HTTP. Clients that do want to use rsync can choose use it or HTTP to bootstrap the initial copy of the file.
Although rsync will gladly sync a whole directory tree of individual tldr pages you'll likely find it faster to have it sync a single zipfile. If you are shipping a compressed zipfile you want to disable rsync's transfer compression (rsync --no-compress ). It can also be beneficial to ship a zipfile with zero compression (zip -0) and enable rsync transfer compression (rsync -z): rsync can find better diffs in uncompressed data and the client gets a smaller download from the rsync transfer compression of the diff.
If you ship uncompressed zip files HTTP clients can still benefit from transfer compression via the Accept-Encoding header. Typically it's turned off for mirrors but you can certainly enable it/leave it turned on by default. nginx even has optimizations if you don't want to re-encode the file for each request: see the gzip_static setting.
This is a good hygenic release engineering practice in general. However, I'm dropping it in here because it also creates smaller deltas and faster transfers when updating your zipfile via rsync.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I am a tourist to your project. I think there is some low-hanging fruit / best practices standardization you can follow to help the world better consume your project.
Bless some official directories
I opened this ticket on the issue. It probably should have been a discussion instead, my apologies. There are already a few people chiming in who would find it beneficial. I think the questions you have to answer are:
Localization
If you banish the localized pages from tldr.zip you can ship something like the tldr-pages.LANG.zip that you currently build alongside it. This lets users download only the data they're interested in, reduces the size of the archive the clients have to dig through, lets packagers reduce the size of what they're shipping and lets packagers automatically install the correct language support for a localized user.
Versioning on the filesystem
You can use versioned filenames and symlinks to support multiple versions of tldr pages installed on the same system. e.g. symlink tldr.zip and tldr-2.zip to tldr-2.3.zip.
Don't change published artifacts!
I see there has been some iteration on this already (e.g. #12048). You've learned the hard way that it's bad to check zip files into a git repository. I want to add that it is considered improper/impolite/inappropriate to change an artifact after you've released it. Right now you're clobbering the same tldr.zip filename with every release, multiple times a day, and don't seem to provide a versioned history of the file. It is a better practice to instead:
Host the zips yourself
I suspect that the reason why you're using your current release process is because you want to offload the costs of hosting onto GitHub Pages. I also suspect you're clobbering the same zipfile over and over in GitHub releases because you don't want to clutter the releases feed on GitHub. All of this and the previous section are code smells (release engineering smells?) downstream from avoiding the responsibility of hosting your own source files. I want to encourage you all to not be afraid, you are all capable of serving source files cheaply and reliably!
Rsync / binary diff
Once you have your own server you can provide rsync updates as part of your spec by running an rsync daemon alongside the HTTP server. This is a win-win-win for users: you can release tldr.zip multiple times a day, clients can update themselves multiple times a day, and users get fast updates with minimal data transfers. rsync is standard for this but perhaps there are other tools out there now. Clients that don't want a dependency on rsync can keep using HTTP. Clients that do want to use rsync can choose use it or HTTP to bootstrap the initial copy of the file.
Although rsync will gladly sync a whole directory tree of individual tldr pages you'll likely find it faster to have it sync a single zipfile. If you are shipping a compressed zipfile you want to disable rsync's transfer compression (
rsync --no-compress). It can also be beneficial to ship a zipfile with zero compression (zip -0) and enable rsync transfer compression (rsync -z): rsync can find better diffs in uncompressed data and the client gets a smaller download from the rsync transfer compression of the diff.If you ship uncompressed zip files HTTP clients can still benefit from transfer compression via the
Accept-Encodingheader. Typically it's turned off for mirrors but you can certainly enable it/leave it turned on by default. nginx even has optimizations if you don't want to re-encode the file for each request: see thegzip_staticsetting.Reproducible builds and zip files
See https://reproducible-builds.org/ for what I'm referring to here, especially https://reproducible-builds.org/docs/archives/ . You likely want to do something like sort the archive contents alphabetically and set timestamps to a constant value. I am not sure what zip tooling is available to do this.
This is a good hygenic release engineering practice in general. However, I'm dropping it in here because it also creates smaller deltas and faster transfers when updating your zipfile via rsync.
Beta Was this translation helpful? Give feedback.
All reactions