-
-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy patharchiver.1.man
269 lines (257 loc) · 9.85 KB
/
archiver.1.man
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
.\" Man page generated from reStructuredText.
.
.
.nr rst2man-indent-level 0
.
.de1 rstReportMargin
\\$1 \\n[an-margin]
level \\n[rst2man-indent-level]
level margin: \\n[rst2man-indent\\n[rst2man-indent-level]]
-
\\n[rst2man-indent0]
\\n[rst2man-indent1]
\\n[rst2man-indent2]
..
.de1 INDENT
.\" .rstReportMargin pre:
. RS \\$1
. nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin]
. nr rst2man-indent-level +1
.\" .rstReportMargin post:
..
.de UNINDENT
. RE
.\" indent \\n[an-margin]
.\" old: \\n[rst2man-indent\\n[rst2man-indent-level]]
.nr rst2man-indent-level -1
.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
.in \\n[rst2man-indent\\n[rst2man-indent-level]]u
..
.TH "KAWIPIKO-ARCHIVER" "1" "2023-03-05" "volution.ro" "kawipiko"
.SH NAME
kawipiko -- blazingly fast static HTTP server \- kawipiko-archiver
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
>> kawipiko\-archiver \-\-help
>> kawipiko\-archiver \-\-man
.ft P
.fi
.UNINDENT
.UNINDENT
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
\-\-sources <path>
\-\-archive <path>
\-\-compress <gzip | zopfli | brotli | identity>
\-\-compress\-level <number>
\-\-compress\-cache <path>
\-\-sources\-cache <path>
\-\-exclude\-index
\-\-exclude\-strip
\-\-exclude\-cache
\-\-include\-etag
\-\-exclude\-slash\-redirects
\-\-include\-folder\-listing
\-\-exclude\-paths\-index
\-\-progress \-\-debug
\-\-version
\-\-help (show this short help)
\-\-man (show the full manual)
\-\-sources\-md5 (dump an \(ga\(gamd5sum\(ga\(ga of the sources)
\-\-sources\-cpio (dump a \(ga\(gacpio.gz\(ga\(ga of the sources)
\-\-sbom \-\-sbom\-text \-\-sbom\-json
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
.ce
----
.ce 0
.sp
.SH FLAGS
.sp
\fB\-\-sources\fP
.INDENT 0.0
.INDENT 3.5
The path to the source folder that is the root of the static website content.
.UNINDENT
.UNINDENT
.sp
\fB\-\-archive\fP
.INDENT 0.0
.INDENT 3.5
The path to the target CDB file that contains the archived static content.
.UNINDENT
.UNINDENT
.sp
\fB\-\-compress\fP, and \fB\-\-compress\-level\fP
.INDENT 0.0
.INDENT 3.5
Each individual file (and consequently of the corresponding HTTP response body) is compressed with either \fBgzip\fP, \fBzopfli\fP or \fBbrotli\fP; by default (or alternatively with \fBidentity\fP) no compression is used.
.sp
Even if compression is explicitly requested, if the compression ratio is bellow a certain threshold (depending on the uncompressed size), the file is stored without any compression.
(It\(aqs senseless to force the client to spend time and decompress the response body if that time is not recovered during network transmission.)
.sp
The compression level can be chosen, the value depending on the algorithm:
.INDENT 0.0
.IP \(bu 2
\fBgzip\fP \-\- \fB\-1\fP for algorithm default, \fB\-2\fP for Huffman only, \fB0\fP to \fB9\fP for fast to slow;
.IP \(bu 2
\fBzopfli\fP \-\- \fB\-1\fP for algorithm default, \fB0\fP to \fB30\fP iterations for fast to slow;
.IP \(bu 2
\fBbrotli\fP \-\- \fB\-1\fP for algorithm default, \fB0\fP to \fB9\fP for fast to slow, \fB\-2\fP for extreme;
.IP \(bu 2
(by \(dqalgorithm default\(dq, it is meant \(dqwhat that algorithm considers the recommended default compression level\(dq;)
.IP \(bu 2
\fBkawipiko\fP by default uses the maximum compression level for each algorithm; (i.e. \fB9\fP for \fBgzip\fP, \fB30\fP for \fBzopfli\fP, and \fB\-2\fP for \fBbrotli\fP;)
.UNINDENT
.UNINDENT
.UNINDENT
.sp
\fB\-\-compress\-cache <path>\fP, and \fB\-\-sources\-cache <path>\fP
.INDENT 0.0
.INDENT 3.5
At the given path a single file is created (that is an BBolt database), that will be used to cache the following information:
.INDENT 0.0
.IP \(bu 2
in case of \fB\-\-sources\-cache\fP, the fingerprint of each file contents is stored, so that if the file was not changed, re\-reading it shouldn\(aqt be attempted unless it is absolutely necessary; also if the file is small enough, its contents is stored in this database (deduplicated by its fingerprint);
.IP \(bu 2
in case of \fB\-\-compress\-cache\fP the compression outcome of each file contents is stored (deduplicated by its fingerprint), so that compression is done only once over multiple runs;
.UNINDENT
.sp
Each of these caches can be safely reused between multiple related archives, especially when they have many files in common.
Each of these caches can be independently used (or shared).
.sp
Using these caches allows one to very quickly rebuild an archive when only a couple of files have been changed, without even touching the file\-system for the unchanged ones.
.UNINDENT
.UNINDENT
.sp
\fB\-\-exclude\-index\fP
.INDENT 0.0
.INDENT 3.5
Disables using \fB_index.*\fP and \fBindex.*\fP files (where \fB\&.*\fP is one of \fB\&.html\fP, \fB\&.htm\fP, \fB\&.xhtml\fP, \fB\&.xht\fP, \fB\&.txt\fP, \fB\&.json\fP, and \fB\&.xml\fP) to respond to a request whose URL path ends in \fB/\fP (corresponding to the folder wherein \fB_index.*\fP or \fBindex.*\fP file is located).
(This can be used to implement \(dqslash\(dq blog style URL\(aqs like \fB/blog/whatever/\fP which maps to \fB/blog/whatever/index.html\fP\&.)
.UNINDENT
.UNINDENT
.sp
\fB\-\-exclude\-strip\fP
.INDENT 0.0
.INDENT 3.5
Disables using a file with the suffix \fB\&.html\fP, \fB\&.htm\fP, \fB\&.xhtml\fP, \fB\&.xht\fP, and \fB\&.txt\fP to respond to a request whose URL does not exactly match an existing file.
(This can be used to implement \(dqsuffix\-less\(dq blog style URL\(aqs like \fB/blog/whatever\fP which maps to \fB/blog/whatever.html\fP\&.)
.UNINDENT
.UNINDENT
.sp
\fB\-\-exclude\-cache\fP
.INDENT 0.0
.INDENT 3.5
Disables adding an \fBCache\-Control: public, immutable, max\-age=3600\fP header that forces the browser (and other intermediary proxies) to cache the response for an hour (the \fBpublic\fP and \fBmax\-age=3600\fP arguments), and furthermore not request it even on reloads (the \fBimmutable\fP argument).
.UNINDENT
.UNINDENT
.sp
\fB\-\-include\-etag\fP
.INDENT 0.0
.INDENT 3.5
Enables adding an \fBETag\fP response header that contains the SHA256 of the response body.
.sp
By not including the \fBETag\fP header (i.e. the default), and because identical headers are stored only one, if one has many files of the same type (that in turn without \fBETag\fP generates the same headers), this can lead to significant reduction in stored headers blocks, including reducing RAM usage.
(At this moment it does not support HTTP conditional requests, i.e. the \fBIf\-None\-Match\fP, \fBIf\-Modified\-Since\fP and their counterparts; however this \fBETag\fP header might be used in conjuction with \fBHEAD\fP requests to see if the resource has changed.)
.UNINDENT
.UNINDENT
.sp
\fB\-\-exclude\-slash\-redirects\fP
.INDENT 0.0
.INDENT 3.5
Disables adding redirects to/from paths with/without \fI/\fP
(For example, by default, if \fI/file\fP exists, then there is also a \fI/file/\fP redirect towards \fI/file\fP; and vice\-versa from \fI/folder\fP towards \fI/folder/\fP\&.)
.UNINDENT
.UNINDENT
.sp
\fB\-\-include\-folder\-listing\fP
.INDENT 0.0
.INDENT 3.5
Enables the creation of an internal list of folders.
.UNINDENT
.UNINDENT
.sp
\fB\-\-exclude\-paths\-index\fP
.INDENT 0.0
.INDENT 3.5
Disables the creation of an internal list of references that can be used in conjunction with the \fB\-\-index\-all\fP flag of the \fBkawipiko\-server\fP\&.
.UNINDENT
.UNINDENT
.sp
\fB\-\-progress\fP
.INDENT 0.0
.INDENT 3.5
Enables periodic reporting of various metrics.
.UNINDENT
.UNINDENT
.sp
\fB\-\-debug\fP
.INDENT 0.0
.INDENT 3.5
Enables verbose logging.
It will log various information about the archived files (including compression statistics).
.UNINDENT
.UNINDENT
.SH IGNORED FILES
.INDENT 0.0
.IP \(bu 2
any file with the following prefixes: \fB\&.\fP, \fB#\fP;
.IP \(bu 2
any file with the following suffixes: \fB~\fP, \fB#\fP, \fB\&.log\fP, \fB\&.tmp\fP, \fB\&.temp\fP, \fB\&.lock\fP;
.IP \(bu 2
any file that contains the following: \fB#\fP;
.IP \(bu 2
any file that exactly matches the following: \fBThumbs.db\fP, \fB\&.DS_Store\fP;
.IP \(bu 2
(at the moment these rules are not configurable through flags;)
.UNINDENT
.SH WILDCARD FILES
.sp
By placing a file whose name matches \fB_wildcard.*\fP (i.e. with the prefix \fB_wildcard.\fP and any other suffix), it will be used to respond to any request whose URL fails to find a \(dqbetter\(dq match.
.sp
These wildcard files respect the folder hierarchy, in that wildcard files in (direct or transitive) subfolders override the wildcard file in their parents (direct or transitive).
.sp
In addition to \fB_wildcard.*\fP, there is also support for \fB_200.html\fP (or just \fB200.html\fP), plus \fB_404.html\fP (or just \fB404.html\fP).
.SH REDIRECT FILES
.sp
By placing a file whose name is \fB_redirects\fP (or \fB_redirects.txt\fP), it instructs the archiver to create redirect responses.
.sp
The syntax is quite simple:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
# This is a comment.
# NOTE: Absolute paths are allowed only at the top of the sources folder.
/some\-path https://example.com/ 301
# NOTE: Relative paths are always, and are reinterpreted as relative to the containing folder.
\&./some\-path https://example.com/ 302
# NOTE: Redirects only for a specific domain. (The protocol is irelevant.)
# (Allowed only at the top of the sources folder.)
://example.com/some\-path https://example.com/ 303
http://example.com/some\-path https://example.com/ 307
https://example.com/some\-path https://example.com/ 308
.ft P
.fi
.UNINDENT
.UNINDENT
.SH SYMLINKS, HARDLINKS, LOOPS, AND DUPLICATED FILES
.sp
You freely use symlinks (including pointing outside of the content root) and they will be crawled during archival respecting the \(dqlogical\(dq hierarchy they introduce.
(Any loop that you introduce into the hierarchy will be ignored and a warning will be issued.)
.sp
You can safely symlink or hardlink the same file (or folder) in multiple places (within the content hierarchy), and its data will be stored only once.
(The same applies to duplicated files that have exactly the same data.)
.\" Generated by docutils manpage writer.
.