|
1 |
| -# `bmap-tools` |
| 1 | +The code at this location is no longer maintained and will |
| 2 | +likely be removed in the future. |
2 | 3 |
|
3 |
| -> The better `dd` for embedded projects, based on block maps. |
4 |
| -
|
5 |
| -## Introduction |
6 |
| - |
7 |
| -`bmaptool` is a generic tool for creating the block map (bmap) for a file and |
8 |
| -copying files using the block map. The idea is that large files, like raw |
9 |
| -system image files, can be copied or flashed a lot faster and more reliably |
10 |
| -with `bmaptool` than with traditional tools, like `dd` or `cp`. |
11 |
| - |
12 |
| -`bmaptool` was originally created for the "Tizen IVI" project and it was used for |
13 |
| -flashing system images to USB sticks and other block devices. `bmaptool` can also |
14 |
| -be used for general image flashing purposes, for example, flashing Fedora Linux |
15 |
| -OS distribution images to USB sticks. |
16 |
| - |
17 |
| -Originally Tizen IVI images had been flashed using the `dd` tool, but bmaptool |
18 |
| -brought a number of advantages. |
19 |
| - |
20 |
| -* Faster. Depending on various factors, like write speed, image size, how full |
21 |
| - is the image, and so on, `bmaptool` was 5-7 times faster than `dd` in the Tizen |
22 |
| - IVI project. |
23 |
| -* Integrity. `bmaptool` verifies data integrity while flashing, which means that |
24 |
| - possible data corruptions will be noticed immediately. |
25 |
| -* Usability. `bmaptool` can read images directly from the remote server, so users |
26 |
| - do not have to download images and save them locally. |
27 |
| -* Protects user's data. Unlike `dd`, if you make a mistake and specify a wrong |
28 |
| - block device name, `bmaptool` will less likely destroy your data because it has |
29 |
| - protection mechanisms which, for example, prevent `bmaptool` from writing to a |
30 |
| - mounted block device. |
31 |
| - |
32 |
| -## Usage |
33 |
| - |
34 |
| -`bmaptool` supports 2 subcommands: |
35 |
| -* `copy` - copy a file to another file using bmap or flash an image to a block |
36 |
| - device |
37 |
| -* `create` - create a bmap for a file |
38 |
| - |
39 |
| -You can get usage reference for `bmaptool` and all the supported command using |
40 |
| -the `-h` or `--help` options: |
41 |
| - |
42 |
| -```bash |
43 |
| -$ bmaptool -h # General bmaptool help |
44 |
| -$ bmaptool <cmd> -h # Help on the <cmd> sub-command |
45 |
| -``` |
46 |
| - |
47 |
| -You can also refer to the `bmaptool` manual page: |
48 |
| -```bash |
49 |
| -$ man bmaptool |
50 |
| -``` |
51 |
| - |
52 |
| -## Concept |
53 |
| - |
54 |
| -This section provides general information about the block map (bmap) necessary |
55 |
| -for understanding how `bmaptool` works. The structure of the section is: |
56 |
| - |
57 |
| -* "Sparse files" - the bmap ideas are based on sparse files, so it is important |
58 |
| - to understand what sparse files are. |
59 |
| -* "The block map" - explains what bmap is. |
60 |
| -* "Raw images" - the main usage scenario for `bmaptool` is flashing raw images, |
61 |
| - which this section discusses. |
62 |
| -* "Usage scenarios" - describes various possible bmap and `bmaptool` usage |
63 |
| - scenarios. |
64 |
| - |
65 |
| -### Sparse files |
66 |
| - |
67 |
| -One of the main roles of a filesystem, generally speaking, is to map blocks of |
68 |
| -file data to disk sectors. Different file-systems do this mapping differently, |
69 |
| -and filesystem performance largely depends on how well the filesystem can do |
70 |
| -the mapping. The filesystem block size is usually 4KiB, but may also be 8KiB or |
71 |
| -larger. |
72 |
| - |
73 |
| -Obviously, to implement the mapping, the file-system has to maintain some kind |
74 |
| -of on-disk index. For any file on the file-system, and any offset within the |
75 |
| -file, the index allows you to find the corresponding disk sector, which stores |
76 |
| -the file's data. Whenever we write to a file, the filesystem looks up the index |
77 |
| -and writes to the corresponding disk sectors. Sometimes the filesystem has to |
78 |
| -allocate new disk sectors and update the index (such as when appending data to |
79 |
| -the file). The filesystem index is sometimes referred to as the "filesystem |
80 |
| -metadata". |
81 |
| - |
82 |
| -What happens if a file area is not mapped to any disk sectors? Is this |
83 |
| -possible? The answer is yes. It is possible and these unmapped areas are often |
84 |
| -called "holes". And those files which have holes are often called "sparse |
85 |
| -files". |
86 |
| - |
87 |
| -All reasonable file-systems like Linux ext[234], btrfs, XFS, or Solaris XFS, |
88 |
| -and even Windows' NTFS, support sparse files. Old and less reasonable |
89 |
| -filesystems, like FAT, do not support holes. |
90 |
| - |
91 |
| -Reading holes returns zeroes. Writing to a hole causes the filesystem to |
92 |
| -allocate disk sectors for the corresponding blocks. Here is how you can create |
93 |
| -a 4GiB file with all blocks unmapped, which means that the file consists of a |
94 |
| -huge 4GiB hole: |
95 |
| - |
96 |
| -```bash |
97 |
| -$ truncate -s 4G image.raw |
98 |
| -$ stat image.raw |
99 |
| - File: image.raw |
100 |
| - Size: 4294967296 Blocks: 0 IO Block: 4096 regular file |
101 |
| -``` |
102 |
| - |
103 |
| -Notice that `image.raw` is a 4GiB file, which occupies 0 blocks on the disk! |
104 |
| -So, the entire file's contents are not mapped anywhere. Reading this file would |
105 |
| -result in reading 4GiB of zeroes. If you write to the middle of the image.raw |
106 |
| -file, you'll end up with 2 holes and a mapped area in the middle. |
107 |
| - |
108 |
| -Therefore: |
109 |
| -* Sparse files are files with holes. |
110 |
| -* Sparse files help save disk space, because, roughly speaking, holes do not |
111 |
| - occupy disk space. |
112 |
| -* A hole is an unmapped area of a file, meaning that it is not mapped anywhere |
113 |
| - on the disk. |
114 |
| -* Reading data from a hole returns zeroes. |
115 |
| -* Writing data to a hole destroys it by forcing the filesystem to map |
116 |
| - corresponding file areas to disk sectors. |
117 |
| -* Filesystems usually operate with blocks, so sizes and offsets of holes are |
118 |
| - aligned to the block boundary. |
119 |
| - |
120 |
| -It is also useful to know that you should work with sparse files carefully. It |
121 |
| -is easy to accidentally expand a sparse file, that is, to map all holes to |
122 |
| -zero-filled disk areas. For example, `scp` always expands sparse files, the |
123 |
| -`tar` and `rsync` tools do the same, by default, unless you use the `--sparse` |
124 |
| -option. Compressing and then decompressing a sparse file usually expands it. |
125 |
| - |
126 |
| -There are 2 ioctl's in Linux which allow you to find mapped and unmapped areas: |
127 |
| -`FIBMAP` and `FIEMAP`. The former is very old and is probably supported by all |
128 |
| -Linux systems, but it is rather limited and requires root privileges. The |
129 |
| -latter is a lot more advanced and does not require root privileges, but it is |
130 |
| -relatively new (added in Linux kernel, version 2.6.28). |
131 |
| - |
132 |
| -Recent versions of the Linux kernel (starting from 3.1) also support the |
133 |
| -`SEEK_HOLE` and `SEEK_DATA` values for the `whence` argument of the standard |
134 |
| -`lseek()` system call. They allow positioning to the next hole and the next |
135 |
| -mapped area of the file. |
136 |
| - |
137 |
| -Advanced Linux filesystems, in modern kernels, also allow "punching holes", |
138 |
| -meaning that it is possible to unmap any aligned area and turn it into a hole. |
139 |
| -This is implemented using the `FALLOC_FL_PUNCH_HOLE` `mode` of the |
140 |
| -`fallocate()` system call. |
141 |
| - |
142 |
| -### The bmap |
143 |
| - |
144 |
| -The bmap is an XML file, which contains a list of mapped areas, plus some |
145 |
| -additional information about the file it was created for, for example: |
146 |
| -* SHA256 checksum of the bmap file itself |
147 |
| -* SHA256 checksum of the mapped areas |
148 |
| -* the original file size |
149 |
| -* amount of mapped data |
150 |
| - |
151 |
| -The bmap file is designed to be both easily machine-readable and |
152 |
| -human-readable. All the machine-readable information is provided by XML tags. |
153 |
| -The human-oriented information is in XML comments, which explain the meaning of |
154 |
| -XML tags and provide useful information like amount of mapped data in percent |
155 |
| -and in MiB or GiB. |
156 |
| - |
157 |
| -So, the best way to understand bmap is to just to read it. Here is an |
158 |
| -[example of a bmap file](tests/test-data/test.image.bmap.v2.0). |
159 |
| - |
160 |
| -### Raw images |
161 |
| - |
162 |
| -Raw images are the simplest type of system images which may be flashed to the |
163 |
| -target block device, block-by-block, without any further processing. Raw images |
164 |
| -just "mirror" the target block device: they usually start with the MBR sector. |
165 |
| -There is a partition table at the beginning of the image and one or more |
166 |
| -partitions containing filesystems, like ext4. Usually, no special tools are |
167 |
| -required to flash a raw image to the target block device. The standard `dd` |
168 |
| -command can do the job: |
169 |
| - |
170 |
| -```bash |
171 |
| -$ dd if=tizen-ivi-image.raw of=/dev/usb_stick |
172 |
| -``` |
173 |
| - |
174 |
| -At first glance, raw images do not look very appealing because they are large |
175 |
| -and it takes a lot of time to flash them. However, with bmap, raw images become |
176 |
| -a much more attractive type of image. We will demonstrate this, using Tizen IVI |
177 |
| -as an example. |
178 |
| - |
179 |
| -The Tizen IVI project uses raw images which take 3.7GiB in Tizen IVI 2.0 alpha. |
180 |
| -The images are created by the MIC tool. Here is a brief description of how MIC |
181 |
| -creates them: |
182 |
| - |
183 |
| -* create a 3.7GiB sparse file, which will become the Tizen IVI image in the end |
184 |
| -* partition the file using the `parted` tool |
185 |
| -* format the partitions using the `mkfs.ext4` tool |
186 |
| -* loop-back mount all the partitions |
187 |
| -* install all the required packages to the partitions: copy all the needed |
188 |
| - files and do all the tweaks |
189 |
| -* unmount all loop-back-mounted image partitions, the image is ready |
190 |
| -* generate the block map file for the image |
191 |
| -* compress the image using `bzip2`, turning them into a small file, around |
192 |
| - 300MiB |
193 |
| - |
194 |
| -The Tizen IVI raw images are initially sparse files. All the mapped blocks |
195 |
| -represent useful data and all the holes represent unused regions, which |
196 |
| -"contain" zeroes and do not have to be copied when flashing the image. Although |
197 |
| -information about holes is lost once the image gets compressed, the bmap file |
198 |
| -still has it and it can be used to reconstruct the uncompressed image or to |
199 |
| -flash the image quickly, by copying only the mapped regions. |
200 |
| - |
201 |
| -Raw images compress extremely well because the holes are essentially zeroes, |
202 |
| -which compress perfectly. This is why 3.7GiB Tizen IVI raw images, which |
203 |
| -contain about 1.1GiB of mapped blocks, take only 300MiB in a compressed form. |
204 |
| -And the important point is that you need to decompress them only while |
205 |
| -flashing. The `bmaptool` does this "on-the-fly". |
206 |
| - |
207 |
| -Therefore: |
208 |
| -* raw images are distributed in a compressed form, and they are almost as small |
209 |
| - as a tarball (that includes all the data the image would take) |
210 |
| -* the bmap file and the `bmaptool` make it possible to quickly flash the |
211 |
| - compressed raw image to the target block device |
212 |
| -* optionally, the `bmaptool` can reconstruct the original uncompressed sparse raw |
213 |
| - image file |
214 |
| - |
215 |
| -And, what is even more important, is that flashing raw images is extremely fast |
216 |
| -because you write directly to the block device, and write sequentially. |
217 |
| - |
218 |
| -Another great thing about raw images is that they may be 100% ready-to-go and |
219 |
| -all you need to do is to put the image on your device "as-is". You do not have |
220 |
| -to know the image format, which partitions and filesystems it contains, etc. |
221 |
| -This is simple and robust. |
222 |
| - |
223 |
| -### Usage scenarios |
224 |
| - |
225 |
| -Flashing or copying large images is the main `bmaptool` use case. The idea is |
226 |
| -that if you have a raw image file and its bmap, you can flash it to a device by |
227 |
| -writing only the mapped blocks and skipping the unmapped blocks. |
228 |
| - |
229 |
| -What this basically means is that with bmap it is not necessary to try to |
230 |
| -minimize the raw image size by making the partitions small, which would require |
231 |
| -resizing them. The image can contain huge multi-gigabyte partitions, just like |
232 |
| -the target device requires. The image will then be a huge sparse file, with |
233 |
| -little mapped data. And because unmapped areas "contain" zeroes, the huge image |
234 |
| -will compress extremely well, so the huge image will be very small in |
235 |
| -compressed form. It can then be distributed in compressed form, and flashed |
236 |
| -very quickly with `bmaptool` and the bmap file, because `bmaptool` will decompress |
237 |
| -the image on-the-fly and write only mapped areas. |
238 |
| - |
239 |
| -The additional benefit of using bmap for flashing is the checksum verification. |
240 |
| -Indeed, the `bmaptool create` command generates SHA256 checksums for all mapped |
241 |
| -block ranges, and the `bmaptool copy` command verifies the checksums while |
242 |
| -writing. Integrity of the bmap file itself is also protected by a SHA256 |
243 |
| -checksum and `bmaptool` verifies it before starting flashing. |
244 |
| - |
245 |
| -On top of this, the bmap file can be signed using OpenPGP (gpg) and bmaptool |
246 |
| -automatically verifies the signature if it is present. This allows for |
247 |
| -verifying the bmap file integrity and authoring. And since the bmap file |
248 |
| -contains SHA256 checksums for all the mapped image data, the bmap file |
249 |
| -signature verification should be enough to guarantee integrity and authoring of |
250 |
| -the image file. |
251 |
| - |
252 |
| -The second usage scenario is reconstructing sparse files Generally speaking, if |
253 |
| -you had a sparse file but then expanded it, there is no way to reconstruct it. |
254 |
| -In some cases, something like |
255 |
| - |
256 |
| -```bash |
257 |
| -$ cp --sparse=always expanded.file reconstructed.file |
258 |
| -``` |
259 |
| - |
260 |
| -would be enough. However, a file reconstructed this way will not necessarily be |
261 |
| -the same as the original sparse file. The original sparse file could have |
262 |
| -contained mapped blocks filled with all zeroes (not holes), and, in the |
263 |
| -reconstructed file, these blocks will become holes. In some cases, this does |
264 |
| -not matter. For example, if you just want to save disk space. However, for raw |
265 |
| -images, flashing it does matter, because it is essential to write zero-filled |
266 |
| -blocks and not skip them. Indeed, if you do not write the zero-filled block to |
267 |
| -corresponding disk sectors which, presumably, contain garbage, you end up with |
268 |
| -garbage in those blocks. In other words, when we are talking about flashing raw |
269 |
| -images, the difference between zero-filled blocks and holes in the original |
270 |
| -image is essential because zero-filled blocks are the required blocks which are |
271 |
| -expected to contain zeroes, while holes are just unneeded blocks with no |
272 |
| -expectations regarding the contents. |
273 |
| - |
274 |
| -`bmaptool` may be helpful for reconstructing sparse files properly. Before the |
275 |
| -sparse file is expanded, you should generate its bmap (for example, by using |
276 |
| -the `bmaptool create` command). Then you may compress your file or, otherwise, |
277 |
| -expand it. Later on, you may reconstruct it using the `bmaptool copy` command. |
278 |
| - |
279 |
| -## Project structure |
280 |
| - |
281 |
| -```bash |
282 |
| ------------------------------------------------------------------------------------- |
283 |
| -| - bmaptool | A tools to create bmap and copy with bmap. Based | |
284 |
| -| | on the 'BmapCreate.py' and 'BmapCopy.py' modules. | |
285 |
| -| - setup.py | A script to turn the entire bmap-tools project | |
286 |
| -| | into a python egg. | |
287 |
| -| - setup.cfg | contains a piece of nose tests configuration | |
288 |
| -| - .coveragerc | lists files to include into test coverage report | |
289 |
| -| - TODO | Just a list of things to be done for the project. | |
290 |
| -| - make_a_release.sh | Most people may ignore this script. It is used by | |
291 |
| -| | maintainer when creating a new release. | |
292 |
| -| - tests/ | Contains the project unit-tests. | |
293 |
| -| | - test_api_base.py | Tests the base API modules: 'BmapCreate.py' and | |
294 |
| -| | | 'BmapCopy.py'. | |
295 |
| -| | - test_filemap.py | Tests the 'Filemap.py' module. | |
296 |
| -| | - test_compat.py | Tests that new BmapCopy implementations support old | |
297 |
| -| | | bmap formats, and old BmapCopy implementations | |
298 |
| -| | | support new compatible bmap fomrats. | |
299 |
| -| | - test_bmap_helpers.py | Tests the 'BmapHelpers.py' module. | |
300 |
| -| | - helpers.py | Helper functions shared between the unit-tests. | |
301 |
| -| | - test-data/ | Data files for the unit-tests | |
302 |
| -| | - oldcodebase/ | Copies of old BmapCopy implementations for bmap | |
303 |
| -| | | format forward-compatibility verification. | |
304 |
| -| - bmaptools/ | The API modules which implement all the bmap | |
305 |
| -| | | functionality. | |
306 |
| -| | - BmapCreate.py | Creates a bmap for a given file. | |
307 |
| -| | - BmapCopy.py | Implements copying of an image using its bmap. | |
308 |
| -| | - Filemap.py | Allows for reading files' block map. | |
309 |
| -| | - BmapHelpers.py | Just helper functions used all over the project. | |
310 |
| -| | - TransRead.py | Provides a transparent way to read various kind of | |
311 |
| -| | | files (compressed, etc) | |
312 |
| -| - debian/* | Debian packaging for the project. | |
313 |
| -| - doc/* | Project documentation. | |
314 |
| -| - packaging/* | RPM packaging (Fedora & OpenSuse) for the project. | |
315 |
| -| - contrib/* | Various contributions that may be useful, but | |
316 |
| -| | project maintainers do not really test or maintain. | |
317 |
| ------------------------------------------------------------------------------------- |
318 |
| -``` |
319 |
| -
|
320 |
| -## How to run unit tests |
321 |
| -
|
322 |
| -Just install the `nose` python test framework and run the `nosetests` command in |
323 |
| -the project root directory. If you want to see tests coverage report, run |
324 |
| -`nosetests --with-coverage`. |
325 |
| -
|
326 |
| -## Known Issues |
327 |
| -
|
328 |
| -### ZFS File System |
329 |
| -
|
330 |
| -If running on the ZFS file system, the Linux ZFS kernel driver parameters |
331 |
| -configuration can cause the finding of mapped and unmapped areas to fail. |
332 |
| -This can be fixed temporarily by doing the following: |
333 |
| -
|
334 |
| -```bash |
335 |
| -$ echo 1 | sudo tee -a /sys/module/zfs/parameters/zfs_dmu_offset_next_sync |
336 |
| -``` |
337 |
| -
|
338 |
| -However, if a permanent solution is required then perform the following: |
339 |
| -
|
340 |
| -```bash |
341 |
| -$ echo "options zfs zfs_dmu_offset_next_sync=1" | sudo tee -a /etc/modprobe.d/zfs.conf |
342 |
| -``` |
343 |
| -
|
344 |
| -Depending upon your Linux distro, you may also need to do the following to |
345 |
| -ensure that the permanent change is updated in all your initramfs images: |
346 |
| -
|
347 |
| -```bash |
348 |
| -$ sudo update-initramfs -u -k all |
349 |
| -``` |
350 |
| -
|
351 |
| -To verify the temporary or permanent change has worked you can use the following |
352 |
| -which should return `1`: |
353 |
| -
|
354 |
| -```bash |
355 |
| -$ cat /sys/module/zfs/parameters/zfs_dmu_offset_next_sync |
356 |
| -``` |
357 |
| -
|
358 |
| -More details can be found [in the OpenZFS documentation](https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html). |
359 |
| -
|
360 |
| -## Project and maintainer |
361 |
| -
|
362 |
| -The bmap-tools project implements bmap-related tools and API modules. The |
363 |
| -entire project is written in python and supports python 2.7 and python 3.x. |
364 |
| -
|
365 |
| -The project author is Artem Bityutskiy ([email protected]). Artem is looking |
366 |
| -for a new maintainer for the project. Anyone actively contributing may become a |
367 |
| -maintainer. Please, let Artem know if you volunteer to be one. |
368 |
| -
|
369 |
| -Project git repository is here: |
370 |
| -https://github.com/intel/bmap-tools.git |
371 |
| -
|
372 |
| -## Credits |
373 |
| -
|
374 |
| -* Ed Bartosh ([email protected]) for helping me with learning python |
375 |
| - (this is my first python project) and working with the Tizen IVI |
376 |
| - infrastructure. Ed also implemented the packaging. |
377 |
| -* Alexander Kanevskiy ([email protected]) and |
378 |
| - Kevin Wang ([email protected]) for helping with integrating this stuff |
379 |
| - to the Tizen IVI infrastructure. |
380 |
| -* Simon McVittie ([email protected]) for improving Debian |
381 |
| - packaging and fixing bmaptool. |
| 4 | +This project has moved to [https://github.com/yoctoproject/bmaptool](https://github.com/yoctoproject/bmaptool) |
0 commit comments