Skip to content
This repository was archived by the owner on Oct 16, 2024. It is now read-only.

Commit 03f5dd0

Browse files
twoernerdedekind
authored andcommitted
README.md: update status
Signed-off-by: Trevor Woerner <[email protected]>
1 parent 7ca37b1 commit 03f5dd0

File tree

1 file changed

+3
-380
lines changed

1 file changed

+3
-380
lines changed

README.md

Lines changed: 3 additions & 380 deletions
Original file line numberDiff line numberDiff line change
@@ -1,381 +1,4 @@
1-
# `bmap-tools`
1+
The code at this location is no longer maintained and will
2+
likely be removed in the future.
23

3-
> The better `dd` for embedded projects, based on block maps.
4-
5-
## Introduction
6-
7-
`bmaptool` is a generic tool for creating the block map (bmap) for a file and
8-
copying files using the block map. The idea is that large files, like raw
9-
system image files, can be copied or flashed a lot faster and more reliably
10-
with `bmaptool` than with traditional tools, like `dd` or `cp`.
11-
12-
`bmaptool` was originally created for the "Tizen IVI" project and it was used for
13-
flashing system images to USB sticks and other block devices. `bmaptool` can also
14-
be used for general image flashing purposes, for example, flashing Fedora Linux
15-
OS distribution images to USB sticks.
16-
17-
Originally Tizen IVI images had been flashed using the `dd` tool, but bmaptool
18-
brought a number of advantages.
19-
20-
* Faster. Depending on various factors, like write speed, image size, how full
21-
is the image, and so on, `bmaptool` was 5-7 times faster than `dd` in the Tizen
22-
IVI project.
23-
* Integrity. `bmaptool` verifies data integrity while flashing, which means that
24-
possible data corruptions will be noticed immediately.
25-
* Usability. `bmaptool` can read images directly from the remote server, so users
26-
do not have to download images and save them locally.
27-
* Protects user's data. Unlike `dd`, if you make a mistake and specify a wrong
28-
block device name, `bmaptool` will less likely destroy your data because it has
29-
protection mechanisms which, for example, prevent `bmaptool` from writing to a
30-
mounted block device.
31-
32-
## Usage
33-
34-
`bmaptool` supports 2 subcommands:
35-
* `copy` - copy a file to another file using bmap or flash an image to a block
36-
device
37-
* `create` - create a bmap for a file
38-
39-
You can get usage reference for `bmaptool` and all the supported command using
40-
the `-h` or `--help` options:
41-
42-
```bash
43-
$ bmaptool -h # General bmaptool help
44-
$ bmaptool <cmd> -h # Help on the <cmd> sub-command
45-
```
46-
47-
You can also refer to the `bmaptool` manual page:
48-
```bash
49-
$ man bmaptool
50-
```
51-
52-
## Concept
53-
54-
This section provides general information about the block map (bmap) necessary
55-
for understanding how `bmaptool` works. The structure of the section is:
56-
57-
* "Sparse files" - the bmap ideas are based on sparse files, so it is important
58-
to understand what sparse files are.
59-
* "The block map" - explains what bmap is.
60-
* "Raw images" - the main usage scenario for `bmaptool` is flashing raw images,
61-
which this section discusses.
62-
* "Usage scenarios" - describes various possible bmap and `bmaptool` usage
63-
scenarios.
64-
65-
### Sparse files
66-
67-
One of the main roles of a filesystem, generally speaking, is to map blocks of
68-
file data to disk sectors. Different file-systems do this mapping differently,
69-
and filesystem performance largely depends on how well the filesystem can do
70-
the mapping. The filesystem block size is usually 4KiB, but may also be 8KiB or
71-
larger.
72-
73-
Obviously, to implement the mapping, the file-system has to maintain some kind
74-
of on-disk index. For any file on the file-system, and any offset within the
75-
file, the index allows you to find the corresponding disk sector, which stores
76-
the file's data. Whenever we write to a file, the filesystem looks up the index
77-
and writes to the corresponding disk sectors. Sometimes the filesystem has to
78-
allocate new disk sectors and update the index (such as when appending data to
79-
the file). The filesystem index is sometimes referred to as the "filesystem
80-
metadata".
81-
82-
What happens if a file area is not mapped to any disk sectors? Is this
83-
possible? The answer is yes. It is possible and these unmapped areas are often
84-
called "holes". And those files which have holes are often called "sparse
85-
files".
86-
87-
All reasonable file-systems like Linux ext[234], btrfs, XFS, or Solaris XFS,
88-
and even Windows' NTFS, support sparse files. Old and less reasonable
89-
filesystems, like FAT, do not support holes.
90-
91-
Reading holes returns zeroes. Writing to a hole causes the filesystem to
92-
allocate disk sectors for the corresponding blocks. Here is how you can create
93-
a 4GiB file with all blocks unmapped, which means that the file consists of a
94-
huge 4GiB hole:
95-
96-
```bash
97-
$ truncate -s 4G image.raw
98-
$ stat image.raw
99-
File: image.raw
100-
Size: 4294967296 Blocks: 0 IO Block: 4096 regular file
101-
```
102-
103-
Notice that `image.raw` is a 4GiB file, which occupies 0 blocks on the disk!
104-
So, the entire file's contents are not mapped anywhere. Reading this file would
105-
result in reading 4GiB of zeroes. If you write to the middle of the image.raw
106-
file, you'll end up with 2 holes and a mapped area in the middle.
107-
108-
Therefore:
109-
* Sparse files are files with holes.
110-
* Sparse files help save disk space, because, roughly speaking, holes do not
111-
occupy disk space.
112-
* A hole is an unmapped area of a file, meaning that it is not mapped anywhere
113-
on the disk.
114-
* Reading data from a hole returns zeroes.
115-
* Writing data to a hole destroys it by forcing the filesystem to map
116-
corresponding file areas to disk sectors.
117-
* Filesystems usually operate with blocks, so sizes and offsets of holes are
118-
aligned to the block boundary.
119-
120-
It is also useful to know that you should work with sparse files carefully. It
121-
is easy to accidentally expand a sparse file, that is, to map all holes to
122-
zero-filled disk areas. For example, `scp` always expands sparse files, the
123-
`tar` and `rsync` tools do the same, by default, unless you use the `--sparse`
124-
option. Compressing and then decompressing a sparse file usually expands it.
125-
126-
There are 2 ioctl's in Linux which allow you to find mapped and unmapped areas:
127-
`FIBMAP` and `FIEMAP`. The former is very old and is probably supported by all
128-
Linux systems, but it is rather limited and requires root privileges. The
129-
latter is a lot more advanced and does not require root privileges, but it is
130-
relatively new (added in Linux kernel, version 2.6.28).
131-
132-
Recent versions of the Linux kernel (starting from 3.1) also support the
133-
`SEEK_HOLE` and `SEEK_DATA` values for the `whence` argument of the standard
134-
`lseek()` system call. They allow positioning to the next hole and the next
135-
mapped area of the file.
136-
137-
Advanced Linux filesystems, in modern kernels, also allow "punching holes",
138-
meaning that it is possible to unmap any aligned area and turn it into a hole.
139-
This is implemented using the `FALLOC_FL_PUNCH_HOLE` `mode` of the
140-
`fallocate()` system call.
141-
142-
### The bmap
143-
144-
The bmap is an XML file, which contains a list of mapped areas, plus some
145-
additional information about the file it was created for, for example:
146-
* SHA256 checksum of the bmap file itself
147-
* SHA256 checksum of the mapped areas
148-
* the original file size
149-
* amount of mapped data
150-
151-
The bmap file is designed to be both easily machine-readable and
152-
human-readable. All the machine-readable information is provided by XML tags.
153-
The human-oriented information is in XML comments, which explain the meaning of
154-
XML tags and provide useful information like amount of mapped data in percent
155-
and in MiB or GiB.
156-
157-
So, the best way to understand bmap is to just to read it. Here is an
158-
[example of a bmap file](tests/test-data/test.image.bmap.v2.0).
159-
160-
### Raw images
161-
162-
Raw images are the simplest type of system images which may be flashed to the
163-
target block device, block-by-block, without any further processing. Raw images
164-
just "mirror" the target block device: they usually start with the MBR sector.
165-
There is a partition table at the beginning of the image and one or more
166-
partitions containing filesystems, like ext4. Usually, no special tools are
167-
required to flash a raw image to the target block device. The standard `dd`
168-
command can do the job:
169-
170-
```bash
171-
$ dd if=tizen-ivi-image.raw of=/dev/usb_stick
172-
```
173-
174-
At first glance, raw images do not look very appealing because they are large
175-
and it takes a lot of time to flash them. However, with bmap, raw images become
176-
a much more attractive type of image. We will demonstrate this, using Tizen IVI
177-
as an example.
178-
179-
The Tizen IVI project uses raw images which take 3.7GiB in Tizen IVI 2.0 alpha.
180-
The images are created by the MIC tool. Here is a brief description of how MIC
181-
creates them:
182-
183-
* create a 3.7GiB sparse file, which will become the Tizen IVI image in the end
184-
* partition the file using the `parted` tool
185-
* format the partitions using the `mkfs.ext4` tool
186-
* loop-back mount all the partitions
187-
* install all the required packages to the partitions: copy all the needed
188-
files and do all the tweaks
189-
* unmount all loop-back-mounted image partitions, the image is ready
190-
* generate the block map file for the image
191-
* compress the image using `bzip2`, turning them into a small file, around
192-
300MiB
193-
194-
The Tizen IVI raw images are initially sparse files. All the mapped blocks
195-
represent useful data and all the holes represent unused regions, which
196-
"contain" zeroes and do not have to be copied when flashing the image. Although
197-
information about holes is lost once the image gets compressed, the bmap file
198-
still has it and it can be used to reconstruct the uncompressed image or to
199-
flash the image quickly, by copying only the mapped regions.
200-
201-
Raw images compress extremely well because the holes are essentially zeroes,
202-
which compress perfectly. This is why 3.7GiB Tizen IVI raw images, which
203-
contain about 1.1GiB of mapped blocks, take only 300MiB in a compressed form.
204-
And the important point is that you need to decompress them only while
205-
flashing. The `bmaptool` does this "on-the-fly".
206-
207-
Therefore:
208-
* raw images are distributed in a compressed form, and they are almost as small
209-
as a tarball (that includes all the data the image would take)
210-
* the bmap file and the `bmaptool` make it possible to quickly flash the
211-
compressed raw image to the target block device
212-
* optionally, the `bmaptool` can reconstruct the original uncompressed sparse raw
213-
image file
214-
215-
And, what is even more important, is that flashing raw images is extremely fast
216-
because you write directly to the block device, and write sequentially.
217-
218-
Another great thing about raw images is that they may be 100% ready-to-go and
219-
all you need to do is to put the image on your device "as-is". You do not have
220-
to know the image format, which partitions and filesystems it contains, etc.
221-
This is simple and robust.
222-
223-
### Usage scenarios
224-
225-
Flashing or copying large images is the main `bmaptool` use case. The idea is
226-
that if you have a raw image file and its bmap, you can flash it to a device by
227-
writing only the mapped blocks and skipping the unmapped blocks.
228-
229-
What this basically means is that with bmap it is not necessary to try to
230-
minimize the raw image size by making the partitions small, which would require
231-
resizing them. The image can contain huge multi-gigabyte partitions, just like
232-
the target device requires. The image will then be a huge sparse file, with
233-
little mapped data. And because unmapped areas "contain" zeroes, the huge image
234-
will compress extremely well, so the huge image will be very small in
235-
compressed form. It can then be distributed in compressed form, and flashed
236-
very quickly with `bmaptool` and the bmap file, because `bmaptool` will decompress
237-
the image on-the-fly and write only mapped areas.
238-
239-
The additional benefit of using bmap for flashing is the checksum verification.
240-
Indeed, the `bmaptool create` command generates SHA256 checksums for all mapped
241-
block ranges, and the `bmaptool copy` command verifies the checksums while
242-
writing. Integrity of the bmap file itself is also protected by a SHA256
243-
checksum and `bmaptool` verifies it before starting flashing.
244-
245-
On top of this, the bmap file can be signed using OpenPGP (gpg) and bmaptool
246-
automatically verifies the signature if it is present. This allows for
247-
verifying the bmap file integrity and authoring. And since the bmap file
248-
contains SHA256 checksums for all the mapped image data, the bmap file
249-
signature verification should be enough to guarantee integrity and authoring of
250-
the image file.
251-
252-
The second usage scenario is reconstructing sparse files Generally speaking, if
253-
you had a sparse file but then expanded it, there is no way to reconstruct it.
254-
In some cases, something like
255-
256-
```bash
257-
$ cp --sparse=always expanded.file reconstructed.file
258-
```
259-
260-
would be enough. However, a file reconstructed this way will not necessarily be
261-
the same as the original sparse file. The original sparse file could have
262-
contained mapped blocks filled with all zeroes (not holes), and, in the
263-
reconstructed file, these blocks will become holes. In some cases, this does
264-
not matter. For example, if you just want to save disk space. However, for raw
265-
images, flashing it does matter, because it is essential to write zero-filled
266-
blocks and not skip them. Indeed, if you do not write the zero-filled block to
267-
corresponding disk sectors which, presumably, contain garbage, you end up with
268-
garbage in those blocks. In other words, when we are talking about flashing raw
269-
images, the difference between zero-filled blocks and holes in the original
270-
image is essential because zero-filled blocks are the required blocks which are
271-
expected to contain zeroes, while holes are just unneeded blocks with no
272-
expectations regarding the contents.
273-
274-
`bmaptool` may be helpful for reconstructing sparse files properly. Before the
275-
sparse file is expanded, you should generate its bmap (for example, by using
276-
the `bmaptool create` command). Then you may compress your file or, otherwise,
277-
expand it. Later on, you may reconstruct it using the `bmaptool copy` command.
278-
279-
## Project structure
280-
281-
```bash
282-
------------------------------------------------------------------------------------
283-
| - bmaptool | A tools to create bmap and copy with bmap. Based |
284-
| | on the 'BmapCreate.py' and 'BmapCopy.py' modules. |
285-
| - setup.py | A script to turn the entire bmap-tools project |
286-
| | into a python egg. |
287-
| - setup.cfg | contains a piece of nose tests configuration |
288-
| - .coveragerc | lists files to include into test coverage report |
289-
| - TODO | Just a list of things to be done for the project. |
290-
| - make_a_release.sh | Most people may ignore this script. It is used by |
291-
| | maintainer when creating a new release. |
292-
| - tests/ | Contains the project unit-tests. |
293-
| | - test_api_base.py | Tests the base API modules: 'BmapCreate.py' and |
294-
| | | 'BmapCopy.py'. |
295-
| | - test_filemap.py | Tests the 'Filemap.py' module. |
296-
| | - test_compat.py | Tests that new BmapCopy implementations support old |
297-
| | | bmap formats, and old BmapCopy implementations |
298-
| | | support new compatible bmap fomrats. |
299-
| | - test_bmap_helpers.py | Tests the 'BmapHelpers.py' module. |
300-
| | - helpers.py | Helper functions shared between the unit-tests. |
301-
| | - test-data/ | Data files for the unit-tests |
302-
| | - oldcodebase/ | Copies of old BmapCopy implementations for bmap |
303-
| | | format forward-compatibility verification. |
304-
| - bmaptools/ | The API modules which implement all the bmap |
305-
| | | functionality. |
306-
| | - BmapCreate.py | Creates a bmap for a given file. |
307-
| | - BmapCopy.py | Implements copying of an image using its bmap. |
308-
| | - Filemap.py | Allows for reading files' block map. |
309-
| | - BmapHelpers.py | Just helper functions used all over the project. |
310-
| | - TransRead.py | Provides a transparent way to read various kind of |
311-
| | | files (compressed, etc) |
312-
| - debian/* | Debian packaging for the project. |
313-
| - doc/* | Project documentation. |
314-
| - packaging/* | RPM packaging (Fedora & OpenSuse) for the project. |
315-
| - contrib/* | Various contributions that may be useful, but |
316-
| | project maintainers do not really test or maintain. |
317-
------------------------------------------------------------------------------------
318-
```
319-
320-
## How to run unit tests
321-
322-
Just install the `nose` python test framework and run the `nosetests` command in
323-
the project root directory. If you want to see tests coverage report, run
324-
`nosetests --with-coverage`.
325-
326-
## Known Issues
327-
328-
### ZFS File System
329-
330-
If running on the ZFS file system, the Linux ZFS kernel driver parameters
331-
configuration can cause the finding of mapped and unmapped areas to fail.
332-
This can be fixed temporarily by doing the following:
333-
334-
```bash
335-
$ echo 1 | sudo tee -a /sys/module/zfs/parameters/zfs_dmu_offset_next_sync
336-
```
337-
338-
However, if a permanent solution is required then perform the following:
339-
340-
```bash
341-
$ echo "options zfs zfs_dmu_offset_next_sync=1" | sudo tee -a /etc/modprobe.d/zfs.conf
342-
```
343-
344-
Depending upon your Linux distro, you may also need to do the following to
345-
ensure that the permanent change is updated in all your initramfs images:
346-
347-
```bash
348-
$ sudo update-initramfs -u -k all
349-
```
350-
351-
To verify the temporary or permanent change has worked you can use the following
352-
which should return `1`:
353-
354-
```bash
355-
$ cat /sys/module/zfs/parameters/zfs_dmu_offset_next_sync
356-
```
357-
358-
More details can be found [in the OpenZFS documentation](https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html).
359-
360-
## Project and maintainer
361-
362-
The bmap-tools project implements bmap-related tools and API modules. The
363-
entire project is written in python and supports python 2.7 and python 3.x.
364-
365-
The project author is Artem Bityutskiy ([email protected]). Artem is looking
366-
for a new maintainer for the project. Anyone actively contributing may become a
367-
maintainer. Please, let Artem know if you volunteer to be one.
368-
369-
Project git repository is here:
370-
https://github.com/intel/bmap-tools.git
371-
372-
## Credits
373-
374-
* Ed Bartosh ([email protected]) for helping me with learning python
375-
(this is my first python project) and working with the Tizen IVI
376-
infrastructure. Ed also implemented the packaging.
377-
* Alexander Kanevskiy ([email protected]) and
378-
Kevin Wang ([email protected]) for helping with integrating this stuff
379-
to the Tizen IVI infrastructure.
380-
* Simon McVittie ([email protected]) for improving Debian
381-
packaging and fixing bmaptool.
4+
This project has moved to [https://github.com/yoctoproject/bmaptool](https://github.com/yoctoproject/bmaptool)

0 commit comments

Comments
 (0)