Skip to content

Commit 5145776

Browse files
AArch64: Fix AArch64 disassembler mapping symbol search
My previous patch for AArch64 was not enough to catch all the cases where disassembling an out-of-order section could go wrong. It had missed the case DATA sections could be incorrectly disassembled as TEXT. Out of order here refers to an object file where sections are not listed in a monotonic increasing VMA order. The ELF ABI for AArch64 [1] specifies the following for mapping symbols: 1) A text section must always have a corresponding mapping symbol at it's start. 2) Data sections do not require any mapping symbols. 3) The range of a mapping symbol extends from the address it starts on up to the next mapping symbol (exclusive) or section end (inclusive). However there is no defined order between a symbol and it's corresponding mapping symbol in the symbol table. This means that while in general we look up for a corresponding mapping symbol, we have to make at least one check of the symbol below the address being disassembled. When disassembling different PCs within the same section, the search for mapping symbol can be cached somewhat. We know that the mapping symbol corresponding to the current PC is either the previous one used, or one at the same address as the current PC. However this optimization and mapping symbol search must stop as soon as we reach the end or start of the section. Furthermore if we're only disassembling a part of a section, the search is a allowed to search further than the current chunk, but is not allowed to search past it (The mapping symbol if there, must be at the same address, so in practice we usually stop at PC+4). lastly, since only data sections don't require a mapping symbol the default mapping type should be DATA and not INSN as previously defined, however if the binary has had all its symbols stripped than this isn't very useful. To fix this we determine the default based on the section flags. This will allow the disassembler to be more useful on stripped binaries. If there is no section than we assume you to be disassembling INSN. [1] https://developer.arm.com/docs/ihi0056/latest/elf-for-the-arm-64-bit-architecture-aarch64-abi-2018q4#aaelf64-section4-5-4 binutils/ChangeLog: * testsuite/binutils-all/aarch64/in-order.d: New test. * testsuite/binutils-all/aarch64/out-of-order.d: Disassemble data as well. opcodes/ChangeLog: * aarch64-dis.c (print_insn_aarch64): Update the mapping symbol search order.
1 parent 53b2f36 commit 5145776

File tree

5 files changed

+93
-7
lines changed

5 files changed

+93
-7
lines changed

binutils/ChangeLog

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
2019-03-25 Tamar Christina <[email protected]>
2+
3+
* testsuite/binutils-all/aarch64/in-order.d: New test.
4+
* testsuite/binutils-all/aarch64/out-of-order.d: Disassemble data as
5+
well.
6+
17
2019-03-25 Tamar Christina <[email protected]>
28

39
* objdump.c (disassemble_bytes): Pass stop_offset.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
#PROG: objcopy
2+
#source: out-of-order.s
3+
#ld: -e v1 -Ttext-segment=0x400000
4+
#objdump: -d
5+
#name: Check if disassembler can handle sections in default order
6+
7+
.*: +file format .*aarch64.*
8+
9+
Disassembly of section \.func1:
10+
11+
0000000000400000 <v1>:
12+
400000: 8b010000 add x0, x0, x1
13+
400004: 00000000 \.word 0x00000000
14+
15+
Disassembly of section .func2:
16+
17+
0000000000400008 <\.func2>:
18+
400008: 8b010000 add x0, x0, x1
19+
20+
Disassembly of section \.func3:
21+
22+
000000000040000c <\.func3>:
23+
40000c: 8b010000 add x0, x0, x1
24+
400010: 8b010000 add x0, x0, x1
25+
400014: 8b010000 add x0, x0, x1
26+
400018: 8b010000 add x0, x0, x1
27+
40001c: 8b010000 add x0, x0, x1
28+
400020: 00000000 \.word 0x00000000

binutils/testsuite/binutils-all/aarch64/out-of-order.d

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,20 @@
11
#PROG: objcopy
22
#ld: -T out-of-order.T
3-
#objdump: -d
3+
#objdump: -D
44
#name: Check if disassembler can handle sections in different order than header
55

66
.*: +file format .*aarch64.*
77

8+
Disassembly of section \.global:
9+
10+
00000000ffe00000 <\.global>:
11+
ffe00000: 00000001 \.word 0x00000001
12+
ffe00004: 00000000 \.word 0x00000000
13+
ffe00008: 00000001 \.word 0x00000001
14+
ffe0000c: 00000000 \.word 0x00000000
15+
ffe00010: 00000001 \.word 0x00000001
16+
ffe00014: 00000000 \.word 0x00000000
17+
818
Disassembly of section \.func2:
919

1020
0000000004018280 <\.func2>:
@@ -25,3 +35,8 @@ Disassembly of section \.func3:
2535
401500c: 8b010000 add x0, x0, x1
2636
4015010: 8b010000 add x0, x0, x1
2737
4015014: 00000000 \.word 0x00000000
38+
39+
Disassembly of section \.rodata:
40+
41+
0000000004015018 <\.rodata>:
42+
4015018: 00000004 \.word 0x00000004

opcodes/ChangeLog

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
2019-03-25 Tamar Christina <[email protected]>
2+
3+
* aarch64-dis.c (print_insn_aarch64): Update the mapping symbol search
4+
order.
5+
16
2019-03-25 Tamar Christina <[email protected]>
27

38
* aarch64-dis.c (last_stop_offset): New.

opcodes/aarch64-dis.c

Lines changed: 38 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3318,14 +3318,26 @@ print_insn_aarch64 (bfd_vma pc,
33183318
/* Aarch64 instructions are always little-endian */
33193319
info->endian_code = BFD_ENDIAN_LITTLE;
33203320

3321+
/* Default to DATA. A text section is required by the ABI to contain an
3322+
INSN mapping symbol at the start. A data section has no such
3323+
requirement, hence if no mapping symbol is found the section must
3324+
contain only data. This however isn't very useful if the user has
3325+
fully stripped the binaries. If this is the case use the section
3326+
attributes to determine the default. If we have no section default to
3327+
INSN as well, as we may be disassembling some raw bytes on a baremetal
3328+
HEX file or similar. */
3329+
enum map_type type = MAP_DATA;
3330+
if ((info->section && info->section->flags & SEC_CODE) || !info->section)
3331+
type = MAP_INSN;
3332+
33213333
/* First check the full symtab for a mapping symbol, even if there
33223334
are no usable non-mapping symbols for this address. */
33233335
if (info->symtab_size != 0
33243336
&& bfd_asymbol_flavour (*info->symtab) == bfd_target_elf_flavour)
33253337
{
3326-
enum map_type type = MAP_INSN;
33273338
int last_sym = -1;
3328-
bfd_vma addr;
3339+
bfd_vma addr, section_vma = 0;
3340+
bfd_boolean can_use_search_opt_p;
33293341
int n;
33303342

33313343
if (pc <= last_mapping_addr)
@@ -3334,13 +3346,20 @@ print_insn_aarch64 (bfd_vma pc,
33343346
/* Start scanning at the start of the function, or wherever
33353347
we finished last time. */
33363348
n = info->symtab_pos + 1;
3349+
33373350
/* If the last stop offset is different from the current one it means we
33383351
are disassembling a different glob of bytes. As such the optimization
33393352
would not be safe and we should start over. */
3340-
if (n < last_mapping_sym && info->stop_offset == last_stop_offset)
3353+
can_use_search_opt_p = last_mapping_sym >= 0
3354+
&& info->stop_offset == last_stop_offset;
3355+
3356+
if (n >= last_mapping_sym && can_use_search_opt_p)
33413357
n = last_mapping_sym;
33423358

3343-
/* Scan up to the location being disassembled. */
3359+
/* Look down while we haven't passed the location being disassembled.
3360+
The reason for this is that there's no defined order between a symbol
3361+
and an mapping symbol that may be at the same address. We may have to
3362+
look at least one position ahead. */
33443363
for (; n < info->symtab_size; n++)
33453364
{
33463365
addr = bfd_asymbol_value (info->symtab[n]);
@@ -3356,13 +3375,24 @@ print_insn_aarch64 (bfd_vma pc,
33563375
if (!found)
33573376
{
33583377
n = info->symtab_pos;
3359-
if (n < last_mapping_sym)
3378+
if (n >= last_mapping_sym && can_use_search_opt_p)
33603379
n = last_mapping_sym;
33613380

33623381
/* No mapping symbol found at this address. Look backwards
3363-
for a preceeding one. */
3382+
for a preceeding one, but don't go pass the section start
3383+
otherwise a data section with no mapping symbol can pick up
3384+
a text mapping symbol of a preceeding section. The documentation
3385+
says section can be NULL, in which case we will seek up all the
3386+
way to the top. */
3387+
if (info->section)
3388+
section_vma = info->section->vma;
3389+
33643390
for (; n >= 0; n--)
33653391
{
3392+
addr = bfd_asymbol_value (info->symtab[n]);
3393+
if (addr < section_vma)
3394+
break;
3395+
33663396
if (get_sym_code_type (info, n, &type))
33673397
{
33683398
last_sym = n;
@@ -3400,6 +3430,8 @@ print_insn_aarch64 (bfd_vma pc,
34003430
size = (pc & 1) ? 1 : 2;
34013431
}
34023432
}
3433+
else
3434+
last_type = type;
34033435

34043436
if (last_type == MAP_DATA)
34053437
{

0 commit comments

Comments
 (0)