You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Changelog for v3.3 release:
- fixed regression regarding deduplication of consecutive data lines
(added in v3.2) messing up disassembly split into separate files
(i.e. reconstructed source files) (fixes issue #16)
- prevent very long lines when deduplicating consecutive data lines by
truncating hex output/display + appending '..'
- added support for regions with multiple access sizes when generating/
outputting possible hints for code objects
- extended pretty printer (modules/module_pretty_print.py) to produce
hex dumps of bytes and other bytes-like objects
- extended file writer (module_miscellaneous.py) to create folders for
destination path if missing
- fixed regex strings in 're.match' and 're.search' calls producing
'SyntaxWarning's with Python 3.12+ due to invalid escape sequences
(https://stackoverflow.com/a/52335971/1976617)
- applied various minor changes (console output, code formatting,
comments, etc.)
Copy file name to clipboardExpand all lines: CHANGELOG.md
+14Lines changed: 14 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,13 @@
1
+
## Changelog for v3.3 release
2
+
3
+
- fixed regression regarding *deduplication of consecutive data lines* (added in v3.2) messing up disassembly split into separate files (i.e. reconstructed source files) (fixes issue #16)
4
+
- prevent very long lines when deduplicating consecutive data lines by truncating hex output/display + appending '..'
5
+
- added support for regions with multiple access sizes when generating/outputting possible hints for code objects
6
+
- extended pretty printer (`modules/module_pretty_print.py`) to produce hex dumps of bytes and other bytes-like objects
7
+
- extended file writer (`module_miscellaneous.py`) to create folders for destination path if missing
8
+
- fixed regex strings in `re.match` and `re.search` calls producing `SyntaxWarning`s with Python 3.12+ due to invalid escape sequences (https://stackoverflow.com/a/52335971/1976617)
9
+
- applied various minor changes (console output, code formatting, comments, etc.)
10
+
1
11
## Changelog for v3.2 release
2
12
3
13
- added algorithm to *deduplicate consecutive data lines* in formatted disassembly (*greatly* reduces disassembly size for data objects)
@@ -38,3 +48,7 @@
38
48
- initial release
39
49
- monolithic (everything in one single source file)
40
50
- originally named 'wcdctool' (*Watcom Decompilation Tool*)
Copy file name to clipboardExpand all lines: README.md
+11-10Lines changed: 11 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,13 +24,13 @@ Thus, I began writing my own tool. What originally started out as *mkdecomptool*
24
24
25
25
Note that while wcdatool performs the tasks it is designed for quite well, it is not intended to compete with or replace high-end tools like *IDA Pro* or *Ghidra*.
26
26
27
-
## Current state and future development
27
+
## Current state / future development
28
28
29
-
Wcdatool is *work in progress*. You can tell from looking at the source code - there's tons of TODO, TESTING, FIXME, etc. flying around. Also, it is relatively slow as performance has not been the main focus ([Cython](https://cython.org/) might be utilized in the future to increase performance).
29
+
Wcdatool works quite well in its current state - you'll get a well-readable, reasonably structured disassembly output (*objdump* format, *Intel* syntax). Check out issues [#9](https://github.com/fonic/wcdatool/issues/9) and [#11](https://github.com/fonic/wcdatool/issues/11) for games other than *Mortal Kombat* that wcdatool worked nicely for thus far. **Please note that wcdatool works best when used on executables that contain debug symbols.** If you come across other *unstripped**Watcom*-based DOS applications that may be used for further testing and development, please let me know.
30
30
31
-
Nevertheless, it works quite well in its current state - you'll get a well-readable, reasonably structured disassembly output (*objdump* format, *Intel* syntax). Check out issues [#9](https://github.com/fonic/wcdatool/issues/9) and [#11](https://github.com/fonic/wcdatool/issues/11) for games other than*Mortal Kombat* that wcdatool worked nicely for thus far. Please note that wcdatool works best when used on executables that contain debug symbols. If you come across other *unstripped**Watcom*-based DOS applications that may be used for further testing and development, please let me know.
31
+
**However, the current approach has reached its EOL.** There is no point in advancing it any further (aside from fixing bugs), as there are limits inherent to the fundamental design that cannot be overcome easily. Thus, the next major goal is to cleanly *rewrite the disassembler module* and transition from *static code disassembly* to *execution flow tracing* (e.g.*Mortal Kombat 2* executable contains code within its data object, which is neither discovered nor analyzed with the current approach). Also, instead of treating objects separately, a *linear unified address space* containing all object data shall be implemented. This will allow to *apply fixups on a binary level*, which should simplify dealing with references that cross object boundaries and with placeholders (stubs) that are replaced via fixups at run time.
32
32
33
-
The *next major goal* is to cleanly rewrite the disassembler module and transition from *static code disassembly* to *execution flow tracing* (e.g. *Mortal Kombat 2* executable contains code within its data object, which is neither discovered nor processed with the current approach).
33
+
Last but not least, wcdatool in its current state is relatively slow, as performance has not been the main focus during development. [Cython](https://cython.org/) might be utilized in the future to increase performance.
34
34
35
35
## Output sample
36
36
@@ -97,17 +97,18 @@ There are multiple ways to use *wcdatool*, but the following instructions should
97
97
98
98
7. Have a look at the results in `wcdatool/Output`:
99
99
- File `<name-of-executable>_zzz_log.txt` contains *log messages* (same as console output, but without coloring/formatting)
- Folder `<name-of-executable>_modules` contains *formatted disassembly split into separate files* (this attempts to reconstruct the application's original source files if corresponding debug information is available)
- Files `<name-of-executable>_disasm_object_x_disassembly_formatted.asm` contain *formatted disassembly* (this is arguably the most interesting/useful output)
102
+
- Files `<name-of-executable>_disasm_object_x_disassembly_formatted_deduplicated.asm` contain *formatted deduplicated disassembly* (same as above, but with data portions being compressed for increased readability where applicable)
103
+
- Folder `<name-of-executable>_modules` contains *formatted disassembly split into separate files* (same as above, additionally attempts to reconstruct an application's original source files if corresponding debug information is available)
103
104
104
105
**NOTE:** if you are new to assembler/assembly language, check out this [x86 Assembly Guide](https://www.cs.virginia.edu/~evans/cs216/guides/x86.html)
105
106
106
107
8. Refine the output by analyzing the disassembly, updating the object hints and re-running *wcdatool* (i.e. loop steps 5-8):
107
-
- Identify and add hints for regions in code objects that are actually data (look for `; misplaced item` comments, `(bad)` assembly instructions and labels with `; access size` comments)
108
+
- Identify and add hints for regions in code objects that are actually data (look for `; misplaced item` comments, `(bad)` assembly instructions and labels with trailing `; access size` comments)
108
109
- Identify and add hints for regions in data objects that are actually code (look for `call`/`jmp` instructions in code objects with fixup targets pointing to data objects)
109
110
- Check section `Possible object hints` of *wcdatool*'s console output / log file for suggestions (not guaranteed to be correct, but likely a good starting point)
110
-
-*The ultimate goal here is to eliminate all (or at least most) warnings issued by wcdatool*. Each warning points out a region of the disassembly that does currently seem flawed and therefore requires further attention/investigation. Note that there is a *cascading effect* at work (e.g. a region of data that is falsely intepreted as code may produce bogus branches, leading to further warnings), thus warnings should be tackled one (or few) at a time from first to last with *wcdatool* re-runs in between
111
+
-*The ultimate goal is to eliminate all (or at least most) warnings issued by wcdatool*. Each warning points out a region of the disassembly that does currently seem flawed and therefore requires further attention/investigation. Note that there is a *cascading effect* at work (e.g. a region of data that is falsely intepreted as code may produce bogus branches, leading to further warnings), thus warnings should be tackled one (or few) at a time from first to last with *wcdatool* re-runs in between
111
112
112
113
**NOTE:** this is by far the most time-consuming part, but *crucial* to achieve good and clean results (!)
113
114
@@ -153,4 +154,4 @@ If you want to get in touch with me, give feedback, ask questions or simply need
0 commit comments