Can't mark data as code #164

FungusN0 · 2025-02-16T02:14:47Z

I'm disassembling some code that uses a jump table at the start of the code. It only recognizes the first jmp as valid code and the routine it points to. Everything after that is marked as data and there is no function such as "analyze code" or "mark as code".

fadden · 2025-02-16T04:56:32Z

Select the next piece that starts with $4c, right-click, and select "tag address as code start point" (or hit Ctrl+H Ctrl+C). Repeat.

The "repeat" part is sub-optimal. This was actually requested a while back (#22) as a fully-automated recognition feature. It might be better as a special behavior of the tag feature, where you select the block of JMP instructions and a single "tag" operation properly handles the whole block.

FungusN0 · 2025-02-16T06:34:30Z

I agree that is suboptimal. Should be able to select blocks of anything and mark them as code, text, data, vectors, inline, and other common things in 6502. That would greatly speed up reversing things.

Some smaller intelligences could be made too, in terms of code that uses inline data after a JSR to a function that uses the return address+1 as a pointer to that data. This is tangential however.

BacchusFLT · 2025-02-16T10:33:56Z

Scott, Glad I lured you over to 6502 bench :) You can do tables as addresses with a -1, so that function is already there. Den sön 16 feb. 2025 07:34FungusN0 ***@***.***> skrev:

…

I agree that is suboptimal. Should be able to select blocks of anything and mark them as code, text, data, vectors, inline, and other common things in 6502. That would greatly speed up reversing things. Some smaller intelligences could be made too, in terms of code that uses inline data after a JSR to a function that uses the return address+1 as a pointer to that data. This is tangential however. — Reply to this email directly, view it on GitHub <#164 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGZWZSWYYENCQCA3U2H4N732QAWQZAVCNFSM6AAAAABXHCE2NGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNRRGI3TQNZTG4> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***> [image: FungusN0]*FungusN0* left a comment (fadden/6502bench#164) <#164 (comment)> I agree that is suboptimal. Should be able to select blocks of anything and mark them as code, text, data, vectors, inline, and other common things in 6502. That would greatly speed up reversing things. Some smaller intelligences could be made too, in terms of code that uses inline data after a JSR to a function that uses the return address+1 as a pointer to that data. This is tangential however. — Reply to this email directly, view it on GitHub <#164 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGZWZSWYYENCQCA3U2H4N732QAWQZAVCNFSM6AAAAABXHCE2NGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNRRGI3TQNZTG4> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

fadden · 2025-02-16T15:31:19Z

I agree that is suboptimal. Should be able to select blocks of anything and mark them as code, text, data, vectors, inline, and other common things in 6502. That would greatly speed up reversing things.

Marking ranges as various types of data (e.g. 16-bit address vectors), and as inline data, is supported. Code is more tricky than the others because you don't want to mark the entire block as code, but rather mark all code entry points. This is because SourceGen uses code tracing to find code areas, rather than simply "color coding" ranges. Also, sometimes you actually do need to mark multiple bytes of a single instruction as entry points, e.g. when a BIT instruction is used to wrap an immediate load.

Some smaller intelligences could be made too, in terms of code that uses inline data after a JSR to a function that uses the return address+1 as a pointer to that data. This is tangential however.

Inline data that follows a JSR can be formatted with extension scripts (https://6502bench.com/sgtutorial/extension-scripts.html). Some basic ones for addresses and strings are provided.

FungusN0 · 2025-02-16T20:07:16Z

Glad I lured you over to 6502 bench :)

Pontus,

Actually Grue did ;)

Marking ranges as various types of data (e.g. 16-bit address vectors), and as inline data, is supported. Code is more tricky than the others because you don't want to mark the entire block as code, but rather mark all code entry points. This is because SourceGen uses code tracing to find code areas, rather than simply "color coding" ranges. Also, sometimes you actually do need to mark multiple bytes of a single instruction as entry points, e.g. when a BIT instruction is used to wrap an immediate load.

OK, maybe the heuristics could be made a little smarter?

Some smaller intelligences could be made too, in terms of code that uses inline data after a JSR to a function that uses the return address+1 as a pointer to that data. This is tangential however.

Inline data that follows a JSR can be formatted with extension scripts (https://6502bench.com/sgtutorial/extension-scripts.html). Some basic ones for addresses and strings are provided.

That should maybe part of the tool since it's very very common thing to do, as are the others I mentioned. I view needing scripts as something for special cases like decryption or obfuscation that is program dependent.

fadden · 2025-02-17T00:42:10Z

Inline data that follows a JSR can be formatted with extension scripts (https://6502bench.com/sgtutorial/extension-scripts.html). Some basic ones for addresses and strings are provided.

That should maybe part of the tool since it's very very common thing to do, as are the others I mentioned. I view needing scripts as something for special cases like decryption or obfuscation that is program dependent.

My experience has been that inline data following a JSR is often fairly custom. The built-in script handles a lot of situations, but it's not uncommon to follow the JSR with a structure, like a two-byte text position before the string data. Apple ProDOS system calls are JSRs followed by a command code and an address, and it's useful to follow the address to format the parameter block there as well. I didn't want to have one mechanism for "simple" things and another for "complex" things.

fadden added the enhancement New feature or request label Feb 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't mark data as code #164

Can't mark data as code #164

FungusN0 commented Feb 16, 2025

fadden commented Feb 16, 2025

FungusN0 commented Feb 16, 2025

BacchusFLT commented Feb 16, 2025 via email

fadden commented Feb 16, 2025

FungusN0 commented Feb 16, 2025

fadden commented Feb 17, 2025

Can't mark data as code #164

Can't mark data as code #164

Comments

FungusN0 commented Feb 16, 2025

fadden commented Feb 16, 2025

FungusN0 commented Feb 16, 2025

BacchusFLT commented Feb 16, 2025 via email

fadden commented Feb 16, 2025

FungusN0 commented Feb 16, 2025

fadden commented Feb 17, 2025