Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't mark data as code #164

Open
FungusN0 opened this issue Feb 16, 2025 · 6 comments
Open

Can't mark data as code #164

FungusN0 opened this issue Feb 16, 2025 · 6 comments
Labels
enhancement New feature or request

Comments

@FungusN0
Copy link

I'm disassembling some code that uses a jump table at the start of the code. It only recognizes the first jmp as valid code and the routine it points to. Everything after that is marked as data and there is no function such as "analyze code" or "mark as code".

@fadden fadden added the enhancement New feature or request label Feb 16, 2025
@fadden
Copy link
Owner

fadden commented Feb 16, 2025

Select the next piece that starts with $4c, right-click, and select "tag address as code start point" (or hit Ctrl+H Ctrl+C). Repeat.

The "repeat" part is sub-optimal. This was actually requested a while back (#22) as a fully-automated recognition feature. It might be better as a special behavior of the tag feature, where you select the block of JMP instructions and a single "tag" operation properly handles the whole block.

@FungusN0
Copy link
Author

I agree that is suboptimal. Should be able to select blocks of anything and mark them as code, text, data, vectors, inline, and other common things in 6502. That would greatly speed up reversing things.

Some smaller intelligences could be made too, in terms of code that uses inline data after a JSR to a function that uses the return address+1 as a pointer to that data. This is tangential however.

@BacchusFLT
Copy link

BacchusFLT commented Feb 16, 2025 via email

@fadden
Copy link
Owner

fadden commented Feb 16, 2025

I agree that is suboptimal. Should be able to select blocks of anything and mark them as code, text, data, vectors, inline, and other common things in 6502. That would greatly speed up reversing things.

Marking ranges as various types of data (e.g. 16-bit address vectors), and as inline data, is supported. Code is more tricky than the others because you don't want to mark the entire block as code, but rather mark all code entry points. This is because SourceGen uses code tracing to find code areas, rather than simply "color coding" ranges. Also, sometimes you actually do need to mark multiple bytes of a single instruction as entry points, e.g. when a BIT instruction is used to wrap an immediate load.

Some smaller intelligences could be made too, in terms of code that uses inline data after a JSR to a function that uses the return address+1 as a pointer to that data. This is tangential however.

Inline data that follows a JSR can be formatted with extension scripts (https://6502bench.com/sgtutorial/extension-scripts.html). Some basic ones for addresses and strings are provided.

@FungusN0
Copy link
Author

Glad I lured you over to 6502 bench :)

Pontus,

Actually Grue did ;)

Marking ranges as various types of data (e.g. 16-bit address vectors), and as inline data, is supported. Code is more tricky than the others because you don't want to mark the entire block as code, but rather mark all code entry points. This is because SourceGen uses code tracing to find code areas, rather than simply "color coding" ranges. Also, sometimes you actually do need to mark multiple bytes of a single instruction as entry points, e.g. when a BIT instruction is used to wrap an immediate load.

OK, maybe the heuristics could be made a little smarter?

Some smaller intelligences could be made too, in terms of code that uses inline data after a JSR to a function that uses the return address+1 as a pointer to that data. This is tangential however.

Inline data that follows a JSR can be formatted with extension scripts (https://6502bench.com/sgtutorial/extension-scripts.html). Some basic ones for addresses and strings are provided.

That should maybe part of the tool since it's very very common thing to do, as are the others I mentioned. I view needing scripts as something for special cases like decryption or obfuscation that is program dependent.

@fadden
Copy link
Owner

fadden commented Feb 17, 2025

Inline data that follows a JSR can be formatted with extension scripts (https://6502bench.com/sgtutorial/extension-scripts.html). Some basic ones for addresses and strings are provided.

That should maybe part of the tool since it's very very common thing to do, as are the others I mentioned. I view needing scripts as something for special cases like decryption or obfuscation that is program dependent.

My experience has been that inline data following a JSR is often fairly custom. The built-in script handles a lot of situations, but it's not uncommon to follow the JSR with a structure, like a two-byte text position before the string data. Apple ProDOS system calls are JSRs followed by a command code and an address, and it's useful to follow the address to format the parameter block there as well. I didn't want to have one mechanism for "simple" things and another for "complex" things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants