Skip to content

Conversation

Zunawe
Copy link
Collaborator

@Zunawe Zunawe commented Oct 5, 2025

What is this fixing or adding?

Emerald patch files including the base patch is redundant. The only benefit it provides is the ability to patch a game regardless of whether there's a mismatch between the apworld version used to generate the patch file and the one used to patch it. But this is rendered useless anyway because if the base patch changes, the client will reject it anyway and prevent you from connecting.

So instead we can just check that the base patch used to generate is the same file as the one in our currently installed apworld and use that one. Reduces patch file size by about 85%.

Also a couple typing/attribute things that popped out while I was in __init__.

How was this tested?

Patched an old patch file on this branch to make sure it still worked. Generated and patched on this branch. And then artificially modified the base patch file between generation and patching to see the error.

@github-actions github-actions bot added the waiting-on: peer-review Issue/PR has not been reviewed by enough people yet. label Oct 5, 2025

patch = PokemonEmeraldProcedurePatch(player=self.player, player_name=self.player_name)
patch.write_file("base_patch.bsdiff4", pkgutil.get_data(__name__, "data/base_patch.bsdiff4"))
patch.base_patch_hash = hashlib.sha256(pkgutil.get_data(__name__, "data/base_patch.bsdiff4")).hexdigest()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of this base_patch_hash attribute. During generation, it gets set here, and it is used to populate the manifest. When patching, it gets populated by the manifest. Messing with the constructor of PokemonEmeraldProcedurePatch seemed messier than I expected. I considered putting it as an arg in the base patch step of PokemonEmeraldProcedurePatch.procedure, but that would mean loading and hashing the base patch file on import. I looked into a decorator that could lazy load the procedure the first time it gets accessed, but it looked hacky and only got one use.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could cache the hash on a property? It'd still get loaded on first import, but not on every import.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were just thrown into the procedure attribute it would only load the one time anyway, right?. But that's primarily what I'm trying to avoid, since iirc Emerald is already somewhat slow to import among core games.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanna say a property would make it more lazy, but I should probably test that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok yeah that was completely off. This seems fine enough. Any particular reason you don't just put the current basepatch hash as a variable? BTW I don't think Emerald is in the top 10 of slowest imports anymore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do realize now actually that this should go on the world as a static attribute rather than recalculating it every time. I'll do that tomorrow.

I tried passing the hash in as an arg to the constructor, and it caused some weirdness that I couldn't quite figure out regarding calling the superclass' constructor. Trying to pass the kwargs back up looked odd and I got an error during patching if my constructor had a positional arg. I agree that it probably makes more sense in the constructor though, conceptually.

@Zunawe Zunawe added the is: enhancement Issues requesting new features or pull requests implementing new features. label Oct 5, 2025
@black-sliver
Copy link
Member

black-sliver commented Oct 19, 2025

Just noting that this moves us one step further from having an alternative implementation of appp.

More importantly it makes it impossible to run a game rolled on Archipelago version N-1 with a client running on Archipelago version N, which is something we explicitly want to support.

I'm happy for clarification on those two points from another core maintainer.

@Zunawe
Copy link
Collaborator Author

Zunawe commented Oct 19, 2025

When the base patch changes, the difference is generally dramatic, and there's not much that can be done in order to keep an apworld client compatible with a previous version, short of including the old data in the new apworld and adding a bunch of if statements to the client. If the base patch changes, so does Emerald's major version number. Emerald v2 clients were not compatible with Emerald v1 patches, but they would detect it and explain as much to the user. So if we want to make an effort to keep Emerald backwards compatible when the base patch changes, it would be a pretty simple extra step to include the old base patch as well as the old data. The addition of this PR doesn't change the feasibility of keeping things backward compatible.

And this PR alone doesn't violate backwards compatibility either.

If we're trying to replace APPP and this interferes with that, I wasn't aware.

The motivation was to cut down on patch file size to help save memory on the webhost primarily. It's nice to have smaller downloads, but I don't think any users are really agonizing over the size of Emerald patch files. If that boon isn't worth the tradeoffs, that's fine.

@black-sliver
Copy link
Member

When the base patch changes, the difference is generally dramatic

Can't there be a base patch change that just fixes, say, a crash but leave the major version (that is tested by the client) in tact? However in such a case, using the newer basepatch could actually improve things for the player, but you verify the hash so this would not work with the current system either. But, like, if we don't go all the way, there is probably absolutely no point in keeping the basepatch unless we want to support alternative APPP patcher implementations (read below).

To allow BHClient to support multiple versions of emerald, one solution would be separating the generation code from the client code (or having the client definitions just be json files), which would allow having 2 versions of the client code at any time.

If we're trying to replace APPP and this interferes with that, I wasn't aware.

I may not have been clear about this. When APPP was implemented/merged, I mentioned that what was proposed has the following problem:
The way it's structured, patching requires the patcher to know the game involved because the procedures are not properly namespaced.
That is Pokemon R/B could have a different apply_base_patch than Pokemon Emerald, but the code can not know that apply_tokens and apply_bsdiff4 is the same between Pokemon R/B and Pokemon Emerald without having the actual APWorlds loaded. If all your APPP does is apply_bsdiff4 + apply_tokens, you actually do not need to have the world loaded to patch the game.

Namespacing would mean something like putting "pokemon_emerald:apply_base_patch" into the APPP instead of just "apply_base_patch".

An alternative patcher would be relevant if you run the client and the game on two different machines.
And namespacing might even be relevant for "regular" AP setups if we take lazy loading to an extreme.

This is not something you can fix in your world, but I am yet to hear back from anyone else on the core team if we really want to keep the current system, and just wanted to bring it up in case this leads somewhere.

--

Assuming we "fix" neither of the two things, I don't see anything wrong with your change fwiw, but I would not want to merge this for 0.6.4 but at a later point.

@Zunawe
Copy link
Collaborator Author

Zunawe commented Oct 20, 2025

Can't there be a base patch change that just fixes, say, a crash but leave the major version (that is tested by the client) in tact?

Changes made to the base patch version of the game happen in C source code. There's no step in the process where I know what affect a change has had on the modified binary file unless I manually inspect it. Everything is designed to draw from data exported while the base patch is being created because any given change could have wide effects. Realistically, small changes might only push a few bytes around, but keeping an eye on changes in the binary or trying to manage them is a pain. Other than doing a manual review of the difference between the exported data in the fixed and unfixed binaries to see if anything changed, the way to deal with handling two different base patches is to keep a copy of the old addresses around and swap between which addresses it's using.

For serious problems that can be fixed with small changes in scripting, data, or instructions, adding an extra few commands to the token file during generation is simpler than exporting a new base patch, and it guarantees not moving stuff around.


As for the custom APPP stuff, I see what you mean now. But the difference between needing to load the Emerald apworld and not needing to load Emerald apworld doesn't seem significant to me. It might be beneficial to separate the need to load all apworlds from the ability to patch a game, but lazy loaded worlds knocks out the vast majority of the problem anyway, right? And really the only people who need to patch an Emerald game without pretty much immediately loading the Emerald apworld to connect a client are those who are using AP as a solo randomizer. Everyone else needs the client anyway. So I don't think it's a big deal for Emerald specifically, at least.

@black-sliver
Copy link
Member

black-sliver commented Oct 20, 2025

but lazy loaded worlds knocks out the vast majority of the problem anyway, right?

We are not there yet, so it's hard to say, but the way it is written now, AutoBizHawkClientRegister can't be populated lazily and AutoPatchRegister is used by WebHost, which means it also has to be pre-populated, however this is separate from AutoPatchExtensionRegister, which also needs its own lazy loading to populate the patch extensions that will be needed (this is complicated by required_extensions).

Everyone else needs the client anyway

But it could be on 2 separate systems. For apbpatch we have a pure C implementation that is also available as WASM (a website). Another use-case would be an emulator or launcher that can directly open an APPP.

I think the way to fix this without breaking existing behaviour is to add some sort of prefix to methods in derived classes that are invalid to be used in the base class (i.e. never override) and then when such a prefix is found, look at the manifest's game to resolve those, while being able to use the Base "core" APPatchExtension for those that do not have that prefix. This still does not solve multi-level inheritance nor the required_extensions, but it would get us 90% there. (Both of those things could be emulated if we were certain there was no overriding happening.)

Another thing worth considering would be a generic APPatchExtension.version to make sure we never try to use an incompatible APPatchExtension.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

is: enhancement Issues requesting new features or pull requests implementing new features. waiting-on: peer-review Issue/PR has not been reviewed by enough people yet.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants