Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethinking the web-vault repository #137

Open
stefan0xC opened this issue Aug 14, 2023 · 13 comments
Open

Rethinking the web-vault repository #137

stefan0xC opened this issue Aug 14, 2023 · 13 comments

Comments

@stefan0xC
Copy link
Contributor

I've been working on a soft fork of the clients repository where the changes to the web-vault can be directly applied. The idea is to having a separate branch for each web-vault release, which is likely easier to manage than the way we currently do with the patch files. And we could manage the releases using that repository as a submodule in this repository like this.

While this is a proof of concept right now, I believe it would make it easier to keep track of the changes in the future because the changes can be annotated in the commits (cf. my v2023.7.1 branch).

If we want we could also add the old patch files more or less unmodified like I've done in v2023.5.1.patch.

By using something like git cherry-pick v2023.7.1~4..v2023.7.1 (which I've tested on the rc branch) it's easy to re-apply the changeset to a new release (and it's also more comfortable to use in case of merge conflicts).

Keeping the changes to the bitwarden/clients repository separate from this repository would also make it possible to just change the version in app/web/package.json so we could get rid of the vw-version.json file entirely.

The obvious drawback is that we'd have to maintain a fork but I think that it would be worth it.

@BlackDex
Copy link
Collaborator

I think this way of working has been mentioned a few times before.
I do see some issues with this. mainly that it could be harder for other people if the want to pull the bitwarden client repo them selfs and patch it for Vaultwarden.

I also think the patch files make it a bit more transparent into what is going to be patched.
If i understand the way you mention correctly, is that we just fork the client repo, and make the adjustments and to a rebase everytime needed right?

That might be easier for small changes and sometimes also for some larger, but i think it will loose a good way to see what has been patched. specifically for Vaultwarden.

It might be that i overlook something, so please let me know, or explain it better or even graphically explain it which might help others also to understand it.

@stefan0xC
Copy link
Contributor Author

stefan0xC commented Aug 15, 2023

Hm, I think you overestimate the clarity of the patch files a bit. Sure, I can see what gets patched but since the patch file is for the most part not annotated, I will not understand why a patch has been applied (and most of the time I won't care about older patches either, so the accumulating list of patch files is more confusing than helpful). And having the changeset in a single patch file, which are sorted by filename instead of being grouped logically is not the best user experience either.

By splitting the changes into more manageable chunks, I also have better context for the change. And if we approach it more like my v2023.7.1 example instead of the v2023.5.1, it would also give us a better way to differentiate between functionally essential changes, our re-branding efforts and hot fixes like this commit which could also have been cherry-picked from your PR.

I'll try to explain it graphically. Upstream has different branches, e.g.

         __ (t2)  __ (rc)
        /        /
---\---/--------/---- (master)
    \
     \__ (t1)

Where t1 and t2 is shorthand for the tag web-v2023.5.1 and web-v2023.7.1 respectively.

Our fork would have the same base but adding additional branches to keep track of our changes.

         __ (t2) ----- (b2)
        /
---\---/------------- (master)
    \
     \__ (t1) -- (b1)

The changes from t2 to b2 (which would be shorthand for the branch v2023.7.1) would be a patch set that is stored not as a patch file, but instead as commits in a git repository.

I think this would be an improvement to the current way of doing things, as when the application of a patch fails, I have to manually apply the missing changes from the .rej files, generate a new patch file, checkout and reapply the patch file and check if everything is correct, which is a bit cumbersome.

Managing the changes in the forked repository we can easily create a new release (by creating a new branch v2023.8.1 from the web-v2023.8.1 tag and then just cherrypicking the commits). No need to rebase and if there are merge conflicts, git cherry-pick lets us solve them interactively. (Granted, I still have to test what happens on bigger restructuring changes but I'm rather confident that this would still be an improvement to our current approach.)

And as demonstrated in my post you can easily view the differences between the released tag (upstream) and our changed branches, and you could also create a diff file for each branch. For example, in the v2023.5.1 branch I've first applied the v2023.5.1.patch you can see that, if you compare it to the diff file of the first commit).

If you really want to, it's also possible to show the differences between two changesets by using something like git range-diff web-v2023.5.1..v2023.5.1 web-v2023.7.1..v2023.7.1 --creation-factor 100. However, I don't think this would be that interesting to most users. as the important changes are done upstream. And it might also be easier to just generate diff files and compare them. If you think they are important, we could also keep generating and storing the patch files of course, but I don't think that they are that informative or useful...

Furhermore, by actually using the bitwarden/clients repository, it becomes easier to run the development tools and ensure better code quality for our patches.

And lastly, I think working with patch files is a bit limiting in what we can do. Not only because it's bit involved, keeping the patches up to date (e.g. when we want to introduce multiple changes concurrently). Having the changes in more manageable chunks might allows us to discuss changes separately and also in the long term to maybe have different web-vault version. Eg. for users that have push notifications or sso enabled (once it has been added).

@dionysius
Copy link

2 cents: As a maintainer of the downstream debian packaging both ways would work for me.

Currently the original git upstream is referenced and whenever theres a new release here I update that reference, copy the matching patch and let the deb build tooling apply it since it offers such mechanism.

If I can reference a maintained soft-fork, I could spare the copying of the patch and would only need to update the git reference.

In any way I have some minor steps to do - could be automated.

@Timshel
Copy link
Contributor

Timshel commented Dec 27, 2024

Hey not sure if the change is still considered or if you want additional feedback, but since I opened some PR, and I'm maintaining an equivalent distribution.

Issues I have with the current system:

  • Modifications are not trivial to make since it means patching a patch
  • Can't use Github interface to compare the modifications.
  • Using a specific patch file for each version means that
    • Any PR will need to be reworked each time a new version is released (main reason I never opened a PR for the OpenId modifications).
    • If a change is merged while the PR for the next version is in progress then it needs to be manually added to the PR or it will be discarded.

I switched to committing the patch on top of the main branch and I find the flow easier to use:

  • Working on a new version is a simple rebase
  • It is easier to split modifications in multiple commits with more information.
  • It does mean force pushing to the main branch which will brake PR, but it should be easier to fix them (alternatively a different branch could be used for each major Bitwarden release and the default branch could be changed each time).
  • Depending on how the commit are managed different issues could arise:
    • If all new modifications are always added in new commit it could mean a complicated history after some time (but commit could be periodically squashed).
    • If past commit are edited and multiple branch are used then some merged change could be missed in the case of a long-running PR (a simple merge would reveal that some change are missing).

@BlackDex
Copy link
Collaborator

Ill have to dive into this a bit more, but as mentioned before, I'm not against a change though.

So basically, we would need a fork of the bitwarden/clients repo right?
Would a sub-repo be an option too? In such a way that we can still have it a bit separated and it doesn't look like Vaultwarden is maintaining a full fork of all the clients?

That would also make it a bit more easier to add and maintain our own github build-scripts for example.
And, if I am correct, we can minimize which directories are pulled in, though not sure if that is possible with a sub-repo too. Like the spares-checkout feature.

@Timshel
Copy link
Contributor

Timshel commented Dec 28, 2024

Would a sub-repo be an option too?

I don't think so since the goal would be to commit on top of it.

it doesn't look like Vaultwarden is maintaining a full fork of all the clients ?

I think for this, the simplest is probably to have a commit which delete all unnecessary folders/files.

@BlackDex
Copy link
Collaborator

Would it help if i would create a fork in our new vaultwarden repo both of you (@stefan0xC and @Timshel) take a look at this? Since i know both of you have done some work on this already?

Ill probably also name it vw_web_builds instead of clients to also indicate it's for web-builds only.

@Timshel
Copy link
Contributor

Timshel commented Dec 28, 2024

Would it help if i would create a fork in our new vaultwarden repo

Sounds good :).

@BlackDex
Copy link
Collaborator

Repo is created and can be found here: https://github.com/vaultwarden/vw_web_builds

@stefan0xC
Copy link
Contributor Author

@BlackDex I have a slightly different workflow. Like I said I create a new branch for every web-vault release and cherry-pick all changes from the previous release/branch. E.g. for the release web-v2024.12.1, I create a new branch based from that tag called just v2024.12.1 and then I git cherry-pick web-v2024.12.0..v2024.12.0.

Once upon a time I rebase the commits to clean up the changes or add some history (cf. the changes upon web-v2023.5.1 with web-v2023.9.1 or web-v2024.12.1).

If you can create a v2024.12.1 branch in the new repo based on the web-v2024.12.1 tag, I could create a PR request to get us started. (I mean, I could also try to add the main branch but I don't think that this will work.)

As to how we want to work with this repo in the future I have not thought that far ahead. I think it might be possible to move the build scripts from this repo to the vaultwarden repository and reference the web-vault repo as a submodule.

@Timshel
Copy link
Contributor

Timshel commented Jan 2, 2025

I will remark that if not using the main branch, unless the default branch is changed with each release, it's not evident which branch is worked on.

From the merged PR:

  1. do we want to also remove the non-gpl code from this repository? Or is it enough to remove it when building?
  2. do we want to keep the code for the other clients? Or would this be too tedious?

I think it makes sense to delete it, and when rebasing if some files were modified a git rm on the deleted root directory is usually enough to resolve the conflict.

how do we want to work with this repository?

I think it would make sense to replace all the Bitwarden releaselogic with the script used here and keep only one repository.

@stefan0xC
Copy link
Contributor Author

I will remark that if not using the main branch, unless the default branch is changed with each release, it's not evident which branch is worked on.

Well, referencing it as a submodule will not point to the branch but a specific commit. Cf. https://github.com/stefan0xC/bw_web_builds where I have it pointing to clients @ ef9c943 which is currently the top commit on the v2024.12.1 branch. Updating which commit this submodule points to is part of keeping the version up to date but if we change our build process accordingly we could keep the development between releases separate (e.g. have a branch which targets the currently supported web-vault version and one which targets the next one).

I think it makes sense to delete it, and when rebasing if some files were modified a git rm on the deleted root directory is usually enough to resolve the conflict.

That's true, but it's even less work if we don't have to git rm anything. 😇
But I'm also fine with removing it too.

I think it would make sense to replace all the Bitwarden releaselogic with the script used here and keep only one repository.

So integrate everything in the main vaultwarden repository?

@Timshel
Copy link
Contributor

Timshel commented Jan 2, 2025

Well, referencing it as a submodule will not point to the branch but a specific commit.

Yes when I answered it was with the assumption to only use https://github.com/vaultwarden/vw_web_builds to build the front package. Which is what I meant too when I mentioned moving the release logic 😅 .

An alternative could be to add all the build logic to the main branch of https://github.com/vaultwarden/vw_web_builds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants