Skip to content

Conversation

Fuzzyma
Copy link
Collaborator

@Fuzzyma Fuzzyma commented Mar 19, 2025

This PR does a few things:

  • Update types to reflect the proposed codemod format (backwards compatible)
  • Change codemods that replace something to the new format (that were only 3)
  • qs had a bit more of an overhaul and uses the warnings field
  • There are 2 types of options now. The one you pass in when calling the codemod factory. This is e.g. for general behavior (stuff like: strict mode, fail on warning, stuff like that). Then there is an options parameter that is passed into the transform function itself. This is utilized to control the astlib that is used in the codemod. e.g. many codemods passed options to toSource of jsopenshift. Those control stuff like double vs singlequotes but also support sourcemap generation. This is basically the old options object.
  • I also added a new field "astlib" that can be passed to the codemod. That way a consuming lib can check which lib is used in the codemod and pass corresponding options

When rolldown is a thing, a file would also come already parsed into an ast. So I think we should also support the case when an ast is available already and reuse it. That would take away from the free choice of lib tho. So for now thats food for thought.

Example usage:

import qs from './codemods/qs';

const codemod = qs({ strict: true });

const someSource = 'import qs from "qs"' // whatever

const result = await codemod.transform({
  file: {
    source: someSource,
    filename: 'someFilename.js'
  },
  // this part wont work yet because qs doesnt have the astlib field set
  options: codemod.astlib === 'recast' || codemod.astlib === 'jscodeshift' ?  {
    qoutes: 'single'
  } : undefined
})

const newSource = typeof result === 'string' ? result : result.code

console.log(newSource); // import * as pc from "picoquery"
console.log(result);
/*
{
  code: 'import * as pq from "picoquery"',
  warnings: [],
  replacements: { qs: 'picoquery' }
}
*/

Open for suggestions!

- codemods return what they replace
- codemods can return a sourcemap
- codemods transform function gets passed an option object which contains options related to the used astlib specified in astlib
- codemods can be configure via codemod options
- unified usage of quote types
- added script to renew fixtures
@thepassle
Copy link
Collaborator

thepassle commented Jul 8, 2025

Ok this makes sense to me. One thing I'm not super sure about is the api here:

const result = await codemod.transform({
  // this part wont work yet because qs doesnt have the astlib field set
  options: codemod.astlib === 'recast' || codemod.astlib === 'jscodeshift' ?  {
    qoutes: 'single'
  } : undefined
})

This feels a bit clunky to me for some reason, maybe we can think of a different way to do this? Maybe just:

const result = await codemod.transform({
  jscodeshift: { /* etc */ },
})

const result = await codemod.transform({
  recast: { /* etc */ },
})

Also, why not add the astlib property in this PR as well?

@thepassle
Copy link
Collaborator

And would we still need replacements now that codemods have a to property?

@thepassle
Copy link
Collaborator

For codemods that just return a string, you still wont know if the codemod changed something, right?

@Fuzzyma
Copy link
Collaborator Author

Fuzzyma commented Jul 8, 2025

This feels a bit clunky to me for some reason, maybe we can think of a different way to do this?

Yes you are right. It does feel clunky and I like your idea. But it would be limited to known ast libs. If a codemod uses one that is not in our list of supported options, it wont work anymore.

Also, why not add the astlib property in this PR as well?

I did add it to the types. But I didnt want to rewrite all codemods until this is somewhat approved

And would we still need replacements now that codemods have a to property?

Yes, because the "to" prop is static. Replacements is only filled if the codemod actually did the replacement. E.g. maybe the qs codemod is configured (via options) to only do the replacements if it can guarantee same behavior. However, thats not always possible because qs covers loads of cases. So in that case the codemod might find that it cannot replace qs. Replacements would be empty in that case.

Maybe explaining my usecase helps: I am building a vite plugin that runs codemod on transitive dependencies to cut out bloat from your app bundle by replacing imports. The user most probably isnt aware of every single package that is used in its dependencies so they most likely will run more codemods than actually needed. I want to be able to prompt the user to install required packages based on what the codemod actually replaced in the end. It is also useful for the cli. People might run codemods on huge codebases and they might not be aware which packages are used so they maybe run all of them just to be sure. And the cli in the end should also report which new packages they need to install now

Hence: dynamic replacements property :D.

For codemods that just return a string, you still wont know if the codemod changed something, right?

I didnt want to introduce a breaking change. Thats why I left the string-return in there. And since I also didnt want to change all codemods in this PR, I would rather deprecate the string return and slowly convert all codemods over.

@thepassle
Copy link
Collaborator

Yes you are right. It does feel clunky and I like your idea. But it would be limited to known ast libs. If a codemod uses one that is not in our list of supported options, it wont work anymore.

Well it would only be ast libs used by our codemods right? So we'd be in control of the list, and can just add if we decide to introduce more astlibs used by our codemods

I didnt want to introduce a breaking change.

as far as im concerned its fine to do a breaking change

@Fuzzyma
Copy link
Collaborator Author

Fuzzyma commented Jul 8, 2025

Well it would only be ast libs used by our codemods right? So we'd be in control of the list, and can just add if we decide to introduce more astlibs used by our codemods

yes and no. Depending on how popular this whole thing becomes :D.
I am also thinking about other people contributing codemods. They might be more likely to add a codemod but less likely to change types. But I think you are right. Its a small thing and I like your syntax way more

as far as im concerned its fine to do a breaking change

Ok, then I will change this to always return proper TransformResults.

Adding to all this, I discussed this issue with @43081j a while ago and there were some talking points that I dont want to ignore:

  • should the warnings be rather a log array with different types of logs (aka info, warning, error)?
  • should it be an array or rather an injected log function? Having it return an array means, the codemod has to return to show any messages. Maybe sometimes you want to log while the codemod is run. So injecting a log function that the codemod can use for logging might be the better abstraction (?)
  • should the codemod even continue on error or abort?
  • should this PR already add sourcemaps? (they are optional anyways, so we could add them when actually needed. I want them in my plugin later. Thats why I added them)

@thepassle
Copy link
Collaborator

should the warnings be rather a log array with different types of logs (aka info, warning, error)?

That makes sense to me (having different types of logs)

should it be an array or rather an injected log function? Having it return an array means, the codemod has to return to show any messages. Maybe sometimes you want to log while the codemod is run. So injecting a log function that the codemod can use for logging might be the better abstraction (?)

Hm, I think there's pros/cons to either, I dont have a super strong opinion about this though

should the codemod even continue on error or abort?

Hopefully codemods dont error, if there are bugs we should fix that, but I feel like its up to the wrapping code to catch any errors from the codemod

should this PR already add sourcemaps? (they are optional anyways, so we could add them when actually needed. I want them in my plugin later. Thats why I added them)

Is there a reason we shouldnt? (Not being snarky, just trying to understand :) )

@Fuzzyma
Copy link
Collaborator Author

Fuzzyma commented Jul 8, 2025

Hm, I think there's pros/cons to either, I dont have a super strong opinion about this though

James wanted to go with returning the logs. Its a simpler api for sure. But also less flexible

Hopefully codemods dont error, if there are bugs we should fix that, but I feel like its up to the wrapping code to catch any errors from the codemod

Yeah but then again, does it make sense for the codemod to log errors if it cant error? :D I keep going back and forth on that one

Hopefully codemods dont error, if there are bugs we should fix that, but I feel like its up to the wrapping code to catch any errors from the codemod

I am all for it. James had his reservations. He didnt want to add it until really needed. He has a point. But I can foresee the usage so I wouldnt mind adding them now

@43081j
Copy link
Contributor

43081j commented Jul 9, 2025

astlib - i don't think is quite the right pattern to use here. every codemod is individually exported, so we could just have a generically typed codemod type

something like this:

/**
 * @returns {Codemod<AstGrepOptions>}
 */

then the individual codemod knows that its options.parserOptions or something is AstGrepOptions

if you're programmatically trying to configure any given codemod though, this won't help you. but in that case, i don't think you should be trying to expose/inject the underlying parser/library options.

i think whatever it is you're finding yourself having to configure, should be exposed as our own option generically

importantly, i don't mean literally add all the various codeshift and astgrep options to our options. somewhere there's a subset of those you're trying to configure, but only because you're trying to achieve something higher level. what is that? whatever that is, is what we should expose.

warnings/errors

afaict, there's no requirement we want to fill yet which means we should change error handling.

if a codemod can't do its job (e.g. it failed to parse, you gave it gibberish), it should throw i think.

if it does no-op (e.g. you migrated chalk but didn't have chalk in the first place), it should just return the code as-is.

do we have an example of where we'd want to warn rather than error?

@Fuzzyma
Copy link
Collaborator Author

Fuzzyma commented Jul 9, 2025

The things I want to configure for codemods is mostly their output. There is only one example that I can think of which is singleQuotes: true / false. Because I feel like the user should be able to produce the code closest to their code style.

I disagree that every codemod should export its own options. It makes it impossible to use those options in a script without checking each and every codemod that is used. I added the options specifically to control the output style of the ast lib and the sourcemap generation part. Since this is specific to the ast lib used, it makes sense to just expose those as is.

There are also notably 2 types of options. The one passed to the codemod factory and the ones passed to the transform. The user of the codemod probably often only has access to the factory options while scripts will probably control the transform options.

do we have an example of where we'd want to warn rather than error?

Yes, the qs codemod. I am ok with throwing. But we should agree on if we want to always return a TransformResult or also support string returns. For the consumer its obviously easier to always expect the same instead of doing a type check on the return

@43081j
Copy link
Contributor

43081j commented Jul 9, 2025

The things I want to configure for codemods is mostly their output. There is only one example that I can think of which is singleQuotes: true / false. Because I feel like the user should be able to produce the code closest to their code style

some of the codemods already do this internally and infer the quote type from the existing source

maybe just do that as best effort and leave the user to run a formatter or fix it up if its not quite right?

its a possibly deep rabbit hole if we add such an option, otherwise. since you could go on forever (quotes, curly if, import formatting, ternaries or not, etc). i think you just have to go for it and leave the user to reformat it if they need.

some of the more complex codemods for example, inject multiple imports. these won't be formatted well, but we don't care because its a lost cause trying to format them in a way everyone is happy with. we just delegate to the user

Yes, the qs codemod. I am ok with throwing. But we should agree on if we want to always return a TransformResult or also support string returns. For the consumer its obviously easier to always expect the same instead of doing a type check on the return

i think the qs warnings are useful but are basically debug level almost. i don't think we need to return these.

i think the following should happen:

  • if the codemod can't run (parse error, something catastrophic), throw
  • if it has to drop some things or can't one-to-one map some stuff, warn
  • possibly introduce a quiet: boolean option for codemods to disable these warnings

@Fuzzyma
Copy link
Collaborator Author

Fuzzyma commented Jul 9, 2025

its a possibly deep rabbit hole if we add such an option, otherwise. since you could go on forever (quotes, curly if, import formatting, ternaries or not, etc). i think you just have to go for it and leave the user to reformat it if they need.

No if i can just pass in the options of the ast lib used its not a rabbithole at all. its very constrained. And it is required for recast for sourcemap generation to work properly i think

i think the qs warnings are useful but are basically debug level almost. i don't think we need to return these.

But I dont want to show the wanrings somewhere in between were they get lost. I want a proper output that i can show when I want.

I did introduce a strict flag that I passed to the codemod factory as option:

import qsCodemod from "../../test/codemods/qs";

const codemod = qsCodemod({ strict: false })

The reason being that the user would pass me the codemods he want to use but still want to supply options.
In this case, the qs codemod would error out when it cannot replicate the behavior to 100%.

So the options passed to the factory would be end-user facing options and the options passed to transform would be lower level options

@43081j
Copy link
Contributor

43081j commented Jul 9, 2025

But I dont want to show the wanrings somewhere in between were they get lost. I want a proper output that i can show when I want.

the only place we warn right now is debug level, is my point. we probably shouldn't be outputting those warnings by default, which means we certainly shouldn't be returning them. they're not intended to be consumed by anything

a codemod has two completion states: either it worked (and we don't care if it was lossy), or it couldn't run and threw

for quotes, we already detect the ones to use in various codemods. that 'best effort' is enough as we're not expecting to output code formatted the same way as the repo it is in. that's the user's job, otherwise we end up implementing all sorts of formatting logic.

on a side note, exposing any third party types (including options) isn't a great idea for maintaining a public API. we should either have our own type or not do it. in this case, it does feel like we don't need to do any of this

@Fuzzyma
Copy link
Collaborator Author

Fuzzyma commented Jul 9, 2025

for quotes, we already detect the ones to use in various codemods. that 'best effort' is enough as we're not expecting to output code formatted the same way as the repo it is in. that's the user's job, otherwise we end up implementing all sorts of formatting logic.

We are not implementing anything. We are simply passing options through. There is no work on our side and there are no downsides. This is also not only about formatting. Recast for example needs options about the sourcemap name which can only be passed by tools that utilize the codemod.

// EDIT: Also really useful for testing because our formatter prefers one type of quotes for the fixtures and the codemod needs to create the same. Otherwise we have to ignore the fixtures from formating

the only place we warn right now is debug level, is my point. we probably shouldn't be outputting those warnings by default, which means we certainly shouldn't be returning them. they're not intended to be consumed by anything

I consider these to be significant warnings, not debug-level information, as they indicate potential behavioral changes that are vital for users to understand especially when using the codemod trough a tool. I do have a usecase for the returned logs, and I do want to consume them to show them properly to the user. So it would be nice if my usecase could be taken into consideration as well

a codemod has two completion states: either it worked (and we don't care if it was lossy), or it couldn't run and threw

I disagree. Whenever another tool uses the codemod, it needs all meta data it can get about what the codemod did. Especially it it ran partially or not

@43081j
Copy link
Contributor

43081j commented Jul 9, 2025

We are not implementing anything. We are simply passing options through. There is no work on our side and there are no downsides. This is also not only about formatting. Recast for example needs options about the sourcemap name which can only be passed by tools that utilize the codemod.

it means our public interface is directly tied into the public interface of our dependencies, i.e. that there exists a property whose type is entirely decided by a dependency.

instead of that, we should figure out what exactly it is we're trying to expose, and do it at a higher level i think

maybe we need to discuss this on discord as it seems like we need more info on what exactly you're trying to do/what you're running into

I consider these to be significant warnings, not debug-level information, as they indicate potential behavioral changes that are vital for users to understand especially when using the codemod trough a tool

they are useful information but were intended to be basically debug level info. so we shouldn't be looking to expose the logs, as they were never really intended to be consumed anyway.

instead, if you want to know what a codemod has "lost" (if its even aware), we should come up with some kind of Diagnostic or Warning type unrelated to these existing logs

then we leave the logs as they are (or silence them, only showing during debug), and separately write some code to emit useful diagnostics/warnings using a new structure and content

Especially it it ran partially or not

most will run partially, we can't really know. all codemods for migrating from one dependency to another should be assumed to be lossy. since there'll certainly be things we missed, or non-one-to-one mappings, etc.

in places where we know about them, we can do the diagnostics stuff. but know that none of them will ever run "non partially". all of them will have edge cases they don't fully migrate, without knowing. or at least we should assume as much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants