Skip to content

Conversation

@JosefAnthony
Copy link

This PR modifies the scraper to change the default ZIM output folder from dist to output. It also adds a CLI option (--output) to allow users to specify a custom output directory.

Changes:

  • Updated the default output folder path.

  • Implemented the --output CLI option for flexibility.

  • Ensured backward compatibility by handling cases where output is not specified.

@benoit74
Copy link
Contributor

benoit74 commented Apr 2, 2025

Please also:

Copy link
Contributor

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry but while making the review I realize that scraper will rimraf the output folder. This is quite dangerous in general, but every more dangerous once we introduce this new output folder. Should user pass something wrong, he might loose significant data.

I propose that rather than rimraf on the output folder, we fail the scraper if output folder already exists. This is way safer.

@JosefAnthony
Copy link
Author

I'm sorry but while making the review I realize that scraper will rimraf the output folder. This is quite dangerous in general, but every more dangerous once we introduce this new output folder. Should user pass something wrong, he might loose significant data.

I propose that rather than rimraf on the output folder, we fail the scraper if output folder already exists. This is way safer.

Thanks for the suggestion, I have now made the script fail if the output directory already exists instead of deleting it.

@JosefAnthony JosefAnthony force-pushed the standardize-output-folder branch from 3ebb19d to 0830ce2 Compare April 3, 2025 09:53

const creator = new Creator()
creator.configIndexing(true, iso6393LanguageCode).configCompression(Compression.Zstd).startZimCreation(`./dist/${target.output}.zim`)
creator.configIndexing(true, iso6393LanguageCode).configCompression(Compression.Zstd).startZimCreation(`${targetDir}.zim`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You miss the file name from ${target.output} in this call. Beware that $targetDir might contain a trailing slash or not

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh my! i skipped that part...so sorry for the back and fort. Will make the change and push now

Copy link
Contributor

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please squash all your commits into one and force push them, I will merge them right after.

@JosefAnthony
Copy link
Author

Fix path handling in startZimCreation openzim#248

I have squashed the changes and force-pushed them to the standardize-output-folder branch. The commit history is now as follows:
2d5ceb4: Fix path handling in startZimCreation #248

@benoit74
Copy link
Contributor

benoit74 commented Apr 3, 2025

Sorry, I made a typo myself...

Adding issue number in Github commit does not link PR to issue properly.

@JosefAnthony JosefAnthony force-pushed the standardize-output-folder branch from 2d5ceb4 to f5701d4 Compare April 3, 2025 19:46
@JosefAnthony
Copy link
Author

Sorry, I made a typo myself...

Adding issue number in Github commit does not link PR to issue properly.

Oop! I have adjusted and push without the issue number.

@benoit74
Copy link
Contributor

benoit74 commented Apr 4, 2025

Nope, it is the first PR comment which must contain the issue number.
Plus you did not squashed your commits, there are still 5 commits in this PR instead of one.

@JosefAnthony
Copy link
Author

Nope, it is the first PR comment which must contain the issue number. Plus you did not squashed your commits, there are still 5 commits in this PR instead of one.

I’ve been running into some issues squashing the commits properly. Would you recommend that I close this PR and open a new one with a clean commit history and the necessary changes, or should I continue troubleshooting the squash and force push here?
Appreciate your guidance!

@benoit74
Copy link
Contributor

benoit74 commented Apr 4, 2025

Closing the PR will not help squashing commits.

Something like this should do the trick:

git switch standardize-output-folder
git rebase -i main
# mark all commits except the first one for fixup and close text editor
git push -f

Or if you prefer :

git switch standardize-output-folder
git reset --soft $(git merge-base main HEAD)
git commit -m "one commit on yourBranch"
git push -f

@JosefAnthony JosefAnthony force-pushed the standardize-output-folder branch from f5701d4 to 724af04 Compare April 4, 2025 09:39
@JosefAnthony
Copy link
Author

Closing the PR will not help squashing commits.

Something like this should do the trick:

git switch standardize-output-folder
git rebase -i main
# mark all commits except the first one for fixup and close text editor
git push -f

Or if you prefer :

git switch standardize-output-folder
git reset --soft $(git merge-base main HEAD)
git commit -m "one commit on yourBranch"
git push -f

okay thanks...i just try out the second option

@benoit74 benoit74 self-requested a review April 4, 2025 12:18
Copy link
Contributor

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not working in fact:

  • phet2zim --includeLanguages zh_CN --withoutLanguageVariants --output output2 is not working
  • npm run export -- --createMul --mulOnly --output output2 is not working

@JosefAnthony
Copy link
Author

This is not working in fact:

  • phet2zim --includeLanguages zh_CN --withoutLanguageVariants --output output2 is not working
  • npm run export -- --createMul --mulOnly --output output2 is not workin

Thanks for the feedback! I’ll take a closer look at why those commands aren’t working as expected. It’s possible that recent changes to path handling or CLI options affected them. I’ll reproduce the issue locally, fix it, and update the PR shortly.

@JosefAnthony
Copy link
Author

This is not working in fact:

  • phet2zim --includeLanguages zh_CN --withoutLanguageVariants --output output2 is not working
  • npm run export -- --createMul --mulOnly --output output2 is not working

I tried running the same commands — phet2zim and npm run export — using the --dist option on the main branch, but it’s still not working for me either. I'm wondering if there's something I'm missing in the setup. Could you please guide me on what might be wrong or what else I should check?

@benoit74
Copy link
Contributor

benoit74 commented Apr 7, 2025

Same commands without --output output2 on main branch are working ok.

So your code is wrong. You have to fix it. Period.

@JosefAnthony
Copy link
Author

Same commands without --output output2 on main branch are working ok.

So your code is wrong. You have to fix it. Period.

Thanks for the clarification. I'd really like to fix it, but I'm having trouble running the app even on the main branch when using those same commands with the --output or --dist options. If you could guide me through getting it to work properly on main, that would really help me better understand what’s expected and how to fix it correctly. 🙏

@benoit74
Copy link
Contributor

benoit74 commented Apr 9, 2025

Looks like you do not understand at all what you are doing, sorry but I can't help.

@JosefAnthony
Copy link
Author

Looks like you do not understand at all what you are doing, sorry but I can't help.

I understand the task and the solution, and I’m sorry if I’ve caused any frustration. I’m still learning and trying to contribute meaningfully. I’d really love to fix the issue — all I’m saying is, even after forking the project, installing the dependencies, and running the commands as described, it’s still not working for me on the main branch.

If there’s something I might be missing in the setup, I’d appreciate any pointers. Thanks for your time.

@JosefAnthony JosefAnthony force-pushed the standardize-output-folder branch from 724af04 to b59f1f1 Compare April 11, 2025 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants