Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues/Workarounds for Run 2 UL Fullsim workflow #694

Open
atownse2 opened this issue Feb 21, 2025 · 0 comments
Open

Issues/Workarounds for Run 2 UL Fullsim workflow #694

atownse2 opened this issue Feb 21, 2025 · 0 comments

Comments

@atownse2
Copy link

atownse2 commented Feb 21, 2025

Hi everyone, I have spent the past few weeks trying to get a run 2 UL full sim workflow to work in Lobster and I have run into a number of issues which I would like to document here. I have a working version right now (for 2016/2017) and I will document the steps I took to get this to work.

Issues and workarounds

DIGI step fails with XRD errors

CMS is currently moving pileup files to disk so all the files which are originally in the DIGI step are not available. There doesn't seem to be a central solution as of this moment but a workaround seems to be to use a script to update the list of pileup files.

Here is a script for doing this: https://github.com/FNALLPC/lpc-scripts/blob/master/get_files_on_disk.py
An example usage is:

python3 get_files_on_disk.py /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL18_106X_upgrade2018_realistic_v11_L1v1-v2/PREMIX -v -o premix.txt --user [cern_username]

And to get the rucio dataset name from the das dataset name I did:

source /cvmfs/cms.cern.ch/rucio/setup-py3.sh
rucio list-parent-dids cms:<das_filename>

Note: For some reason I still don't have the 2018 DIGI step working even after updating the file list. I have tried adding process.mixData.input.skipBadFiles = cms.untracked.bool(True) to the DIGI step but it still eventually fails.

HLT step fails with python errors

I think this issue is related to using a newer version of WMCore in Lobster. The newer version of WMCore uses the builtins module which comes by default in python3 but needs to be installed manually in python2. In most of the UL releases the python2 version comes with this already installed however some of the releases used for the HLT steps are old enough that they do not have it. To fix this I added two of the following lines to the end of lobster/core/data/wrapper.sh:

python -m ensurepip --user
python -m pip install --user future

$*
res=$?

Soft failure on aod/maod/naod steps

For a starting point I use the lobster config defined here: https://github.com/TopEFT/mgprod/blob/master/lobster_workflow/lobster_postLHE_UL_config.py

This "soft failure" results in lobster not forming any new tasks despite there still being work to do. The only messages I get are in process_debug.log and are:

2025-02-14 12:47:36 [DEBUG] lobster.algo: workflow aod_step_2018_BkkToGRadionToGGG_M1_450_R0_1p125 has not enough units available to form new tasks
2025-02-14 12:47:36 [DEBUG] lobster.algo: workflow maod_step_2018_BkkToGRadionToGGG_M1_450_R0_1p125 has not enough units available to form new tasks
2025-02-14 12:47:36 [DEBUG] lobster.algo: workflow naod_step_2018_BkkToGRadionToGGG_M1_450_R0_1p125 has not enough units available to form new tasks

If I look at the number of attempted jobs per step I see all aod(reco) steps successful and many attempted, all maod steps successful but only a few attempted, and no naod steps attempted.

I had tried a number of things including:

  • Modifying the number of cores defined in the resources for the aod/maod/naod steps.
  • Modifying the units_per_task parameter in the Workflow definition for each step
  • Modifying the merge_size parameter in the Workflow definition for each step

Unfortunately I did not try these things one at a time but by disabling the merging, setting units_per_task to 1 and telling the steps to use 4 cores I was able to get something that worked. Changing the resource definitions alone did not solve the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant