Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEPATAC looper report Rscript /PEPATAC_summerizer.R error #310

Closed
moor2562 opened this issue Mar 13, 2025 · 13 comments · Fixed by #313
Closed

PEPATAC looper report Rscript /PEPATAC_summerizer.R error #310

moor2562 opened this issue Mar 13, 2025 · 13 comments · Fixed by #313

Comments

@moor2562
Copy link

Rscript /users/2/moor2562/bin/pepatac/tools/PEPATAC_summarizer.R /panfs/jay/groups/2/modianoj/shared/moores/proj84/project84_canFam4Y.yaml /home/modianoj/shared/moores/proj84/pepatac_out/results_pipeline /home/modianoj/shared/moores/proj84/pepatac_out/results_pipeline 2 5 1 (461067)

Loading config file: /panfs/jay/groups/2/modianoj/shared/moores/proj84/project84_canFam4Y.yaml
Creating assets summary...
Summary (n=48): /home/modianoj/shared/moores/proj84/pepatac_out/results_pipeline/project84_ATAC_assets_summary.tsv
Creating summary plots...
Error in forderv(x, by = by, sort = FALSE, retGrp = TRUE) : 
  Column 1 passed to [f]order is type 'list', not yet supported.
Calls:  ... lapply -> FUN -> unique -> unique.data.table -> forderv
Execution halted

I'm receiving this error. looper run -p looper_84_config.yaml generated .logs that said the jobs completed/run successfully. I want to generate the .html report (looper report). Running looper runp -c looper_84_config.yaml leads to the above error in the summary/PEPATAC_log.md file. I've checked some files to make sure they are not written empty but cannot figure out what file inside of the R script  is leading to the error. Can someone please point me in a direction to figure out why/what empty or miswritten file is leading to the error?
@donaldcampbelljr
Copy link
Member

Can you confirm the PEPATAC version you are using as well as the R version?

Are you using the command looper report -c looper_84_config.yaml after running looper run -c looper_84_config.yaml and then looper runp -c looper_84_config.yaml ?

@moor2562
Copy link
Author

Can you confirm the PEPATAC version you are using as well as the R version?

Are you using the command looper report -c looper_84_config.yaml after running looper run -c looper_84_config.yaml and then looper runp -c looper_84_config.yaml ?

I'm running PEPATAC v0.12.3 using conda and R v4.3.1. I ran 'looper run -c looper_84_config.yaml' which generated what seem to be non-empty output files, then ran looper runp -c looper_84_config.yaml which led to the above error. Does that help?

@donaldcampbelljr
Copy link
Member

Some follow-up questions:
How many samples did you run for looper run? Is it only 1 sample?
Do you have one of the sample's results files (stats.yaml) that you could share?

@moor2562
Copy link
Author

48 samples, here is an example stat.yaml for a single sample. The sample level stats.yamls are there and full of information for all 48 samples. The stats.yaml file in the summary directory is lacking information
PEPATAC:
project: {}
sample: {}

@donaldcampbelljr
Copy link
Member

I believe I was able to recreate it on my end.

I ran looper runp locally and everything appears to execute BUT I noticed this line in the log.md output:

> `Rscript /home/drc/PEPATAC_OCT_2024//tools/pepatac/tools/PEPATAC_summarizer.R /home/drc/PEPATAC_OCT_2024/tools/pepatac/examples/tutorial/tutorial_refgenie_pep_config.yaml /home/drc/PEPATAC_OCT_2024//processed/results_pipeline /home/drc/PEPATAC_OCT_2024//processed/results_pipeline 2 5 1` (707941)
<pre>
Error: Install the "argparser" library before proceeding.
</pre>
Command completed. Elapsed time: 0:00:00. Running peak memory: 0.017GB.  
  PID: 707941;	Command: Rscript;	Return code: 0;	Memory used: 0.017GB


> `Library_complexity`	{'path': '/home/drc/PEPATAC_OCT_2024/processed/results_pipeline/summary/PEPATAC_tutorial_libComplexity.pdf', 'thumbnail_path': '/home/drc/PEPATAC_OCT_2024/processed/results_pipeline/summary/PEPATAC_tutorial_libComplexity.png', 'title': 'Library_complexity', 'annotation': 'PEPATAC'}	_RES_

Though it shows an error, the project level pipeline appears to continue without issue.

I decided to install the argparser package to my R environment, and I now see the error you are receiving with an additional warning message:

Loading config file: /home/drc/PEPATAC_OCT_2024/tools/pepatac/examples/tutorial/tutorial_refgenie_pep_config.yaml
Creating assets summary...
Summary (n=2): /home/drc/PEPATAC_OCT_2024//processed/results_pipeline/PEPATAC_tutorial_assets_summary.tsv
Creating summary plots...
Error in forderv(x, by = by, sort = FALSE, retGrp = TRUE) : 
  Column 1 passed to [f]order is type 'list', not yet supported.
Calls: <Anonymous> ... lapply -> FUN -> unique -> unique.data.table -> forderv
In addition: Warning message:
In as.data.table.list(yaml_file$PEPATAC$sample[[sample_name]]) :
  Item 1 has 3 rows but longest item has 4; recycled with remainder.
Execution halted

This appears to be the same issue as: #291
with a potential solution attempted here: #289

@moor2562
Copy link
Author

moor2562 commented Mar 19, 2025

I ran install.packages("data.table") in R in my conda env and reran looper runp and received the same error.
looper runp summary log.md https://drive.google.com/file/d/1T16nQJsJjaeDDhV_y3Pu0rQ9en_mb-y4/view?usp=sharing
single sample submission log https://drive.google.com/file/d/1T5NM7LMKBgUoLareNdLoX-jy1d-5V5Oe/view?usp=sharing

@moor2562
Copy link
Author

My interpretation of the links to the similar issues was updating the R version - I'm running R 4.3.1. ran install.packages("argparser") prior to rerunning looper runp -- with same error as before

@donaldcampbelljr
Copy link
Member

I believe it is happening at this line after parsing the file _stats_summary.yaml:

sample_DT <- unique(sample_DT)

Example yaml that causes issue:

PEPATAC:
  project: {}
  sample:
    tutorial1:
      meta:
        pipestat_modified_time: '2025-02-10 14:53:30'
        pipestat_created_time: '2025-02-10 14:51:29'
        history:
          Time:
            '2025-02-10 14:53:30': 0:02:02
          Success:
            '2025-02-10 14:53:30': 02-10-14:53:30
      File_mb: 27
      Read_type: paired
      Genome: hg38
      Raw_reads: '1000000'
      Fastq_reads: 1000000
      Trimmed_reads: 1000000
      Trim_loss_rate: 0.0
      FastQC report r1:
        path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/fastqc/tutorial1_R1_trim_fastqc.html
        thumbnail_path: null
        title: FastQC report r1
        annotation: PEPATAC
      FastQC report r2:
        path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/fastqc/tutorial1_R2_trim_fastqc.html
        thumbnail_path: null
        title: FastQC report r2
        annotation: PEPATAC
      Aligned_reads_rCRSd: 99360.0
      Alignment_rate_rCRSd: 9.94
      Mapped_reads: '900577'
      QC_filtered_reads: 3835
      Aligned_reads: '896742'
      Alignment_rate: 89.67
      Total_efficiency: 89.67
      Mitochondrial_reads: 18
      NRF: 1.0
      PBC1: 1.0
      PBC2: 448366.0
      Unmapped_reads: 63
      Duplicate_reads: '0'
      Dedup_aligned_reads: 896742.0
      Dedup_alignment_rate: 89.67
      Dedup_total_efficiency: 89.67
      NFR_frac: 0.3593
      mono_frac: 0.2362
      di_frac: 0.0647
      tri_frac: 0.0014
      poly_frac: 0.0013
      Read_length: 42
      Genome_size: 3099922541
      Library complexity:
        path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/QC_hg38/tutorial1_preseq_plot.pdf
        thumbnail_path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/QC_hg38/tutorial1_preseq_plot.png
        title: Library complexity
        annotation: PEPATAC
      Frac_exp_unique_at_10M: 0.9586
      TSS_score: 14.2
      TSS enrichment:
        path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/QC_hg38/tutorial1_TSS_enrichment.pdf
        thumbnail_path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/QC_hg38/tutorial1_TSS_enrichment.png
        title: TSS enrichment
        annotation: PEPATAC
      Fragment distribution:
        path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/QC_hg38/tutorial1_fragLenDistribution.pdf
        thumbnail_path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/QC_hg38/tutorial1_fragLenDistribution.png
        title: Fragment distribution
        annotation: PEPATAC
      Time: 0:02:02
      Success: 02-10-14:53:30
    tutorial2:
      meta:
        pipestat_modified_time: '2025-02-10 14:55:31'
        pipestat_created_time: '2025-02-10 14:53:31'
        history:
          Time:
            '2025-02-10 14:55:31': 0:01:59
          Success:
            '2025-02-10 14:55:31': 02-10-14:55:31
      File_mb: 27
      Read_type: paired
      Genome: hg38
      Raw_reads: '1000000'
      Fastq_reads: 1000000
      Trimmed_reads: 1000000
      Trim_loss_rate: 0.0
      FastQC report r1:
        path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial2/fastqc/tutorial2_R1_trim_fastqc.html
        thumbnail_path: null
        title: FastQC report r1
        annotation: PEPATAC
      FastQC report r2:
        path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial2/fastqc/tutorial2_R2_trim_fastqc.html
        thumbnail_path: null
        title: FastQC report r2
        annotation: PEPATAC
      Aligned_reads_rCRSd: 100556.0
      Alignment_rate_rCRSd: 10.06
      Mapped_reads: '899373'
      QC_filtered_reads: 4021
      Aligned_reads: '895352'
      Alignment_rate: 89.54
      Total_efficiency: 89.54
      Mitochondrial_reads: 30
      NRF: 1.0
      PBC1: 1.0
      PBC2: 447669.0
      Unmapped_reads: 71
      Duplicate_reads: '0'
      Dedup_aligned_reads: 895352.0
      Dedup_alignment_rate: 89.54
      Dedup_total_efficiency: 89.54
      NFR_frac: 0.3602
      mono_frac: 0.2354
      di_frac: 0.0643
      tri_frac: 0.0015
      poly_frac: 0.0014
      Read_length: 42
      Genome_size: 3099922541
      TSS_score: 12.8
      TSS enrichment:
        path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial2/QC_hg38/tutorial2_TSS_enrichment.pdf
        thumbnail_path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial2/QC_hg38/tutorial2_TSS_enrichment.png
        title: TSS enrichment
        annotation: PEPATAC
      Fragment distribution:
        path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial2/QC_hg38/tutorial2_fragLenDistribution.pdf
        thumbnail_path: /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial2/QC_hg38/tutorial2_fragLenDistribution.png
        title: Fragment distribution
        annotation: PEPATAC
      Time: 0:01:59
      Success: 02-10-14:55:31

@ljmills
Copy link

ljmills commented Mar 21, 2025

So the pipeline is creating sample level stats.yaml files that the R script can't read? It seems like you ran the tutorial example to get this error so it isn't our dataset it is an issue with the R package?

@donaldcampbelljr
Copy link
Member

Yes, I do not believe it is your dataset. It appears that the absence of the argparser R package in my install was hiding this issue for the project level pipeline. You could remove that package and see if the project level pipeline executes for you without issue (it did for me during testing for the new release, unsure why).

We are currently working on an actual solution at the moment, however.

@sanghoonio
Copy link
Member

There is an issue with the yamlToDT() function not removing all columns of type list before filtering for unique rows with unique(). This happens when the yaml file contains additional subkey levels past the level for your sample. We can test some changes and have a patch out shortly.

@donaldcampbelljr
Copy link
Member

Dev branch now has potential fix with this commit: 7a19e00

I did need to run install.packages("ragg") in my R environment to solve a new error ("Graphics API version mismatch") popping up later in the script. Solution from here.

@donaldcampbelljr
Copy link
Member

This should be fixed with 0.12.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants