Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import bigwig #5

Open
Ollipolli1909 opened this issue Dec 12, 2024 · 5 comments
Open

import bigwig #5

Ollipolli1909 opened this issue Dec 12, 2024 · 5 comments
Assignees

Comments

@Ollipolli1909
Copy link

Hi,
i am trying to import a bigwig file to prepareSashimi, but get this:
getGRcoverageFromBw():import_or_null() error:BigWig file not accessible

Not sure why i won't/can't read the file.
Love the plots, just wished it would work for me :) - working with nonmodel organism so have to start from scratch with 2 days of error weeding already (gff formats ...)

@jmw86069
Copy link
Owner

Hi, thanks for the report and the feedback! And sorry for the issues, both with the BigWig and what I gather were some frustrating moments with the GFF formats... I can be more explicit with the assumptions used for GTF/GFF file importing. Hopefully you have that sorted.

Recently, we experienced what we thought were rare errors reading BigWig files. In our case, they were caused by the web server hosting the files. Something about the web server certificate.
We ultimately found a workaround by storing the bigwig files locally.

Where is the bigwig file? Is it on the local machine, or web-accessible? Is there a firewall or VPN involved?
Can you send or describe the filesDF, to make sure the URL for the file is as expected?

Just to cover some basics: Since working with non-standard organism, do chromosomes begin "chr1" or are they encoded as "1" or some other format? I don't think splicejam is specific to "chr" - Mostly the GFF chromosome name just needs to match the bigwig chromosome name.

It may be helpful to use verbose=TRUE in some splicejam functions,
or try:
options("warn"=TRUE) (to see warnings that might help show what's happening)
then:
options("warn"=2) (as follow-up, turns any warning into an error - then you can run traceback() and may get some clue for where it occurs)

@Ollipolli1909
Copy link
Author

Hi James,

thanks for your reply. The gff import is working now, I think. At least the GRanges object looks like it works.
Regarding the bw file, I played around a little more and this seems to be working:
test = rtracklayer::import("C:/Users/oberkowitz/OneDrive - LA TROBE UNIVERSITY/NGS_PROJECTS/FI-rep1.chr_sorted.bw"):

test
GRanges object with 1337351 ranges and 1 metadata column:
seqnames ranges strand | score
|
[1] chr1 1-31450 * | 0
[2] chr1 31451-31550 * | 3
[3] chr1 31551-31600 * | 2
[4] chr1 31601-33050 * | 0
[5] chr1 33051-33200 * | 1
... ... ... ... . ...
[1337347] NC_029855.1|mitochon.. 415401-415450 * | 81
[1337348] NC_029855.1|mitochon.. 415451-415500 * | 54
[1337349] NC_029855.1|mitochon.. 415501-415550 * | 37
[1337350] NC_029855.1|mitochon.. 415551-415600 * | 9
[1337351] NC_029855.1|mitochon.. 415601-415602 * | 0


seqinfo: 221 sequences from an unspecified genome

So not sure why getGRcoverageFromBw() is not. Is there any workaround to get it into the right format ? Sorry, but not familiar with GRanges, and learning using a nonmodel organism doesn't make it easier :(

@jmw86069
Copy link
Owner

Okay I'll try my best to help troubleshoot, and thanks for your patience (so far)!

I ran through some troubleshooting and noticed (and fixed) a bug in sashimiDataConstants(). I'm not sure how it slipped in, I'll add tests to cover in future. Make sure to update the splicejam package before testing.

Thanks for showing the rtracklayer::import() worked for you, that's a good sign.
I'm a bit at a loss where to start and how, but a good starting point.
Attach the sessioninfo(), to make sure versions of R and whatnot are included.

I suggest testing the data by running sashimiDataConstants(), something like this:

test_env <- new.env();
test_env <- sashimiDataConstants(envir=test_env, gtf=GTFFILE, verbose=TRUE)

(It should spit out more verbose info than you ever need. Usually it spits out an error when something isn't found - this might be helpful. For example, if the GTF isn't found, or tx2geneDF step failed, etc. Maybe you already checked this part, that's okay too.)

If that succeeds, it should populate test_env with a bunch of R objects used for sashimi plots. You may check a few to make sure they're not empty, and populated with your data. (It's designed to fallback to use farrisdata in some cases.)

The head(test_env$tx2geneDF) should have columns: gene_id, gene_name, transcript_id. Make sure the gene you want to use is found in the "gene_name" column.

After that, the filesDF configuration is next to review. Make sure the URL is formatted to your local bigwig file. You can confirm with file.exists(filesDF$url).

If these steps don't point to the cause, I suggest pasting the minimal commands that reproduce the error.
For me, I usually define gtf, filesDF, then run launchSashimiApp(). Sometimes for custom data, I might manually create tx2geneDF, cdsByTx, exonsByTx, the whole bit. I did that for a hybrid genome that included a transgene on a custom chromosome.

Thank you for your help, I'll do what I can to make it easier to use!

@jmw86069 jmw86069 self-assigned this Dec 17, 2024
@Ollipolli1909
Copy link
Author

Ollipolli1909 commented Dec 19, 2024 via email

@Ollipolli1909
Copy link
Author

Ollipolli1909 commented Dec 19, 2024

just an update: being desperate I tried launchSashimiApp() - and by some magic everything seems to work, despite all the errors in base R. No idea what is going and how it pulls the data together when the inputs apparently are not quite what it expects ... One thing missing is the splice junction file, for which I am again not quite sure what the expected format is... Sorry, for being stupid ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants