-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
import bigwig #5
Comments
Hi, thanks for the report and the feedback! And sorry for the issues, both with the BigWig and what I gather were some frustrating moments with the GFF formats... I can be more explicit with the assumptions used for GTF/GFF file importing. Hopefully you have that sorted. Recently, we experienced what we thought were rare errors reading BigWig files. In our case, they were caused by the web server hosting the files. Something about the web server certificate. Where is the bigwig file? Is it on the local machine, or web-accessible? Is there a firewall or VPN involved? Just to cover some basics: Since working with non-standard organism, do chromosomes begin "chr1" or are they encoded as "1" or some other format? I don't think splicejam is specific to "chr" - Mostly the GFF chromosome name just needs to match the bigwig chromosome name. It may be helpful to use |
Hi James, thanks for your reply. The gff import is working now, I think. At least the GRanges object looks like it works.
seqinfo: 221 sequences from an unspecified genome So not sure why getGRcoverageFromBw() is not. Is there any workaround to get it into the right format ? Sorry, but not familiar with GRanges, and learning using a nonmodel organism doesn't make it easier :( |
Okay I'll try my best to help troubleshoot, and thanks for your patience (so far)! I ran through some troubleshooting and noticed (and fixed) a bug in Thanks for showing the I suggest testing the data by running test_env <- new.env();
test_env <- sashimiDataConstants(envir=test_env, gtf=GTFFILE, verbose=TRUE) (It should spit out more verbose info than you ever need. Usually it spits out an error when something isn't found - this might be helpful. For example, if the GTF isn't found, or tx2geneDF step failed, etc. Maybe you already checked this part, that's okay too.) If that succeeds, it should populate The After that, the If these steps don't point to the cause, I suggest pasting the minimal commands that reproduce the error. Thank you for your help, I'll do what I can to make it easier to use! |
Hi James,
Thanks again for quick replies. So I have played around a bit – and sorted out a few things. I can generate e.g. and ggplot of the transcripts, like this:
***@***.***
So I kinda assume the general setup is correct – but I am struggling with integration of coverage, using my bigwig file. Trying like this:
prepareSashimi(gene = "COG007_PB.811",
flatExonsByGene = flat_exons_gene,
#minJunctionScore=100,
sample_id = c("test"),
filesDF = filesDF);
seems to lead to a bunch of strange outputs that go over my head (pasted to the end).
And this:
cov = getGRcoverageFromBw(flat_exons_gene,
bwUrls = "C:/Users/oberkowitz/OneDrive - LA TROBE UNIVERSITY/NGS_PROJECTS/Cannabis/PacBio/paper_analyses/highCBD-FI-rep1.chr_sorted.bw",
ddGaps = FALSE,
#gap_feature_type = "gap",
default_feature_type = "exon",
feature_type_colname = "feature_type",
use_memoise = FALSE,
memoise_coverage_path = "coverage_memoise",
do_shiny_progress = FALSE,
verbose = TRUE)
leads to this warning, but no coverage added:
Warning message:
In .local(con, format, text, ...) :
'which' contains seqnames not known to BigWig file: COG007_PB.811, COG010_PB.12775
I somehow wonder if my bigwig is somehow not what is expected ? – I calculated coverage across the chromosomes, just by position with a 50bp window using deepTools
As indicated earlier, importing the bw like this works, so it is not completely wrong.
test = rtracklayer::import("C:/Users/oberkowitz/OneDrive - LA TROBE UNIVERSITY/NGS_PROJECTS/Cannabis/PacBio/paper_analyses/highCBD-FI-rep1.chr_sorted.bw")
Thanks again for your help !
Cheers,
Oliver
Prepare_sashimi output:
$ref2c
$ref2c$transform
function (v)
.approxfun(x, y, v, method, yleft, yright, f, na.rm)
<bytecode: 0x00000197152b58f8>
<environment: 0x000001975649dea8>
$ref2c$inverse
function (v)
.approxfun(x, y, v, method, yleft, yright, f, na.rm)
<bytecode: 0x00000197152b58f8>
<environment: 0x0000019756497ee8>
$ref2c$gr
function (gr)
{
if (length(names(gr)) > 0) {
GRnames <- names(gr)
}
if (all(c("refStart", "refEnd") %in% colnames(GenomicRanges::values(GR)))) {
GenomicRanges::ranges(gr) <- IRanges::IRanges(start = GenomicRanges::values(gr)[,
"refStart"], end = GenomicRanges::values(gr)[, "refEnd"])
}
else {
GenomicRanges::values(gr)[, "refStart"] <- GenomicRanges::start(gr)
GenomicRanges::values(gr)[, "refEnd"] <- GenomicRanges::end(gr)
}
GenomicRanges::ranges(gr) <- IRanges::IRanges(start = ref2compressed(GenomicRanges::start(gr)),
end = ref2compressed(GenomicRanges::end(gr)))
if (length(GRnames) > 0) {
names(gr) <- GRnames
}
return(gr)
}
<bytecode: 0x000001977dc58380>
<environment: 0x0000019757ef1a90>
$ref2c$grl
function (grl)
{
if (length(names(grl)) == 0) {
names(grl) <- jamba::makeNames(rep("grl", length(grl)))
}
grlc1 <- ***@***.***)
grlc <- GenomicRanges::split(grlc1, factor(rep(names(grl),
S4Vectors::elementNROWS(grl)), levels = unique(names(grl))))
grlc
}
<bytecode: 0x000001977dc52b30>
<environment: 0x0000019757ef1a90>
$ref2c$scale_x_grc
function (..., trans = trans_grc)
{
ggplot2::scale_x_continuous(name = "grc", breaks = ggplot2::waiver(),
minor_breaks = ggplot2::waiver(), labels = function(...) {
scales::comma(...)
}, trans = trans, ...)
}
<bytecode: 0x000001977dc68200>
<environment: 0x0000019757ef1a90>
$ref2c$trans_grc
Transformer: compressed_gr [1, 3e+10]
attr(,"lookupCoordDF")
refCoord coord
1 1 -508601.5
16 42333545 -599.0
15 42383545 1.0
8 42384460 916.0
2 42384578 1117.0
9 42385997 2536.0
3 42386097 2737.0
10 42386318 2958.0
4 42386539 3159.0
11 42386694 3314.0
5 42389508 3515.0
12 42389956 3963.0
6 42390080 4164.0
13 42390525 4609.0
7 42390949 4810.0
14 42391290 5151.0
17 42441290 5751.0
18 30000000000 359496455.5
attr(,"gapWidth")
[1] 200
attr(,"gr")
GRanges object with 38 ranges and 3 metadata columns:
seqnames ranges strand | gene_name gene_nameExon
<Rle> <IRanges> <Rle> | <character> <character>
COG007_PB.811_exon7l chr1 42383545-42383553 - | COG007_PB.811 COG007_PB.811_exon7l
COG007_PB.811_exon7k chr1 42383554-42383577 - | COG007_PB.811 COG007_PB.811_exon7k
COG007_PB.811_exon7j chr1 42383578-42383740 - | COG007_PB.811 COG007_PB.811_exon7j
COG007_PB.811_exon7i chr1 42383741-42383742 - | COG007_PB.811 COG007_PB.811_exon7i
COG007_PB.811_exon7h chr1 42383743 - | COG007_PB.811 COG007_PB.811_exon7h
... ... ... ... . ... ...
COG007_PB.811_exon1e chr1 42391232-42391235 - | COG007_PB.811 COG007_PB.811_exon1e
COG007_PB.811_exon1d chr1 42391236-42391242 - | COG007_PB.811 COG007_PB.811_exon1d
COG007_PB.811_exon1c chr1 42391243 - | COG007_PB.811 COG007_PB.811_exon1c
COG007_PB.811_exon1b chr1 42391244-42391262 - | COG007_PB.811 COG007_PB.811_exon1b
COG007_PB.811_exon1a chr1 42391263-42391290 - | COG007_PB.811 COG007_PB.811_exon1a
feature_type
<character>
COG007_PB.811_exon7l exon
COG007_PB.811_exon7k exon
COG007_PB.811_exon7j exon
COG007_PB.811_exon7i exon
COG007_PB.811_exon7h exon
... ...
COG007_PB.811_exon1e exon
COG007_PB.811_exon1d exon
COG007_PB.811_exon1c exon
COG007_PB.811_exon1b exon
COG007_PB.811_exon1a exon
-------
seqinfo: 59 sequences from an unspecified genome; no seqlengths
$some_null
[1] FALSE
$df
x y gr sample_id type name
1 42383545 0 COG007_PB.811_exon7l test coverage COG007_PB.811_exon7l test+ test
3 42383545 7 COG007_PB.811_exon7l test coverage COG007_PB.811_exon7l test+ test
5 42383550 7 COG007_PB.811_exon7l test coverage COG007_PB.811_exon7l test+ test
Cheers,
Oliver
From: James Ward ***@***.***>
Sent: Tuesday, 17 December 2024 12:35
To: jmw86069/splicejam ***@***.***>
Cc: Oliver Berkowitz ***@***.***>; Author ***@***.***>
Subject: Re: [jmw86069/splicejam] import bigwig (Issue #5)
Okay I'll try my best to help troubleshoot, and thanks for your patience (so far)!
I ran through some troubleshooting and noticed (and fixed) a bug in sashimiDataConstants(). I'm not sure how it slipped in, I'll add tests to cover in future. Make sure to update the splicejam package before testing.
Thanks for showing the rtracklayer::import() worked for you, that's a good sign.
I'm a bit at a loss where to start and how, but a good starting point.
Attach the sessioninfo(), to make sure versions of R and whatnot are included.
I suggest testing the data by running sashimiDataConstants(), something like this:
test_env <- new.env();
test_env <- sashimiDataConstants(envir=test_env, gtf=GTFFILE, verbose=TRUE)
(It should spit out more verbose info than you ever need. Usually it spits out an error when something isn't found - this might be helpful. For example, if the GTF isn't found, or tx2geneDF step failed, etc. Maybe you already checked this part, that's okay too.)
If that succeeds, it should populate test_env with a bunch of R objects used for sashimi plots. You may check a few to make sure they're not empty, and populated with your data. (It's designed to fallback to use farrisdata in some cases.)
The head(test_env$tx2geneDF) should have columns: gene_id, gene_name, transcript_id. Make sure the gene you want to use is found in the "gene_name" column.
After that, the filesDF configuration is next to review. Make sure the URL is formatted to your local bigwig file. You can confirm with file.exists(filesDF$url).
If these steps don't point to the cause, I suggest pasting the minimal commands that reproduce the error.
For me, I usually define gtf, filesDF, then run launchSashimiApp(). Sometimes for custom data, I might manually create tx2geneDF, cdsByTx, exonsByTx, the whole bit. I did that for a hybrid genome that included a transgene on a custom chromosome.
Thank you for your help, I'll do what I can to make it easier to use!
—
Reply to this email directly, view it on GitHub<#5 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AG7DGRGPYNR47KQ5CHNEZQL2F55VTAVCNFSM6AAAAABTPDTXM6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNBXGMYTEMZSHA>.
You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>
La Trobe University | TEQSA PRV12132 - Australian University | CRICOS Provider 00115M
|
just an update: being desperate I tried launchSashimiApp() - and by some magic everything seems to work, despite all the errors in base R. No idea what is going and how it pulls the data together when the inputs apparently are not quite what it expects ... One thing missing is the splice junction file, for which I am again not quite sure what the expected format is... Sorry, for being stupid ;) |
Hi,
i am trying to import a bigwig file to prepareSashimi, but get this:
getGRcoverageFromBw():import_or_null() error:BigWig file not accessible
Not sure why i won't/can't read the file.
Love the plots, just wished it would work for me :) - working with nonmodel organism so have to start from scratch with 2 days of error weeding already (gff formats ...)
The text was updated successfully, but these errors were encountered: