Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Charges in raw arrays #20

Open
RienLeuvenink opened this issue Jan 10, 2025 · 5 comments
Open

Charges in raw arrays #20

RienLeuvenink opened this issue Jan 10, 2025 · 5 comments

Comments

@RienLeuvenink
Copy link

Hi,

I am trying to look at some ion data but I can't seem find the charges. I've included an example spectrum below which has annotated charges when I open the raw file in freestyle. I assume this is encoded in the raw file and not something that freestyle makes up..
below how I access the file. Maybe I'm just overlooking something

`
let mut ms_file =
MZReader::open_path(r"D:\MS data\2025-01 January\ETD test\MS1_example.raw").unwrap();

dbg!(ms_file
    .get_spectrum_by_index(0)
    .unwrap()
    .raw_arrays()
    .unwrap());

`

https://filesender.surf.nl/?s=download&token=168ebd63-923e-49f3-8424-b65c3a5e942c

@mobiusklein
Copy link
Owner

mobiusklein commented Jan 10, 2025 via email

@RienLeuvenink
Copy link
Author

Thanks for your quick response. A collaborator of mine uses the pymsfilereader for python and I've been told that he can access the charge states through this. I see that you where a collaborator on this too, I also see that this package uses the MSFileReader instead of the RawFileReader maybe this is where the difference stems from?

Kind regards,
Rien

@mobiusklein
Copy link
Owner

Yes, the old Thermo MSFileReader C++/COM library included both a peak picking and a charge deconvolution algorithm, but neither are easily configurable.

RawFileReader does seem to include a few charge state estimators. One is based upon the THRASH algorithm from Horn et. al. and seems to work with the peak data stream directly, while others appear to be tied to some other GUI-facing application and don't have a clear way of interacting with the CentroidStream API.

Additionally, your file and some of my newer test files seem to have charge annotations, but they don't include isotopic pattern assignments so it would effectively say "this peak is predicted to be this charge state, but whether it is a monoisotopic peak is unknown". This is representable with mzpeaks::DeconvolutedPeak, but it's not what it's meant for, semantically. If this is still desirable, I can add the plumbing to thermorawfilereader to surface this information.

@RienLeuvenink
Copy link
Author

If possible from your side I'd be keen to try it out and compare it to what freestyle tells me. In my experience and for the type of proteins we're looking at freestyle does quite a decent job of assigning charge-states etc.

@mobiusklein
Copy link
Owner

I've added support for extracting the charge array to the main branch:

use std::io;

use mzdata::prelude::*;

fn main() -> io::Result() {
    let mut reader = ThermoRawReader::open_path("./test/data/small.RAW")?;
    reader.set_centroiding(true);
    reader.set_load_extended_spectrum_data(true);
    
    let spec = reader.get_spectrum_by_index(0).unwrap();
    let arrays =  spec.arrays.as_ref().unwrap();
    let z_array = arrays.charges().unwrap();
    
    Ok(())
}

Again, this charge array doesn't deisotope the spectrum. I think it just does a variation of the Patterson charge state determination method like what's shown in the Horn et al. paper to determine the charge state from peak spacing, and may be incorrect when you have interleaved isotopic peaks.

The change also includes other data arrays for FT spectra including a signal-to-noise-ratio array and a raw baseline signal array. I am just pulling what's available from the GetAdvancedPacketData function at https://github.com/mobiusklein/thermorawfilereader.rs/blob/main/librawfilereader/Lib.cs#L736-L840.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants