Converting SARGE to Tskit? #1879
-
@nkschaefer just published an ARG of humans, denisovans, and neanderthals using SARGE. They did not use tskit, but used SARGE. https://github.com/nkschaefer/sarge Is there any way to convert a SARGE arg into a tskit ARG? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 3 replies
-
Not that I know of @swamidass, I don't think anyone has looked at this. I guess ideally we'd like SARGE to support converting to tskit rather than us trying to reverse engineer things. It would probably be a pretty straightforward utility program using our C API. Maybe we should open an issue on their repo to ask for this feature? |
Beta Was this translation helpful? Give feedback.
-
Putting an issue in with him is a good idea. I suspect open lines of communication between @nkschaefer and the TSKit team would be ideal. It does seem that it is possible to convert SARGE files to a sequence of newick trees. From Nick regarding the SARGE codebase:
The problem, however, is that I'm not sure if this will produce a tree sequence that meets all your constraints. Also, it would seem that the distribution of mutations on the trees may be lost. As a reference point, SARGE was used to calculate the ARG in this paper: https://www.science.org/doi/10.1126/sciadv.abc0776 I am honestly curious to whether or not tsinfer produces a comparable ARG or not. It seems that using ancient DNA makes the analysis more complex though, and I'm not sure if or how tsinfer handles the higher error rates, etc. If it isn't impossible, perhaps they'd be willing to share the input data they used, to run tsinfer on too. |
Beta Was this translation helpful? Give feedback.
-
It also looks like he has code to convert tskit trees into SARGE. I wonder if that gives a good enough hook to work with... https://github.com/nkschaefer/sarge/blob/master/utilities/sarge_tsinfer.py |
Beta Was this translation helpful? Give feedback.
Not that I know of @swamidass, I don't think anyone has looked at this. I guess ideally we'd like SARGE to support converting to tskit rather than us trying to reverse engineer things. It would probably be a pretty straightforward utility program using our C API.
Maybe we should open an issue on their repo to ask for this feature?