Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F4-statistics from unlinked SNPs of SNP array #3

Open
ghost opened this issue Sep 19, 2021 · 4 comments
Open

F4-statistics from unlinked SNPs of SNP array #3

ghost opened this issue Sep 19, 2021 · 4 comments

Comments

@ghost
Copy link

ghost commented Sep 19, 2021

I have a question associated with the usage of this tool, can this tool be used tool to calculate F4-statistics on my data that consists of unlinked SNPs from SNP array? Will the simulation by fastsimcoal2 (as run by F4.py) affected by it?

@mmatschiner
Copy link
Owner

Hi,
I guess with SNP array data, D- or F4-statistics could be influence by how the SNPs were originally selected for the array. I assume this was not done randomly, but with variability in the species/population in mind? Besides that, yes, F4 can calculate the F4 statistic from such data. But the statistic itself shouldn't be any different if you calculate it with a tool like Dsuite, and the latter would be much faster. The F4 tool might only report a different p-value because this is what the simulations are used for. In your case, I would probably just run Dsuite now and if you worry that the p-value might be affected by the jackknifing method, then you could also run the F4 tool.

@ghost
Copy link
Author

ghost commented Sep 19, 2021

Thank you for the prompt response even on sunday (I really appreciate it). Apart from SNP array, I also have WGS-SNPs, therefore, there, I could use random SNPs situated at relatively distant places along the genome. And yes, Indeed, I was worried about how using different blocks of jackknifing affect the Z-values, that is why I wanted to use this tool! My only worry was whether the simulation parameters as carried out by fastsimcaol2 (in F4.py) is specific to RAD-seq (like generated SNPs in block).

@mmatschiner
Copy link
Owner

With WGS-SNPs, I probably would not worry too much about linkage in the calculation of D or F4. You could of course try with or without thinning the dataset, but I wouldn't expect much difference (unless a gigantic inversion region has a large influence or similar). But the F4 tool is definitely applicable to data other than RAD-seq data. In that sense, SNP array data should be fine.

@ghost
Copy link
Author

ghost commented Sep 19, 2021

Okay, thank you again for the prompt response! I will try this tool as well as D-suite tool on my WGS as well as as SNP array data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant