This is a Mathematica notebook that contains genomic sequence of COVID-19 and calculates the n-th order Markov Chain probability transition matrices of it.
To use this you need to download the whole folder and open the .nb file using Mathematica. You can further export the matrices as a .csv file so to use it elsewhere.
First order Markov Chain specifies that the probability of a state depends only on the probability of the previous state, we can build more “memory” into our states by using a higher order Markov model. On the .nb file you can change the order by changing the value of n.
You can check the examples from COVID-19_1stOrder and COVID-19_2ndOrder.
General Info: Coronaviruses are members of the Coronaviridae group and contain a single-stranded, positive-sense RNA genome surrounded by a corona-like helical envelope. Approximately 100 sequences of the SARS-CoV-2 genome have been published and these suggest there are two types, Type I and Type II, of which the latter came from the Huanan market in China while the Type I strain came from an unknown location (Zhang 2020). The genome consists of 29,751 base pairs (NC_045512.2) and the genome is about 80% homologous with SARS viruses (NCBI 2020, Fisher 2020).