A Python program for deconvolution spatial transcriptomics data and inference of spatial communication
NODE is a Python-based package for spatial transcriptomic data. NODE is based on an optimization search model and non-negative least squares problem to use scRNA-Seq data for deconvoluting spatial transcriptomics data and inferring spatial communications. In deconvolution, Node can infer cell number and cell type from spatial transcriptomics data by referring to single-cell data. In inference of spatial communication, NODE can infer the information flow in space.
NODE require users to provide four types of data for deconvolution and inferring the spatial communication. Firstly the user needs to download and unzip it locally. The user can then use sys (python package) to find the directory for NODE and import it, followed by using the The specific form of the data is shown below:
Sc_data needs to carry the gene name and cell name as a square, (0, 0) can specify any name, such as 'gene' and 'name'. Sc_data rows represent genes, columns represent cells, and each column represents the expression of a gene. The genes' name of sc_data must correspond to the genes' name of st_data.
sc_data is shown below:
name cell1 cell2
gene1 5.0 3.0
gene2 2.0 0.0
St_data needs to carry the genes' name and spots' name as a square,(0,0) can specify any name, such as 'spot' and 'name'. st_data rows represent genes, columns represent spots, and each column represents the expression of a gene in a spot. The genes' name of st_data must correspond to the genes's name of sc_data.
st_data is shown below:
spot spot1 spot2
gene1 4.0 8.0
gene2 0.0 2.0
cell_type needs to carry the cell name and its cell type. celltype's first line is title information including name and cell type name. Cell_type rows represent each cell. The name column of cell_type must correspond to the first row of sc_data, and the cell names in it must be the same.
cell_type is shown below:
name celltype
cell1 type1
cell2 type2
st_coordinate needs to carry the spots' name and theirs coordinates. St_coordinate 's first line is title information including 'spot', 'x', and 'y'. St_coordinate rows represent spot. St_coordinate's spot column must correspond to the first row of st_data, where the names of the spots must be the same.
st_coordinate is shown below:
spot x y
spot1 8 20
spot2 10 12
Here, we provide example.py for users to test and refer to. It is worth noting that this data is generated by NODE and is test data with no real meaning.
pip install NODE-deconvolution
Once the user has prepared the above four sets of data, the deconvolution can be performed with the get_deconvolution function. The method_optimize parameter has two options, 1 and 2, with method 1 being faster and method 2 having tighter control over conditions.
import NODE_deconvolution as nd
from NODE_deconvolution.NODE import get_test_data
st_data,sc_data,cell_type,st_pixel = get_test_data(10,2000,800,10)
result_data,result_data_normalized,W_interaction = nd.get_deconvolution(
st_data = st_data,
sc_data = sc_data,
cell_type = cell_type,
st_coordinate = st_pixel,
method_optimize = 1,
prossecing_reserve = False,
file_path = '',
Number_of_iterations = 500)
# If necessary, the users can print the data for viewing in a data format.
# In deconvolution, we return the deconvolution result and spatial communications
In analysed, we provide the code for our algorithms, the data can be obtained from data_available.txt or https://node-deconvolution.sourceforge.io, you can clone it locally and download data, change the path, perform the algorithms and validate it.
Here we provide our results analysis files, where you can view the intermediate files generated during our analysis, as well as the final graphic files.
# pip install NODE-deconvolution
import NODE_deconvolution as nd
import numpy as np
import pandas as pd
# Make sure to replace the placeholder file paths in the provided scripts with the correct paths to your local data files.
st_data = pd.read_csv('.../example_data/test_data/st_data.txt',sep=' ')
sc_data = pd.read_csv('.../example_data/test_data/sc_data.txt',sep=' ')
st_coordinate = pd.read_csv('.../example_data/test_data/st_pixel.txt',sep=' ')
cell_type = pd.read_csv('.../example_data/test_data/cell_type.txt',sep=' ')
st_data = np.vstack((np.array(st_data.columns).reshape(1,-1),st_data.values))
sc_data = np.vstack((np.array(sc_data.columns).reshape(1,-1),sc_data.values))
st_coordinate = np.vstack((np.array(st_coordinate.columns).reshape(1,-1),st_coordinate.values))
cell_type = np.vstack((np.array(cell_type.columns).reshape(1,-1),cell_type.values))
result_data,result_data_normalized,W_interaction = nd.get_deconvolution(
st_data = st_data,
sc_data = sc_data,
cell_type = cell_type,
st_coordinate = st_coordinate,
method_optimize = 1,
prossecing_reserve = False,
file_path = '',
Number_of_iterations = 500)
For more details, please refer to the provided example data and code:
Test Data: See example_data.zip for the sample dataset.
Example Code: Refer to example.ipynb for a step-by-step guide on how to use the provided scripts and analyze the data.