Previously known as GrapeTree Extended.
- Description
- Use SPREAD
- Instruction to run it locally
- Server
- Docker
- Videos
- Documentation
- Credits
- License
- Citation
The project is based on GrapeTree, a fully interactive, tree visualization program within EnteroBase, which supports facile manipulations of both tree layout and metadata. Please see more info here.
The idea was to extend the original project in order to conducts spatio-temporal analyses with an integrated geographic information system (GIS) as well as a data visualization system across time. The web application allows users to upload geographic coordinates and temporal data related to each sample and allows to display them reflecting the selection in the tree on the map and vice versa, and reproducing a timelapse visualization both in the map and in the tree.
You can download the project and easly running it locally, but we provide also a ready to use version freely available following link below:
Please note, you can load sensitive data, your dataset is visualised client-side in the browser. No data is transmitted, and no tracking cookies are used. The only data downloaded from the internet are the visualisation (JavaScript) code, fonts and map tiles.
To run the project locally you need to set a web server, node packages http-server or live-server works pretty fine.
- Node js. Download the latest stable release of NodeJS from https://nodejs.org and install it using all the default options.
Now you will be able to install live-server
globally on your machine using the node package manager npm
command line tool, this will allow you to run a web server from anywhere on your computer.
npm install live-server -g
To start a web server, in terminal open the directory containing your static web files and start the server with the following command:
live-server
Download the .zip of the code from this repository, then unzip on a directory DIR_SPREAD
(for example: /tmp/spread
).
Source files in place
Produce your newick (tree.nwk
) and metadata (meta.tsv
) files and copy them inside DIR_SPREAD
, then on terminal:
live-server DIR_SPREAD
Now you will be able to load data by url in your web browser, for example:
http://localhost:8080?tree=tree.nwk&metadata=metadata.tsv
In the project you will find a datasets
folder with some example data to load, you can use url to see them as well:
http://localhost:8080?tree=/datasets/test/tree.nwk&metadata=/datasets/test/metadata.tsv&geo=/datasets/test/points.geojson
The dashboard is able to recognize .geojson
file passed as geo
parameter by query string:
&geo=points.geojson
Alternatively you can integrate in metadata.tsv
longitute and latitude values and pass them as query string parameters in the place of geoJSON. For example, if in the .tsv
you define x
for longitude and y
for latitude, just add &longitude=x&latitude=y
in the url in this way:
&metadata=metadata.tsv&longitude=x&latitude=y
Using our data example:
http://localhost:8080?tree=/datasets/test/tree.nwk&metadata=/datasets/test/metadata.tsv&longitude=x&latitude=y
Important
If you use
longitude
andlatitude
to name coordinates in the.tsv
file, there is no need to pass them as query string parameters, so simply use:&metadata=metadata.tsv
Please note that for the longitude and latitude values, decimal points should be used as separators instead of commas.
You can also directly upload files by dragging them over the tree, over the initial droppable area or using load buttons provided in the UI. In this case you don't need to user parameters in the url. You can drag or load a .nwk
file followed by a .tsv
file containing metadata and optionally a .geojson
file containing geo-saptial information.
Important
.nwk file should be loaded before loading metadata or geoJson files.
Dashboard allows you to download a complete JSON file including metadata and configurations. Generated JSON file can be loaded with the same drag or load functionalities seen previously, this is very useful if you want to save your work or share it.
SPREAD can be used to visualize results generated with ReporTree. Very briefly, ReporTree allows the generation of a series of zooms on clusters by specifying samples of interest and parameters such as thresholds and distances.
In this context, it's possible to indicate to SPREAD the presence of available zooms using the methods described below.
Please note. Here, we describe how to set up and place the needed newick and metadata files for zooms. However, if you produce these files directly with ReporTree, the structure, composition, and positions are already correct.
Here the minimum metadata mandatory schema:
ID | category | MST-21x1.0 | MST-15x1.0 | MST-7x1.0 | MST-4x1.0 |
---|---|---|---|---|---|
00.00000.00.00 | sample of interest | cluster_4 | cluster_11 | cluster_22 | cluster_32 |
Insert the folders containing the zooms (a newick and optionally a related .tsv metadata file) at the same level as the general tree. Following the output generated by ReporTree, the names of folders and files should adhere to the following conventions:
prefix_threshold-definition_cluster-name
: for example,Reportree_MST-7x1.0_cluster_334
prefix_sample-id_closest-name
: for example,Reportree_sample-01_closest5
For a concrete example, the structure should follow this convention:
Reportree.nwk
Reportree_metadata_w_partitions.tsv
Reportree_MST-7x1.0_cluster_335
|__ MST-7x1.0_cluster_335.nwk
|__ MST-7x1.0_cluster_335_metadata_w_partitions.tsv
Reportree_MST-7x1.0_cluster_887
|__ MST-7x1.0_cluster_887.nwk
|__ MST-7x1.0_cluster_887_metadata_w_partitions.tsv
...
Once the zooms folders are moved as shown, it is necessary to specify some parameters, and this can be done in two ways.
There are 2 parameters to specify: zooms
, which is mandatory, and zooms_prefix
, which is optional.
zooms
parameter can have one of the following syntax:zooms=category,sample%20of%20interest@5,7
: the code will find in the general metadata file the clusters for the samples marked as sample of interest for the indicated thresholds.zooms=category,sample%20of%20interest@all
: the code will find in the general metadata file the clusters for the samples marked as sample of interest for all thresholds.zooms=@7,21
: The code will usecategory
andsample of interest
by default to find clusters in the general metadata file for the indicated thresholds.zooms=all
: The code will usecategory
andsample of interest
by default to find clusters in the general metadata file for all thresholds.
Please note. As permitted by ReporTree, different thresholds can be separated by commas (e.g., @5,8,16), but you can also define ranges by specifying them with a hyphen to separate the minimum and maximum (e.g., @5,8,10-20).
zooms_prefix
should define a prefix used for files and folders. This is optional, by default application take it from the.nwk
file name.zooms_prefix=zoom
indicates that folder names and file names start withzoom_
.
At the same level as the zoom folders, alternatively, it's possible to insert a .txt file (for example zooms.txt
) containing a simple list of prefix_threshold-name_cluster-name
for each zoom folder in the full tree root. By using this index, it is possible to cover a second use case, that of closest zooms, and it is also possible to indicate elements in the list composed as follows: prefix_sample-name_closest-name
. To give a more concrete example:
Reportree_MST-7x1.0_cluster_335
Reportree_MST-7x1.0_cluster_510
Reportree_MST-7x1.0_cluster_519
Reportree_MST-15x1.0_cluster_556
Reportree_sample-01_closest5
Reportree_sample-01_closest10
Reportree_sample-02_closest5
Reportree_sample-02_closest10
then add in the URL one of the following parameters:
zooms_list=category,sample%20of%[email protected]
: the code will find in the general metadata file the clusters for the samples marked as sample of interest for zooms indicated byzooms.txt
file.[email protected]
orzooms_list=zooms.txt
: the code will usecategory
andsample of interest
by default to find clusters in the general metadata file for zooms indicated byzooms.txt
file.
Please note. The
prefix
should match the name of the.nwk
file.
You can use SPREAD with the provided server.js
by running the following command:
npm start
This server acts as a proxy, allowing you to download resources using the withProxy
parameter in the query string. For example:
http://localhost:8080/?tree=https://example.com/tree.nwk&withProxy
You can set custom variables to control the proxy:
SERVER_JSON_LIMIT=10mb
SERVER_URLENCODED_LIMIT=5mb
SERVER_MAX_DOWNLOAD_SIZE_KB=1024
SERVER_ALLOWED_DOMAINS_FOR_DOWNLOAD=domain1.com,domain2.com
Example:
Windows:
set SERVER_JSON_LIMIT=2mb && set SERVER_URLENCODED_LIMIT=2mb && set SERVER_MAX_DOWNLOAD_SIZE_KB=1024 && node server.js
Linux:
SERVER_JSON_LIMIT=2mb SERVER_URLENCODED_LIMIT=2mb SERVER_MAX_DOWNLOAD_SIZE_KB=1024 node server.js
You can also specify your proxy:
http://localhost:8080/?tree=https://example.com/tree.nwk&withProxy=https://myproxy.com
SPREAD will use the following format for the proxy URL: https://myproxy.com/download?url={url}
docker pull ghcr.io/genpat-it/spread:latest
docker run -d -p 3000:8080 --name spread ghcr.io/genpat-it/spread:latest
You can surf the spread instance by visiting http://<IP_ADDRESS_OR_HOSTNAME>:3000
in your web browser.
To build the Docker image from the source code, follow these steps:
- Clone the repository:
git clone https://github.com/genpat-it/spread
- Navigate to the cloned directory:
cd spread
- Build the Docker image using the provided Dockerfile. You can also specify the port using the
PORT
build argument:
docker build . -t spread --build-arg PORT=3000
- Once the image is built, you can run the Docker container:
docker run -d spread
This will expose the application running inside the container on port 3000 of your host machine.
-
Video Settings: Watch Video - This video demonstrates the settings configuration in SPREAD.
-
Video Upload: Watch Video - This video shows how to upload data in SPREAD.
- https://cohesive.izs.it/wiki/user/Dashboards/GrapeTree/GrapeTree_extended.html
- https://enterobase.readthedocs.io/en/latest/grapetree/grapetree-about.html
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
If you run SPREAD, please cite publications:
de Ruvo, A., De Luca, A., Bucciacchio, A., Castelli, P., Di Lorenzo, A., Radomski, N., Di Pasquale, A. SPREAD: Spatiotemporal Pathogen Relationships and Epidemiological Analysis Dashboard. Veterinaria Italiana, Vol. 60 No. 4 (2024): Special Issue GeoVet23. https://doi.org/10.12834/VetIt.3476.23846.1
De Luca, A., de Ruvo, A., Di Lorenzo, A., Bucciacchio, A., Di Pasquale, A. GrapeTree integration with spatio-temporal data visualization: a holistic understanding of diseases and the transmission pathways. GeoVet23 (2023). Poster
Also, SPREAD relies on the work of other developers. So, there are other tools that you must cite: