Skip to content

Latest commit

 

History

History
127 lines (115 loc) · 4.08 KB

README.md

File metadata and controls

127 lines (115 loc) · 4.08 KB

fmriFlow

fMRI Data Analysis Workflows with Apache Spark and Thunder.

User Guide

In this section we discuss in detail how to install and run an fmriFlow application step-by-step.

Apache Spark Installation

  1. Download Spark from the official website: http://spark.apache.org/downloads.html
  2. Extract Spark: tar -zxvf spark-1.6.0-bin-hadoop2.6.tgz
  3. Set the environmental variables:
    1. Open ~/.bashrc with your favorite editor, e.g. nano ~/.bashrc
    2. Append the following two lines
      export SPARK_HOME=(The path at which the Spark was extracted at step 2)
      export PATH=$PATH:$SPARK_HOME/bin
### Thunder Installation
  1. git clone [email protected]:thunder-project/thunder.git
  2. cd thunder-project
  3. python setup.py install
  4. rm -rf thunder-project

fmriFlow Installation

Just clone this repository: git clone [email protected]:gsvic/fmriFlow.git

Run the provided example

In order to run an application you just need to define the workflow in a Python file and submit it to Spark. To run the provided test.py you just type: spark-submit test.py. In this example we use sample input data from Thunder-Project.

Define and execute a Workflow

A new workflow can be defined in a Python script just like the example above. In detail:

  1. Define the workflow by providing a Spark Context and an input path(.nii file) flow1 = Workflow(datapath, sc)
  2. Add some operators flow1 = Workflow(datapath, sc).extract().clustering(k=5).visualize()
  3. Execute the workflow flow1.execute()
  4. Or print the execution plan print flow1.explain()
Currently the available operators are:
  • extract(): Extracts features into time series
  • clustering(k): K-Means clustering
  • visualizeBrain(): Visualizes a specific slice of the brain
  • visualize(nsamples): Visualizes nsamples data points

Bash Commands

It is also possible to execute operations via bash using the scripts in the /scripts folder with the following parameters:
run.sh

  • --path the input path
  • --operator the operator
  • --path the input path
  • --model a serialized model from previous execution
  • --vector a neuron-vector to be given as input to the model above in order to compute its corresponding cluster
Examples
  • Train and save a model: sbin/run.sh --path ../bold_dico.nii --operator ts: Runs a K-Means clustering on the input dataset and serializes it in disk
  • Load a trained model: sbin/run.sh --operator pr --model model --vector "[...]": Predicts the cluster center of the input vector using the input model
Other Scripts
  • visualizeBrain.sh $INPUT
  • visualizeData.sh $INPUT $NSAMPLES
  • visualizeClusters.sh $INPUT $K

Additional Info

###Understanding fMRI Data http://www.biostat.jhsph.edu/~mlindqui/Papers/STS282.pdf

Datasets

http://psydata.ovgu.de/forrest_gump/

Links

http://studyforrest.org/7tmusicdata.html
https://github.com/hanke/gumpdata
http://klab.smpp.northwestern.edu/wiki/images/9/9b/Big_data_klab.pdf

Neuroimaging Background

NifTi Data Format

Neuroimaging Informatics Technology Initiative http://nifti.nimh.nih.gov
http://nipy.org/nibabel

##Acknowledgments This project was developed for the purposes of Digital Image Processing (HY620) Course of Dept. of Informatics at Ionian University.
Course Page: http://di.ionio.gr/en/component/content/article/19-modules/semester-5/58-digital-image-processing.html