Skip to content

HCP file compatibility and parallelization #16

Open
neurabenn wants to merge 28 commits intoNeuroanatomyAndConnectivity:masterfrom
neurabenn:master
Open

HCP file compatibility and parallelization #16
neurabenn wants to merge 28 commits intoNeuroanatomyAndConnectivity:masterfrom
neurabenn:master

Conversation

@neurabenn
Copy link
Contributor

Major overhaul of surfdist. Numba is removed in favour of multiprocessing for distance matrix calculation. Compatibility with HCP gifti and cifti file formats is also implemented.

Additionally several bugs were fixed in the working repo.

Copy link

@rscgh rscgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite good, I added some comments directly on the code.

"""
Calculate exact geodesic distance along cortical surface from set of source nodes.
"dist_type" specifies whether to calculate "min", "mean", "median", or "max" distance values
from a region-of-interest. If running only on single node, defaults to "min".
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be good to describe what "surf" and "cortex" are expected to be:

surf : Tuple containing two numpy arrays describing either mesh vertices and triangles (array shapes: (n_verts,3) and (n_triangs, 3) respectively). 
       Each row of the first array specifies the x, y, z coordinates of a single vertex of the surface mesh. 
       Each row of the second array specifies the indices (over the first array) of the three vertices building one triangle of the surface mesh.
       (e.g. the output from nibabel.freesurfer.io.read_geometry)

cortex : Array with indices of vertices included in within the cortex (indexing over the vertex array).
       (e.g. the output from nibabel.freesurfer.io.read_label)

Does cortex need to be a numpy array or can it be also a mere list?


if gifti !=False:
gii=nib.load(surf)
surf=(gii.darrays[0].data,gii.darrays[1].data)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fact that you are either expecting the already loaded freesurfer files or alternatively a path to a gifti-file might be a bit confusing. Id suggest allowing both to be given either as file path or tuples, or just stick with giving only the array tuples.

This also prevents the potential issue hardcoding the decomposition of darrays into vertices and triangles (not sure if that is fixed in the gifti-standard)



def dist_calc_matrix(surf, cortex, labels, exceptions = ['Unknown', 'Medial_wall'], verbose = True):
def dist_calc_matrixFS(surf, cortex, labels, exceptions = ['Unknown', 'Medial_wall'], verbose = True):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest refractoring surface and label data loading from distance calculation. This way you could have only a single calc_dist_matrix function, and also avoid having a mix of both arrays and filepaths as input params.

The surface and label loading function could also automatically recognize if the given file is a cifti/gifti or freesurfer image (similiar as to what nibabel.load does internally), e.g. based on file-ending or nibabel class; or alternatively take a parameter (i.e. surface_type = "gifti")


###calculate distance from each region to all nodes:
dist_roi = []
with ProcessPool(processes=cpus-1) as pool:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some more comments here would be nice:

  • i.e. what what the labels you load (array of indices across the [remaining?] cortex vertices), and maybe give an example:

# e.g. rois = {"A1": [1,3,8,3], "S1": ...}

  • and also a description of how your multiprocessing works, i.e.

this calls the roiDistance function multiple times in parallel, with starmap ensuring that the resulting array has the same order as the ROI list.

Also what is a region here? Here nodes refer to ROIs (and all parcels from the parcellation are taken as ROI), but previously in dist_calc, nodes were referring to vertex indices.

#### multiprocessing is used in cifti and gifti distance matrix calculations
from multiprocessing.pool import Pool as ProcessPool
import multiprocessing
cpus=multiprocessing.cpu_count()-1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n_cpus might be more clear

cifti_data = cifti.get_fdata(dtype=np.float32)
cifti_hdr = cifti.header
nifti_hdr = cifti.nifti_header
axes = [cifti_hdr.get_axis(i) for i in range(cifti.ndim)]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally check which axis is the brain model axis, instead of assuming it to be axis 1.

else:
print('labels are not split by hemisphere. function will use')
print('hcp indices') #### these indices are specified for cortex in the HCP output ciftis
lcort=slice(0,29696)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here information directly from the BrainModelAxis might be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants