Proper use of flax.scan #2059

ozencgungor · 2022-04-20T05:54:38Z

ozencgungor
Apr 20, 2022

Hi there,

I am trying to implement a MLP-Mixer type model on nearest neighbor graphs (HEALPix to be specific, my "images" are on a sphere) using chebyshev convolutions. For clarity of discussion, suppose my input is of shape (N, M, F)where N is the batch dimension, M is the number of pixels and F is the channel dimension. After performing a chebyshev transformation of order K, I now have an array of shape (N, K, M, F). In code this looks like:

import jax.numpy as jnp
from jax.experimental import sparse as jaxsparse
import flax.linen as nn

def dot(x, y):
    return jnp.dot(x, y)
sp_dot = jaxsparse.sparsify(dot)
v_sp_dot = jax.vmap(sp_dot, in_axes=(None, 0))

def init_array(K, x):
    M, Fin = x.shape[-2], x.shape[-1]
    xout = jnp.empty((K, M, Fin))
    xout = xout.at[0].add(x)
    return xout                 #shape (K, M, Fin) where xout[0] = input.
        
v_init_array = jax.vmap(init_array, (None, 0), 0)

@jax.jit
def chebyshev_transform(K, L, x):
    """
    Recursively calculates [T_0(L)@x, T_1(L)@x, ..., T_{K-1}(L)@x]
    ---------
    :param K: int>=1, number of chebyshev polynomials T_k(L) to use. 
    :param L: the graph laplacian, jax.sparse BCOO of shape (M, M)
    :param x: input maps of shape (K, M, Fin) where x[0] = input
    ---------
    Returns [P_0(L)@x, P_1(L)@x, ..., P_K(L)@x], shape (K, M, Fin)
    """

    def K1branch(x):
        """
        The case when K=1
        """
        return x

    def Kbranch(x):
        """
        K>1 branch. 
        """
        def recursion(carry, _):
            """
            Chebyshev polynomials recursion rule. "_" is just a dummy variable.
            """
            y0, y1, = carry #set y0 = x[k-2], y1 = x[k-1] and use x[k] = 2*L@x[k-1] - x[k-2]
            y2 = 2*sp_dot(L,y1) - y0
            return (y1, y2), y2
        x = x.at[1].add(sp_dot(L,x[0])) #add L@input at K=1
        (_, _), xrest = jax.lax.scan(recursion, (x[0], x[1]), x[2:]) #calculate the rest
        x = x.at[2:].add(xrest)
        return x
            
    x = jax.lax.cond(K>1, lambda x: Kbranch(x), lambda x: K1branch(x), x) #(K, M, Fin)
    return x

v_chebyshev_transform = jax.vmap(chebyshev_transform, (None, None, 0), 0)

After this I would simply act on the output of the vectorized chebyshev transform with a usual nn.Conv(kernel_size=(K, 1)) and squeeze the leftover dimension out to get back an output of shape (N, M, F)

My images are masked, so to not act on the masked pixels, I carry around an array of pixel indices of shape (M,) telling me which pixels are valid so I mask before I act with the nn.Conv.

class MaskedConv(nn.Module):
    indices: Array
    .....
        
    def setup(self):
        self.filter = nn.Conv(....)
    
    def __call__(self, inputs: Array) -> Array:
        unmasked_indices = indices[indices >= 0]
        x = jnp.take(inputs, unmasked_indices, axis=-2)
        x = self.filter(x)  #shape (N, (1), M, F)
        x = jnp.squeeze(x, axis=-3) #shape (N, M, F)
        
        output = jnp.empty_like(inputs)                                     #need to "unmask"
        output = output.at[...,indices[indices>=0],:].set(x)      #shape (N, M, F)
        return output

Now suppose I break up my images into N_s patches to get a shape (N, N_s, K, M/N_s, F) and I do the same on the array of indices to get an array of indices of shape (N_s, M/N_s). What I would like to do is that I would have a flax.scaned MaskedConv module that would effectively do a for i loop over the N_sdimension.

output = []
for i in range(N_s):
    out = MaskedConv(indices=indices[i])(x[...,i,:,:,:])
    output.append(out)

But I cannot for the love of me understand how nn.scanworks. How would one go about implementing the for i loopdescribed above using nn.scanso that one would have the option of not sharing parameters between different patches? If there was no masking, I could imagine reshaping to (N*N_s, K, M/N_s, F) and acting with an nn.Conv but the masking forces me to implement some kind of looping over the patch dimension.

I'm relatively new to jax/flaxso any help would be greatly appreciated and thanks in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proper use of flax.scan #2059

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Proper use of flax.scan #2059

ozencgungor Apr 20, 2022

Replies: 0 comments

ozencgungor
Apr 20, 2022