Skip to content
mikiobraun edited this page Sep 13, 2010 · 1 revision

I’m using jblas for all of my research, meaning that I’ve put in features as I needed them. Nevertheless, I want jblas to be as multi-purpose as possible on the long run. Here is a list of features I have in my head. Hopefully I’ll find time to implement them at some stage… :)

So here is the list roughly in decreasing order of priority.

Complete Coverage of LAPACK Methods, Including Decomposition

I’ll go for the high-level routines first, that is, solving equations, eigenvalues, singular values, and least squares problems. At some point, I also want to include more decomposition methods (called “computational routines” in LAPACK). I already have LU and Cholesky, but there is QR, RQ, and a host of other methods. Not sure in what order I’ll add them, though.

A More General Matrix Class

Currently, the classes exist pretty much in isolation. This has been a deliberate design choice to keep the code somewhat clean. There is already so much code due to all the overloadings, so I didn’t want to add even more overloadings for the different classes. I’m also not sure how much speed is affected, when you’re calling fundamental methods like get() or put() through an interface. I guess I have to check.

On the other hand, at some point it would be very nice to have “general” matrix classes like Matrix, RealMatrix, ComplexMatrix, etc. which work with single or double precision floats and convert between the different representations. At that point you can also write truly generic algorithms which work with either precision.

Before I’ll dive into this, I will probably revamp my own Java macro facility to cut down on code duplication… .

Sparse Matrices

For some machine learning applications like working with text, you just need sparse matrices, another field of application being solving differential equations (I guess). As far as I know, there is no clear choice for which numerical library to use, so I’ll probably roll my own. One thing I also haven’t figured out yet is how to represent sparse matrices. I think most libraries work with a fixed pattern, meaning that once you have created a matrix, you cannot add new entries at points which were zero before. Of course you can representing sparse matrices by hashes, but those are probably less performant for computation. In any case, I’ll have to figure those things out first, I guess.

Some Caching Scheme for Arrays and JNI

As it is right now, the data is always copied between the “Java space” and the “native space” for each single call. This is clearly an overhead, in particular, because matrices often don’t change, or the intermediate result is directly passed to some other routine.

Therefore, it should be possible to get huge performance gains if one would cache arrays in the native space, or have a way of explicitly pushing arrays to the native space and pull them back later. At some point I’d like to have something like this.