-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for CHOLMOD_INTLONG on CHOLMOD_PATTERN matrices #350
Comments
I'm not planning on creating the CHOLMOD_INTLONG.
Adding matrix operations for boolean matrices is also not feasible.
I do plan on adding support for single precision floating point, but I'm
not able to add boolean matrices.
What operations do you need? GraphBLAS has boolean matrices, and can also
exploit "iso-valued" matrices (my terminology) in which all entries in the
pattern of the matrix have the same value.
…On Tue, Jul 4, 2023 at 5:59 AM Alois Schlögl ***@***.***> wrote:
*Is your feature request related to a problem? Please describe.*
I'm working with large sparse boolean matrices. These matrices are in the
order of 10^5 to 10^7 rows and columns, and a density of 1-10%. This
results in more than 2^32 nnz values, so CHOLMOD_INT is not sufficient.
CHOLMOD_LONG uses 8 instead of 4 bytes for column and row indices, which
requires twice as much memory than necessary.
CHOLMOD_INTLONG seem to be the proper matrix mode for my purpose, but
cholmod_core.h contains this comment:
/* The itype of all parameters for all CHOLMOD routines must match.
- FUTURE WORK: CHOLMOD_INTLONG is not yet supported.
*/
I checked this in suitesparse v5.8 and als the latest (v.7.1.0) release.
*Describe the solution you'd like*
Support for functions in cholmod_core and cholmod_matrix operations on
boolean (i.e. CHOLMOD_PATTERN) matrices.
If this is already partially supported, I'd like to see what is supported
and what is not supported.
*Describe alternatives you've considered*
Currently, I've my own single-threaded implementation for boolean sparse
matrices, I'd need to make it multi-threaded.
Using CHOLMOD_LONG would take twice as much memory which has a significant
effect on performance and the maximum problem size.
—
Reply to this email directly, view it on GitHub
<#350>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEYIIOIGV5HEKSQOJCVPQVLXOPZQXANCNFSM6AAAAAAZ5SFLDQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks for your reply. Concerning your questions, which operations we need I'll use Matlab-like pseudo code to describe the problem. I hope that is fine with you.
W is a sparse, booolan, non-symetric, square martrix (size N-times-N), density 0-20% W is either a random matrix, or precomputed. N up to 1e7, In order to save memory we use already:
In order to save compute time, we do already :
The memory requirement of X{t}, and X{t+1} is considered not crucial, currently we use full matrices. At this moment, I do not see how these equations can be efficiently implemented in graphblas. Moreover, (equ 3) the product of X*WJ is not "iso-valued". I expect the biggest performance gains from a multithreaded implementation of equatuions (1) and (3). Do you have an suggestion how to achieve this with the help of suitesparse/graphblas ? |
I handle nearly all of this in GraphBLAS. In particular, using the right semiring, many of my monoids are what I call "terminal". They halt early, like a short-circuit AND or OR. The first line is a masked dot product, WJ = Z'*Z using the GxB_ANY_PAIR_BOOL semiring. Eqn 3 is unclear (it needs parentheses for me to understand the order of operation). GraphBLAS uses 64 bit integers however, not 32-bit, since I have to handle matrices of very large dimension. The corr function is the Pearson's correlation? It might be computable with a user-defined semiring but I haven't worked out the details. |
Equ 3 is following the operator precedence for Matlab, i.e.
|
Adding support for 32-bit integers for matrix row/col indices would be a major change. I have to support 64-bit integers and can't replace that with 32-bit indices. I would need to support both, and also support arbitary mixing of 32 and 64 bits. |
@schloegl if you need a special build for these small matrices, and don't care about using int64_t values you could try But you limit the size of your matrices quite a bit by doing so. Is this for some sort of embedded device? |
If I understand your suggestions correctly, this is the equivalent of using CHOLMOD_INT. That would work for the row- and column indices, but in my case the number-of-nonzeros can be larger then 2^32 (~4e9). So int32_t would not be sufficient for that. |
This would require the "pointer" arrays in my sparse matrix data structures to be held in int64, with the row/col indices held in int32. I agree that would be a nice feature, but I currently don't have plans for a mixed int64/int32 version of either CHOLMOD or GraphBLAS. |
@DrTimothyAldenDavis: Yes, I understood that from our earlier discussion. I just wanted to clarify why the suggestion of @Wimmerer is not suitable for my purpose. Based on this discussion, I believe you should close this issue with "Won't fix". @Wimmerer: To answer your question, no its not about some embedded device, but about some large scale simulation, where memory and memory bandwidth is the limiting factor. |
Got it. "Won't fix" is a bit strong, since it's an issue I'm aware of and ideally could do in the future. It's something I've thought about for my recent GraphBLAS package, but it's lower on the priority list as compared to other issues. So perhaps "Won't fix in the short or medium term, but may do this in the long term". |
@schloegl for my own curiosity is this a common use case in your field? In my current work I'm making the assumption that all indices are the same size integer. But that's not strictly necessary, and it's possible in the future that (in the Julia language I work on) the sparse libraries could support mixing index types. |
@Wimmerer : each time you have a sparse, square matrix with more than 65536 rows/columns, you might want to be able of storing more the 4G non-zeros. |
Yes, that's a useful case to handle (matrices of dimension at or less than, say 2 billion, but with more than 4 billion entries). I just can't handle that case for now without a lot of work. I'll need to leave it for future work. |
Is your feature request related to a problem? Please describe.
I'm working with large sparse boolean matrices. These matrices are in the order of 10^5 to 10^7 rows and columns, and a density of 1-10%. This results in more than 2^32 nnz values, so CHOLMOD_INT is not sufficient. CHOLMOD_LONG uses 8 instead of 4 bytes for column and row indices, which requires twice as much memory than necessary.
CHOLMOD_INTLONG seem to be the proper matrix mode for my purpose, but cholmod_core.h contains this comment:
I checked this in suitesparse v5.8 and als the latest (v.7.1.0) release.
Describe the solution you'd like
Support for functions in cholmod_core and cholmod_matrix operations on boolean (i.e. CHOLMOD_PATTERN) matrices.
If this is already partially supported, I'd like to see what is supported and what is not supported.
Describe alternatives you've considered
Currently, I've my own single-threaded implementation for boolean sparse matrices, I'd need to make it multi-threaded.
Using CHOLMOD_LONG would take twice as much memory which has a significant effect on performance and the maximum problem size.
The text was updated successfully, but these errors were encountered: