-
-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Banded matrices required buffer size calculated incorrectly (GBMV, HBMV, SBMV & TBMV) #538
Comments
Thank you for reporting this issue, and apologies for the late reply. I think I understand most of what you are saying, but I'm not really sure what you mean with BTW, I'm open to reviewing fixes for this issue if you have something in mind. |
I apologize for the lack of clarity, I was referring to the Netlib documentation to which you linked. On that page, the relevant snippet is:
I will be looking for a fix and opening a pull request when I have something worth reviewing, but I cannot promise a timeframe since doing it without affecting other level 2 operations seems to be harder than appears at first look because the |
What about adding an extra argument to optionally disable the matrix A check in diff --git a/src/routines/level2/xgbmv.cpp b/src/routines/level2/xgbmv.cpp
index e80b9a9..c37ba10 100644
--- a/src/routines/level2/xgbmv.cpp
+++ b/src/routines/level2/xgbmv.cpp
@@ -42,6 +42,9 @@ void Xgbmv<T>::DoGbmv(const Layout layout, const Transpose a_transpose,
auto kl_real = (rotated) ? ku : kl;
auto ku_real = (rotated) ? kl : ku;
+ // The matrix A has different constraints compared to what is normally tested in MatVec below
+ TestMatrixBanded(n, kl, ku, a_buffer, a_offset, a_ld);
+
// Runs the generic matrix-vector multiplication, disabling the use of fast vectorized kernels.
// The specific hermitian matrix-accesses are implemented in the kernel guarded by the
// ROUTINE_GBMV define.
@@ -52,7 +55,8 @@ void Xgbmv<T>::DoGbmv(const Layout layout, const Transpose a_transpose,
x_buffer, x_offset, x_inc, beta,
y_buffer, y_offset, y_inc,
fast_kernels, fast_kernels,
- 0, false, kl_real, ku_real);
+ 0, false, kl_real, ku_real,
+ /*do_test_matrix_a=*/false);
}
// =================================================================================================
diff --git a/src/routines/level2/xgemv.cpp b/src/routines/level2/xgemv.cpp
index 63dab9f..9a5761c 100644
--- a/src/routines/level2/xgemv.cpp
+++ b/src/routines/level2/xgemv.cpp
@@ -64,7 +64,7 @@ void Xgemv<T>::MatVec(const Layout layout, const Transpose a_transpose,
const Buffer<T> &y_buffer, const size_t y_offset, const size_t y_inc,
bool fast_kernel, bool fast_kernel_rot,
const size_t parameter, const bool packed,
- const size_t kl, const size_t ku) {
+ const size_t kl, const size_t ku, const bool do_test_matrix_a) {
// Makes sure all dimensions are larger than zero
if (m == 0 || n == 0) { throw BLASError(StatusCode::kInvalidDimension); }
@@ -92,7 +92,7 @@ void Xgemv<T>::MatVec(const Layout layout, const Transpose a_transpose,
// Tests the matrix and the vectors for validity
if (packed) { TestMatrixAP(n, a_buffer, a_offset); }
- else { TestMatrixA(a_one, a_two, a_buffer, a_offset, a_ld); }
+ else if (do_test_matrix_a) { TestMatrixA(a_one, a_two, a_buffer, a_offset, a_ld); }
TestVectorX(n_real, x_buffer, x_offset, x_inc);
TestVectorY(m_real, y_buffer, y_offset, y_inc);
diff --git a/src/routines/level2/xgemv.hpp b/src/routines/level2/xgemv.hpp
index 1e1fa72..1ba5dde 100644
--- a/src/routines/level2/xgemv.hpp
+++ b/src/routines/level2/xgemv.hpp
@@ -46,7 +46,7 @@ class Xgemv: public Routine {
const Buffer<T> &y_buffer, const size_t y_offset, const size_t y_inc,
bool fast_kernel, bool fast_kernel_rot,
const size_t parameter, const bool packed,
- const size_t kl, const size_t ku);
+ const size_t kl, const size_t ku, const bool do_test_matrix_a = true);
};
// =================================================================================================
diff --git a/src/utilities/buffer_test.hpp b/src/utilities/buffer_test.hpp
index 4a2a2c9..9097a21 100644
--- a/src/utilities/buffer_test.hpp
+++ b/src/utilities/buffer_test.hpp
@@ -65,6 +65,17 @@ void TestMatrixAP(const size_t n, const Buffer<T> &buffer, const size_t offset)
} catch (const Error<std::runtime_error> &e) { throw BLASError(StatusCode::kInvalidMatrixA, e.what()); }
}
+// Tests matrix 'A' (for banded matrix-vector computations) for validity
+template <typename T>
+void TestMatrixBanded(const size_t n, const size_t kl, const size_t ku, const Buffer<T> &buffer,
+ const size_t offset, const size_t ld, const bool test_lead_dim = true) {
+ if (test_lead_dim && ld < kl + ku) { throw BLASError(StatusCode::kInvalidLeadDimA); }
+ try {
+ const auto required_size = (ld * n + offset) * sizeof(T);
+ if (buffer.GetSize() < required_size) { throw BLASError(StatusCode::kInsufficientMemoryA); }
+ } catch (const Error<std::runtime_error> &e) { throw BLASError(StatusCode::kInvalidMatrixA, e.what()); }
+}
+
// =================================================================================================
// Tests vector 'X' for validity |
Can you make a PR to this repository? And then also ask @BadgerKing7 for a review. |
CNugteren, can u give me access to publish the branch? (I asked @BadgerKing7, his waiting to review my changes after I create the PR) |
The normal flow to make a PR to a repository you don't have write permissions to (such as CLBlast) is to fork the repository, make your changes (possibly on a branch), push your changes, and then make a PR to this repository from your fork. If you need help, I'm sure there's a guide about this somewhere on GitHub. |
@BadgerKing7 can u look at this PR #559 |
GBMV, HBMV and SBMV use
MatVec
directly, passing their own buffers into the function, unlike TBMV which passes a scratch buffer for just vector X intoMatVec
. This means that for all 4 of these operationsTestMatrixA
is applied directly on the user provided buffer for the A matrix.The required buffer size of all A matrices is calculated as
(ld * (two - 1) + one + offset) * sizeof(T)
which for GBMV, HBMV, SBMV and TBMV called with row major matrices expands to(a_ld * (m - 1) + kl + ku + 1 + a_offset) * sizeof(T)
. However, the BLAS interface defines the size of matrix A for all 4 routines asa_ld * n
which means that user buffers can lead tokInsufficientMemoryA
errors even though they hold enough memory according to the interface spec.For HBMV, SBMV and TBMV this issue is somewhat hidden by passing the value of
n
intoMatVec
'sm
parameter so the error is only encountered when(kl + ku + 1) > a_ld
.The text was updated successfully, but these errors were encountered: