-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
type: renaming Kokkos_Tools_OptimizationGoal
#221
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6 tasks
masterleinad
approved these changes
Dec 1, 2023
That would be great and thanks for submitting this one! On Dec 1, 2023, at 3:39 AM, Tomasetti Romin ***@***.***> wrote:
@dalg24 @vlkale I can make the accompanying PR in Kokkos if you want.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
romintomasetti
added a commit
to uliegecsm/kokkos
that referenced
this pull request
Dec 1, 2023
I checked that this pull request and kokkos/kokkos#6642 are independent. I could run a simple program with either the changes here or those in the |
dalg24
approved these changes
Dec 1, 2023
etphipp
added a commit
to sandialabs/GenTen
that referenced
this pull request
Sep 18, 2024
08ceff92b Merge pull request #7202 from ndellingwood/master-release-4.4.00 948c13463 update master_history.txt for 4.4.00 6068673cb Merge branch 'release-candidate-4.4.00' for 4.4.00 f4ef4dab4 Merge pull request #7207 from dalg24/cherry_pick_automated_releases_master 76ffeea8d Add workflow to create releases with SLSA provenance generation f15e90c9c [ci skip] update changelog for 4.4.0 (#7188) 818b82712 Merge pull request #7190 from dalg24/rc440_desul_atomics_config c8bbbe2ef Fix atomic accessor for pre-volta GPU architectures (#7189) ca0efd501 Merge pull request #7185 from ndellingwood/cherry-pick-7181-to-rc4.4.0 c8c11c2f7 Fix bogus warnings for cuda/11.4 with gcc/8.5 (#7181) 608fcbea8 Set version number to 4.4.0 94e62755d Hide `IMPL_REF_COUNT_BRANCH_UNLIKELY` option (#7175) dbb0abb67 Merge pull request #7174 from rgayatri23/ompt_lock_code_delete e933bfd5b Implement KOKKOS_ENABLE_IMPL_VIEW_OF_VIEWS_DESTRUCTOR_PRECONDITION_VIOLATION_WORKAROUND (#7168) c8c6ae9bd OpenMPTarget: Delete ununsed code. 3bf86311e Merge pull request #7173 from dalg24/prefer_exec_space_name_in_tutorial 3a4fb03be Prefer ExecutionSpace::name() to a typeid expression in hello world 39cfac80e Merge pull request #7170 from dalg24/gcc_deprecated_declarations_warnings af6fb10d5 Disable deprecated warnings with GCC < 11.1 for Pair<T1, void> 02ed662e8 Enable deprecation warnings in the GCC 8.4 build a6beb5a56 Merge pull request #7166 from tjhei/examples_c++11 b13cef88b tutorials: do not mention requiring c++11 73b6d032b Add nvidia Grace Architecture (#7158) 5a2292743 Fix Kokkos_CoreUnitTest_DeviceAndThreads (#7159) c22d638b9 Merge pull request #7163 from kokkos/dependabot/github_actions/ossf/scorecard-action-2.4.0 a65a6ce36 Bump ossf/scorecard-action from 2.3.3 to 2.4.0 eb11070f6 Make ExecutionSpace constructors explicit (#7156) 7d0e3e818 Fix Kokkos::Array<T, 0> default initialization for icpc (#7154) 5e2f147cd Make struct "ChunkSize" constructor explicit to avoid implicit construction in RangePolicy (#7151) 277c616e3 OpenMPTarget: Update docker clang build. (#7147) bc7e02a16 Hidden friend operator== for Kokkos::Array (#7148) d306b408b SYCL: Use sycl::shift_group_[left|right] and sycl::select_from_group (#7146) 49eccc859 Merge pull request #6828 from masterleinad/sycl_use_auto_range 28a7788b7 Use sycl::ext::oneapi::experimental::auto_range 2b8380992 Drop `Experimental::RawMemoryAllocationFailure` and don't catch exceptions to rethrow them in shared alloc (#7145) 3290a3de1 Fix using View without corresponding mdspan-type (#7140) 23b8813dc Add CMake options to control compilation flags for AMD GPUs (#7127) f53b26d44 Merge pull request #7143 from dalg24/ompt_bad_alloc f33443b49 fixup! Throw bad alloc if omp_target_alloc() returns nullptr 84a60e56c Fix Trilinos nightly failure due to `create_mirror*` refactor (#7126) a07cbe4eb Fix gcc-14 C++26 nightly jenkins build (#7137) dca818b1a Enable test_view_allocation_error with OpenMPTarget 5154f9d0b Throw bad alloc if omp_target_alloc() returns nullptr 571fbae26 Add `likely` and `unlikely` attribute from C++20 to ref counting in views (#6730) b02c83a4c Merge pull request #7141 from masterleinad/disable_failing_nan_tests_nvhpc 0a64dfc74 Get rid of `RawMemoryAllocationFailure::AllocationMechanism` and derived backend-specific exceptions (#7139) 7247c7f4f Check for LIBCXX 10 or later for C++20 and later (#7123) 1dd782dbf no_device_stack is unknown 4dc70d811 NVHPC: Disable failing NaN tests 8e682582f Merge pull request #6987 from masterleinad/remove_nvhpc_as_device_compiler_support 7bcd7acaf [ci skip] rename jenkins build 981163a67 Merge pull request #7138 from masterleinad/fix_sycl_sstream ee50eeb17 SYCL: Add support for Graphs (#6912) d7c0be575 SYCL: Add missing include for std::stringstream 62e60260d OpenMP: Ensure kernels submitted by multiple threads to the same instance don't run concurrently (#6151) c0edea91c Merge pull request #7133 from dalg24/simplify_finalize_logic 7ff2e10b3 Disable the PushFinalizeHookTerminate test on Windows 54cd8dac4 Merge pull request #7134 from masterleinad/fix_hip_nightly 59773deec [ci skip] Fix ROCm version to 6.1.2 in nightly CI db8326fb1 Merge pull request #7132 from dalg24/get_rid_of_exception_swallowers 1fbfc7248 Simplify the logic when finalizing and calling the registered functions e215b6bb3 Drop (unused) KOKKOS_ADD_ADVANCED_TEST TriBITS function cbed0764b Let the throwing push finalize hook calls terminate test actually run 44befe5a0 Do not swallow errors when deallocating memory with CUDA 63141b5b8 Merge pull request #7128 from masterleinad/c++20_minimum_compiler_versions 41eb0b6e1 Fix using and, or, xor in desul with MSVC (#7124) 1c04d0464 Merge pull request #7131 from dalg24/avoid_catch_mem_alloc_failure_and_rethrow 0e0307313 Do not bother catching memory allocation failure and rethrow af933fe32 Merge pull request #7129 from dalg24/drop_cuda_uvm_allocation_count 971c88440 Drop (unused) cuda uvm allocations counter cca439ff5 Define minimum compiler versions for C++20 support 73ac2492d Merge pull request #7081 from Rombur/gcc_14 2b248b0d2 Refactor: Move logic of `create_mirror*` to `Impl::create_mirror*` (#7061) a6e7f0d1f Merge pull request #7121 from kokkos/dependabot/github_actions/actions/upload-artifact-4.3.4 669b9f279 Deprecate `RawMemoryAllocationFailure::FailureMode::MaximumCudaUVMAllocationsExceeded` (#7120) 35b3f288f Merge pull request #7122 from dalg24/update_hip_nightly 72d9d077f Update HIP nightly build base image Ubuntu 20.04 -> 22.04 a18bcb059 Bump actions/upload-artifact from 4.3.3 to 4.3.4 93e372cbb Fix and test with -fsanitize=undefined in GitHub CI (#7104) 487b310c9 Merge pull request #7119 from ldh4/simd_fix_div_by_zero 56a40db0a Fix div by zero in math ops testing 33c9b8cef Merge pull request #7118 from crtrott/update-mdspan b84125ed6 Add AtomicAccessorRelaxed (#7089) dc175068b Update mdspan to 98a12b01b51b2 ba2075b3d Merge pull request #7117 from Rombur/hip_20 2a465fa8c Merge pull request #7102 from vicentebolea/fix-relative-dir-install 7b3e7c872 Merge pull request #7112 from masterleinad/fx_sycl_ci 4ddc65077 Merge pull request #7114 from crtrott/add-concepts-include-in-test e396c8f50 Update base image for ROCm 5.6 83f975a94 Github CI: Test with C++17, C++20, and C++23 (#7082) 4a54fb34b Add missing concepts include in test 14093185a SYCL CI: Manually build oneDPL fc45a032c move view allocation related functionality to a new header (#7110) b5f51b9c8 Workaround to ice with icpc when using -no-ip (#7106) f562ca246 Merge pull request #6802 from ldh4/simd_use_larger_vec_width 9f7a92f05 Clean up KOKKOS_LIB_INCLUDE_DIRECTORIES, append include directories to associated targets in Trilinos builds (#7103) d9f7dfe85 Merge pull request #7108 from masterleinad/restrict_jenkins_cuda 720490e7c cmake: fix relative to find kokkos_compiler_launcher f10076cb9 Restrict jenkins CI not to run on hopper for nvcc < 11.8 b650199ca clang formating 4d1278ec2 Added a comment about is_type structs 6e167f26a Workaround for the compilation failure for rocm 5.6-6.0 1eb1abe5e Disabling simd unit tests from building for Windows+CUDA build 61de582d1 clang-formatted e02c6a351 Added for width 4 for NEON aa833570a Added for AVX512 e320e0054 Added width 8 abi for avx2 2d7715239 Fix SpaceAwareAccessor based on usage experiment in View (#7088) 93db4f783 Merge pull request #7094 from aprokop/exec_spaces 28614907c Remove FIXME_NVHPC 23.7 guards 53b320221 Cleanup KokkosP hooks in `Profiling::` (#7096) 46df6c18f Merge pull request #7093 from seyonglee/disable_mdspanerror_openacc b69cf9eab Merge pull request #7099 from ndellingwood/fix-werror-icpc a10c912db Merge pull request #7097 from kokkos/dependabot/github_actions/Jimver/cuda-toolkit-0.2.16 a4e7eab6c Couple more icpc -Werror fixes 546bb2bd4 Merge pull request #7098 from DerNils-git/develop 1771bfd90 Copy print_configuration setting in combination of kokkos settings. c2a586338 Bump Jimver/cuda-toolkit from 0.2.15 to 0.2.16 70d50fe6f Merge pull request #7091 from JBludau/remove_overwrite_of_default_space e0d99fdd7 Merge pull request #7095 from ndellingwood/fix-more-icpc e826e7fc8 Fix more icpc issues 8fc95f871 Add missing space 304ad9d3e Temporarily disable failing parts in the TestMDSpan.hpp for the OpenACC backend. c7acfb75b remove cmake options to change default spaces 3c30f4023 Remove support for NVHPC as CUDA device compiler 24454fa82 Resolve various bogus icpc -Werror (#7079) 2c0bd1644 Merge pull request #7080 from masterleinad/threads_safety_serial a0b8deab0 Merge pull request #7078 from crtrott/update-desul 8501d5a90 Update desul version in github workflow ea4b96f8f Update internal desul file copies to 60c1115 04f6a4f5e Merge pull request #7083 from crtrott/add-missing-include cf14f1c71 Complex needs a tuple include 362c9d724 Merge pull request #7074 from crtrott/space-aware-accessor 9ac81a88d Don't delete special member functions explicitly 1e14d047c fix refcount exception safety (#6289) 3de267cb8 Improve performance for deleting an instance. 7b962ce29 Use correct includes for spaceawareaccessor 708abe21a Move `layout_iterate_type_selector` into Impl namespace (#7076) e2d5815bb Update from GCC 13 to 14 and use C++ 26 in Jenkins nightly 3309d9332 Fix thread-safety for the Serial backend 2678194c7 Structured binding support for Kokkos::complex (#7040) 3d27bf596 Merge pull request #7077 from crtrott/fix-dynamic-extent-definition 3418084ea OpenACC: Skip exec_space_thread_safety_range_scan (#7022) 549858227 Fix using shared libraries and -fvisibility=hidden (#7065) 6c78f4b1f SpaceAwareAccessor: fix issues (no-unique-address, is_empty) a1f1255ac Fix incompatible dynamic_extent definition in Kokkos 34db5182e Address review comments a6b95e9f5 Add specialization of SpaceAwareAcc for AnonymousSpace e2d68fd2b Use SpaceAwareAccessor in View mdspan-interop b2046a40e Add basic tests for SpaceAwareAccessor afbff6c53 Add SpaceAwareAccessor 0d5cc923a Enable MDSPAN support by default (#7069) 892e13c8c Merge pull request #7062 from masterleinad/use_find_cudatoolkit b967b1012 Merge pull request #7072 from ndellingwood/issue-7071 63d8093c1 Workaround icpc "missing return statement at end of non-void function" a3e2b84a7 KOKKOS_CUDA_ERROR->DEFAULT_MSG 64406064d Fix closing brackets 8229477b4 Move check CMake 3.20.1 with nvhpc 40cf84f91 Fix using CUDAToolkit for CMake 3.28.4 and higher d54619970 Merge pull request #7068 from masterleinad/fix_msvc_cuda 90d877036 Avoid lambda in sort_by_key_via_sort d50a87832 Workaround MSVC compiler issues in Views f96df0277 Merge pull request #7070 from masterleinad/fix_mdspan_test 363b464f5 Update to CUDA 12.4.1 in MSVC CI 1d7ccd8df Fix mdspan test 043f87304 Switch to using functors in sort_by_key_via_sort (#7059) 660136f5d Merge pull request #7021 from masterleinad/use_werror_for_cuda 00b4e7fe2 Merge pull request #7063 from masterleinad/restrict_to_array_subtest 0a5fac076 Merge pull request #7066 from Rombur/rocm_61 7c4f2b40a [ci skip] Use ROCM 6.1 in the nightly CI and disable one test 5d0983823 Restrict to_array subtest to NVCC >= 11.4.0 63f05204d Merge pull request #7058 from cedricchevalier19/bump-version-readme 013ef0cad Bump version in the readme f0a7c764a Merge pull request #7057 from kokkos/dependabot/github_actions/DoozyX/clang-format-lint-action-0.17 517f48a4b Merge pull request #7056 from kokkos/dependabot/github_actions/Jimver/cuda-toolkit-0.2.15 5269803eb Bump DoozyX/clang-format-lint-action from 0.16.2 to 0.17 e7ddeee49 Bump Jimver/cuda-toolkit from 0.2.14 to 0.2.15 8e0d4a923 Merge pull request #7055 from masterleinad/move_dependabot 4b913d3e7 Move dependabot to .github 2c3fd02aa Use -Xcudafe --diag_suppress=20208 in Makefile build 1625ec210 Try moving pragma suppress to tests 5b0d94518 Use -Xcudafe --diag_suppress=20208 for 11.6 build; nothing else seems to help 2a15c75c2 Suppress 'long double' is treated as 'double' in device code 0e88744dc Fix dangling reference 9d1842e71 Only use -Werror all-warnings with explicit nvcc_wrapper 5906cba05 Fix .jenkins whitespce 13447c433 Fix gtest d0d99bd58 Fix array size 1876867d6 Fix kokkos_swap 726a8f296 Fix quotation marks in CXX flags e89955018 Cuda: Fix nvcc warnings fad664c8f Merge pull request #7051 from tpadioleau/fix-unused-symbols-ctad-tests 69b0db4c1 Fix unused symbols in CTAD tests f53f905ec Merge pull request #7054 from masterleinad/update_scorecard 150f9009d Update scorecard GitHub workflow 1f602905c Add nightly CI on Frontier (#7048) 63a3cef18 Introduce `KOKKOS_DEDUCTION_GUIDE` macro to allow user-defined deduction guide in device code for clang compiler (#6954) 9f1cc4c97 Merge pull request #7046 from masterleinad/add_dependabot 669746ef8 Improve Kokkos Graphs (#7039) e011753fb Merge pull request #7047 from nliber/array-structured-binding-improvements f0704b39d Add tests to `ScopeGuard` (#7028) c8a5870c2 Merge pull request #7042 from masterleinad/fix_msvc_warnings 2a7ca1a37 Added static_asserts for out of range tuple_element and get (to match checks in complex structured bindings) 5071f2fd3 Add dependabot for GitHub Actions d65b67bbf (Rebase) Partial fix to compile time issues w/nvcc + Kokkos_ENABLE_DEBUG_BOUNDS_CHECK (#7013) 9bd74ee75 Avoid using "#if not defined" 561818bcd Merge pull request #7041 from ndellingwood/issue-7038 0f9efac16 TestArray: add intel guard to to_array implicit conversion test e06ddf6c1 Fix adjacent difference (#6922) 7472ed7ac Merge pull request #6812 from tcclevenger/unorderedmap_deepcopy 580dba58d Merge pull request #7034 from ndellingwood/issue-7031 1c60c8007 Merge pull request #7030 from nliber/ctad-teampolicy-v3 cf791bc2e Adding `Kokkos::to_array` (#6375) 7c67b020c Workaround icpc warnings 0410363d7 Refactor: Replace SFINAE by `if constexpr` for `create_mirror*` functions (#6955) a78d4ddb2 Copied the deduction guides and test cases over from branch nliber/ctad-teampolicy-crtp 24b24d075 Merge pull request #7006 from masterleinad/test_no_default_constructor_dualview 07a500982 Merge pull request #7024 from masterleinad/sycl_cuda_fix_graph_tests 6a3d918a4 Merge pull request #6834 from mhoemmen/fix-README-FENL-link a5bb0d41b Fix Kokkos README's FENL link c8e0a95cb HIP: Use builtin atomic for compare_exchange (#7000) cb27c9941 SYCL: Skip launch_six Graph test 6f176cde0 OpenMPTarget: Fix compiling Graph tests (#7020) 083fb014c Improve `Impl::is_zero_byte()` (#7017) 068d46882 Merge pull request #7018 from dalg24/disable_openmptarget_graph_test 42e83f165 Merge pull request #7023 from dalg24/remove_unused_cuda_api_wrappers f3bd253d3 Remove unused CudaInternal::cuda_{malloc,free}_async_wrapper 02433b625 Merge pull request #7019 from dalg24/nvhpc_suppress_deprecation_warnings bfe9aa2f1 Fixup for disabling deprecation warnings with NVC++ fa8b50102 Disable OpenMPTarget Kokkos::Graph test (does not compile) 468faaa37 Merge pull request #7015 from G-071/fix_hpx_execution_space_nvcc_compilation ce0915b5e Fix undefined behavior in is_zero_byte (#7014) f8f0cc473 Always run Graph tests (#7011) 6aa2ad7da Add a CITATION.cff file (#7008) 64fe75637 SYCL: Don't use shuffles for top-level reductions (#7009) 81b63c5c5 mdspan converting constructors (#6830) 226aecfb8 Properly guard deprecated `Kokkos_Vector.hpp` header self contained test (#7016) fc4383ab6 Fix unique_any_senders nvcc template deduction 2b7b98a1a Use parallel_for instead of parallel_reduce for check 835dbf594 Merge pull request #7012 from seyonglee/openacc_default_async_val_for_team da8be2257 This PR changes the default execution behavior of the parallel_for(team-policy) constructs in the OpenACC backend. - This PR handles a missing case not covered by the previous PR #6772 This PR also fixes the OpenACC backend error in the thread-safety test in PR #6938. df018d97f Suppress deprecated warnings via pragma push/pop in the tests (#6999) cadab6c1e Test DualView resize/realloc for types without default constructor 1d9d0df2e SYCL: Print submission command queue property (#7004) 506da184f Merge pull request #7002 from dalg24/rm_tpl_cusparse 00170ae80 Remove cuSPARSE TPL 5a5306c4e Merge pull request #6997 from masterleinad/sycl_fix_custom_parallel_for_range_deprecations a69e81a59 Merge pull request #6998 from rgayatri23/ompt_scan_lock 7cad3e7c3 OpenMPTarget: Use mutex lock for parallel scan. 37986fde4 [ci skip] update changelog for 4.3.1 (#6995) 6ecdf605e Merge pull request #6994 from ndellingwood/master-release-4.3.01 f5b34222c SYCL: Fix deprecation in custom parallel_for RangePolicy implementation 50a862cf6 SYCL: Prepare Parallel* for Graphs (#6988) d61d75ace Fix a bug when using realloc on views of non-default constructible element types (#6993) c80cdafef update master_history.txt 262d2d6e8 Merge branch 'release-candidate-4.3.01' for 4.3.01 e4cc6862c Merge pull request #6990 from masterleinad/fix_32bit_tpl_library_path 06e4c5bdc Merge pull request #6989 from dalg24/deprecated_attribute_comparison_operators_pair_t1_void ccadc7d9b Disable failing parallel_scan_with_reducers test 28260178f Avoid duplicated definition of KOKKOS_IMPL_32BIT 7b8e3a68f Fix TPL_LIBRARY_SUFFIXES for 32-bit build 9c7920291 Fix deprecation warnings with GCC for pair<T1,void> comparison operators 69567f305 Add thread-safety tests (#6938) c6d86474a Also use is_nothrow_swappable workaround for Intel Classic Compilers (#6983) 68fabc8a2 Merge pull request #6980 from ndellingwood/update-changelog-4301 fecc96c9e Merge pull request #6978 from ndellingwood/cherrypick-6951-4.3.01 4ee802725 Merge pull request #6979 from ndellingwood/cherrypick-6877-4.3.01 49e265601 Merge pull request #6977 from ndellingwood/cherrypick-6931-4.3.01 cd34c2e8b Merge pull request #6976 from ndellingwood/cherrypick-6578-4.3.01 a75dc70d8 Merge pull request #6982 from masterleinad/fix_fedora b7bb509d8 Merge pull request #6985 from crtrott/copyright-rc 85610f455 Merge pull request #6984 from crtrott/Copyright 83498bdc6 Fix Copyright file 45a140491 Fix Copyright file ccd0126b8 Fix fedora CI builds with flang-new 9fccb6107 Update changelog for 4.3.01 cf7f87c19 Merge pull request #6951 from masterleinad/fix_serial_space_team_policy fbab8bdf0 bring back --fmad option to nvcc_wrapper (#6931) 4d7258c26 MI300 support unified memory support (#6877) 30979fb93 cuda: reduction with `RangePolicy`: fix grid dimensions to work for large values and avoid overflow (#6578) 6486a9d68 Merge pull request #6975 from ndellingwood/update-version-4_3_01 dbd7f583a Merge pull request #6962 from dalg24/kokkos_array_const_qualified_element_type 775023262 changelog: header for version 4.3.01 73a7a41ba update to version 4.3.01 ed4d2544f Merge pull request #6972 from dalg24/fix_kokkos_compile_language_cuda_hip_w_omp 15d13f23b Merge pull request #6882 from Rombur/hip_atomic_fetch 27b3ced35 Merge pull request #6949 from Rombur/nightly_deprecated 2574b8029 Fix OpenMP+CUDA when `Kokkos_ENABLE_COMPILE_AS_CMAKE_LANGUAGE` is `ON` f699a2c7a Fix enabling OpenMP with HIP and "compile as CMake language" 4f416f3b7 Merge pull request #6965 from dalg24/cmake_openmp_cxx 77ea52f97 Threads: Don't silently allow m_instance to be a nullptr (#6969) 4ec82963f OpenMPTarget: Update loop order in MDRange (#6925) 7e7709fdb SYCL: Avoid deprecated floating-point number abs overloads (#6959) 18642875f Merge pull request #6967 from crtrott/update-readme-kk-version 968639211 Add Linux Foundation notice and fix C++ standard 19ca9ce97 Update version d434f87e9 Do not require OpenMP support for languages other than CXX 2391f1765 Avoid introducing a 2nd definition of the Impl::swappable trait 031f6d94a Alternate definition of Impl::is_nothrow_swappable_v for NVCC version less than 11.4 ebb1cb308 Revert "Try to fix the CUDA 11.0 build" 63eef4623 Try to fix the CUDA 11.0 build 2e82fdd87 Merge pull request #6961 from dalg24/fixup_deprcated_guards_pair_void fafe861d0 Fix support for Kokkos::Array of const-qualified element type ab3cae486 Fix wrong macro guards for deprecated Kokkos::pair<T1,void> specialization cf59f3120 Merge pull request #6943 from dalg24/kokkos_swap_specialization_for_kokkos_arrays e2b7bb99e Merge pull request #6958 from masterleinad/sycl_replace_deprecated_usm_address_spaces a7827731c Kokkos::Impl::SYCLTypes:: -> Kokkos::Impl::sycl_ 5932685c9 Introduce alias based on feature macro 205fd156d Replace deprecated sycl::device_ptr/sycl::host_ptr cc602957c Merge pull request #6951 from masterleinad/fix_serial_space_team_policy 86f5988b3 Fix noexcept specification for kokkos_swap on zero-sized arrays 8706b68d5 kokkos_swap(Array) member friend should not be templated on some other type U 44fde213f Use Kokkos::AUTO for OpenMPTarget 34d0db2f4 Add test 04bc3d9e3 Merge pull request #6952 from nliber/changelog43 d5fd51274 Merge pull request #6947 from dalg24/deprecate_kokkos_pair_void_specialization 0859ab0af Fixed the link for P6601 (Threads backend change) e7b486ff6 Serial: Use the provided execution space instance in TeamPolicy 69c527a42 [ci skip] Enable deprecated code and deprecated warnings in nightly CI d914fe316 Fix deprecated warning from `Kokkos::Array` specialization (#6945) 906e8ce3c Merge pull request #6942 from dalg24/fix_nightlies_cxx20_requires_expression 730d8d828 Deprecate specialization of Kokkos::pair for a single element c9e21ce2a Add `kokkos_swap(Array<T, N>)` sepcialization f94e8d34d Prefer standard C++ feature testing to guard the C++20 requires expression a8115e5df Adding converting constructor in Kokkos::RandomAccessIterator (#6929) 8c7cc95f9 Merge pull request #6940 from dalg24/unused_limits_header_include_in_kokkos_array f2d37801d Remove unnecessary header include de3a2632c Merge pull request #6934 from dalg24/deprecate_kokkos_array_proxy_template_param d88e2a5b0 bring back --fmad option to nvcc_wrapper (#6931) b5ec79bc9 Merge pull request #6936 from rgayatri23/issue_6874 92e02b50c CUDA: Update nvcc_wrapper a2af4e0d4 Deprecate trailing Proxy template argument in Kokkos::Array b0c2566c8 Merge pull request #6930 from Rombur/fix_nightly 0099c10be Fix nightly CI 6ea7be76e cuda: reduction with `RangePolicy`: fix grid dimensions to work for large values and avoid overflow (#6578) 164519d7d MI300 support unified memory support (#6877) 74c81228f Merge pull request #6926 from Rombur/latest_rocm 1fe8108fb Merge pull request #6906 from dalg24/make_view_of_arrays_less_special 3a27cdbc2 Add ROCm 6.0 in the nightly CI 7b41536c4 Merge pull request #6924 from masterleinad/fix_sycl_workgroup_scan 8cf841076 SYCL: Fix range in subgroup scan for workgroup_scan 55c575750 Use recommended/max team size functions in Cuda ParallelFor and Reduce constructors (#6891) e52cda370 Merge pull request #6785 from Rombur/memory_test e93b168ba Merge pull request #6907 from dalg24/rm_experimental_layout_tiled 98b1a38e5 SYCL: Improve team_reduce implementation (#6562) 1256f6919 Merge pull request #6822 from CExA-project/fix-deep-copy 486cc745c Merge pull request #6908 from ndellingwood/master-release-4.3.00 4b9093099 Refactor: Uniformize `create_mirror*` parameter name for views (#6917) 077ea33c4 Remove trailing whitespace in changelog cc21a5482 Merge pull request #6919 from ndellingwood/dev-changelog-4300 497b438f1 CHANGELOG.md: 4.3.00 update a833fb00b Preparing readme for develop as the default branch (#6796) caa139c9b SYCL: Unroll shuffle loops for top-level parallel_reduce and parallel_scan (#6750) 47a50ac3c Update master_history.txt for 4.3.0 f08217a49 Accommodate users that depend on a code that define silly macros (#6909) 5cf09513c Merge pull request #6910 from tpadioleau/remove-return-functor-copy-for_each 2aecb1d24 SYCL: Fix multi-GPU support and add test (#6887) 059cd15c0 Accommodate users that depend on a code that define silly macros (#6909) e33da600f Fix merge artifact b6678539a Drop specialization of ViewMapping for Kokkos::Array 391e0408b Do not return a copy of the input functor for Kokkos::Experimental::for_each 635551058 Move `Kokkos::Array` tests to a more suitable place (#6905) 5f9214049 Merge branch 'release-candidate-4.3.00' for 4.3.0 06850bf74 [ci skip] Update changelog (#6886) 5eac0bc6f Merge pull request #6876 from masterleinad/disable_fedora_rawhide 1efeb5d76 Deprecate is_layouttiled trait 51b98e1d7 Get rid of now unnecessary use of is_layouttiled trait e2cfdec54 Drop Experimental::LayoutTiled class template 68c668469 Update Intel GPU architectures in Makefile (#6895) a53d30aab Merge pull request #6896 from masterleinad/fix_makefile_threads 7ddc2d39c [4.3.00] Cuda: Fix configuring with CMake 3.28.4 (#6903) 8d734b026 Cuda: Fix configuring with CMake 3.28.4 (#6898) 772e745a1 Merge pull request #6899 from ndellingwood/cherrypick-6892 0834a1281 Fix a bug in Makefile when using AMD GPU architectures (#6892) 2035e313d Fix a bug in Makefile when using AMD GPU architectures (#6892) 872dc422f Fix Makefile.kokkos for Threads 46354d25d Use builtin for atomic_fetch in the HIP backend ae4d0013d TestViewCopy_c.hpp: better handling for OpenMPTarget a2f2ba404 TestViewCopy_c.hpp: add new unit test for deep copy (ViewFill) 841b3a9f9 Fix deep copy when filling Rank-7 views 9fff1e066 Merge pull request #6881 from dalg24/bump_develop_to_4_3_99 05bd48516 [ci skip] Bump version number to 4.3.99 1c60a32b7 Set version number to 4.3.0 a34d910ac Merge pull request #6879 from ndellingwood/update-rocthrust-check-trilinos 096e72437 Scratch space fix for MultiGPU (#6866) 49bd895ae kokkos_tpls.cmake: update default option to enable rocthrust c1a800650 Don't use Fedora development version in GitHub CI 5931cbd29 Merge pull request #6871 from masterleinad/fix_link_rocthrust 5e7cab99b SYCL: Make sure to call find_dependency for oneDPL if necessary (#6870) 8062a6020 Fix linking with rothrust in downstream applications a2b64e0e8 Improve message on view out of bounds access and always abort (#6861) da77d6e14 Merge pull request #6868 from Rombur/hip_sort_by_key 128caa1df Merge pull request #6869 from masterleinad/mdrange_ctad_test_warning 3a765351c Fix unused variable warning in TestMDRangePolicyCTAD.cpp e5126e929 Add HIP specialization for sort-by-key 35ad698e0 Add support for rocThrust in sort when using HIP (#6793) 4e835e136 Merge pull request #6816 from crtrott/add-security-md 5ffcc1dcc Merge pull request #6840 from CExA-project/cmake-bench cfc260ac0 CTAD (deduction guides) for MDRangePolicy (#5516) 6db04b3b5 CTAD (deduction guides) for RangePolicy (#6850) c7ad79c4f Merge pull request #6862 from nmm0/update-mdspan-tpl 82b0f2a60 Merge pull request #6860 from masterleinad/fix_cstyle_cast_clang_tidy 121964a93 update mdspan tpl 9a7e7958a Split some classes from Kokkos_ViewMapping (#6859) c3c8a70d2 Update the unsafe implicit conversion error message in MDRangePolicy (#6855) 9feb104d9 Fix fallback implementation for sort_by_key (#6856) 99c7e1b1c Fix amdclang++ compilation (#6857) 97a94b60a Fix C-style cast 3d485c19d bytes_and_flops: fix a counter name 52c41e6b3 Merge pull request #6854 from dalg24/bump_google_benchmark 4dcbff2cf Benchmarks: disable 2 benchmarks for OpenMPTarget 715d6156e policy_benchmark: fix indentation 97fa76f29 fix some warnings in policy_performance benchmark 750ef211a add policy_performance benchmark to CMake 16d2edbb3 add atomic benchmark to CMake 932466f21 add gather benchmark to CMake 5c9a4aa3c bytes_and_flops fix a small bug in command line argument 277339090 bytes_and_flops with CMake e83619830 Merge pull request #6858 from masterleinad/fix_unused_variable_ctad dc524910d Avoid unused variable warning in TestRangePolicyCTAD.cpp 8b8de2cf4 Remove variadic range policy constructor (#6845) 0cdc9eb76 Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent 04a5334c6 Remove redundant RangePolicy constructor (#6841) 058c3a08e Fix scorecard workflow (#6831) c90a9c6f7 Implement sort_by_key (#6801) 549c50b9c Merge pull request #6800 from masterleinad/sycl_clean_device_selection 16a5ebe95 multi-GPU support: Add test for all policies (#6782) bb734012e Merge pull request #6837 from masterleinad/fix_unwanted_fence_parallel_scan_no_fence_test 99510b131 Merge pull request #6825 from masterleinad/cleanup_kokkos_configure_core 24f251a85 Add test for current CTAD support with RangePolicy (#6803) e2c810e1f Avoid detecting unwanted fences in the parallel_scan_no_fence test e2689abc0 Merge pull request #6829 from ndellingwood/update-changelog-421 9d33cb772 Clean up shift_{right, left}_team_impl (#6821) 361bdbf49 [4.2.01]: changelog update (#6656) 74b421b19 Merge pull request #6826 from masterleinad/update_github_actions e67ce088d Merge pull request #6824 from masterleinad/fix_sycl_ci 1112e07eb Update GitHub actions ot use Node 20 c3f0a2698 Cleanup KOKKOS_CONFIGURE_CORE df68761f9 SYCL CI: Avoid setvars.sh 696654a1c Only call deep_copy_view() from deep_copy(), add deprecation warning 0f3b727b1 Merge pull request #6813 from fnrizzi/fix_constness_for_views_std_algos 48588d08b Add CodeQL GitHub Action (#6818) fe6a937af Merge pull request #6815 from masterleinad/fix_sort_fence a46a5a14e Merge pull request #6806 from dalg24/rm_older_cpu_archs a1199b3df Explicity pass template params to ZeroMemset for intel icpc compilers (#6807) cdb634dba Merge pull request #6817 from dalg24/license_in_readme bd9db1562 [ci skip] Update license badge and links in the README 2a8ac6f48 Adding SECURITY.md file c95f9542f Fix fence in Kokkos::sort when using std::sort 8e8d45724 Remove unused typedef 513d8db05 fix constness for views 59e2fd08d Redeine deep_copy for UnorderMap f40d55528 Merge pull request #6810 from crtrott/update-workflow-permissions 54c2336c5 Update workflow permissions 8963927d0 Merge pull request #6808 from crtrott/ossf-scorecard 4b84ae0e6 Add OpenSSF scorecard workflow e0dc0128e Merge pull request #6770 from ndellingwood/master-release-4.2.01 3611cfef3 SYCL: Improve print_configuration (#6795) 37962b3d2 SYCL: Cleanup device selection 17d074259 Drop Intel Westmere and SSE4.2 extension 5b86415d6 Drop IBM Blue Gene/Q and POWER7 architectures 65dca527a Merge pull request #6798 from dalg24/rm_librt 54d41bdb8 Merge pull request #6797 from dalg24/intel_mm_alloc 3b515c99e Cuda multi-GPU support: Pass the correct device id to get_cuda_kernel_func_attributes (#6767) aced864ec Drop librt TPL and associated KOKKOS_ENABLE_LIBRT macro 21b110542 Drop KOKKOS_ENABLE_INTEL_MM_ALLOC macro 7ff87a5b2 SYCL: Filter GPU devices (#6758) 8d58aadf0 Merge pull request #6790 from dalg24/impl_get_gpu_returns_optional 3e405209d Add support for RISCV and the Milk-V's Pioneer (#6773) 49f646283 Merge pull request #6786 from masterleinad/fix_cuda_occupancy 76f740fdb Merge pull request #6791 from dalg24/rm_hbw_space 3496c6fde Remove stray include header 393509470 Merge pull request #6792 from ldh4/simd_add_missing_vector_aligned_in_neon 1502379d0 Added missing copy_from() in neon for vector_aligned 95f70b3f4 Remove support for memkind 473cd5313 Remove DummyPolicy 136360bb3 Restore TestCommonPolicyConstructors.hpp e28b57976 Fixup bogous shared alloc fence labels mentioning HBWSpace f07a537c4 Drop Experimental::HBWSpace 0ed2ebfee Make ranges non-trivial 391f2d12f Fix SharedAllocationRecord to allocate using the correct execution space instance (#6789) 3db377e15 Fixup select from visible devices 1327c3779 Let Impl::get_gpu return std::optional and delegate device selection when appropriate 26060fed7 Don't try to compile the test for any backend with MSVC+Cuda 19dcd64da test_execution_policy_occupancy_and_hint might be unused 99b2e46ec Run OccupancyControlTrait on all execution spaces 91cc45e3a Split runtime checks from TestCommonPolicyConstructors into OccupancyControlTrait 442e4d42a Add checks for unsafe implicit conversions in RangePolicy (#6754) 4d29e39ab Disable test for MSVC+Cuda 31fb4761d simd: support vector_aligned_tag (#6243) 5f128d27d Merge pull request #6787 from ldh4/simd_skip_reduction_omptarget 01d5f8149 SYCL: Error out on initialization if the backend is different from ext_oneapi_* (#6784) 97997807d Temporarily disable simd_reduction test for omptarget build 20d52fb1c Fix Occupancy for Cuda b4bc40614 Reenable TestHIP_Memory_Requirements 63a1208b3 Fixup use provided execution space when copying host inaccessible reduction result (#6777) 7d2ea7212 Cuda multi-GPU support: Make some variables device-specific, update Kokkos::fence (#6753) 2d273c86a Merge pull request #6778 from Rombur/fix_in_parallel 379007a35 Merge pull request #6768 from dalg24/fix_device_id_test_omp_target 917baa6d6 Fix typo in deprecatation macro used in HIP cc6ecf058 Merge pull request #6772 from seyonglee/openacc_default_async_val 4c94f089b Get rid of `ZeroMemset`'s silly trailing value argument (#6769) af806fb5d Drop 2-arguments `ZeroMemset` constructor overloads (#6764) 729940c87 Attempt to fix device id test with OpenMPTarget 69fc8f851 Merge pull request #6763 from masterleinad/cuda_dont_use_singleton_wrapper_tasks b4c61a8f2 Merge pull request #6766 from dalg24/std_algo_tests_drop_print_statements 7d5fff958 Get rid of print statements in parallel algorithms unit tests 408e8be5b OpenMPTarget on Intel GPUs update (#6735) 61a07cf2a Merge pull request #6762 from masterleinad/cuda_dont_use_singleton_wrapper_space_instance bbb895a34 Remove redundant calls in rangepolicy constructors (#6765) 71b246d67 Deprecate `in_parallel` (#6032) (#6582) 7439ec9d4 Avoid calling wrapper functions with singleton in Kokkos_Cuda_Task.cpp eecd917f6 Change the default execution policy behavior of the OpenACC backend from synchronous to asynchronous executions. - Change the default OpenACC async_arg value from acc_async_sync to acc_async_noval. - Add acc_wait(async_arg) to scalar reduction operations (parallel_reduce()). 92307a5ec Update master_history.txt for 4.2.01 221e5f7a2 Merge branch 'release-candidate-4.2.01' for 4.2.01 26ad2643c Merge pull request #6761 from dalg24/cuda_get_last_error 7b5fbd414 [4.2.01]: changelog update (#6656) a082f820d Avoid calling wrapper functions with singleton in some classes b3d8643e8 Drop CudaInternal::cuda_get_last_error_wrapper() 2ca8e73a6 Merge pull request #6751 from ndellingwood/cherrypick-6746-to-rc4201 11b58159e Merge pull request #6756 from ndellingwood/cherrypick-6742-rc-4201 d2913cb38 Add runtime function to query the number of devices and make device ID consistent with `KOKKOS_VISIBLE_DEVICES` (#6713) e2f452882 Merge pull request #6742 from masterleinad/cleanup_trilinos_cmake_cxx_flags 4621c8643 Merge pull request #6742 from masterleinad/cleanup_trilinos_cmake_cxx_flags 20150550e Merge pull request #6747 from uliegecsm/fix-remove-if 540368114 std(remove-if): fixing tmp view alloc + avoid evaluating twice the predicate during final pass d4a099599 Merge pull request #6749 from ndellingwood/cherrypick-6510-to-rc4201 6229367eb Merge pull request #6746 from tcclevenger/cuda_warp_sync_to_avoid_race_conditions d8ace9763 Merge pull request #6746 from tcclevenger/cuda_warp_sync_to_avoid_race_conditions 8845cee38 Merge pull request #6510 from ndellingwood/fix-werror-pedantic 650ac4067 Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) d560c4719 Drop support for deprecated command-line arguments and environment variables (#6744) 57126af31 add more warp sync for cuda reductions e1415f8fc Merge pull request #6630 from tcclevenger/potential_racecondition_in_cuda_reduce dcf93fc08 Merge pull request #6738 from dalg24/shared_allocation_record 2dc7cbcc9 Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) a1a6ea14c Fix TestThreadVectorMDRangeParallelReduce (#6734) c17969f33 Trilinos: Don't let Kokkos set CMAKE_CXX_FLAGS d18ad8f34 Untangle SharedAllocationRecord spaghetti code 34973c773 Merge pull request #6731 from Rombur/hip_ci_new abd50dc36 Merge pull request #6733 from simon-schlepphorst/fix_cmake_for_cxx26 5610068c5 Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) 407e18dc8 Use team_size_max to fix "Team size too large" error in reducer test (#6725) 523d70189 Disabling failing HIP test in the CI c4e1b86c8 Reenable HIP testing 87f32846b Add KOKKOS_ENABLE_CXX26 to the configuration metadata 39a0f3d67 Add support for C++26 in generated makefiles bd3c0a552 Add C++26 standard to CMake Setup 6912b3998 Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) 8a914909d Merge pull request #6729 from dalg24/acc_allocation_error 5781d176e Disable openacc.view_allocation_error test 000fccc50 Merge pull request #6728 from Rombur/hip_ci_tmp_fix 3d33665ff Fixup using declaration f9f3c6e13 [OpenACC] throw if acc_malloc returned nullptr a3aa567af Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator 8f743cf95 Ensure view_allocation_error does not silently ignore that no exception was thrown 9eca17795 Fix Docker env variables 86f5bb7d8 Let the smart pointer manage the CUDA/HIP stream (#6721) f42a8cb03 Temporary fix to reenable HIP CI 179d2e67f Add bound checks in RangePolicy and MDRangePolicy (#6617) ea564a274 Merge pull request #6723 from ndellingwood/cherrypick-6671-rc-4.2.01 4d784fe01 CHANGELOG.md: remove stray trailing whitespaces f53b18b6c Merge pull request #6671 from rbberger/add_mi300_gfx940 95934133f Merge pull request #6722 from ndellingwood/fix-hip-missing-header 256c0ca62 Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors 35a867d37 Make initialize and finalize of the Cuda/HIP singleton less special (#6714) bed3064ef Merge pull request #6712 from dalg24/cuda_error_cleanup 1e10099a1 Merge pull request #6715 from dalg24/hip_extraneous_closing_brace 4e33b3bf9 HIP: Forgot to delete matching brace closing the namespace e6ff1a469 No need to jump through so many hoops to print the error message 868e42e7b Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper c75d730d2 Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) 474366af4 [4.2.01] Fix msvc cuda release (#6660) 9393b358f Don't use the compiler launcher script if the compile language is CUDA. (#6704) fa91c962a Merge pull request #6711 from dalg24/pointless_cudaexec_forward_declaration 0254c631b Drop pointless Kokkos::Impl::CudaExec forward declaration be0c796c4 Merge pull request #6658 from masterleinad/fill_random_sync cad863fca Merge pull request #6708 from fnrizzi/inplace_transform_inclusive_scan 36da6cca7 add tests for in-place `inclusive_scan` (#6682) ee5cbfc25 Fix TeamThreadMDRange parallel_reduce (#6511) 89ba3fbae Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) 0ba8c40fc Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) 673401038 add tests 96d530a24 Remove Kokkos::[b]half_t volatile overloads (#6579) 0e4a158a7 Check matching static extents in View constructor (#5190) 27286c32d Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) 23b02f064 Merge pull request #6701 from masterleinad/fix_enable_compile_as_cmake_language c9038983a Merge pull request #6681 from masterleinad/disable_bessel_sycl 5bb3ba32a Merge pull request #6695 from dalg24/cleanup_profiling_section 3523bc3e7 Enable `{transform_}exclusive_scan` in place (#6667) bec13acd0 Merge pull request #6703 from ndellingwood/issue-6702 68de5ce19 Merge pull request #6700 from dalg24/fixup_print_tolerance 716bef2a4 test_array_ctad: disable test for intel versions < 2021 3358970c2 Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE cbf1c644c Fixup cast tolerance to double before printing 9f5e38e97 SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) 06de563f9 Add CI for MSVC+Cuda (#6661) efc0c365c Kokkos::Array deduction guide (#6373) 654a51f60 GitHub CI: Test with AddressSanitizer (#6676) 4eae6a99f Cosmetic changes to ProfilingSection f485cfa53 Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) 4078a0d8a Cuda: Allocate using the correct device (#6392) 02b46c09c #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) f02539e35 Merge pull request #6647 from dalg24/ulp_should_be_integral 79a36295d Merge pull request #6649 from dalg24/we_dont_need_no_dual_view_converting_assignment_operator 2ac06ce63 Merge pull request #6689 from dalg24/profiling_section 73c750755 Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp 5aa0ceee4 Drop unnecessary guarding for a tool library being loaded in ProfilingSection 391daefd5 fill_random without exceution space instance should fence 8de16ea35 Disable more Bessel tests for SYCL on INtel GPUs cbbe09b93 OpenMP: Use `omp_get_nested` for older gcc versions (#6685) fe06b6f36 Merge pull request #6652 from masterleinad/ompt_printf 7e73c2b47 Merge pull request #6675 from brian-kelley/DeepCopyMsg f38553cb0 Merge pull request #6361 from masterleinad/cuda_multiple_devices_constructor 79164a43a Improve handling of printf in OMPT on Intel GPUs 52e44d6cf SYCL: Force inlining of Kokkos::printf (#6650) 2092c01a4 Merge pull request #6651 from masterleinad/disable_hip_ci f71052b5f Merge pull request #6680 from Rombur/rocm_60 5df22b87b Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support 64a7774b6 Merge pull request #6671 from rbberger/add_mi300_gfx940 154a57df8 src->source, dst->destination 72bc7ed42 Add missing include sstream 838f8938e Add a unit test for new deep_copy exception msg 316ceac58 Improve "no copy mechanism" exception message 18d7d78f5 Merge pull request #6664 from dalg24/openacc_not_always_true e4a7cfc78 Per review prefer always_false<Arg>::value to is_void_v<Arg> 33db3046a Add Impl::always_false type-dendent false trait 293319c58 Add missing gfx940 62855dcf1 Merge pull request #6662 from cwpearson/feature/cmake-stream cedbf56f6 Merge pull request #6665 from dalg24/not_accomodating_external_definition_of_kokkos_assert 1bd9ce7a5 Merge pull request #6659 from crtrott/fix-msvc-cuda-develop fb668b143 Merge pull request #6666 from masterleinad/openmp_use_omp_get_max_active_levels a996c12a0 Use omp_get_max_active_levels() when supported ae71e4002 Drop guards to accommodate external code defining KOKKOS_ASSERT 76ea3a3a9 Do not negate the dependent true traits helper 379d5db1a Add CMakeLists.txt for stream benchmark ed08974c7 Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) e524ec777 Move header for Damien because he is right c6d01e943 Fix formatting 249f8b4fb Sidestep lacking CTAD support msvc/cuda 7dcf1deba Avoid lambdas in constexpr branch for msvc/cuda 458910fbf Fix missing include on msvc/cuda fb0380b91 Fix builtin_unreachable use for MSVC/CUDA 843fca336 OpenMPTarget: clang extensions for dynamic shared memory. (#6380) 1abcca9d4 Merge pull request #6626 from masterleinad/cherry_pick_6608_4_2_01 d0548d658 Merge pull request #6631 from tcclevenger/cherry-pick-6630 4e4a047a2 Merge pull request #6627 from masterleinad/cherry_pick_6623_4_2_01 6e1865714 Merge pull request #6638 from dalg24/rc421_early_tools_profiling 84299466a Merge pull request #6655 from kokkos/cherry_pick_6653 232114fcd Merge pull request #6653 from masterleinad/remove_deprecation_allocation_mechanism_gcc_11_0 24b64848e Merge pull request #6653 from masterleinad/remove_deprecation_allocation_mechanism_gcc_11_0 8e16df3cf Merge pull request #6557 from dalg24/rm_logical_spaces eadc210bf Remove deprecation warning for AllocationMechanism for gcc <11.0 dcdfcac91 Diable HIP CI 9fd95ebcb Don't use rocm-docker for clang-format a35bc6890 Merge pull request #6643 from seyonglee/fix_openacc_toomuchwarning b9b63dfd8 Drop DualView converting copy assignment operator 71729af71 Fixup test math functions ulp should double -> int b877a6e9b Merge pull request #6645 from fnrizzi/fix_6644 a41dba586 SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) 07cdd7000 add missing header fix #6644 c3dde624d Merge pull request #6642 from uliegecsm/kokkos-tools-typo 685620918 This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. This PR also re-enables the OpenACC CI test. 9041bdaf3 Merge pull request #6625 from Rombur/jenkins_multibranch 4a6a92056 Merge pull request #6634 from Rombur/ubuntu_18 ed64cea7f tools(profiling): type (related to kokkos/kokkos-tools/pull/221) 30f020777 Merge pull request #6640 from uliegecsm/unorderedmap-types 52d5c3738 nvcc wrapper: remove troubling flag to fix 6628 (#6629) 12af5769b Merge pull request #6639 from Rombur/disable_openacc e9899a5b1 unorderedmap: modernize traits 7739ca191 Disabling OpenACC in the CI because it emits too many warnings e4753753b Merge pull request #6635 from uliegecsm/kokkos-profiling-fix f8788ef2a Merge pull request #6635 from uliegecsm/kokkos-profiling-fix b00c1e068 update comment to include final() mention 54c62d15d Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format c9d7bbad1 kokkos(profiling): do not finalize in any backend 6dcd72b9b Add warp sync for Cuda parallel reduce 4d4a343e5 Add warp sync for Cuda parallel reduce c9540f51c Merge pull request #6624 from uliegecsm/kokkos-graph-hip-fix 0c617db8a Merge pull request #6608 from masterleinad/fix_numeric_traits_bfloat16 16972af28 Add jenkins multibranch pipeline options 0d3428087 Merge pull request #6614 from masterleinad/gh_workflows_icpc_fedora_to_intel a7bf142d5 Merge pull request #6604 from fnrizzi/fix_test_dev_and_th 81580ca15 Merge pull request #6615 from uliegecsm/nvcc-wrapper-missing b54105701 Merge pull request #6624 from uliegecsm/kokkos-graph-hip-fix f31436a09 graph(HIP): adding inline keyword to fix #6623 f1d466622 Merge pull request #6608 from masterleinad/fix_numeric_traits_bfloat16 a4720ce41 Add clang-format check to GitHub workflows (#6612) 0262f7405 nvcc(wrapper): adding missing `--generate-line-info` arg 71a9bcae5 Merge pull request #6613 from ndellingwood/master-release-4.2.00 ae75d3895 GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing 0a40d16b8 Merge pull request #6611 from cz4rs/fix-formatting 3dd0b8253 [ci skip] fix formatting 374064ab7 add branching 33a1106da use reference 61842b7d1 remove comments 68e4bedc4 fix for macos 17af2f3c4 try 2779b29b5 avoid pyt package f0af4672c try fix aa2ff89fb Merge pull request #6598 from uliegecsm/kokkos-unique-fix 81a958653 Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF (#6593) ff7104cee [ci skip] Update changelog on develop for 4.2.00 (#6592) 932c1fb2f Added missing operator* to NEON simd 8fd8c94aa Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) 1a145311f Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 ee655c08a Fix TestNumericTriats.hpp for SYCL with bfloat16 support c60716df4 try fix 38cbde408 Update master_history.txt for 4.2.00 abe01c88f Merge pull request #6600 from masterleinad/cherry-pick-6590 08efbb919 Merge pull request #6595 from masterleinad/cherry-pick-6543 9c37437ea Use binary wrapper for consistency in definition of half types numeric traits (#6590) 61b93ec7f kokkos(unique): fix allocation of temporary view to enfore using the provided space instance 2f5723bd6 Merge pull request #6585 from masterleinad/ompt_guard_scratch_allocations d5a480291 Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) 97a90d5dd OpenACC: add atomics support (#6446) fb73a7359 Merge pull request #6589 from kokkos/revert-6586-desul_sycl_device_global_supported 6023c1919 Merge branch 'release-candidate-4.2.00' for 4.2.00 81e308e7d Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" 3f773d057 OpenMP: No memset in viewfill (#6573) 0a83695e5 Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) 91ee4e1e1 Merge pull request #6569 from masterleinad/cleanup_static_assert_kokkos_impl_do_not_use_printf 13c6c5783 Merge pull request #6586 from dalg24/desul_sycl_device_global_supported 605784265 [ci skip] Adding Changelog for Release 4.2.0 (#6583) c8b4fe848 Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header 26464df04 SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) fcb0452d0 OpenMPTarget: Guard scratch memory usage in ParallelReduce d4a517f82 Set the device id explicitly for CUDA API calls in impl_initialize 80084960c simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) a453e9fc3 Simplify fence functions in the Threads backend (#6571) 8d9400e29 Add crtrott's launch_latency benchmark (#6379) 403c34f30 m_cudaDev isn't static anymore a7b16b351 OpenMPTarget: CI compiler upgrade. (#6545) c7a162342 Merge pull request #6576 from dalg24/remove_cuda_clang_workaround 1e1ed1318 Drop Clang+CUDA workaround 6fc7a4930 Merge pull request #6575 from dalg24/drop_unused_memory_fence_header cead4f559 [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header 3a285ecf1 Merge pull request #6276 from ldh4/simd_add_missing_unit_tests 21a3d6f12 Merge pull request #6570 from ldh4/simd_move_fallback_impls 3b8c449f1 Remove empty quotation marks for static_assert b76e1dcc1 fallback implementation cleanup 024d6c21b Remove unused Sandia testing files (#6568) 6da3fa7e9 Threads remove unused variables and functions (#6566) 0e5aa1503 Merge pull request #6553 from masterleinad/avoid_redundance_algorithm_unit_test_variables a07c7a2b6 Address reviewer comments 8c4fe6b06 Merge remote-tracking branch 'upstream/develop' into cuda_multiple_devices_constructor 6eb12dbc9 Rollback changes to view constructors to reduce the number of instantiations (#6564) fd80cbef4 Merge pull request #6541 from Rombur/threads_refactor_3 6d95b621e Remove logical memory spaces 54f2e7f23 Merge pull request #6548 from Rombur/hip_split 3093a0e64 Only define STDALGO_TEAM_SOURCES_* once 3edbef33d Merge pull request #6531 from masterleinad/sycl_use_ext_oneapi_device_global_feature_macro 400dd1d99 Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) 201d1dead Merge pull request #6536 from dalg24/view_constructor_from_label 6b4ee34ee Split files in HIP backend 13efa71ac Merge pull request #6547 from msimberg/bump-hpx-1.9.0 a41df08a7 Bump HPX version used in CI to 1.9.0 840d6b775 Reduce number of View constructor instantiations 189aaa6da Merge pull request #6542 from Rombur/fix_guard b4f27c87f Fix typo in macro guard c4d0dfe02 Fix indentation 33010ecc3 Add comments a417450bb Remove spawn function bb759df49 Remove useless forward declaration 7d31c2273 Small cleanup of ThreadsInternal::initialize 6ac5aa846 Remove Sentinel struct from Threads b875be75d Remove unused variables 09756717d SYCL: Use host-pinned memory to copy reduction/scan result (#6500) 6056c6b1f Merge pull request #6537 from Rombur/threads_refactor_2 3bcf9657f Prefer defaulted default constructor for Bitset (#6524) 1fcce6936 Remove extra constructor 9158785df Remove sleep and wake functions ae0bd54eb Added unit tests for reduction ops and few intel svml intrinsics cf5a859bf Threads: replace enum with constexpr int and enum class (#6514) e156d5859 Check that device associated with stream matches requested device 0b59a1b40 Merge pull request #6512 from eltociear/patch-1 a30b9aa78 Fixup in README (github -> GitHub) 4383d1c1e Merge pull request #6528 from ndellingwood/cherrypick-6516 a440ac9e3 Merge pull request #6527 from ndellingwood/cherrypick-6518 4c88f2569 Merge pull request #6525 from dalg24/bitset_do_not_mess_with_labels 92cc6ce95 Merge pull request #6523 from dalg24/rm_deprecated_code_3 2bc1721d7 SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables e9a540605 Merge pull request #6516 from fnrizzi/fix_6502 86e3c8db7 Merge pull request #6518 from fnrizzi/fix_6515 0120c431b Merge pull request #6516 from fnrizzi/fix_6502 ef889a7ab with_updated_label -> append_to_label 380754b91 Do not append " - blocks" to the bitset label 589ad55b0 fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions 2e6765a2a [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option 7c63c32bd [deprecated code 3] remove MasterLock fb0bd5297 Get rid of FIXME_OPENMP ca49c65f6 OpenMP backend cleanup following removal of deprecated code 3 0505ce294 [deprecated code 3] remove {OpenMP,HPX}::partition_master 3172fd1b0 [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants d515a51ea [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions 35dda2ac6 [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax 57c0aa61f [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros dfd0a6d31 [deprecated code 3] remove InitArguments e6c51df7f [deprecated code 3] remove all default device init tests 3490ec1e7 Merge pull request #6520 from masterleinad/update_kokkos_version_develop 505c396dc Merge pull request #6518 from fnrizzi/fix_6515 629135a0f [ci skip] Update Kokkos version to 4.2.99 b26a1f735 avoid auto 7b86b80a9 add guards 0c0cafaba Merge pull request #6509 from Rombur/threads_team 58f53a6a2 Merge pull request #6510 from ndellingwood/fix-werror-pedantic 78c1ed885 Kokkos_SIMD_Scalar.hpp: remove extra ';' a0cacc305 Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp 654d283a6 Update version number for 4.2.00 release bd361e562 Merge pull request #6505 from Rombur/threads_instance 3beb7f191 Merge pull request #6499 from masterleinad/nvhpc_impl_only e2ad3b36b Merge pull request #6506 from masterleinad/promote_kokkos_printf_header 89bd35cc3 Merge pull request #6198 from uliegecsm/unordered-map-space 0cad570ad Merge pull request #6503 from fnrizzi/relax_guards_team_algos 04a631081 Update CI in OpenMPTarget to use llvm-17 (#6472) c586fa172 simd: add floor, ceil, round, trunc operations (#6393) 1095b640e Merge pull request #6497 from fnrizzi/openmptarget_scan_return 5518eb99e Promote Kokkos_Printf.hpp to public include 6ff5721a6 Rename Kokkos_ThreadsExec to align with the other backends 5544c0c22 UnorderedMap(space instance): proposal for #6067 adc885184 remove guards 377b3f057 fix order 02e6bdcce ad threadvector bc83a8912 Merge pull request #6430 from aelovikov-intel/fix-red 578bc7f91 Merge branch 'develop' into openmptarget_scan_return e8687a5df Merge pull request #6495 from masterleinad/hpx_parallel_scan_team_thread_thread_vector 63d9ae201 Merge pull request #6493 from fnrizzi/fix_team_uniquecopy_copyif a856f973e Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON e4038bcd0 Merge pull request #6498 from Rombur/threads_split_files f511dca95 SIMD: Split math functions from SIMD_Common.hpp (#6487) 8420c2f00 Update to HIP TeamPolicy Block number heuristic (#6284) 4e69e4010 Merge pull request #6370 from Rombur/hip_graph 5b693fd95 address review comment fdfeaf916 add overload for TeamThreadRange d97f16f44 Merge pull request #6479 from uliegecsm/fix_hip_concurrency ebef19bdf Serial: Allow for distinct execution space instances (#6441) 23496b47e HPX: Implement TeamThread and ThreadVector parallel_scan with return value cb22b8061 Split Kokkos_Threads_Parallel files f5c0cc5a4 Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp ad9eb209f Merge pull request #6308 from thearusable/5635-sycl-parallel-scan-with-value-ThreadVectorRange a601d81f9 Merge pull request #6490 from masterleinad/fix_build_cmake_installed_different_compiler 1f4e3d5db fix impl 8659ffa0b Fix example/build_cmake_installed_different_compiler 1ebb3afc4 Merge pull request #6484 from masterleinad/fix_bessel_function ef1922d30 Merge pull request #6394 from masterleinad/simd_checks_neon 7fafc641a Merge pull request #6485 from masterleinad/fix_simd_cuda_compilations 8181d7075 Fix atomic operations bug for Min and Max (#6435) ee8d58ea3 Merge pull request #6482 from uliegecsm/iostream 9035ab2d3 Merge pull request #6486 from ajpowelsnl/fix/6451 c5bf8705d #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange 1ccf4995b Also fix annotations for generator constructor for AVX512 and NEON e52b957db Merge pull request #6305 from thearusable/5635-threads-parallel-scan-with-value-ThreadVectorRange e40f026ef team-level std algos: part 13 (#6351) 890148e5d Fix NVCC warnings (#6483) 4d3958bec guards to ensure DBL_EPSILON return for POWER8,9 96edf73bd Fix compiling SIMD unit tests on NVIDIA 6b21fde9e cleaning: remove iostream from headers where possible (IWYU) 6ff0deb9b Fix implementation for cyl_bessel_i0 29d4ffdbf Add KOKKOS_ARCH_ARM_NEON 4ce289baa Allow detecting SIMD types based on compiler macros (#6188) 495b1ccfd Add parallel_scan overloads with value for Threads c63f125ec Add test for parallel_scan with return value for ThreadVectorRange 0bf937cdd Moving abort and assert into their own public headers (#6445) 2075ae79b core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) e04f637dc Merge pull request #6307 from thearusable/5635-sycl-parallel-scan-with-value-TeamThreadRange 567524c8f Merge pull request #6242 from thearusable/5635-hip-parallel-scan-with-value 3ad6473f6 Merge pull request #6465 from masterleinad/simd_math_functions fbdb0e04f Merge pull request #6478 from masterleinad/minimum_version_google_benchmark 872ffb770 Merge pull request #6474 from uliegecsm/dualview_compatible_copy_constructor_assignment 148e6a6c3 Merge pull request #6471 from masterleinad/fix_openmp_teamthreadrange_parallel_scan_return 60e4d1359 team-level std algos: part 12 (#6350) d8f8142ab Use call operator instead of run_me function 94c5d9ab6 Modify test so that source and destination view are of different type 744711864 Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU 39316fa8c Add test of copy constructor/assignment operator for DualView. e1f2cf545 Fix minimum version for Google benchmark 82044c696 Add compatible copy assignment operator to DualView b610a288b OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 bbfe63981 Use std::is_same_v 7d817b88b #5635: SYCL: Add parallel_scan overload with return value e4eb204ee #5635: Move some tests for parallel_scan to TestTeamScan df1901b1c Use std::is_same_v 6c6a26ab1 Add parallel_scan overloads with value for HIP backend 41cf2e51c Merge pull request #6303 from thearusable/5635-threads-parallel-scan-with-value-TeamThreadRange 68a97a1fb Merge pull request #6235 from thearusable/5635-cuda-parallel-scan-with-value 5150a9fad Merge pull request #6463 from masterleinad/sycl_disable_bessel_test_intel_gpus c395c0cf1 Fix formatting 61e7b262d Skip testing for non-power-of-two team sizes b8d4feb26 use shortcut 4f6ddd190 #5635: Move some tests for parallel_scan to TestTeamScan 190bfe4ab #5635: Add parallel_scan overloads with value for Threads 1675997f2 #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) 9db1ea46d team-level std algos: part 11 (#6258) 925032879 Merge pull request #6378 from cwpearson/feature/gups-permute-mode d458fdadb team-level std algos: part 10 (#6256) f6977cf43 Check for default device 6e2ca15da Merge pull request #6464 from masterleinad/restrict_avx2_workaround_rocm5_67 c195ee69a SYCL: Disable another bessel function test for Intel GPUs b85563160 SIMD: Math functions should be in namespace Kokkos e542e989a improve tests to check intra-team result (#6431) 9b3778134 fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) b813f2bb3 HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 e1c82660e Merge pull request #6462 from fnrizzi/fix_warning_random_test_windows b9fa28cfb Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) b2a1820b0 fix casting warning in Random test 988a9e6a9 Move final assignment to correct scope 2e743674a improve tests (#6437) 1f0183bc1 improve tests (#6432) 773e34648 team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) 5f279b02c Fix parallel_scan_with_reducers test 8e4820194 Fix race condition in functor_vec_scan_ret_val test e13f67ca6 team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) 5ed274c93 Skip bessel function tests known to fail on Intel GPUs (#6434) 3cd281376 team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) d8fa85644 std_algos: improving min, max, minmax (#6421) 6f9e50c83 Merge pull request #6301 from thearusable/5635-cuda-parallel-scan-TeamThreadRange 06c6a73ab Merge pull request #6213 from fnrizzi/team_level_p10 ecbe79507 Merge pull request #6452 from masterleinad/disable_simd_compiler_macro_check_ompt a56b433ce Merge pull request #6455 from fnrizzi/fix_6442 8239de526 Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) 36af9d6e4 Merge pull request #6456 from fnrizzi/fix_6440 fb20a482d Assign final sum in Cuda parallel_scan ThreadVectorRange 4a819b6b7 Fix Cuda parallel_scan ThreadVectorRange range 6f85f19eb Merge pull request #6454 from fnrizzi/fix_copyif_team_test_assert 7e2749632 OpenMPTarget init-join fix (#6444) 56cc35bdf re-enable unit tests for sort and random via makefile (#6422) e95075930 fix unreachable for intel bcb92a619 fix intel compile error ba9165994 add intra team check for missing test 4a266d8ee #5635: Add test for parallel_scan with return value for ThreadVectorRange 2b7eb0b0b add missing assert 96320555f #5635: Add parallel_scan with value for CUDA and ThreadVectorRange 7284cd215 OpenMPTarget: Disable check for SIMD compiler macros e743017e8 benchmark/gups: use CMake 002cce07f Clean up benchmarks/gups 6d794df99 #5635: Enable TeamThreadRange test for CUDA db8498389 remove old impl 1fb6f4a74 #5635: Add parallel_scan changes for CUDA and TeamThreadRange 47aecc6c4 Merge remote-tracking branch 'upstream/develop' into team_level_p10 6a95b5f3a Merge pull request #6292 from thearusable/5635-serial-parallel-scan-part-2 6494d96c1 Merge pull request #6212 from fnrizzi/team_level_p9 8bff82ff8 try fix for unique, previous impl to remove later 615fc1aef Fixes for Kokkos::Array (#6372) 7e35f1087 Merge pull request #6433 from masterleinad/cuda_fix_m_num_scratch_locks_initialization 541f67468 improve tests with intra-team result check d270e064a Merge pull request #6428 from masterleinad/enable_kokkos_isnan_for_bhalf_t c046bdba1 Merge pull request #6429 from tcclevenger/hip_potential_race_condition 6979f67ca Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy fea838822 Same for scan 89a42341c [SYCL][Reduction] Group counter should use at least memory_order::acq_rel 41253bd55 Set the device id in cuda_kernel_arch 111371f10 Merge remote-tracking branch 'upstream/develop' into cuda_multiple_devices_constructor 96bb26b0c avoid potential race condition HIP 732d39219 Fix guard for isnan test for bhalf_t 9081d366c Merge pull request #6423 from uliegecsm/viewmapping_comparison 3bbbe2ba5 improve tests to address review e3a608bb5 add comment 4389d81c4 Fix to avoid #186-D pointless comparison warning. ba1bd2303 Merge pull request #6418 from dalg24/uvm_warn_once cb459e741 Merge pull request #6419 from dalg24/cuda_bock_size_deduction_device_properties 035d28487 Address reviewer' comments 45646ab3b Use execution space instance argument to get device properties in block size deduction 582dfeac7 check-copyright improvements (#6399) 26a4cd43a Only warn once (at initialization) when forcing allocation in unified memory 03ba69e07 Merge pull request #6417 from dalg24/drop_check_support_unified_addressing eb8ee282a fix single as per Christian's suggestion 21f72433b Drop check whether device supports unified addressing c32f9c90b Merge pull request #6415 from cz4rs/fedora-enable-death-tests 1c0c73402 Merge pull request #6413 from dalg24/pre_kepler_arch_not_supported 1affb05d0 core/src: Add half math functions to private header (#6124) c19926fd5 Enable death tests for fedora rawhide 7d9394ddb formatting 1cb10cb27 Merge remote-tracking branch 'upstream/develop' into team_level_p10 f4d7ea559 Merge remote-tracking branch 'upstream/develop' into team_level_p9 fc213ead1 Team-level std algos: part 7 (#6211) db591e369 formatting afac7784f address comments 8a3ec1bf5 Merge remote-tracking branch 'upstream/develop' into team_level_p10 dd5c624b7 use single c4c9ed551 Merge remote-tracking branch 'upstream/develop' into team_level_p9 fd774d916 Merge pull request #6411 from dalg24/precondition_not_initalized 28061e84d Drop pre-Kepler logic in Cuda::impl_initialize 7b4d0a6f7 Merge pull request #6410 from dalg24/unused_hip_internal_data_members c692a816d !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize d8846bf84 Merge pull request #6409 from dalg24/host_exec_initialized_before_device 881801cb4 Drop unused HIPInternal::m_hipArch static data member 6aaf3736b Drop unused HIPInternal::m_maxSharedWords data member 53b2b2285 Merge pull request #6408 from dalg24/drop_cuda_internal_maximum_shared_words c6cd24794 Drop check that the host backend is initialized before the Cuda/HIP/SYCL one 34aebe55c Drop (unused) `Cuda::cuda_internal_maximum_shared_words` 3c97512cc OpenMP backend refactor files. (#6403) 93fe629cf address comments c53a95e0c Merge pull request #6402 from dalg24/cuda_malloc_async_on_by_default bda5326b1 team-level std algos: part 6 (#6210) 49d4048cd Merge pull request #6405 from cz4rs/fix-cmake-warning-benchmarks 31c060ab5 Merge pull request #6401 from dalg24/manage_stream_should_be_private 629128cb7 Merge pull request #6406 from cz4rs/appveyor-disable-benchmarks 201f78d6d Merge pull request #6400 from dalg24/checked_integer_ops_death_category e5a23d10e Disable performance benchmarks in AppVeyor CI 926a6420f Use archive extraction time for timestamps…
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is extracted from #218.
It is correcting a typo.
A PR is probably needed in
Kokkos
too.