- Add some ops from spconv 1.x, see spconv.utils for more details.
- Add some debug tool for users to attach more info in issue.
- Add a method for voxel generator to get pc_voxel_id, which is usually used in semantic segmentation
- Fix a bug in cuda voxel generater when max_voxels is smaller than real number of voxels
- Fixed a bug Volta kernels (TITAN V, Tesla V100), backward weight kernels use f16 as accumulator. we should use f32.
- Fixed a corner case when user use kernel size = 1x1 but stride != 1.
- Fixed a corner case when input feature is non-contiguous when maxpool.
- Fixed a bug in utils.PointToVoxel, shouldn't get cuda stream in cpu code
- Remove a wrong assert
- Add support for pytorch 1.5
- Fix a bug when net have inverse and run inference in eval mode.
- Fix missing -fopenmp in linker for CPU only
- remove stale comment sending in CI
- Add cuda profile tool
- Add python 36 support
- Format all code
- remove a unnecessary device sync and slightly improve performance.
- Fix a bug of SparseInverseConv3d
- Fix a bug of CPU only package
- Fix a bug of python 3.7
- add implicit gemm algorithm for all kind of convolution with kernel volume <= 32. this algorithm is very fast with float16.
- add pytorch wrapper for voxel generator
- add CPU support and CPU-only build.
- Fix a serious bug that do nothing with non-spconv layers in SparseSequential
- Fix a bug of ProxyableClassMeta
- Change build system from cmake to pccm.
- Change pytorch python code to spconv.pytorch
- Rewrite All c++ code.
- The subm indice pair generation speed is greatly increased by two tricks: 1. most subm conv use only kernelsize=3, so we can unroll loops to get 100% performance increase. 2. subm indice pairs have a property: indicePairs[0, i] = indicePairs[1, kernelVolume - i - 1], so we can get another 100% performance increase.
- add batch gemm support. small performance increasement but more gpu memory usage. you can use algo=spconv.ConvAlgo.Batch to use it.
- replace most of 'functor' with c++14 dispatch in c++ code.
- change gather/scatterAdd kernel parameter to support large points.