Releases: intel/intel-graphics-compiler
Releases · intel/intel-graphics-compiler
igc-1.0.5964
Fixed Issues / Improvements
- Process legalization of 64-bit moves on VC backend side.
- Renumber subroutines after removing unreachable code.
- Add enviroment variable for OCL debugging options.
- As urem needs positive operands, srcMod could generate negative operands. To be safe, disable srcMod for urem.
- The existing I64 shift emu had wrong result if shift amt is not in [31, 0]. The problem is that the inner condition was generated incorrectly.
- Passing context for code patching.
- Process legalization of 64-bit moves on VC backend side.
- Search for genx.output.1 intrinsics not only in return blocks (some execution paths may have callable instead).
- Fix for DumpToCustomDir flag and logic of ShaderDumpPidDisable flag on Linux.
- Fix: Wrong pattern is matched in GenSpecificPattern.
- Add support for SPV_INTEL_long_constant_composite.
- Process legalization of 64-bit moves on VC backend side.
- Use simple token allocation algorithm in debug mode.
- Disable stateless to stateful promotion after 32 promotion.
- Adding /Ob3 inline expansion option to 64-bit release config for agressive inlining within IGC.
- Fix coalescing of output arguments after migration to genx.output.1 intrinsic.
- Support for SPV_EXT_shader_atomic_float_add extension.
- Guard the FC patch SWSB info generation to save compilation time
- When a byte is promoted to word, its signedness should remain unchanged.
- Temp WA to limit kernel name length
- Initialize address register for indirect addressing if shader has indirect resources accesses i.e. a0 is used in send descriptor.
- Add support for SPV_INTEL_fp_fast_math_mode in SPIRVReader.
- Address register initial support.
- Fix initialization of GenXTidyControlFlow
- Fix push constant threshold for CFL GT3.
- When a byte is promoted to word, its signedness should remain unchanged.
- Fixed performance issues with subroutine inlining heuristic.
- Move block push constants threshold setting from being the default IGC flag value to CPlatform.
- Remove -hasRNEAndRenorm and its associated code.
- int64 mul does not support srcMod
- Refactored some conditions to make the source code more idiomatic and easier to read.
- Change of stateless indirect access reporting mechanism.
- Prevent unnecessary copies generation on GenXCoalescing.
- Extract code that adds Compute Shder CodeGen passes to a separate function.
- Change return type of createWrRegion and small refactoring in GenXBaling
- Hybrid RA with spill
- Maintain physical pred/succ during CFG BB insert/delete.
- RA compilation time--remove unnecessary operations build inteference with local RA.
- Workaround for sampler feedback bug.
- Optimization for signed scalar division for constant power of 2 int as divided.
Dependencies revisions
- intel/llvm-patches@9cbc7cf
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@5032643
- KhronosGroup/SPIRV-LLVM-Translator@ab5e12a (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.5884
Fixed Issues / Improvements
- Avoid read-modify-write when spilling scalar variables.
- Open source ROCKETLAKE and ALDERLAKE_S
- Refactor spill/fill intrinsic to not rely on the execution size passed in.
- Replace a hot function with templated version for better compile time.
- Move private memory allocations to SLM.
- Add preserve CFG and WIA to AdvMemOpt to save time
- Enable ForceInlineStackCallWithImplArg by default, and -O0 no longer force inlines all function calls.
Dependencies revisions
- intel/llvm-patches@9cbc7cf
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@5032643
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.5819
Fixed Issues / Improvements
- Moved FP64 math to separate bc, a second attempt,
- Added debug printouts for DebugInfo,
- Renamed createSrc to make it less verbose and aligned with createDst,
- Added processing of new masked gather intrinsics (gather4_masked_scaled2 and gather_masked_scaled2)
- Moved ModuleAllocaInfo to the header for reuse,
- Unified inlining heuristic for stackcalls and subroutines,
- Cleaned up IR debug dumps,
- Fix for argument indirection with already indirected call,
- Fixed a bug where spilled dst's size was incorrectly computed in debug mode,
- VC: Support FP64 BiF after it was supported in scalar backend,
- Added optimizations for signed division for constant power of 2,
- Cleaned up in GenXCategory,
- Fixed subregister offset for spilled destination,
- IMF LA open-sourcing: Switch back to previous FP32 atan2 implementation,
- Changed reduce implementation to remove extra barriers,
- Corrected wrappers in llvm::DIBuilder,
- Provide the ability to call one kernel from another,
- Reduced time on LiveVar update,
- Don't insert branches in loops and on a big amount of samples,
- Enabled ForceInlineStackCallWithImplArg by default, and -O0 no longer force inlines all function calls,
- Reduced the RA compilation time-use: replace push_back with emplace_back,
- Handle optnone builtins with subroutines instead of stackcalls,
- Other minor fixed and improvements.
Dependencies revisions
- intel/llvm-patches@9cbc7cf
- intel/opencl-clang@4e83bbf
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@5032643
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.5761
Fixed Issues / Improvements
- Added padding between globals when encoding,
- Added SPIRVDLL_SRC variable which takes prepared sprivdll sources,
- Added support to emit relocations in debug info,
- Improved LiveVar time by changing data-structure,
- Improvements in VC debug info,
- Increased per-thread stack size for SVM case,
- Made GenXTidyControlFlow actually preserve liveness,
- Moved splitStructPhis implementation to the proper place,
- Optimized generic pointer load for kernels not using local memory,
- Reduced the RA compilation time,
- Reduced the redundant interferences caused by function call,
- Specified type of pointer arithmetic to avoid tagging,
- Updated patch token version,
- Utilized genx.gaddr instrinsic for const/global tables,
- Other minor fixes and improvements.
Dependencies revisions
- intel/llvm-patches@9cbc7cf
- intel/opencl-clang@4e83bbf
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@a08fe5b
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.5723
Fixed Issues / Improvements
- Considering uniformness during register pressure estimate,
- Eliminated name length field restriction,
- Enabled spill cleanup for fp based spill/fill,
- Fixed extra option processing for CM online compilation,
- Fixed image tracking for GetBufferPtr scenario,
- Fixed spill code generation for spilled dest with non-zero subregister,
- Fixed the assignment of BTI values in the case of multiple uses,
- IMF LA open-sourcing,
- Implemented SPV_INTEL_unstructured_loop_controls extension,
- Other minor fixes and improvements.
Dependencies revisions
- intel/llvm-patches@9cbc7cf
- intel/opencl-clang@4e83bbf
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@a08fe5b
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.5699
Fixed Issues / Improvements
- Added IMF LA math function for FP32
- Avoid OCL kernel recompilation if there is less than 2% spill/fill
- Implement support for implicit arguments in stack call functions
- Fix bug in SPIRV reader to correctly propagate flags
- Local variables no longer optimized out in off-loaded functions
- Use ValueTracker to track width and height media block read/write parameter
- Add support for reading implicit arguments from stack call functions
- Fix vISA parser error for fcall/fret
- Optimize to generate mad by promoting src2 from :b to :w
Dependencies revisions
- intel/llvm-patches@9cbc7cf
- intel/opencl-clang@4e83bbf
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@a08fe5b
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.5585
Fixed Issues / Improvements
- Update CheckInstrTypes pass to provide more detailed statistics wrt. the global memory and storage buffer accesses.
- Add option to split evaluate messages
- AdaptorCM should provide access to DyLib wrapper
- IMF: unify trigonometry tables
- Add switch to jump table lowering
- Enable GAS resolution for arguments When we can prove that pointer coming from kernel function arguments is not changed inside the kernel, we assume that it points to global address space
- Pick UD visa type for function pointers
- Add rotate pattern
- Remove split of arithmetic insts in CISACodeGen. Inst split is handled by vISA.
- Fix predicates passing thru indirect calls
- GenXBackendData refactoring
- introduce an environment variable to pass extra arguments to CM FE
- Load CMFE using driver path on windows
- Add translation of FPFastMathMode decorations in SPIRVReader. This is a port of KhronosGroup/SPIRV-LLVM-Translator@14e2c5f .
- Force inlining on stack call functions with implicit arguments.
- IGA: Fix mov with label instructions. Do not compact them.
- ZEBinWriter: fix variables init order
- Fix issue on zext with i1 src type not recognized as supported in PeepholeLegalizer
- Added opportunity to initialize GenXBackendData in a debug scenario. -vc-ocl-generic-bif-path was added to pass BiF file to backend.
- Check for int64 moves at the end of HWConformity for platforms without int64.
- Reduce a count of rd/wr region sequence generated during shufflevector lowering
- Change reduce implementation to remove extra barriers
- Pass down DebugLocation of SPIR-V builtins to LLVM-IR
- Emulation routines for pointer compare and converts
- Fix bitwidth for svm atomic instructions
- Make flushL3 parameter compile time constant to avoid crash in emitMemoryFence
- Preserve debug info for subgroup barrier with -O0.
- Pre-increment switchjmp index operand on TGL.
- Limit RA iteraitons before we switch to failSafe RA.
- Refactoring code related cisa alignment to avoid potential bug
- Add ashr and fix trunc bug in peephole legalizer. We currently need to ensure we are handling ashr when it is between i1 and i64 and not a supported type
- Fix condition that turns remat off.
- Added environment variables to set API/Internal Options for VC
- Due to incorrect scoreboarding, flags registers were being stored as
- Change vISA to use a table-based scheme to manage platforms.
- Introduction of new entry in IGC constant folder for fbh_shi.
- Fixed function signature cloning for argument attributes
- Introduction of new entry in IGC constant folder for fbh.
- Add new patch token with stateless access
- Consider SIMD mask while constfolding
- Remove the indirect call variable size checking in RA
- Add method to do sampler splits SIMD8/16 sampler to odd and even subspans
- Allow various architecture register functions to be used in release driver.
- Make ternary instruction operand GRF-aligned
- Set CISA offset correctly when transforming goto/join to if/else/endif.
- update calling conv only for imported from BiF funcs in VC
- Fix computation of CFA, return address location, callee save GRF
- memset lowering improvements in VC
- Constrain name string length to be 256 as per spec.
- Fix bug on linear scan RA for stack call
- Revert of change: 242d131 Internal feature
- Disable scheduler when compiling without optimizations.
- Remove some dead functions
- Fix nested stack calls corrupting the ARG register - Only apply optimization to alias the ARG register for leaf functions, since they do not further write to ARG. - Non-leaf functions still requires copying the ARGs to a temp register at function entry.
- Enable recursion and convert all recursive calls to stack calls
- Support function pointer arguments passing thru SVM
- fix align wrapper for unset alignment in CMPacketize
Dependencies revisions
- intel/llvm-patches@d8b63ab
- intel/opencl-clang@4e83bbf
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@eabcd20
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.5435
Fixed Issues / Improvements
- Added support for non-constant parameter for
llvm.genx.GenISA.memoryfence
andllvm.genx.GenISA.typedmemoryfence
instructions, - Added analysis of stateless memory load/store,
- Added env. variable for SPIRVDLL path,
- Added new regkey to disable DispatchAlongY for HW IDs,
- Avoiding passing large composite values when localizing globals by value,
- Compile time improvements in IGC debug info,
- Considering constant EM lowered in SIMDCFConformance,
- Convert pseudo-or/and/xor to Gen or/and/xor if flag size is smaller than inst SIMD size,
- Enabled vector-backend to take LLVM bitcode input,
- Debug info should be generated automatically when run under gdb,
- Enabled RNE by default for IEEE float-divide,
- Enabled SIMD32 subgroup shuffles,
- Fixed GRF register checking in linear scan RA,
- Fixed parsing of switchjmp for vISA assembly,
- Fixed SPIRV reader constructing reference type debug metadata,
- Initial support for interprocedural TPM pass,
- Introduction of new entry in IGC constant folder for IBFE,
- Lower genx.*mul.sat intrinsics since integer mul.sat is not supported,
- Made improvements in ranges emitted to .debug ranges section,
- Negative modifiers are no longer baled if applied on unsigned integer,
- Open sourced IMF LA; added common header,
- Refactored Int64b support,
- Removed indirection for input arguments,
- Removed legacy inter-procedural analysis code for RA,
- Removed support for -cl-feature,
- Rewritten CM adaptor library,
- Simplifying building workspace/instruction,
- Skip CS simd32 if it is a retry to speed up compile time,
- Skip invalid warning in ShaderOverride mode,
- Unify the stack call functions for both global RA and linear scan RA,
- Vector compiler:
- Backend should not use reserved BTI indexes for debuggable kernels,
- i64 emulation should convert partial predicates to icmp,
- Refactored TransformNode CMABI.
- Changes in preparation of LLVM 11 upgrade,
- Other minor improvements and fixes.
Dependencies revisions
- intel/llvm-patches@d8b63ab
- intel/opencl-clang@55e6029
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@eabcd20
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.5353
Fixed Issues / Improvements
- Implemented VISA stack call ABI in generated code; updated VISA debug info as per ABI,
- Emitting caller save ranges to debug info for stack call functions,
- Writing NewStore to MemRefs for correct isSafeToMergeLoad working,
- Removed support for -cl-feature for Linux.
Dependencies revisions
- intel/llvm-patches@cfc8005
- intel/opencl-clang@55e6029
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@eabcd20
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.5349
Fixed Issues / Improvements
- Updated IGA,
- Improving i64 emulation:
- Enabled i64 emulation on platforms which do not support it,
- Extended i64 emulation pass to support genx_min/max and genx_trunc_sat,
- Avoiding i64 emulation for indirect operands,
- Added CodeGenContext members for non-kernel-arg stateless memory access analysis,
- Added DeleteLegacyIntrinsicDeclarations pass,
- Added diagnostic for missing int<->double conversions,
- Added System Routine interface,
- Added a regkey to insert the discard code during IGC compilation,
- Added support for double type in erf, erfc, lgamma and tgamma functions,
- Added support for using vertex buffers to send shader draw parameters,
- Added an ExtraOCLOptions regkey to configuration flags,
- Added more debug output for VISA builder and regalloc,
- Added new string macros; applied them to IGC,
- Allowing arithmetic on generic pointers,
- Changed vISA null variable to null ARF,
- Cleaned up ld/st/atomic structured emit,
- Improved marking allocas as random (non-uniform),
- Fixed aligning method in live ranges,
- Fixed condition to clean implicit id code when -cl-kernel-debug-enable is passed,
- Fixed bug for linear Scan RA stack call,
- Fixed enabling SIMD16 for stackcalls, accounting for enabling subroutines at the same time,
- Optimized compilation time for linear scan RA,
- Refactored StatelessToStatefull pass,
- Refactored createInst() to avoid passing useless line number information,
- Removed VC-related code from dllInterfaceCompute; moved this code to igcdeps library,
- Removed LLVM 4 support in WrapperLLVM,
- TrivialLocalMem optimization is meant for entry function only. Disabled it for subroutines,
- Tuned remat cost functions,
- Setting proper addrspace when localizing globals,
- Multiple other improvements and bug fixes.
Dependencies revisions
- intel/llvm-patches@cfc8005
- intel/opencl-clang@55e6029
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@eabcd20
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.