Pulse · microsoft/onnxruntime · GitHub

September 2, 2024–September 9, 2024

Overview

67 Active pull requests

27 Active issues

1Release published by1person

v1.19.2 ONNX Runtime v1.19.2
published Sep 4, 2024

47 Pull requests merged by 25people

Move Gelu and LayerNorm fusion to L1 optimization
#21332mergedSep 9, 2024
Add parameter for flexdonwload
#22009mergedSep 8, 2024
[WebNN EP] Remove workaround for CPU op supported list
#21962mergedSep 7, 2024
Use output variable from InstallAppleProvisioningProfile task to set provisioning profile UUID.
#22018mergedSep 7, 2024
[VitisAI] Add processing for sessionOptions.AppendExecutionProvider( "VitisAI", options)
#21839mergedSep 6, 2024
NimbleEdge blog update
#22017mergedSep 6, 2024
near-zero negative values must convert to 0 not NAN
#18473mergedSep 6, 2024
remove unused and confusing float16 constants
#21999mergedSep 6, 2024
Fix typo in coreml_supported_mlprogram_ops.md
#22004mergedSep 6, 2024
Update Android NDK version to 27.0.12077973.
#21989mergedSep 6, 2024
[TransposeOptimizer] Support Unsqueeze/Transpose of input consumed by per-axis DQ
#21821mergedSep 6, 2024
[WebNN EP] Use identity for one input of Max/Min
#21974mergedSep 5, 2024
Update CoreML supported ops lists
#21627mergedSep 5, 2024
Add better native nuget package readme
#21889mergedSep 5, 2024
[CUDA] Update Dockerfile.cuda with cuda 12.5.1 and cudnn 9
#21987mergedSep 5, 2024
[Fuzzer] Add fuzzer support for linux
#21996mergedSep 5, 2024
[VitisAI] remove unused header
#21890mergedSep 5, 2024
Uncomment line in OVEP that was commented out in error
#21973mergedSep 5, 2024
Fix DML packaging CIs
#21997mergedSep 5, 2024
complete the GetSubGraph
#21998mergedSep 5, 2024
Fix C# doc generation workflow
#21988mergedSep 5, 2024
fix one build warning in MSVC
#21983mergedSep 5, 2024
[js/webgpu] Optimize grouped conv
#21892mergedSep 5, 2024
Add packaging version constraint.
#21814mergedSep 4, 2024
Sets enable_windows_arm64ec_qnn to false in training CI
#21981mergedSep 4, 2024
Update C# test projects
#21631mergedSep 4, 2024
Update C# E2E project's test package versions
#21975mergedSep 4, 2024
Rename ios_packaging.requirements.txt to ios_packaging/requirements.txt
#21936mergedSep 4, 2024
[js/webgpu] Optimize transpose
#21964mergedSep 4, 2024
Enable QNN weight sharing
#21077mergedSep 4, 2024
Update documentation for ovep rel-5.4
#21909mergedSep 4, 2024
[VitisAI] add registered custom op for perf test
#21336mergedSep 4, 2024
[VitisAI] Bug fixes in model_clone
#21950mergedSep 4, 2024
Improve stability of Android ReactNative E2E test
#21969mergedSep 4, 2024
Update vsinpu ep cross-compiling patch
#21963mergedSep 4, 2024
[VitisAI] Fix model path
#21911mergedSep 4, 2024
refactor: extract shared util function ComputeBroadcastOutputShape
#21940mergedSep 4, 2024
compare the content of WEBGPU_BUFFER, not the address
#21967mergedSep 3, 2024
reflect change in WebGpuProviderFactoryCreator::Create signature
#21971mergedSep 3, 2024
Remove unused find_cudnn_supported_cuda_versions
#21620mergedSep 3, 2024
Memory Optimization for Compilation in OVEP
#21872mergedSep 3, 2024
Add Expand op
#21933mergedSep 3, 2024
Enable xnnpack ep works in current windows xnn ci
#21951mergedSep 3, 2024
Fix copying ORT dylib into wheel on macOS
#21931mergedSep 3, 2024
revert forceinline for MakeString
#21943mergedSep 3, 2024
Fix C# warnings.
#21913mergedSep 3, 2024
Add dependency dawn into deps.txt
#21910mergedSep 2, 2024

20 Pull requests opened by16people

[Quantization] Apply workaround for crash when using histogram-based calibrators
#21972openedSep 3, 2024
fix more dml warnings
#21980openedSep 4, 2024
configure logger for onnxruntime-node package
#21982openedSep 4, 2024
Implementation of AVX-VNNI-INT8 dot product instructions into MLAS GEMM
#21984openedSep 4, 2024
adds support for Uint8ClampedArray
#21985openedSep 4, 2024
[webgpu-native] Add transpose op
#21986openedSep 4, 2024
Add FastGelu op
#21991openedSep 5, 2024
[ROCm] fix rocm-6.2 build issues
#21993openedSep 5, 2024
[JS/WebGPU] Upgrade emsdk to 3.1.61
#21994openedSep 5, 2024
[WIP][js/webgpu] Don't do layout conversion for InstanceNormalization
#21995openedSep 5, 2024
Enable Pad->Conv(no pads) fusion
#22001openedSep 5, 2024
Fixed many (not all) accessibility issues.
#22002openedSep 5, 2024
Decrease API docs artifact retention days
#22003openedSep 5, 2024
[Draft] Run tests on old pending allocator change
#22008openedSep 6, 2024
Upgrade XNNPACK to latest version
#22012openedSep 6, 2024
[webgpu-native] Add where op
#22014openedSep 6, 2024
Prevent int32 quantized bias from clipping by adjusting the weight's scale
#22020openedSep 7, 2024
[WebNN EP] Use opSupportLimits to dynamically check data type support
#22025openedSep 9, 2024
Add fusions for re-designed Phi-3 vision and Phi-3.5 vision ONNX models
#22026openedSep 9, 2024
Ovep release lnl 1.2.1
#22027openedSep 9, 2024

6Issues closed by5people

[Feature Request] Move graph compilation behind higher transformers (graph optimization)
#20915closedSep 9, 2024
E:onnxruntime:Default, provider_bridge_ort.cc:1992 onnxruntime::TryGetProviderInfo_CUDA
#22019closedSep 6, 2024
[Mobile] Test Application Reveals Multiple Failures in QnnHTPBackendTests Suite on Device
#21887closedSep 5, 2024
[Build] 1.19.0 still depends on CUDA 11.x
#21965closedSep 4, 2024
Llama2 RMS Norm: SimplifiedLayerNormalization
#21925closedSep 4, 2024
Llama2 RMS Norm: SimplifiedLayerNormalization
#21924closedSep 3, 2024

21Issues opened by21people

[Web] onnxruntime-gpu(1.18.0) can not be install
#22028openedSep 9, 2024
[Inference & Training] My Onnxruntime isnt detecting cuda even after all paths are perfectly given with compatible softwares
#22016openedSep 6, 2024
[Build] onnxruntime-openvino library does not have Python 3.12 support
#22015openedSep 6, 2024
onnxruntime c++ logging crash
#22013openedSep 6, 2024
[Web] no available backend found [wasm] when importing `onnxruntime-web/wasm`
#22010openedSep 6, 2024
Context leak detected with CoreMLExecutionProvider
#22007openedSep 6, 2024
Meet error [No module named '_kernel_explorer'] when use triton kernel
#22006openedSep 6, 2024
[Web] __turbopack_resolve_absolute_path__ is not a function
#22005openedSep 6, 2024
CUDA does not load on Windows
#22000openedSep 5, 2024
Extra Memory Usage due to 'CreateSessionFromArray' copying the model again
#21992openedSep 5, 2024
[Performance] CUDAExecutionProvider without RoiAlign (opset 16 version)
#21990openedSep 5, 2024
[Performance] Increasing Memory Usage during INT8 Quantization with ONNX Runtime tools
#21979openedSep 4, 2024
[Build] compiling the WASM in Firefox takes ~10 minutes and 4GB of ram
#21978openedSep 4, 2024
Has there been a 1.19.2 release (tag was set)
#21977openedSep 4, 2024
[Build] CMake Fails to Find Non-LTS Abseil Library (Download Triggered)
#21976openedSep 4, 2024
[Web] Uncaught WebGPU validation error on Snapdragon SM8450 but works on SM8250
#21970openedSep 3, 2024
[Web] BiRefNet_T not working on webgpu
#21968openedSep 3, 2024
[CUDA][Performance] Inference time greatly variates during session run
#21966openedSep 3, 2024
[Mobile] IOS library crashes in Release configuration
#21960openedSep 2, 2024
Use AppendExecutionProvider_Dnnl api to add onednn EP, No success.
#21958openedSep 2, 2024
1.19: Clip operator with type FLOAT16 defaults to min or max value 0.0 if not explicitly given, breaking many models using FLOAT16
#21957openedSep 2, 2024

50 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

[WebNN EP] Support GRU operator
#20405 commented onSep 9, 2024• 24 new comments
Refactor CoreMLExecution to C++ bridge class
#21857 commented onSep 8, 2024• 7 new comments
Adding ONNX Runtime C-API for WebGPU EP
#21838 commented onSep 5, 2024• 5 new comments
Using wostringstream only on Windows
#21938 commented onSep 6, 2024• 4 new comments
[WIP] WebGPU EP [skip ci]
#21904 commented onSep 9, 2024• 2 new comments
[Web] WebGPU and WASM Backends Unavailable within Service Worker
#20876 commented onSep 9, 2024• 0 new comments
Update pool to MacOS-13
#17361 commented onSep 5, 2024• 0 new comments
Update C++ to standard 20 for Windows
#17706 commented onSep 6, 2024• 0 new comments
Enable AVX NE CONVERT for FP16 to FP32 cast
#21183 commented onSep 6, 2024• 0 new comments
Create CMake option `onnxruntime_USE_VCPKG`
#21348 commented onSep 6, 2024• 0 new comments
[WIP] Out-Tree EP feature
#21450 commented onSep 6, 2024• 0 new comments
Add Interactive Decoding support in GQA
#21523 commented onSep 6, 2024• 0 new comments
ThreadPool: Spend less time busy waiting.
#21545 commented onSep 5, 2024• 0 new comments
Make argmin/armax support identical data types and add int64 support
#21641 commented onSep 4, 2024• 0 new comments
Add option for compiler flags.
#21721 commented onSep 3, 2024• 0 new comments
ConvTranpose using CUDNN Frontend with NHWC support
#21752 commented onSep 9, 2024• 0 new comments
[java] Fix for OnnxTensor creation when passing in a ByteBuffer containing elements of a different type
#21774 commented onSep 9, 2024• 0 new comments
Matmul_nbits kernel for mlas sqnbits to support Fp16 inputs
#21807 commented onSep 5, 2024• 0 new comments
Genai page addition
#21862 commented onSep 5, 2024• 0 new comments
Integrate onnx 1.17.0
#21897 commented onSep 5, 2024• 0 new comments
[VitisAI] Translate all session configs into provider options with prefix
#21907 commented onSep 4, 2024• 0 new comments
fmha slide window
#21926 commented onSep 5, 2024• 0 new comments
Set Transpose Attribute instead for manipulating MatMul Strides
#21927 commented onSep 4, 2024• 0 new comments
[CUDA] upgrade cutlass to 3.5.1
#21939 commented onSep 6, 2024• 0 new comments
[JS/WebGPU] Fixed bugs in inputs validation of Resize
#21955 commented onSep 3, 2024• 0 new comments
CUDA_PATH is set but CUDA wasnt able to be loaded
#21527 commented onSep 2, 2024• 0 new comments
New restricted asymmetric quantization mode in QDQ mode with zero_point restricted to either 128 or 0
#21398 commented onSep 2, 2024• 0 new comments
CreateSessionFromArray doesn't work
#21946 commented onSep 3, 2024• 0 new comments
TensorrtExecutionProvider slower than CUDAExecutionProvider: Faster-rcnn [Performance]
#17434 commented onSep 3, 2024• 0 new comments
Why C++ cannot modify the enable_mem_reuse option in Ort::SessionOptions...
#21942 commented onSep 3, 2024• 0 new comments
Importing onnxruntime on AWS Lambdas with ARM64 processor causes crash
#10038 commented onSep 4, 2024• 0 new comments
CPU cores and threads control
#8193 commented onSep 4, 2024• 0 new comments
Python ORT latest is available in 1.19.0 whereas NuGet package is 1.19.1
#21953 commented onSep 4, 2024• 0 new comments
[Discussion] ORT GPU binaries do not contain DML
#20638 commented onSep 5, 2024• 0 new comments
[Web] Memory access out of bounds / alignment fault
#21355 commented onSep 5, 2024• 0 new comments
[Web] How to free webgpu gpu mem in onnxruntime web
#21574 commented onSep 5, 2024• 0 new comments
DirectML returning empty result with ObjectDetection (Mobilinet V2 FPN Keras)
#20386 commented onSep 5, 2024• 0 new comments
[Mobile] Will provide supported 16KB-page-size prebuilt Android.so in the future?
#21837 commented onSep 5, 2024• 0 new comments
Failed to allocated memory for requested buffer of size X
#20038 commented onSep 6, 2024• 0 new comments
[Build] CUDA Illegal Memory Access error when using a custom Triton kernel
#20885 commented onSep 6, 2024• 0 new comments
Java GPU dependency of ONNX Runtime version 1.18 only support CUDA 12?
#21651 commented onSep 6, 2024• 0 new comments
Android build: Execution failed for task ':app:mergeExtDexDebug'.
#21494 commented onSep 6, 2024• 0 new comments
DirectML error: The parameter is incorrect with KBNet S
#21583 commented onSep 7, 2024• 0 new comments
[Mobile] [react-native] [android] Tensor.fromImage error ReferenceError: Property 'document' doesn't exist
#17752 commented onSep 7, 2024• 0 new comments
[cuda ep] Squeeze node fails when axes is not provided
#21661 commented onSep 7, 2024• 0 new comments
[Performance] Mapfile support for certain external data files is not working
#21195 commented onSep 7, 2024• 0 new comments
how to release gpu memory when keep onnxruntime session around.
#9509 commented onSep 8, 2024• 0 new comments
[Performance] Inference time discrepancy when using TorchScript vs ONNX exported model
#21689 commented onSep 8, 2024• 0 new comments
[Feature Request] Support for Florence-2 model family
#21118 commented onSep 8, 2024• 0 new comments
[Web] Wav2vec2 slower on WebGPU than WASM
#21618 commented onSep 9, 2024• 0 new comments