Skip to content

Image processing application that applies a 3x3 Gaussian blur efficiently using SIMD optimizations and multi-threading. Project focuses on high-performance execution through assembly-level optimizations and parallel processing.

Notifications You must be signed in to change notification settings

antosiowsky/gaussian-filter-assembly

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gaussian Filter

Project Overview

Topic: Image filtering using a Gaussian filter

Objective: The goal of this project was to implement an image processing application using a Gaussian filter. The application ensures fast and efficient image filtering by utilizing assembly optimizations and multi-core processing.

Algorithm Description

The filtering process involves applying a 3x3 Gaussian filter matrix to each pixel in an image. The steps include:

  • Retrieving neighboring pixels within a 3x3 region.
  • Multiplying pixels by the corresponding weights in the Gaussian filter matrix.
  • Summing the results and normalizing with a normalization coefficient.
  • Writing the filtered pixel to the output image.

Optimization is achieved through the use of SIMD (Single Instruction Multiple Data) instructions and multi-threading, allowing simultaneous processing of multiple pixels.

Input Parameters

  • Input Image: BMP format image file to be processed.
  • Number of Threads: Specifies the number of threads used for processing (1, 2, 4, 8, 16, 32, 64).
  • Input Data Type: Various image types (e.g., uniform, gradient, random) for testing the algorithm.
  • Computation Library: Specifies the computational method (pure assembly vs. C++ implementation).

Assembly Code Snippet

; Loading neighboring pixels
pinsrb xmm1, byte ptr[RCX + R11 - 3], 0
pinsrb xmm3, byte ptr[RCX + R11], 1
pinsrb xmm1, byte ptr[RCX + R11 + 3], 2
pinsrb xmm3, byte ptr[RCX - 3], 3
pinsrb xmm3, byte ptr[RCX + 3], 5

; Multiplying pixels by filter weights pmullw xmm3, xmm4 pxor xmm2, xmm2 psadbw xmm1, xmm2 paddsw xmm1, xmm3

This code is optimized for SIMD operations, reducing memory overhead and increasing processing speed.

User Interface

The application provides a graphical user interface (GUI) where users can:

  • Select a BMP image file for processing.
  • Specify the number of filtering iterations.
  • Choose a processing library (C++ or Assembly).
  • Adjust the number of threads using a slider.
  • Apply the Gaussian filter and save the output image.
Menu

Performance Measurements

Testing was performed on three different image sizes: small (640x426), medium (1280x853), and large (1920x1280).

Performance comparisons were made between ASM and C++ implementations using various threading configurations (1, 2, 4, 8, 16, 32, 64 threads).

For each configuration, execution time was measured over 5 runs, with the first run excluded as a warm-up.

Sample Performance Data (Small Image, Assembly)

Threads Run 1 Run 2 Run 3 Run 4 Run 5 Avg Time (ms) Standard Deviation
1571312171614.52.38
21869787.51.29
424586561.41

Conclusion

The project demonstrates a significant performance boost using SIMD assembly optimization and multi-threading. The assembly implementation outperforms the C++ version, particularly with higher thread counts.

License

This project is licensed under the MIT License.

About

Image processing application that applies a 3x3 Gaussian blur efficiently using SIMD optimizations and multi-threading. Project focuses on high-performance execution through assembly-level optimizations and parallel processing.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published