Laplacian Operators for Scientific Computing: A Comparative Analysis of CPU and GPU Implementations

Authors

Aswini J
Marinto Richee J
M. G. Dinesh
A. Gayathri

Keywords:

Image Processing, GPU Acceleration, Performance Benchmarking

Abstract

This paper presents a comprehensive bench-marking study of a 2D Laplacian filter implemented on both CPU and GPU architectures for image processing applications. The Laplacian filter serves as a fundamental tool in edge detection and feature extraction, playing a crucial role in various computer vision tasks

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

Arteaga, A., Ruprecht, D., & Krause, R. (2014). A stencil-based implementation of Parareal in the C++ domain specific embedded language STELLA. *ArXiv.* https://doi.org/10.1016/j.amc.2014.12.055

Bianco, M., & Varetto, U. (2012). A Generic Library for Stencil Computations. *ArXiv.* https://arxiv.org/abs/1207.1746

Birke, M., Philip, B., Wang, Z., & Berrill, M. (2012). Block-Relaxation Methods for 3D Constant-Coefficient Stencils on GPUs and Multicore CPUs. *ArXiv.* https://arxiv.org/abs/1208.1975

Brown, N., Echols, B., Zarins, J., & Grosser, T. (2022). TensorFlow as a DSL for stencil-based computation on the Cerebras Wafer Scale Engine. *ArXiv.* https://arxiv.org/abs/2210.04795

Brown, N., Jamieson, M., Lydike, A., Bauer, E., & Grosser, T. (2023). Towards Accelerating high-order stencil computations on modern GPUs and emerging architectures using a portable framework.

*ArXiv.* https://doi.org/10.1145/3624062.3624167

Denzler, A., Bera, R., Hajinazar, N., Singh, G., Oliveira, G. F., & Mutlu, O. (2021). Casper: Accelerating Stencil Computation using Near-cache Processing. *ArXiv.* https://arxiv.org/abs/2112.14216

Ernst, D., Holzer, M., Hager, G., Knorr, M., & Wellein, G. (2022). Analytical Performance Estimation during Code Generation on Modern GPUs. *ArXiv.* https://arxiv.org/abs/2204.14242

Gloster, A. (2021). GPU Methodologies for Numerical Partial Differential Equations. *ArXiv.* https://arxiv.org/abs/2101.06550

Kachris, C. (2024). A Survey on Hardware Accelerators for Large Language Models. *ArXiv.* https://arxiv.org/abs/2401.09890

Kerzner, Ethan, and Timothy Urness. "GPU Programming for Mathematical and Scientific Computing."

*Drake University* (2010).

Luo, W., Fan, R., Li, Z., Du, D., Wang, Q., & Chu, X. (2024). Benchmarking and Dissecting the Nvidia Hopper GPU Architecture. *ArXiv.* https://arxiv.org/abs/2402.13499

Matsumura, K., Zohouri, H. R., Wahib, M., Endo, T., & Matsuoka, S. (2020). AN5D: Automated Stencil Framework for High-Degree Temporal Blocking on GPUs. *ArXiv.* https://doi.org/10.1145/3368826.3377904

Mayer, F., Brandner, J., & Philippsen, M. (2024). Utilizing polyhedral methods to optimize stencil computations on FPGAs, incorporating stencil-specific caches, data reuse strategies, and wide data bursts. *ArXiv.* https://arxiv.org/abs/2401.13645

Omlin, S., & Räss, L. (2022). High-performance xPU Stencil Computations in Julia. *ArXiv.* https://arxiv.org/abs/2211.15634

Omlin, S., Räss, L., & Utkin, I. (2022). Distributed Parallelization of xPU Stencil Computations in Julia.

*ArXiv.* https://arxiv.org/abs/2211.15716

Paredes, E. G., Groner, L., Ubbiali, S., Vogt, H., Madonna, A., Mariotti, K., Cruz, F., Benedicic, L., Bianco, M., VandeVondele, J., & Schulthess, T. C. (2023). GT4Py: Python-based high-performance stencil computations tailored for weather and climate applications. *ArXiv.* https://arxiv.org/abs/2311.08322

Pekkilä, J., Väisälä, M. S., Käpylä, M. J., Rheinhardt, M., & Lappi, O. (2021). Implementing scalable communication techniques for high-order stencil computations by leveraging CUDA-aware MPI.

*ArXiv.* https://doi.org/10.1016/j.parco.2022.102904

Quezada, F. A., & Navarro, C. A. (2021). Accelerating Compact Fractals with Tensor Core GPUs.

*ArXiv.* https://arxiv.org/abs/2110.12952

Reguly, I. Z., Mudalige, G. R., & Giles, M. B. (2017). Exploring out-of-core stencil computations beyond the limitations of 16GB memory. *ArXiv.* https://arxiv.org/abs/1709.02125

Rodrigues, V. H., Cavalcante, L., Pereira, M. B., Luporini, F., Reguly, I., Gorman, G., & De Souza, S. X. (2019). GPU Support for Automatic Generation of Finite-Differences Stencil Kernels. *ArXiv.* $https://doi.org/10.1007/978-3-030-41005-6_16$

Sai, R., & Xu, J. (2023). Towards Accelerating High-Order Stencils on Modern GPUs and Emerging Architectures with a Portable Framework. *ArXiv.* https://arxiv.org/abs/2309.04671

Seznec, Mickael, et al. "Computing large 2D convolutions on GPU efficiently with the im2tensor algorithm." *Journal of Real-Time Image Processing* 19.6 (2022): 1035-1047.

Shen, J., Deng, X., Wu, Y., Okita, M., & Ino, F. (2022). Compression-Based Optimizations for Out-of-Core GPU Stencil Computation. *ArXiv.* https://arxiv.org/abs/2204.11315

Shen, J., Long, L., Zhang, J., Shen, W., Okita, M., & Ino, F. (2023). A Synergy between On- and Off-Chip Data Reuse for GPU-based Out-of-Core Stencil Computation. *ArXiv.* https://arxiv.org/abs/2309.08864

Shen, J., Wu, Y., Okita, M., & Ino, F. (2021). Accelerating GPU

26.-Based Out-of-Core Stencil Computation with On-the-Fly Compression. *ArXiv.* https://arxiv.org/abs/2109.05410

Smith, Melissa C., Jeffery S. Vetter, and Sadaf R. Alam. "Scientific computing beyond CPUs: FPGA implementations of common scientific kernels." *2005 MAPLD International Conference.* 2005.

Yang, J., Giannoula, C., Wu, J., Elhoushi, M., Gleeson, J., & Pekhimenko, G. (2023). Minuet: Accelerating 3D Sparse Convolutions on GPUs. *ArXiv.* https://arxiv.org/abs/2401.06145

Zhang, L., M., Wahib, P., Chen, J., Meng, X., Wang, T., Endo, & Matsuoka, S. (2023). Exploiting Scratchpad Memory for Deep Temporal Blocking: A case study for 2D Jacobian 5-point iterative stencil kernel (j2d5pt). *ArXiv.* https://doi.org/10.1145/3589236.3589242

Zhang, L., M., Wahib, P., Chen, J., Meng, X., Wang, T., Endo, & Matsuoka, S. (2023). Revisiting Temporal Blocking Stencil Optimizations. *ArXiv.* https://doi.org/10.1145/3577193.3593716

Zohouri, H. R., Podobas, A., & Matsuoka, S. (2020). High-Performance High-Order Stencil Computation on FPGAs Using OpenCL. *ArXiv.* https://doi.org/10.1109/IPDPSW.2018.00027

Downloads

Published

2025-05-29

How to Cite

J A, Richee J M, Dinesh MG, Gayathri A. Laplacian Operators for Scientific Computing: A Comparative Analysis of CPU and GPU Implementations. J Neonatal Surg [Internet]. 2025May29 [cited 2025Oct.8];14(29S):75-84. Available from: https://www.jneonatalsurg.com/index.php/jns/article/view/6714

Download Citation

Issue

Vol. 14 No. 29S (2025): Journal of Neonatal Surgery

Section

Original Article

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

You are free to:

Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material for any purpose, even commercially.

Terms:

Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Laplacian Operators for Scientific Computing: A Comparative Analysis of CPU and GPU Implementations

Authors

Keywords:

Abstract

Downloads

Metrics

References

Downloads

Published

How to Cite

Issue

Section

License

You are free to:

Similar Articles

Current Issue

Information

Developed By

Make a Submission