La meta es comparar el uso y rendimiento de la GPU mediante un ejercicio computacionalmente costoso que es el de la multiplicación de matrices programándolo con CUDA o usando directivas de OpenMP y OpenACC. Primeramente se han probado estas herramientas con un ejemplo sencillo que es el cálculo del número pi para ir familiarizándose con ellas. Tras ello se han programado los códigos correspondientes al producto de matrices y se han ido haciendo optimizaciones para poder tomar tiempos. Se realizó una comparativa de los resultados y se hizo un análisis de cada uno de ellos no solo midiendo su diferencia de rendimiento sino la dificultad de desarrollo.
ABSTRACT
The goal is to compare the usage and performance of the GPU through a computationally intensive exercise, which is matrix multiplication, by programming it with CUDA or using OpenMP and OpenACC directives. Initially, these tools were tested with a simple example, which is the calculation of the number pi, to get familiar with them. After that, the corresponding matrix multiplication codes were programmed, and optimizations were made to measure the execution times. A comparison of the results was carried out, and an analysis of each was performed, not only measuring their performance differences but also the development difficulty.
La meta es comparar el uso y rendimiento de la GPU mediante un ejercicio computacionalmente costoso que es el de la multiplicación de matrices programándolo con CUDA o usando directivas de OpenMP y OpenACC. Primeramente se han probado estas herramientas con un ejemplo sencillo que es el cálculo del número pi para ir familiarizándose con ellas. Tras ello se han programado los códigos correspondientes al producto de matrices y se han ido haciendo optimizaciones para poder tomar tiempos. Se realizó una comparativa de los resultados y se hizo un análisis de cada uno de ellos no solo midiendo su diferencia de rendimiento sino la dificultad de desarrollo.
ABSTRACT
The goal is to compare the usage and performance of the GPU through a computationally intensive exercise, which is matrix multiplication, by programming it with CUDA or using OpenMP and OpenACC directives. Initially, these tools were tested with a simple example, which is the calculation of the number pi, to get familiar with them. After that, the corresponding matrix multiplication codes were programmed, and optimizations were made to measure the execution times. A comparison of the results was carried out, and an analysis of each was performed, not only measuring their performance differences but also the development difficulty. Read More