0
The code below is the result of a work I am developing, basically it is the multiplication of a square matrix, however, the results I had parallelizing the application with the Openmp API were superior to the results I obtained using SIMD of the same API.
What am I doing wrong? is the syntax?
Some information that may be pertinent in identifying the problem: I am using the intel compiler through the visual studio IDE, the visual studio Openmp is version 2.0 (which does not support SIMD) but I think it is the 4.0 that comes with the compiler being used. Anyway, for me it is a new activity (parallel processing) so if you can clarify things I would appreciate it heartily. Follow the code:
#include "stdafx.h"
#include <iostream>
#include <time.h>
#include <omp.h>
using namespace std;
int lin = 800, col = 800; // Valores de linha e coluna
int main()
{
// --------------------------------------
// Cria a matriz 1
int** m1 = new int*[lin];
for (int i = 0; i < lin; ++i)
m1[i] = new int[col];
// --------------------------------------
// --------------------------------------
// Cria a matriz 2
int** m2 = new int*[lin];
for (int i = 0; i < lin; ++i)
m2[i] = new int[col];
// --------------------------------------
// --------------------------------------
// Cria a matriz resposta
int** res = new int*[lin];
for (int i = 0; i < lin; ++i)
res[i] = new int[col];
// --------------------------------------
cout << "criou matrizes" << endl;
//PREENCHE m1 e m2
// ----------------------------------------------------------------------------
// BLOCO PARALELO
#pragma omp simd collapse (2)
for (int i = 0; i < lin; ++i) {
for (int j = 0; j < lin; ++j) {
m1[i][j] = (i + 1);
}
}
// FIM DO BLOCO PARALELO
// BLOCO PARALELO
#pragma omp simd collapse (2)
for (int i = 0; i < lin; ++i) {
for (int j = 0; j < lin; ++j) {
m2[i][j] = (i + 1);
}
}
// FIM DO BLOCO PARALELO
cout << "preencheu" << endl;
// ----------------------------------------------------------------------------
//faz a magica rolar
clock_t timer = clock(); //valores de marcação de tempo
// ----------------------------------------------------------------------------
cout << "iniciou" << endl;
#pragma omp simd collapse (2)
for (int i = 0; i < lin; i++)
{
for (int j = 0; j < lin; j++)
{
res[i][j] = 0;
for (int k = 0; k < lin; k++)
res[i][j] += m1[i][k] * m2[k][j];
}
}
cout << "finalizou" << endl;
// ----------------------------------------------------------------------------
//marca tempo final e exibe
timer = clock() - timer;
cout << "Programa Finalizado em " << ((float)timer) / CLOCKS_PER_SEC << " Segundos" << endl;
system("Pause");
}
// This code is contributed
// by Soumik Mondal
Why results with Openmp should be lower than results with SIMD ?
– Isac
Because in addition to parallelizing the process, the SIMD does multiple vector calculations simultaneously. Therefore, the SIMD should have a better result than simple parallelism.
– Guilherme Melo