A vectorization report tells you whether the loops in your code were vectorized, and if not, explains why not.
Because vectorization is off at -O1, the compiler does not generate a vectorization report, so recompile at -O2 (default optimization):
icc -std=c99 -DNOFUNCCALL -vec-report1 Multiply.c Driver.c -o MatVector
Record the new execution time. The reduction in time is mostly due to auto-vectorization of the inner loop at line 150 noted in the vectorization report:
Driver.c(150) (col. 4): remark: LOOP WAS VECTORIZED. Driver.c(164) (col. 2): remark: LOOP WAS VECTORIZED. Driver.c(81) (col. 2): remark: LOOP WAS VECTORIZED.
The -vec-report2 option returns a list that also includes loops that were not vectorized, along with the reason why the compiler did not vectorize them.
icc -std=c99 -DNOFUNCCALL -vec-report2 Multiply.c Driver.c -o MatVector
The vectorization report indicates that the loop at line 45 in Multiply.c did not vectorize because it is not the innermost loop of the loop nest. Two versions of the innermost loop at line 55 were generated, but neither version was vectorized.
Multiply.c(45) (col. 2): remark: loop was not vectorized: not inner loop. Multiply.c(55) (col. 3): remark: loop was not vectorized: existence of vector dependence. Multiply.c(55) (col. 3): remark: loop skipped: multiversioned. Driver.c(140) (col. 2): remark: loop was not vectorized: not inner loop. Driver.c(140) (col. 2): remark: loop was not vectorized: vectorization possible but seems inefficient. Driver.c(141) (col. 2): remark: loop was not vectorized: vectorization possible but seems inefficient. Driver.c(145) (col. 2): remark: loop was not vectorized: not inner loop. Driver.c(148) (col. 3): remark: loop was not vectorized: not inner loop. Driver.c(150) (col. 4): remark: LOOP WAS VECTORIZED. Driver.c(164) (col. 2): remark: LOOP WAS VECTORIZED. Driver.c(81) (col. 2): remark: LOOP WAS VECTORIZED. Driver.c(69) (col. 2): remark: loop was not vectorized: vectorization possible but seems inefficient. Driver.c(54) (col. 2): remark: loop was not vectorized: not inner loop. Driver.c(55) (col. 3): remark: loop was not vectorized: vectorization possible but seems inefficient.
For more information on the -vec-report compiler option, see the Compiler Options section in the Compiler User and Reference Guide.
Copyright © 2010, Intel Corporation. All rights reserved.