High-level Optimization (HLO) Report

High-level Optimization (HLO) performs specific optimizations based on the usefulness and applicability of each optimization. The HLO report can provide information on all relevant areas plus structure splitting and loop-carried scalar replacement, and it can provide information about interchanges not performed for the following reasons:

For example, the report can provide clues as to why the compiler was unable to apply loop interchange to a loop nest that was considered a candidate for optimization. If the reported problems (bottlenecks) can be removed by changing the source code, the report suggests the possible loop interchanges.

Depending on the operating system, you must specify the following options to enable HLO and generate the reports:

The following examples illustrate the general command needed to create HLO report with combined options.

Operating System

Example Command

Linux and Mac OS X

ifort -c -xSSE3 -O3 -opt-report 3 -opt-report-phase=hlo sample.f90

Windows

ifort /c /QxSSE3 /O3 /Qopt-report:3 /Qopt-report-phase:hlo sample.f90

You can use -opt-report-file (Linux and Mac OS X) or /Qopt-report-file (Windows) to specify an output file to capture the report results. Specifying a file to capture the results can help to reduce the time you spend analyzing the results and can provide a baseline for future testing.

Reading the report results

The report provides information using a specific format. The report format for Windows* is different from the format on Linux* and Mac OS* X. While there are some common elements in the report output, the best way to understand what kinds of advice the report can provide is to show example code and the corresponding report output.

Example 1: This example illustrates the condition where a function call is inside a loop.

Example 1

subroutine foo (A, B, bound)
   integer i,j,n,bound
   integer A(bound), B(bound,bound)
   n = bound
   do j = 1, n
      do i = 1, n
        B(j,i) = B(j,i) + A(j)
        call bar(A,B)
      end do
   end do
   return
end subroutine foo

Regardless of the operating system, the reports list optimization results on specific functions by presenting a line above their reported action. The line format and description are included below.

The following table summarizes the common report elements and provides a general description to help interpret the results.

Report Element

Description

String listing information about the function being reported on. The string uses the following format.

<source name>;<start line>;<end line>;<optimization>; <function name>;<element type>

For example, the reports listed below report the following information:

Linux and Mac OS X:

<sample1.f90;-1:-1;hlo;foo_;0>

Windows:

<sample1.f90;-1:-1;hlo;_FOO;0>

The compact string contains the following information:

  • <source name>: Name of the source file being examined.

  • <start line>: Indicates the starting line number for the function being examined. A value of -1 means that the report applies to the entire function.

  • <end line>: Indicates the ending line number for the function being examined.

  • <optimization>: Indicates the optimization phase; for this report the indicated phase should be hlo.

  • <function name>: Name of the function being examined.

  • <element type>: Indicates the type of the report element; 0 indicates the element is a comment.

Several report elements grouped together.

QLOOPS 2/2      ENODE LOOPS 2 
unknown 0 multi_exit_do 0 do 2 
linear_do 2
LINEAR HLO EXPRESSIONS:  17 / 18

Windows only: This section of the report lists the following information:

  • QLOOPS: Indicates the number of well-formed loops found out of the loops discovered.

  • ENODE LOOPS: Indicates number of preferred forms (canonical) of the loops generated by HLO. This indicates the number of loops generated by HLO.

  • unknown: Indicates the number of loops that could not be counted.

  • multi_exit_do: Indicates the countable loops containing multiple exits.

  • do: Indicates the total number of loops with trip counts that can be counted.

  • linear_do: Indicates the number of loops with bounds that can be represented in a linear form.

  • LINEAR HLO EXPRESSIONS: Indicates the number of expressions (first number) in all of the intermediate forms (ENODE) of the expression (second number) that can be represented in a linear form.

The code sample listed above results in a report output similar to the following.

Operating System

Example 1 Report Output

Linux and Mac OS X

<sample1.f90;-1:-1;hlo;foo_;0>
High Level Optimizer Report (foo_)
Block, Unroll, Jam Report:
(loop line numbers, unroll factors and type of transformation)
<sample1.f90;7:7;hlo_unroll;foo_;0>
Loop at line 7 unrolled with remainder by 2

Windows

<sample1.f90;-1:-1;hlo;_FOO;0>
High Level Optimizer Report (_FOO)
QLOOPS 2/2      ENODE LOOPS 2 unknown 0 multi_exit_do 0 do 2 linear_do 2
LINEAR HLO EXPRESSIONS:  17 / 18
------------------------------------------------------------------------------
C:\samples\sample1.f90;6:6;hlo_linear_trans;_FOO;0>
Loop Interchange not done due to: User Function Inside Loop Nest
Advice: Loop Interchange, if possible, might help Loopnest at lines: 6 7
      : Suggested Permutation: (1 2 ) --> ( 2 1 )

Example 2: This example illustrates the condition where the loop nesting prohibits interchange.

Example 2

subroutine foo (A, B, bound)
   integer i,j,n,bound
   integer A(bound), B(bound,bound)
   n = bound
   do j = 1, n
      A(j) = j + B(1,j)
      do i = 1, n
        B(j,i) = B(j,i) + A(j)
      end do
   end do
   return
end subroutine foo

The code sample listed above results in a report output similar to the following.

Operating System

Example 2 Report Output

Linux and Mac OS X

<sample2.f90;-1:-1;hlo;foo_;0>
High Level Optimizer Report (foo_)
Block, Unroll, Jam Report:
(loop line numbers, unroll factors and type of transformation)
<sample2.f90;8:8;hlo_unroll;foo_;0>
Loop at line 8 unrolled with remainder by 2

Windows

<sample2.f90;-1:-1;hlo;_FOO;0>
High Level Optimizer Report (_FOO)
QLOOPS 2/2      ENODE LOOPS 2 unknown 0 multi_exit_do 0 do 2 linear_do 2
LINEAR HLO EXPRESSIONS:  24 / 24
------------------------------------------------------------------------------
C:\samples\sample2.f90;6:6;hlo_linear_trans;_FOO;0>
Loop Interchange not done due to: Imperfect Loop Nest (Either at Source or due t
o other Compiler Transformations)
Advice: Loop Interchange, if possible, might help Loopnest at lines: 6 8
      : Suggested Permutation: (1 2 ) --> ( 2 1 )

Example 3: This example illustrates the condition where data dependence prohibits loop interchange.

Example 3

subroutine foo (bound)
   integer i,j,n,bound
   integer A(100,100), B(100,100), C(100,100)
   equivalence (B(2),A)
   n = bound
   do j = 1, n
      do i = 1, n
        A(j,i) = C(j,i) * 2
        B(j,i) = B(j,i) + A(j,i) * C(j,i)
      end do
   end do
   return
end subroutine foo

The code sample listed above results in a report output similar to the following.

Operating System

Example 3 Report Output

Linux and Mac OS X

<sample3.f90;-1:-1;hlo;foo_;0>
High Level Optimizer Report (foo_)
<sample3.f90;8:8;hlo_scalar_replacement;in foo_;0>
#of Array Refs Scalar Replaced in foo_ at line 8=2
Block, Unroll, Jam Report:
(loop line numbers, unroll factors and type of transformation)
<sample3.f90;8:8;hlo_unroll;foo_;0>
Loop at line 8 unrolled with remainder by 2

Windows

<sample3.f90;-1:-1;hlo;_FOO;0>
High Level Optimizer Report (_FOO)
QLOOPS 2/2      ENODE LOOPS 2 unknown 0 multi_exit_do 0 do 2 linear_do 2
LINEAR HLO EXPRESSIONS:  24 / 24
------------------------------------------------------------------------------
C:\samples\sample3.f90;8:8;hlo_scalar_replacement;in _FOO
;0>
#of Array Refs Scalar Replaced in _FOO at line 8=1
C:\samples\3.f90;7:7;hlo_linear_trans;_FOO;0>
Loop Interchange not done due to: Data Dependencies
  Dependencies found between following statements:
    [From_Line# -> (Dependency Type) To_Line#]
    [9 ->(Flow) 10] [9 ->(Output) 10] [10 ->(Anti) 10]
    [10 ->(Anti) 9] [10 ->(Output) 9]
Advice: Loop Interchange, if possible, might help Loopnest at lines: 7 8
      : Suggested Permutation: (1 2 ) --> ( 2 1 )

Example 4: This example illustrates the condition where the loop order is determined to be proper, but loop interchange offers only marginal relative improvement.

Example 4

subroutine foo (A, B, bound, value)
   integer i,j,n,bound,value
   integer A(bound, bound), B(bound,bound)
   n = bound
   do j = 1, n
      do i = 1, n
        A(i,j) = A(i,j) + B(j,i)
      end do
   end do
   value = A(1,1)
   return
end subroutine foo

The code sample listed above results in a report output similar to the following.

Operating System

Example 4 Report Output

Linux and Mac OS X

<sample4.f90;-1:-1;hlo;foo_;0>
High Level Optimizer Report (foo_)
Block, Unroll, Jam Report:
(loop line numbers, unroll factors and type of transformation)
<sample4.f90;7:7;hlo_unroll;foo_;0>
Loop at line 7 unrolled with remainder by 2

Windows

<sample4.f90;-1:-1;hlo;_FOO;0>
High Level Optimizer Report (_FOO)
QLOOPS 2/2      ENODE LOOPS 2 unknown 0 multi_exit_do 0 do 2 linear_do 2
LINEAR HLO EXPRESSIONS:  18 / 18

Example 5: This example illustrates the condition where the loop nesting is imperfect and the loop order is good, but loop interchange offers only marginal relative improvements.

Example 5

subroutine foo (A, B, C, bound, value)
   integer i,j,n,bound,value
   integer A(bound, bound), B(bound,bound), C(bound, bound)
   n = bound
   do j = 1, n
      value = value + A(1,1)
      do i = 1, n
        value = B(i,j) + C(j,i)
      end do
   end do
   return
end subroutine foo

The code sample listed above results in a report output similar to the following.

Operating System

Example 5 Report Output

Linux and Mac OS X

<sample5.f90;-1:-1;hlo;foo_;0>
High Level Optimizer Report (foo_)
Loopnest Preprocessing Report:
<sample5.f90;7:8;hlo;foo_;0>
Preprocess Loopnests <foo_>: Moving Out Store @Line<8> in Loop @Line<7>

Windows

<sample5.f90;-1:-1;hlo;_FOO;0>
High Level Optimizer Report (_FOO)
QLOOPS 2/2      ENODE LOOPS 2 unknown 0 multi_exit_do 0 do 2 linear_do 2
LINEAR HLO EXPRESSIONS:  20 / 25
------------------------------------------------------------------------------
Loopnest Preprocessing Report:
C:\samples\sample5.f90;7:8;hlo;_FOO;0>
Preprocess Loopnests <_FOO>: Moving Out Store @Line<8> in Loop @Line<7>
C:\samples\sample5.f90;5:5;hlo_linear_trans;_FOO;0>
Loop Interchange not done due to: Imperfect Loop Nest (Either at Source or due t
o other Compiler Transformations)
Advice: Loop Interchange, if possible, might help Loopnest at lines: 5 7
      : Suggested Permutation: (1 2 ) --> ( 2 1 )

Changing Code Based on the Report Results

While the HLO report tells you what loop transformations the compiler performed and provides some advice, the omission of a given loop transformation could imply that there are transformations the compiler might attempt. The following list suggests some transformations you might want to apply. (Manual optimization techniques, like manual cache blocking, should be avoided or used only as a last resort.)


Submit feedback on this help topic

Copyright © 1996-2010, Intel Corporation. All rights reserved.