SHORT Single Precision MFLOP/s

EVENTSET
FIXC0 INST_RETIRED_ANY
FIXC1 ACTUAL_CPU_CLOCK
FIXC2 MAX_CPU_CLOCK
PMC0  RETIRED_SSE_AVX_FLOPS_SINGLE_FMA
PMC2  RETIRED_MMX_FP_INSTR_ALL
PMC3  RETIRED_SSE_AVX_FLOPS_SINGLE_ADD_MULT_DIV

METRICS
Runtime (RDTSC) [s] time
Runtime unhalted [s]   FIXC1*inverseClock
Clock [MHz]  1.E-06*(FIXC1/FIXC2)/inverseClock
CPI   FIXC1/FIXC0
SP MFLOP/s (scalar assumed)   1.0E-06*(PMC3+(PMC2/2)+(PMC0*4))/time
SP MFLOP/s (SSE assumed)   1.0E-06*(PMC3+(PMC2*2)+(PMC0*4))/time
SP MFLOP/s (AVX assumed)   1.0E-06*(PMC3+(PMC2*4)+(PMC0*4))/time


LONG
Formulas:
CPI = INST_RETIRED_ANY/ACTUAL_CPU_CLOCK
SP MFLOP/s (scalar assumed) = 1.0E-06*(RETIRED_SSE_AVX_FLOPS_SINGLE_ADD_MULT_DIV + (RETIRED_MMX_FP_INSTR_ALL/2)+(RETIRED_SSE_AVX_FLOPS_SINGLE_FMA*4))/time
SP MFLOP/s (SSE assumed) = 1.0E-06*(RETIRED_SSE_AVX_FLOPS_SINGLE_ADD_MULT_DIV + (RETIRED_MMX_FP_INSTR_ALL*2)+(RETIRED_SSE_AVX_FLOPS_SINGLE_FMA*4))/time
SP MFLOP/s (AVX assumed) = 1.0E-06*(RETIRED_SSE_AVX_FLOPS_SINGLE_ADD_MULT_DIV + (RETIRED_MMX_FP_INSTR_ALL*4)+(RETIRED_SSE_AVX_FLOPS_SINGLE_FMA*4))/time
-
Profiling group to measure single precisision FLOP rate. The Zen architecture
does not provide distinct events for SSE and AVX FLOPs. Moreover, scalar FP
instructions are counted as SSE instruction in RETIRED_MMX_FP_INSTR_ALL.
Therefore, you have to select the SP MFLOP/s metric based on the measured code.


