Skip to content

Commit cf36f49

Browse files
authored
[libclc] Enable clang fp reciprocal in clc_native_divide/recip/rsqrt/tan (#149269)
The pragma adds `arcp` flag to `fdiv` instruction in these functions. The flag can provide better performance.
1 parent 0b6df54 commit cf36f49

File tree

4 files changed

+4
-0
lines changed

4 files changed

+4
-0
lines changed

libclc/clc/lib/generic/math/clc_native_divide.inc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,5 +8,6 @@
88

99
_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_native_divide(__CLC_GENTYPE x,
1010
__CLC_GENTYPE y) {
11+
_Pragma("clang fp reciprocal(on)");
1112
return x / y;
1213
}

libclc/clc/lib/generic/math/clc_native_recip.inc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,5 +7,6 @@
77
//===----------------------------------------------------------------------===//
88

99
_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_native_recip(__CLC_GENTYPE val) {
10+
_Pragma("clang fp reciprocal(on)");
1011
return 1.0f / val;
1112
}

libclc/clc/lib/generic/math/clc_native_rsqrt.inc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,5 +7,6 @@
77
//===----------------------------------------------------------------------===//
88

99
_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_native_rsqrt(__CLC_GENTYPE val) {
10+
_Pragma("clang fp reciprocal(on)");
1011
return 1.0f / __clc_native_sqrt(val);
1112
}

libclc/clc/lib/generic/math/clc_native_tan.inc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,5 +7,6 @@
77
//===----------------------------------------------------------------------===//
88

99
_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_native_tan(__CLC_GENTYPE val) {
10+
_Pragma("clang fp reciprocal(on)");
1011
return __clc_native_sin(val) / __clc_native_cos(val);
1112
}

0 commit comments

Comments
 (0)