Introducing the /forceInterlockedFunctions switch for ARM64

In Visual Studio 2022 17.14, we are introducing the /forceInterlockedFunctions[-] switch, which generates and links with out-of-line atomics that select Armv8.1+ Large System Extension (LSE) atomic instructions based on CPU support.

This switch is on by default for Armv8.0 and off for Armv8.1+. Outlining is necessary in Armv8.0 because this version’s interlocked intrinsics use exclusive instructions—LoadExcl/StoreExcl—that do not guarantee forward progress. This can cause performance issues due to intermittent livelocks. See Arm Architecture Reference Manual for A-profile architecture, section “B2.17.5 Load-Exclusive and Store-Exclusive instruction usage restrictions” for examples of when the LoadExcl/StoreExcl loop may not make forward progress.

Below is an example of code that was previously generated when using the _InterlockedAdd64 intrinsic. You can see the ldaxr and stlxr instructions being used.

Main.cpp Main.asm snippet

#include <intrin.h>
#include <stdio.h>
#include <Windows.h>

void main() {
    volatile __int64 Addend = 5;
    __int64 Value = 1; _InterlockedAdd64(&Addend, Value);
}

; _InterlockedAdd64(&Addend, Value);
ldr x10,[sp]
add x9,sp,#8
|$LN3@main|
ldaxr x8,[x9]
add x8,x8,x10
stlxr wip0,x8,[x9]
cbnz wip0,|$LN3@main|
dmb ish

With the /forceInterlockedFunctions option you can see that the ldaxr and stlxr are gone and have been replaced with a bl _InterlockedAdd64 instruction

Main.cpp Main.asm snippet

#include <intrin.h>
#include <stdio.h>
#include <Windows.h>

void main() {
    volatile __int64 Addend = 5;
    __int64 Value = 1; _InterlockedAdd64(&Addend, Value);
}

; InterlockedAdd64(&Addend, Value);
ldr x1,[sp,#0x10]
add x0,sp,#0x18
bl _InterlockedAdd64
nop

The /forceInterlockedFunctions option only applies to Arm64 and will be ignored if not applicable. Additionally, enabling the LSE feature will override the default outlining behavior in Armv8.0.

Note that the option is on by default for all Arm64EC versions. We would not recommend turning the option off for ARM64EC, as outlining helps address the memory model differences between Arm64 and x64.

This flag impacts the following interlocked intrinsics:

Key:

Full: supports plain, _acq, _rel, and _nf forms.
None: Not supported

Operation	8	16	32	64	128	Pointer
`Add`	None	None	Full	Full	None	None
`And`	Full	Full	Full	Full	None	None
`CompareExchange`	Full	Full	Full	Full	Full	Full
`Decrement`	None	Full	Full	Full	None	None
`Exchange`	Full	Full	Full	Full	None	Full
`ExchangeAdd`	Full	Full	Full	Full	None	None
`Increment`	None	Full	Full	Full	None	None
`Or`	Full	Full	Full	Full	None	None
`Xor`	Full	Full	Full	Full	None	None
`bittestandset`	None	None	Full	Full	None	None
`bittestandreset`	None	None	Full	Full	None	None

Feedback

That’s all about this new compiler option and default setting that you can find starting in Visual Studio 2022 version 17.14. Please give it a try and let us know how it goes! We always welcome feedback, questions, or concerns from the community, as it helps make Visual Studio better.

Please share your thoughts, comments and questions with us through Developer Community. You can also reach us on X @VisualC, or via email at visualcpp@microsoft.com.

The post Introducing the /forceInterlockedFunctions switch for ARM64 appeared first on C++ Team Blog.

What are You Looking For?

Introducing the /forceInterlockedFunctions switch for ARM64

See also

Feedback

How Office Is Modernizing Their App Suite’s UI using Windows App SDK and React Native

🎉 Visual Studio 2022 v17.14 is now generally available!

Leave a Comment Cancel

Read Next

What’s New for C++ Developers in Visual Studio 2022 17.14

Analyzing the Performance of the “Proxy” Library

Analyzing the Performance of the “Proxy” Library