-
Notifications
You must be signed in to change notification settings - Fork 3k
Mdepkg baselib remove ub #11853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Mdepkg baselib remove ub #11853
Conversation
e0d98fa to
d232a38
Compare
d232a38 to
e609c1b
Compare
#if defined (_MSC_EXTENSIONS)
#define MISALIGNED __unaligned
#else
#define MISALIGNED __attribute__((__aligned__(1)))
#endifIn my tests, it does not enable the compiler to automatically generate instructions that avoid triggering exceptions when it detects unaligned addresses. |
8fad00d to
38c7a48
Compare
Thanks for testing - turns out I should be using packed structs instead, I have updated the branch, could you please give it another try? Thanks |
|
See here for a demonstration of the resulting codegen on VS and GCC/Clang |
38c7a48 to
9300c78
Compare
wxjstz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ISO 639-3 language codes are 3 letter abbreviations that identify natural languages. Comparing them for equality involves performing a byte by byte check, without taking case into account. So use CompareMem() rather than the peculiar ReadUnaligned24 () API, given that we are not dealing with 24-bit integers here. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
24-bit integers have no natural alignment, and have no representation in CPU hardware registers. This means that there is no way values of such types can appear misaligned in memory, and therefore there is no need for a special API that performs misaligned reads or writes on them. There are no longer any users so remove the API altogether. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
The C standard qualifies misaligned pointers as undefined behavior. This means it is fundamentally impossible to define a conformant C API for unaligned access that is expressed in terms of types with a minimum alignment greater than 1 byte. Undefined behavior means that the compiler is free to generate code that assumes that only the defined behavior occurs. This means that a C implementation of ReadUnaligned64() might be emitted using a 64-bit load operation that does not tolerate misalignment, depending on the CPU architecture. So the only correct way to define such an API is in terms of types such as VOID* that have no implied alignment. Then, given that compilers today are perfectly capable of generating the right code, given accurate annotations, let's rely on those in the generic implementation. This will ensure that the correct access sequences are used, depending on the CPU architecture, and on the compiler flags (e.g., AArch64 uses -mstrict-align in XIP code as it may execute with the MMU disabled). Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Now that the generic code takes alignment constraints into account, there is no longer a need for a special AArch64 version that performs byte by byte access in C. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
9300c78 to
46c8208
Compare
leiflindholm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, we could theoretically have out-of-tree users of the 24-bit primitives, but ... I'd be happy to see them gone.
Description
As Marvin reported in #10125 a while ago, the ReadUnaliged/WriteUnaligned APIs in BaseLib are defined in a way that relies on undefined behavior in C. Reading a quantity that appears misaligned in memory cannot be done via a typed pointer, as typed pointers can only point to quantities that appear aligned.
This is essentially why we have needed a special implementation for AArch64 (and previously ARM), and the same problem has come up with RISC-V as well (#11809).
Fix the API, as well as the generic implementation, to not rely on undefined behavior, and instead, communicate to the compiler in a portable manner that the untyped pointer argument points to quantity of a certain type that may appear misaligned in memory, by using
#pragma pack(1)and a union type encapsulating all the variants.On x86, the codegen does not change at all, which is expected given that the underlying hardware always tolerates misalignment.
On AArch64, the codegen depends on whether or not strict alignment is required, which is indicated by passing the
-mstrict-alignargument. This is needed, e.g., in SEC or PEI, where code execution may occur with the MMU and caches disabled, requiring strict alignment.On RISC-V, which never tolerates misalignment, the codegen is as expected on both GCC and Clang.