|
|
Created:
10 years, 1 month ago by fdegans Modified:
9 years, 10 months ago Base URL:
https://chromium.googlesource.com/external/google-breakpad/src.git@master Visibility:
Public. |
DescriptionFix breakpad on mips and x86_64 for the NDK r10c update. This change introduces the necessary glue typedefs to deal with the mismatch introduced by the latest Android NDK (w.r.t. desktop Linux): - [x86_64] Rename fpregs mxcr_mask -> .mxcsr_mask - [mips] uc_mcontext.fpregs.fp_r.fp_dregs -> uc_mcontext.fpreg - [mips] restore the forked user.h TBR=mark Patch Set 1 #
Total comments: 5
Patch Set 2 : Review comment #
Total comments: 7
Patch Set 3 : Fix indent #
MessagesTotal messages: 17
PTAL I do not include the sys/user.h from the NDK for MIPS, but I do on other platforms. Had to do some specific code paths for x86_64 in some cases since some structures are defined differently in glibc and the NDK.
Sign in to reply to this message.
Looks good overall, can you just take a look to the few notes below? https://breakpad.appspot.com/9714003/diff/1/client/linux/dump_writer_common/t... File client/linux/dump_writer_common/thread_info.cc (right): https://breakpad.appspot.com/9714003/diff/1/client/linux/dump_writer_common/t... client/linux/dump_writer_common/thread_info.cc:184: out->flt_save.mx_csr_mask = fpregs.mxcr_mask; Cit: "Drop the 's', just mxcr, it's more sexy" :( (Just to be clear, this is right, I'm not suggesting to change it. Just sad to realize that we need this) https://breakpad.appspot.com/9714003/diff/1/common/android/include/sys/user.h File common/android/include/sys/user.h (right): https://breakpad.appspot.com/9714003/diff/1/common/android/include/sys/user.h... common/android/include/sys/user.h:33: // The definition for these is completely different in the Android NDK. Can we reword a bit this header and make clear the quirks being done in this file? something like: // The purpose of this file is to glue the mismatching headers (Android NDK vs // glibc) and therefore avoid doing otherwise awkward #ifdefs in the code. // The following quirks are currently handled by this file: // - MIPS: Keep using forked definitions of user.h structs. The definition in // the NDK has completely different names. // - i386: Use the Android NDK but alias user_fxsr_struct > user_fpxregs_struct. // - Other platforms: Just use the Android NDK unchanged. https://breakpad.appspot.com/9714003/diff/1/common/android/ucontext_constants.h File common/android/ucontext_constants.h (right): https://breakpad.appspot.com/9714003/diff/1/common/android/ucontext_constants... common/android/ucontext_constants.h:129: #if defined(__android__) Is it intentional the fact that you were using __ANDROID__ in the other file and __android__ here? Also, can you clarify a bit this? Under which circumstance you expect __android__ not being defined in a file which path is common/**android**/...
Sign in to reply to this message.
Thanks for the review. I think patch set 2 covers everything. https://breakpad.appspot.com/9714003/diff/1/common/android/include/sys/user.h File common/android/include/sys/user.h (right): https://breakpad.appspot.com/9714003/diff/1/common/android/include/sys/user.h... common/android/include/sys/user.h:33: // The definition for these is completely different in the Android NDK. On 2014/10/23 10:15:31, primiano wrote: > Can we reword a bit this header and make clear the quirks being done in this > file? something like: > > // The purpose of this file is to glue the mismatching headers (Android NDK vs > // glibc) and therefore avoid doing otherwise awkward #ifdefs in the code. > // The following quirks are currently handled by this file: > // - MIPS: Keep using forked definitions of user.h structs. The definition in > // the NDK has completely different names. > // - i386: Use the Android NDK but alias user_fxsr_struct > user_fpxregs_struct. > // - Other platforms: Just use the Android NDK unchanged. Done. Adding the internal bug reference for MIPS. https://breakpad.appspot.com/9714003/diff/1/common/android/ucontext_constants.h File common/android/ucontext_constants.h (right): https://breakpad.appspot.com/9714003/diff/1/common/android/ucontext_constants... common/android/ucontext_constants.h:129: #if defined(__android__) On 2014/10/23 10:15:31, primiano wrote: > Is it intentional the fact that you were using __ANDROID__ in the other file and > __android__ here? > Also, can you clarify a bit this? > Under which circumstance you expect __android__ not being defined in a file > which path is common/**android**/... You're right, I missed the file path. I thought the header was shared between Linux and Android.
Sign in to reply to this message.
https://breakpad.appspot.com/9714003/diff/50001/client/linux/dump_writer_comm... File client/linux/dump_writer_common/thread_info.cc (right): https://breakpad.appspot.com/9714003/diff/50001/client/linux/dump_writer_comm... client/linux/dump_writer_common/thread_info.cc:183: out->flt_save.mx_csr_mask = fpregs.mxcsr_mask; You could do a hacky: #define mxcsr_mask mxcr_mask #include_next <user.h> #undef mxcsr_mask in the android user.h instead? Not sure if this is better, but it at least keeps the hacks in android specific headers. https://breakpad.appspot.com/9714003/diff/50001/common/android/ucontext_const... File common/android/ucontext_constants.h (right): https://breakpad.appspot.com/9714003/diff/50001/common/android/ucontext_const... common/android/ucontext_constants.h:129: #define MCONTEXT_FPREGS_MEM 304 I'm a bit worried about this since it means we were either previously writing the stuct incorrectly, or are going to do so in the future - I'm adding Anton who wrote the x86_64 code to check if it is OK.
Sign in to reply to this message.
Actually adding Anton - Anton could you check the x86_64 change please?
Sign in to reply to this message.
https://breakpad.appspot.com/9714003/diff/50001/common/android/include/sys/us... File common/android/include/sys/user.h (right): https://breakpad.appspot.com/9714003/diff/50001/common/android/include/sys/us... common/android/include/sys/user.h:104: struct user_fpregs_struct { incorrect indentation https://breakpad.appspot.com/9714003/diff/50001/common/android/ucontext_const... File common/android/ucontext_constants.h (right): https://breakpad.appspot.com/9714003/diff/50001/common/android/ucontext_const... common/android/ucontext_constants.h:129: #define MCONTEXT_FPREGS_MEM 304 On 2014/10/23 14:36:53, rmcilroy wrote: > I'm a bit worried about this since it means we were either previously writing > the stuct incorrectly, or are going to do so in the future - I'm adding Anton > who wrote the x86_64 code to check if it is OK. MCONTEXT_FPREGS_MEM is checked by a compile assert in breakpad_getcontext_unittest.cc, please check that you have built this. If that is passing, I would still like to understand what changed, as I am fairly confident that the original value was correct at time of writing.
Sign in to reply to this message.
https://breakpad.appspot.com/9714003/diff/50001/client/linux/dump_writer_comm... File client/linux/dump_writer_common/thread_info.cc (right): https://breakpad.appspot.com/9714003/diff/50001/client/linux/dump_writer_comm... client/linux/dump_writer_common/thread_info.cc:183: out->flt_save.mx_csr_mask = fpregs.mxcsr_mask; On 2014/10/23 14:36:53, rmcilroy wrote: > You could do a hacky: > > #define mxcsr_mask mxcr_mask > #include_next <user.h> > #undef mxcsr_mask > > in the android user.h instead? Not sure if this is better, but it at least > keeps the hacks in android specific headers. I feel strongly against using define because of side effects. With this, the build will break if the structure is changed in the NDK so we know when we can remove it. However, with the define approach, the build will silently succeed and we may forget all about our hack which may end up having nasty side effects, even if it is very unlikely. https://breakpad.appspot.com/9714003/diff/50001/common/android/include/sys/us... File common/android/include/sys/user.h (right): https://breakpad.appspot.com/9714003/diff/50001/common/android/include/sys/us... common/android/include/sys/user.h:104: struct user_fpregs_struct { On 2014/10/23 15:39:12, anton wrote: > incorrect indentation Done. https://breakpad.appspot.com/9714003/diff/50001/common/android/ucontext_const... File common/android/ucontext_constants.h (right): https://breakpad.appspot.com/9714003/diff/50001/common/android/ucontext_const... common/android/ucontext_constants.h:129: #define MCONTEXT_FPREGS_MEM 304 On 2014/10/23 15:39:12, anton wrote: > On 2014/10/23 14:36:53, rmcilroy wrote: > > I'm a bit worried about this since it means we were either previously writing > > the stuct incorrectly, or are going to do so in the future - I'm adding Anton > > who wrote the x86_64 code to check if it is OK. > > MCONTEXT_FPREGS_MEM is checked by a compile assert in > breakpad_getcontext_unittest.cc, please check that you have built this. If that > is passing, I would still like to understand what changed, as I am fairly > confident that the original value was correct at time of writing. > > The definition of the struct is different in the NDK. There is some padding defined in the glibc version but not in the NDK so I had to change this to the new "proper" value.
Sign in to reply to this message.
On 2014/10/23 15:43:54, fdegans wrote: > https://breakpad.appspot.com/9714003/diff/50001/client/linux/dump_writer_comm... > File client/linux/dump_writer_common/thread_info.cc (right): > > https://breakpad.appspot.com/9714003/diff/50001/client/linux/dump_writer_comm... > client/linux/dump_writer_common/thread_info.cc:183: out->flt_save.mx_csr_mask = > fpregs.mxcsr_mask; > On 2014/10/23 14:36:53, rmcilroy wrote: > > You could do a hacky: > > > > #define mxcsr_mask mxcr_mask > > #include_next <user.h> > > #undef mxcsr_mask > > > > in the android user.h instead? Not sure if this is better, but it at least > > keeps the hacks in android specific headers. > > I feel strongly against using define because of side effects. > With this, the build will break if the structure is changed in the NDK so we > know when we can remove it. > However, with the define approach, the build will silently succeed and we may > forget all about our hack which may end up having nasty side effects, even if it > is very unlikely. > > https://breakpad.appspot.com/9714003/diff/50001/common/android/include/sys/us... > File common/android/include/sys/user.h (right): > > https://breakpad.appspot.com/9714003/diff/50001/common/android/include/sys/us... > common/android/include/sys/user.h:104: struct user_fpregs_struct { > On 2014/10/23 15:39:12, anton wrote: > > incorrect indentation > > Done. > > https://breakpad.appspot.com/9714003/diff/50001/common/android/ucontext_const... > File common/android/ucontext_constants.h (right): > > https://breakpad.appspot.com/9714003/diff/50001/common/android/ucontext_const... > common/android/ucontext_constants.h:129: #define MCONTEXT_FPREGS_MEM 304 > On 2014/10/23 15:39:12, anton wrote: > > On 2014/10/23 14:36:53, rmcilroy wrote: > > > I'm a bit worried about this since it means we were either previously > writing > > > the stuct incorrectly, or are going to do so in the future - I'm adding > Anton > > > who wrote the x86_64 code to check if it is OK. > > > > MCONTEXT_FPREGS_MEM is checked by a compile assert in > > breakpad_getcontext_unittest.cc, please check that you have built this. If > that > > is passing, I would still like to understand what changed, as I am fairly > > confident that the original value was correct at time of writing. > > > > > > The definition of the struct is different in the NDK. There is some padding > defined in the glibc version but not in the NDK so I had to change this to the > new "proper" value. I think they have to match otherwise user code compiled with the NDK will be incorrect, even if the implementation of getcontext.S is correct. If the glibc struct is larger than the NDK struct then there can be no way to make correct code by compiling with the NDK (it will write into unallocated memory). However, your change makes this value smaller, which suggest maybe the difference is in the other direction, which is potentially survivable if the padding is at the end of the struct. Is breakpad_getcontext_unittest working?
Sign in to reply to this message.
On 2014/10/23 15:57:42, anton wrote: > > > > > https://breakpad.appspot.com/9714003/diff/50001/common/android/ucontext_const... > > File common/android/ucontext_constants.h (right): > > > > > https://breakpad.appspot.com/9714003/diff/50001/common/android/ucontext_const... > > common/android/ucontext_constants.h:129: #define MCONTEXT_FPREGS_MEM 304 > > On 2014/10/23 15:39:12, anton wrote: > > > On 2014/10/23 14:36:53, rmcilroy wrote: > > > > I'm a bit worried about this since it means we were either previously > > writing > > > > the stuct incorrectly, or are going to do so in the future - I'm adding > > Anton > > > > who wrote the x86_64 code to check if it is OK. > > > > > > MCONTEXT_FPREGS_MEM is checked by a compile assert in > > > breakpad_getcontext_unittest.cc, please check that you have built this. If > > that > > > is passing, I would still like to understand what changed, as I am fairly > > > confident that the original value was correct at time of writing. > > > > > > > > > > The definition of the struct is different in the NDK. There is some padding > > defined in the glibc version but not in the NDK so I had to change this to the > > new "proper" value. > > I think they have to match otherwise user code compiled with the NDK will be > incorrect, even if the implementation of getcontext.S is correct. If the glibc > struct is larger than the NDK struct then there can be no way to make correct > code by compiling with the NDK (it will write into unallocated memory). However, > your change makes this value smaller, which suggest maybe the difference is in > the other direction, which is potentially survivable if the padding is at the > end of the struct. Actually, I just checked the ucontext struct on my workstation (/usr/include/sys/ucontext.h), and the definition is identical to the one now in the NDK, with no padding. For reference, this is how ucontext was defined in the previous file in breakpad: typedef struct ucontext { [other stuff] uint64_t __padding[18]; _libc_fpstate __fpregs_mem; } ucontext_t; The padding part is not present in the glibc or NDK version of ucontext.h. I am not sure why there was padding in the breakpad version of the file. Maybe it was taken from an older version of the glibc? > > Is breakpad_getcontext_unittest working? Yes.
Sign in to reply to this message.
The short version is that whatever value we put in ucontext_constants.h it's going to be wrong. As enh@ suggested time ago in the offline Bionic thread, the only way to reliably get the fpstate is dereferencing the mcontext.fpstate pointer. And looking at the actual kernel sources I am more and more convinced about that. Also, http://lwn.net/Articles/281921/ (see below) seems to confirm this theory. Longer version: The position of the actual fpstate buffer is not static. It is somewhere after the struct rt_sigframe (which contains the ucontext that is coped to userspace). But the rt_sigframe is based on the stack, and is dynamically expanded depending by a large set of both compile-time and run-time conditions: - From [1] (line 235) the position of the fpstate member is determined by alloc_mathframe() - In turn alloc_mathframe ([2] line 605) allocates the FP frame depending on the xstate_sigframe_size() unsigned long frame_size = xstate_sigframe_size(); *buf_fx = sp = round_down(sp - frame_size, 64); - In turn xstate_sigframe_size() behavior depends whether the cpu supports XSAVE and the kernel configuration (HAVE_HWFP) - The value is than further aligned on a 64 boundary (and the starting value depends on the stack pointer, which is unpredictable) So, as far as I am concerned, 448 here seems to me as arbitrary as 304. I have the strong suspicious that we are dumping garbage in the FP state and we just happen to not care (or is there any test checking this?). I suspect that the GLibc (for Linux desktop) implementation either: - Is wrong (refers to the pre-xsave era) and nobody bothered updating it. - Does a user-space copy from the kernel-passed pointer to the user-space static buffer __fpregs_mem). Anton: at this point I am curious where did the initial value of MCONTEXT_FPREGS_MEM came from? The only right way to deal with this sounds to me doing the same thing that libunwind for x86_64 does in [3], that is, dereferencing the fpstate pointer passed by the kernel (line 60) and using fldenv to load that address. From: http://lwn.net/Articles/281921/ Though the kernel Vs user ucontext look somewhat similar, kernel's ucontext struct is different from glibc's ucontext struct because of sigset_t size differences between user Vs kernel. So fpstate in signal handlers must always be referred through the pointer in sigcontext, and not directly through __fpregs_mem in userlevels ucontext struct. > Is breakpad_getcontext_unittest working? Those unittests are pointless as regards this discussion. They just check that the magic number in ucontext_constants.h matches the offset in the header. IIUC they don't check that the memory layout defined in the header is valid and contains valid data. [1] https://android.googlesource.com/kernel/goldfish/+/android-3.10/arch/x86/kern... [2] https://android.googlesource.com/kernel/goldfish/+/android-3.10/arch/x86/incl... [3] https://android.googlesource.com/platform/external/libunwind/+/master/src/x86... Goodnight, your friendly code archaeologist.
Sign in to reply to this message.
I did some other checks with fresh mind. Confirming that we should NOT rely on the offset of fpregs, as it changable, I just checked that, even on Linux desktop, there is no uc_mcontext.fpregs != & __fpregs_mem. The latter is not modeling the memory layout of kernel space, but an internal implementation detail of GLibc which is used when calling getcontext() / makecontext(). On Desktop linux, a good speculation of the offset seems to be 448. Again, we should not trust this static offset. On the other side, I don't think that is a change that belongs to this CL. In more concrete words, this CL LGTM once the change in ucontext_constants.h is reverted. (x86_64 Desktop Proof below) primiano@primiano:/tmp$ make test && ./test g++ test.cc -o test CTX 0x7fff2e918740 mcontext off 40 gregs off 40 fpregs off 448 __fpregs_mem off 424 XMM0 1122334455667788 XMM1 aabbaabbaabbcccc =========================================== #include <signal.h> #include <stdio.h> #include <string.h> #include <ucontext.h> #include <stdint.h> #include <stdlib.h> void print(int sig, siginfo_t *info, void* v) { ucontext_t* ctx = (ucontext_t*)v; printf("CTX %p\n",ctx); printf("mcontext off %ld\n",(intptr_t)&ctx->uc_mcontext - (intptr_t)ctx); printf("gregs off %ld\n", (intptr_t)&ctx->uc_mcontext.gregs - (intptr_t)ctx); printf("fpregs off %ld\n", (intptr_t)ctx->uc_mcontext.fpregs - (intptr_t)ctx); printf("__fpregs_mem off %ld\n", (intptr_t)& ctx->__fpregs_mem - (intptr_t)ctx); printf("XMM0 %llx\n", *((long long*) & ctx->uc_mcontext.fpregs->_xmm[0])); printf("XMM1 %llx\n", *((long long*) & ctx->uc_mcontext.fpregs->_xmm[1])); exit(-1); } int main(){ long long C1 = 0x1122334455667788LL; long long C2 = 0xAABBAABBAABBCCCCLL; struct sigaction sa; //sa.sa_handler=print; sa.sa_sigaction=print;// warning here: assignment from incompatible pointer type sa.sa_flags=SA_SIGINFO; sigaction(SIGSEGV,&sa,NULL); asm volatile( "movdqu %0, %%xmm0;" "movdqu %1, %%xmm1;" "movl $42, (0)" // BOOOOOOOOOM : : "m" (C1), "m" (C2)); }
Sign in to reply to this message.
Also this line should be removed from common/android/breakpad_getcontext_unittest.cc. ASSERT_EQ(static_cast<size_t>(UCONTEXT_FPREGS_MEM_OFFSET), offsetof(ucontext_t,__fpregs_mem)); This is conceptually wrong, and fails even on Desktop/GlibC *. That check would prevent rolling the new NDK. __fpregs_mem is NOT meant to be the place where the kernel puts fpregs. Proof (on Linux desktop): printf("offsetof(ucontext_t,__fpregs_mem): %lu\n", __builtin_offsetof(ucontext_t,__fpregs_mem)); prints out 424 which is !== 448 and !== 304
Sign in to reply to this message.
Ok, another long talk with Anton. Forget my previous two replies. The MCONTEXT_FPREGS_MEM constant is NOT used to read the ucontext from the kernel as I was mistakenly assuming. it is used to create a ucontext for breakpad when requested (to implement getcontext). This means that the offset of the fpregs storage is irrelevant. What matters is that: - The struct that breakpad allocates (which depends on the contexts of ucontext.h is consistent with the MCONTEXT_FPREGS_MEM offset, and that we put the fpregs pointer accordingly). In other words, it looks like Patch Set 3, as it is, is fine, without doing any other change. Anton, will follow up soon.
Sign in to reply to this message.
LGTM, with MCONTEXT_FPREGS_MEM set to 304 After more discussion we are retracting some of the our previous statements and accepting this change as is. We follow the argument that fpregs are only ever read through the pointer and that they can therefore be in a different location in the signal handler. As the earlier change made the struct ucontext smaller, we want to make the offset smaller so that the assembler code only writes into the part of the struct which is allocated. On 2014/10/24 09:49:14, primiano wrote: > Also this line should be removed from > common/android/breakpad_getcontext_unittest.cc. > > ASSERT_EQ(static_cast<size_t>(UCONTEXT_FPREGS_MEM_OFFSET), > offsetof(ucontext_t,__fpregs_mem)); > > This is conceptually wrong, and fails even on Desktop/GlibC *. > That check would prevent rolling the new NDK. > > __fpregs_mem is NOT meant to be the place where the kernel puts fpregs. > > Proof (on Linux desktop): > printf("offsetof(ucontext_t,__fpregs_mem): %lu\n", > __builtin_offsetof(ucontext_t,__fpregs_mem)); > prints out 424 which is !== 448 and !== 304
Sign in to reply to this message.
Can you guys please make sure that the resolution of the tricky questions like in ucontext_constants.h gets written as comments in the code? Reading this discussion in rietveld after a few months will be hard.
Sign in to reply to this message.
|