|
|
Created:
11 years, 7 months ago by chrisha Modified:
11 years ago Reviewers:
Mark Mentovai Base URL:
http://google-breakpad.googlecode.com/svn/trunk/ Visibility:
Public. |
DescriptionCreate ClientData structure. This creates a class for representing crash key/value pairs. Further CLs will plumb the writing of a user data stream in the minidump which contains serialized crash keys. Patch Set 1 : #
Total comments: 29
Patch Set 2 : Addressed Mark's comments. #
Total comments: 7
Patch Set 3 : More detailed comments. #
MessagesTotal messages: 9
https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc File src/common/client_data.cc (right): https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:45: const uint32_t kClientDataVersion = 0; I’m happy to see that you’ve included a version! https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:90: "additional-stack-trace"; Are you only supporting one additional stack trace? I thought you had suggested you’d support multiple. https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:97: // to change. 4-byte version number: is it big- or little-endian or is it native-endian and subject to swapping when being read? Say so somewhere. It’s OK to treat this as an opaque format, but since you’re writing a writer and a reader (perhaps more than one reader), you should have the answer nailed down and written down somewhere for yourself and other people working on the reader and writer. https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:106: uint8_t* o = reinterpret_cast<uint8_t*>(buffer->data()); o is a funny name. https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:118: bool ClientData::Decode(const void* void_data, size_t length) { The minidump reader will use this, right? This will fail an assertion if there’s badly-formatted data in the minidump, right? And that’ll happen in the processor? Maybe it’s better to be a little more fault-tolerant. I think that most of the places that we read minidump stuff, we don’t just assert and croak if the minidump is badly-formatted. We reject the stream or the entire minidump instead. https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:129: assert(sizeof(version) == sizeof(kClientDataVersion)); Seems like a weird thing to assert. It’s like int i, j; assert(sizeof i == sizeof j); https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:130: if (version > kClientDataVersion) { != is kind of more obvious. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... File src/google_breakpad/common/client_data.h (right): https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:33: #ifndef CLIENT_CLIENT_DATA_H_ GOOGLE_BREAKPAD_COMMON_CLIENT_DATA_H_ https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:37: #include <string> Blank line before this one, separate the C and C++ headers. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:45: struct ClientData { This is kinda way more than just the data that it holds, Encode and Decode are pretty non-trivial, there’s some static const data, …. I think it’s definitely “class”-worthy. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:49: struct KeyValuePair { But this one’s simple, it’s little more than just its data and a constructor, it’s OK as a struct. But do you want to inline the default constructor in the header? Probably not, it should go in the .cc. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:54: template<typename StringType1, typename StringType2> What’s with the templatization here? You don’t use this outside of the test, and even in the test, you only have const char* arguments. I’d consider getting rid of this entirely since it’s unused outside of the test and since the test can just set key and value like is done outside of the test. If you’re keeping it because you’re using it someplace else (that I can’t see yet), consider getting rid of the templatization. You can just have a constructor that takes const char*s or that takes std::string&s and let conversions do their thing. Or is there another reason for this that I’ve missed? If you’re keeping this but untemplatizing it, you probably don’t want to keep the implementation in the header, you probably want to move it to the .cc like I’ve suggested for the default constructor. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:82: // "str-name, hex-frame-address, hex-frame-count, hex-microseconds-ago, I don’t know what str-name is based on your description, and there’s no code that adheres to this alleged format yet, so I can’t glean it from that yet either. I don’t know for sure what to do with hex-frame-address either. Is it used as a pointer? What’s the format of what it points to? hex-frame-count pointers? Why would something abstract like this use a pointer and refer to crash process memory? This isn’t enough to describe what’s actually happening. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/mini... File src/google_breakpad/common/minidump_format.h (right): https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/mini... src/google_breakpad/common/minidump_format.h:345: // This is an optional stream that contains a serialized Follow the example around you and use /* */ for comments in this file. (That was originally done because this file could be used in non-++ C, in the days where // wasn’t always accepted as introducing a comment in C.)
Sign in to reply to this message.
Thanks for the detailed comments. Another look? https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc File src/common/client_data.cc (right): https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:45: const uint32_t kClientDataVersion = 0; On 2013/09/05 20:16:11, Mark Mentovai wrote: > I’m happy to see that you’ve included a version! If only this had been done elsewhere (like for example, in the IPC protocol). Then we could get rid of the limited key/value mechanism and move to something a little more expressive! https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:90: "additional-stack-trace"; On 2013/09/05 20:16:11, Mark Mentovai wrote: > Are you only supporting one additional stack trace? I thought you had suggested > you’d support multiple. I wanted to double check with you first (I didn't get around to digging through the whole implementation): do keys have to be unique in the current implementation? Or is it possible to have multiple crash keys with the same key? If not, this will more appropriately be a 'prefix' that we'll look for. https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:97: // to change. On 2013/09/05 20:16:11, Mark Mentovai wrote: > 4-byte version number: is it big- or little-endian or is it native-endian and > subject to swapping when being read? Say so somewhere. It’s OK to treat this as > an opaque format, but since you’re writing a writer and a reader (perhaps more > than one reader), you should have the answer nailed down and written down > somewhere for yourself and other people working on the reader and writer. Very good point. Not used to doing cross-platform binary formats. Will specify in the docs. https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:106: uint8_t* o = reinterpret_cast<uint8_t*>(buffer->data()); On 2013/09/05 20:16:11, Mark Mentovai wrote: > o is a funny name. Done. https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:118: bool ClientData::Decode(const void* void_data, size_t length) { On 2013/09/05 20:16:11, Mark Mentovai wrote: > The minidump reader will use this, right? > > This will fail an assertion if there’s badly-formatted data in the minidump, > right? > > And that’ll happen in the processor? > > Maybe it’s better to be a little more fault-tolerant. I think that most of the > places that we read minidump stuff, we don’t just assert and croak if the > minidump is badly-formatted. We reject the stream or the entire minidump > instead. The first assertion will only fire if you pass NULL data into the decoder, which is programmer error. On badly formatted data it'll return false. So fault tolerant if used properly. https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:129: assert(sizeof(version) == sizeof(kClientDataVersion)); On 2013/09/05 20:16:11, Mark Mentovai wrote: > Seems like a weird thing to assert. It’s like > > int i, j; > assert(sizeof i == sizeof j); This should actually be compile assert (wanted to make sure I was reading the something of the correct size). No longer an issue after refactor. https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:130: if (version > kClientDataVersion) { On 2013/09/05 20:16:11, Mark Mentovai wrote: > != is kind of more obvious. Done. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... File src/google_breakpad/common/client_data.h (right): https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:33: #ifndef CLIENT_CLIENT_DATA_H_ On 2013/09/05 20:16:11, Mark Mentovai wrote: > GOOGLE_BREAKPAD_COMMON_CLIENT_DATA_H_ Done. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:37: #include <string> On 2013/09/05 20:16:11, Mark Mentovai wrote: > Blank line before this one, separate the C and C++ headers. Done. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:45: struct ClientData { On 2013/09/05 20:16:11, Mark Mentovai wrote: > This is kinda way more than just the data that it holds, Encode and Decode are > pretty non-trivial, there’s some static const data, …. I think it’s definitely > “class”-worthy. Done. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:49: struct KeyValuePair { On 2013/09/05 20:16:11, Mark Mentovai wrote: > But this one’s simple, it’s little more than just its data and a constructor, > it’s OK as a struct. But do you want to inline the default constructor in the > header? Probably not, it should go in the .cc. Done. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:54: template<typename StringType1, typename StringType2> On 2013/09/05 20:16:11, Mark Mentovai wrote: > What’s with the templatization here? > > You don’t use this outside of the test, and even in the test, you only have > const char* arguments. I’d consider getting rid of this entirely since it’s > unused outside of the test and since the test can just set key and value like is > done outside of the test. If you’re keeping it because you’re using it someplace > else (that I can’t see yet), consider getting rid of the templatization. You can > just have a constructor that takes const char*s or that takes std::string&s and > let conversions do their thing. > > Or is there another reason for this that I’ve missed? > > If you’re keeping this but untemplatizing it, you probably don’t want to keep > the implementation in the header, you probably want to move it to the .cc like > I’ve suggested for the default constructor. This was purely for easing the test so I could generate data easily. Created a factory method there. (And the template was an ugly poor man's workaround to lack of StringPiece. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:82: // "str-name, hex-frame-address, hex-frame-count, hex-microseconds-ago, On 2013/09/05 20:16:11, Mark Mentovai wrote: > I don’t know what str-name is based on your description, and there’s no code > that adheres to this alleged format yet, so I can’t glean it from that yet > either. > > I don’t know for sure what to do with hex-frame-address either. Is it used as a > pointer? What’s the format of what it points to? hex-frame-count pointers? Why > would something abstract like this use a pointer and refer to crash process > memory? This isn’t enough to describe what’s actually happening. The name is simply syntactic sugar. The frame address is a hex-encoded pointer to the location of a bunch of pointers in the crashed process' address space. More documentation added. https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/mini... File src/google_breakpad/common/minidump_format.h (right): https://breakpad.appspot.com/625002/diff/2001/src/google_breakpad/common/mini... src/google_breakpad/common/minidump_format.h:345: // This is an optional stream that contains a serialized On 2013/09/05 20:16:11, Mark Mentovai wrote: > Follow the example around you and use /* */ for comments in this file. > > (That was originally done because this file could be used in non-++ C, in the > days where // wasn’t always accepted as introducing a comment in C.) Done.
Sign in to reply to this message.
https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc File src/common/client_data.cc (right): https://breakpad.appspot.com/625002/diff/2001/src/common/client_data.cc#newco... src/common/client_data.cc:90: "additional-stack-trace"; chrisha wrote: > I wanted to double check with you first (I didn't get around to digging through > the whole implementation): do keys have to be unique in the current > implementation? Or is it possible to have multiple crash keys with the same key? > > If not, this will more appropriately be a 'prefix' that we'll look for. The keys are usually keys to a dictionary-like structure on most (or maybe all) client implementations, so that enforces uniqueness. https://breakpad.appspot.com/625002/diff/3008/src/common/client_data.cc File src/common/client_data.cc (right): https://breakpad.appspot.com/625002/diff/3008/src/common/client_data.cc#newco... src/common/client_data.cc:32: // See client_data.h for details. Details? It seems like the details are all here. https://breakpad.appspot.com/625002/diff/3008/src/google_breakpad/common/clie... File src/google_breakpad/common/client_data.h (right): https://breakpad.appspot.com/625002/diff/3008/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:92: // included in the minidump then this stack trace will not be processed. Now I’m wondering about how you’re designing the client side of this. When we talked about this a couple of days ago, one idea was to stuff this new stream into the minidump after it had been written, before upload. But now I’m seeing that your proposal for this stream’s contents refers to process memory. Our minidumps aren’t normally full-memory dumps. In order to make this work, the dump writer would need to know which memory regions this stream was going to wind up pointing to, so that it could write those regions out in the dump. But if the dump writer knows what this stream is going to contain, then we wouldn’t need to write this stream out separately after the minidump was written. What are your current thoughts on the client-side design of this feature? I want to make sure we’re both on the same page. (It’s OK if you’ve moved on to a different page, but I want to know about it if you have so that I can give you a proper review.) https://breakpad.appspot.com/625002/diff/3008/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:96: // The number of microseconds ago since the stack trace was collected. µs since it was collected until…the dump was written? The process crashed? The “-ago” nature of this isn’t necessarily the easiest thing to deal with here, and unless you specify it more precisely, microsecond precision is kind of pointless. An alternative might be to track this in “time since process launch,” if that’s something you can get at easily. That at least provides a well-defined epoch.
Sign in to reply to this message.
Answers to your questions inline. Loosely thought out rant below: In terms of over-arching design, I'm trying to add new functionality using existing mechanisms: basically, I only want to add a few directives that the MinidumpProcessor can then use to surface additional information of interest. In the perfect world, I'd scrap the existing key/value mechanism for a more flexible generic message passing system. These messages would be pushed to the breakpad client, and unique handles to them returned (so they could later be deleted or modified). Some of the messages would be directives to the crash server to be followed at the time of generating a minidump (things like: grab this region of memory). Some message would be aimed at the MinidumpProcessor, and describe additional processing to perform (can think of these as commands to be performed in an interactive debugging session). Messages of the first type may be needed to support the work done by messages of the second type. Finally, other messages would simply be additional metadata to be surfaced visibly in crash reports (like crash keys right now). https://breakpad.appspot.com/625002/diff/3008/src/common/client_data.cc File src/common/client_data.cc (right): https://breakpad.appspot.com/625002/diff/3008/src/common/client_data.cc#newco... src/common/client_data.cc:32: // See client_data.h for details. On 2013/09/06 16:55:48, Mark Mentovai wrote: > Details? It seems like the details are all here. Indeed they are. Removed. https://breakpad.appspot.com/625002/diff/3008/src/google_breakpad/common/clie... File src/google_breakpad/common/client_data.h (right): https://breakpad.appspot.com/625002/diff/3008/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:92: // included in the minidump then this stack trace will not be processed. On 2013/09/06 16:55:48, Mark Mentovai wrote: > Now I’m wondering about how you’re designing the client side of this. > > When we talked about this a couple of days ago, one idea was to stuff this new > stream into the minidump after it had been written, before upload. But now I’m > seeing that your proposal for this stream’s contents refers to process memory. > > Our minidumps aren’t normally full-memory dumps. In order to make this work, the > dump writer would need to know which memory regions this stream was going to > wind up pointing to, so that it could write those regions out in the dump. > > But if the dump writer knows what this stream is going to contain, then we > wouldn’t need to write this stream out separately after the minidump was > written. > > What are your current thoughts on the client-side design of this feature? I want > to make sure we’re both on the same page. (It’s OK if you’ve moved on to a > different page, but I want to know about it if you have so that I can give you a > proper review.) I'm not picturing any extra involvement with the minidump generator. The exact use case I have in mind is for ASAN, which arranges for additional stack traces to be on the stack of the crashing thread, thus ensuring the data makes it into the minidump. Hence use of an 'additional stack trace POINTER' makes sense. I think this approach is a sane one; the only people adding crash keys with extra debugging information or going to be running in the process, and they're likely going to be causing the crash or handling one caused elsewhere. They're thus in position to put relevant data onto the stack before passing control to the crash server. https://breakpad.appspot.com/625002/diff/3008/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:96: // The number of microseconds ago since the stack trace was collected. On 2013/09/06 16:55:48, Mark Mentovai wrote: > µs since it was collected until…the dump was written? The process crashed? The > “-ago” nature of this isn’t necessarily the easiest thing to deal with here, and > unless you specify it more precisely, microsecond precision is kind of > pointless. > > An alternative might be to track this in “time since process launch,” if that’s > something you can get at easily. That at least provides a well-defined epoch. Good point. I was having in mind that the person generating the crash key was also the person causing the crash (and indirectly the minidump generation), thus the time of those two events would be effectively the same. (In SyzyASAN we actually explicitly track absolute time using RDTSC, and convert to an 'ago' time at the moment of the crash.) The intent is not for this to be *too* precise, as we can't see into the past anyways. Rather its simply intended to give you some idea of time scale. Maybe milliseconds is sufficient? If we move to time since process launch, are process start and crash times already recorded?
Sign in to reply to this message.
https://breakpad.appspot.com/625002/diff/3008/src/google_breakpad/common/clie... File src/google_breakpad/common/client_data.h (right): https://breakpad.appspot.com/625002/diff/3008/src/google_breakpad/common/clie... src/google_breakpad/common/client_data.h:92: // included in the minidump then this stack trace will not be processed. chrisha wrote: > I'm not picturing any extra involvement with the minidump generator. The exact > use case I have in mind is for ASAN, which arranges for additional stack traces > to be on the stack of the crashing thread, thus ensuring the data makes it into > the minidump. Hence use of an 'additional stack trace POINTER' makes sense. > > I think this approach is a sane one; the only people adding crash keys with > extra debugging information or going to be running in the process, and they're > likely going to be causing the crash or handling one caused elsewhere. They're > thus in position to put relevant data onto the stack before passing control to > the crash server. We already have use cases that this won’t address. On the Mac, we currently stuff backtraces into crash keys in a number of situations that aren’t immediately followed by a crash or a dump. We do this in cases where a backtrace might be useful in the event that a crash happens “soon,” and for which the actual stack trace at the time of the crash would be less useful. (This can happen if the crash happens after cycling through the run loop again, or if it happens after unwinding the stack due to an exception.) The current usage on the Mac stuffs the backtrace itself into the crash key. Making it work with your design would require some sort of non-stack memory, because by the time the crash occurs, the stack frame where we’d save the backtrace will be long gone. Outside of asan, I think that this is probably the way that the backtrace feature would be used most frequently. If the backtrace was something you could just get when the stack is in the shape it’ll be in at crash time, you’re probably either in the crash frame, or in some caller of the crash frame, and in either case, the thread’s normal backtrace as currently walked by the Breakpad stackwalker would be identical or a superset of a hypothetical client-walked backtrace. chrome/common/crash_keys.cc has some of the keys I’m talking about on the Mac. The ones ending in _bt are stack traces.
Sign in to reply to this message.
> We already have use cases that this won’t address. On the Mac, we > currently stuff backtraces into crash keys in a number of situations > that aren’t immediately followed by a crash or a dump. We do this in > cases where a backtrace might be useful in the event that a crash > happens “soon,” and for which the actual stack trace at the time of the > crash would be less useful. (This can happen if the crash happens after > cycling through the run loop again, or if it happens after unwinding the > stack due to an exception.) The current usage on the Mac stuffs the > backtrace itself into the crash key. Making it work with your design > would require some sort of non-stack memory, because by the time the > crash occurs, the stack frame where we’d save the backtrace will be long > gone. This would simply be another type of key recognized by the processor: an additional stack trace with the frames provided explicitly in the value. If we want to formalize a spec for that, I'm all for it. The hardest part in doing this is the current 64-byte size limitation to key/values (does this size limit exist on platforms other than Windows?). As I understand things there is some automatic chunking which occurs to allow transmitting longer keys. We'd either have to undo the chunking in the crash server before stuffing the client data into the minidump, or undo the chunking in the minidump processor. (In terms of a spec, maybe something like keys of the form "name_bt", and data of the form "hex-thread-id, hex-us-since-process-start, hex-frame-0, hex-frame-1, ..."?)
Sign in to reply to this message.
I mean: Why should we have two ways to supply what’s conceptually the same data: a prewalked stack? We don’t have to carry the data in the crash key’s value itself, the pointer that you’re proposing is fine, as long as we have a good way to make sure the data actually winds up in the dump. Perhaps this is where an additional backtrace-specific stream would have an advantage over the generic crash key stream, because of the size limitation in the key stream. The size limits on Windows are 64-wchar (not 64-byte) keys and values (breakpad src/client/windows/common/ipc_protocol.h) with uniqueness guaranteed for “dynamic” entries limited to 256 pairs using the SetCrashKeyValue API in Chrome (chrome chrome/app/breakpad_win.cc). Mac (breakpad src/common/simple_string_dictionary.h SimpleStringDictionary) and Linux (chrome chrome/app/breakpad_linux_impl.h CrashKeyStorage) use 256-char keys and values limited to 64 pairs with key uniqueness guaranteed by the dict-like structure. I don’t think there’s any good reason for the limits to be as low as they are on Windows, except for protocol compatibility. :/ The inconsistency between Windows and Mac/Linux is a little annoying. Even the limits on Mac and Linux could reasonably be raised. On Fri, Sep 6, 2013 at 2:15 PM, Chris Hamilton <chrisha@chromium.org> wrote: > > We already have use cases that this won’t address. On the Mac, we > > currently stuff backtraces into crash keys in a number of situations > > that aren’t immediately followed by a crash or a dump. We do this in > > cases where a backtrace might be useful in the event that a crash > > happens “soon,” and for which the actual stack trace at the time of the > > crash would be less useful. (This can happen if the crash happens after > > cycling through the run loop again, or if it happens after unwinding the > > stack due to an exception.) The current usage on the Mac stuffs the > > backtrace itself into the crash key. Making it work with your design > > would require some sort of non-stack memory, because by the time the > > crash occurs, the stack frame where we’d save the backtrace will be long > > gone. > > This would simply be another type of key recognized by the processor: > an additional stack trace with the frames provided explicitly in the > value. If we want to formalize a spec for that, I'm all for it. The > hardest part in doing this is the current 64-byte size limitation to > key/values (does this size limit exist on platforms other than > Windows?). As I understand things there is some automatic chunking > which occurs to allow transmitting longer keys. We'd either have to > undo the chunking in the crash server before stuffing the client data > into the minidump, or undo the chunking in the minidump processor. > > (In terms of a spec, maybe something like keys of the form "name_bt", > and data of the form "hex-thread-id, hex-us-since-process-start, > hex-frame-0, hex-frame-1, ..."?) >
Sign in to reply to this message.
|