Description
Hello,
Thanks for all your work on the glog
library, we utilize it extensively in our code base!
I have a quick question about the expected behavior of the InstallFailureSignalHandler
function when utilized in a binary that is the PID 1 process (i.e. the binary is the entrypoint of a Docker container).
I noticed that when signals (SIGSEGV
, SIGABRT
, etc.) are triggered that the InstallFailureSignalHandler
handles (kFailureSignals
), the FailureSignalHandler
function dumps the stack trace successfully but the application hangs indefinitely.
After some debugging, it seems that it is due to the kill(getpid(), signal_number)
(ref) behavior of the OS call in the InvokeDefaultSignalHandler
function.
My understanding of that function is that it "un-registers" the given user signal to let the OS handle the signal through it's default signal handler, through the use of the kill()
call.
Digging around, I found that for PID 1 processes in Linux, there are no default signal handlers and the process wont receive the signal provided by kill()
:
PID 1 processes in Linux do not have any default signal handlers and as a result will not receive and propagate signals.
This will ultimately leave the process running in-definitely.
I added a patch to signalhandler.cc for our use case to handle this situation:
diff --git a/src/signalhandler.cc b/src/signalhandler.cc
index b6d6e25..338ce00 100644
--- a/src/signalhandler.cc
+++ b/src/signalhandler.cc
@@ -36,6 +36,7 @@
#include "symbolize.h"
#include "glog/logging.h"
+#include <cstdlib>
#include <csignal>
#include <ctime>
#ifdef HAVE_UCONTEXT_H
@@ -252,6 +253,13 @@ void InvokeDefaultSignalHandler(int signal_number) {
sigemptyset(&sig_action.sa_mask);
sig_action.sa_handler = SIG_DFL;
sigaction(signal_number, &sig_action, NULL);
+ if (1 == getpid()) {
+ // From Advanced Bash-Scripting Guide
+ // Fatal error signal "n" should return with 128 + n
+ // Where n is the fatal error signal
+ // https://tldp.org/LDP/abs/html/exitcodes.html
+ std::_Exit(128 + signal_number);
+ }
kill(getpid(), signal_number);
#elif defined(OS_WINDOWS)
signal(signal_number, SIG_DFL);
But wanted to understand if this PID 1 behavior is expected with glog
, or if others have had to do something similar?
FYI, we are using v0.5.0
but the InvokeDefaultSignalHandler
function is the same in the latest version of glog
.