[wplug] killing process that will not die

Kuzman Ganchev kuzman at sccs.swarthmore.edu
Thu Aug 7 20:45:55 EDT 2003


On Thu, Aug 07, 2003 at 12:24:12PM -0400, Michael Skowvron wrote:
> > This means that
> > there are no signals it can't ignore... I know this is the case for
> > kernel threads (a signal is just a flag that gets set, and the thread
> > has to be nice about listening). 
> 
> Sort of. Any process/thread can mask or ignore certain signals if 
> programmed to do so. There are other signals that cannot be masked or 
> ignored (KILL, STOP, CONT). 

A little pedantic at this point, but my $0.02.

Processes can't exactly ignore signals -- they can define custom
signal handlers that don't do anything, so the signal handler exits
right away and the program continues execution from where it stopped.
The difference is significant, because if you are in the middle of a
system call: e.g. reading a file, you can sometimes be interrupted in
the middle of doing this, and you return from the system call with an
error (typically errno is set to EINTR). In that case, if what we want
to do is ignore the signal, we need to make the call again.

Kernel threads are not interruptable, meaning that the code they are
running has to check for signals explicity by calling 

static inline int signal_pending(struct task_struct *p)

so you might see a line like:

if (signal_pending(current)) goto out;

If there are no such checks, then the thread will keep doing its
thing. Iirc signals do kick you out of I/O sometimes, since the
blocking I/O functions check for signals and return.

> However, if the thread gets stuck in the system call because it is
> waiting for a sync variable or semaphore to clear, then it will
> never get scheduled again and never see the signal.

Process or thread? kernel threads don't make system calls -- they are
already part of the kernel. I think that while in a system call,
processes behave like kernel threads -- i.e. it's up to whoever wrote
the system call to make sure they eventually return if they get a
signal, but I'm not 100% sure about that. 

> This is what I believe happened to the unkillable process. It reached a 
> barrier and was not allowed to continue executing until a certain 
> "thing" happened. Most likely this would be some kind of sync variable 
> or semaphore. 
> I speculate that it is probably related to coordination of 
> the various threads within the process. It's definitely a bug, but there 
> is no way to determine it's origin. It could be a kernel bug relating to 
> the scheduler, a bug in the threads library, or a bug in the processes 
> code itself.

A user level thread library shouldn't affect recieving SIGKILL, since
the whole set of threads should be a single process. Unless I'm
missing something about thread libraries that use kernel
functionality. 

Kuzman




More information about the wplug mailing list