x

single syscall process status check

Written by Ledion Bitincka

May 21, 2018

WARNING: this post talks about low level stuff, like named pipes, O_RDONLY, O_NONBLOCK which might sound like gibberish, so read on only if you need to reliably check the status (alive or not) of a given process.

Problem statement: we have a process and we need to programmatically check if the process is running. We want this check to be both reliable and cheap

This problem is pretty common whenever there is a daemon process and some other process (not necessarily a child) needs to check if the daemon is alive or not. Many solutions to this problem have been implemented in the wild, I’ll mention two common ones:

  • PID file – the daemon creates a file and writes its process id. Processes interested in its status read the .pid file and check if the process is still running + some extra checks to ensure the PID wasn’t recycled. On graceful exit, daemon cleans up the .pid file
  • socket/endpoint – the daemon opens up a local port and listens for connections. Processes interested in the status of the process, connect on the local port and probe the status

The above have either reliability issues and/or are not exactly cheap (at least a few syscalls each).

The proposed solution relies on the following behavior of named pipes:

  A process can open a FIFO in nonblocking mode.  In this case, opening
  for read-only succeeds even if no one has opened on the write side
  yet and opening for write-only fails with ENXIO (no such device or
  address) unless the other end has already been opened.

Solution

At startup, the daemon process creates and opens a named pipe (in a well known location) in read-only and non-blocking mode (O_RDONLY | O_NONBLOCK). This will succeed even if no writer are attached to the pipe.
Cost:
2 syscalls (mkfifo + open)

Processes interested in knowing the status of the daemon attempt to open the same pipe in write-only and non-blocking mode (O_WRONLY | O_NONBLOCK). This will only succeed if some other process/thread (the daemon) has opened the pipe for reading.
Cost: 1 syscalls (open)

The solution is simple, cheap and resilient to the daemon process exiting uncleanly, OS crash/reboot etc. If the daemon process ends up in a deadlock or some other hung state this solutions is not sufficient as another bit of information is needed, ie alive + well.

Enough explaining,  checkout the C/C++ code or GO code , feel free to reuse/borrow etc. The concept works equally well on other languages, as long as you have access to mkfifo and non-blocking flags for file open

If you need to the same thing on Windows, you can use similar approach but open a file in exclusive mode (see dwShareMode)

Questions about our technology? We’d love to chat with you.