single syscall process status check

Last edited: May 21, 2018

WARNING: this post talks about low level stuff, like named pipes, O_RDONLY, O_NONBLOCK which might sound like gibberish, so read on only if you need to reliably check the status (alive or not) of a given process.

Problem statement: we have a process and we need to programmatically check if the process is running. We want this check to be both reliable and cheap

This problem is pretty common whenever there is a daemon process and some other process (not necessarily a child) needs to check if the daemon is alive or not. Many solutions to this problem have been implemented in the wild, I’ll mention two common ones:

  • PID file – the daemon creates a file and writes its process id. Processes interested in its status read the .pid file and check if the process is still running + some extra checks to ensure the PID wasn’t recycled. On graceful exit, daemon cleans up the .pid file

  • socket/endpoint – the daemon opens up a local port and listens for connections. Processes interested in the status of the process, connect on the local port and probe the status

The above have either reliability issues and/or are not exactly cheap (at least a few syscalls each).

The proposed solution relies on the following behavior of named pipes:

Code example
A process can open a FIFO in nonblocking mode. In this case, opening for read-only succeeds even if no one has opened on the write side yet and opening for write-only fails with ENXIO (no such device or address) unless the other end has already been opened.

Solution

At startup, the daemon process creates and opens a named pipe (in a well known location) in read-only and non-blocking mode (O_RDONLY | O_NONBLOCK). This will succeed even if no writer are attached to the pipe.Cost: 2 syscalls (mkfifo + open)

Processes interested in knowing the status of the daemon attempt to open the same pipe in write-only and non-blocking mode (O_WRONLY | O_NONBLOCK). This will only succeed if some other process/thread (the daemon) has opened the pipe for reading.Cost: 1 syscalls (open)

The solution is simple, cheap and resilient to the daemon process exiting uncleanly, OS crash/reboot etc. If the daemon process ends up in a deadlock or some other hung state this solutions is not sufficient as another bit of information is needed, ie alive + well.

Enough explaining,  checkout the C/C++ code or GO code , feel free to reuse/borrow etc. The concept works equally well on other languages, as long as you have access to mkfifo and non-blocking flags for file open

If you need to the same thing on Windows, you can use similar approach but open a file in exclusive mode (see dwShareMode)

Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.

We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.

More from the blog

get started

Choose how to get started

See

Cribl

See demos by use case, by yourself or with one of our team.

Try

Cribl

Get hands-on with a Sandbox or guided Cloud Trial.

Free

Cribl

Process up to 1TB/day, no license required.