postgre-database-development-india

AppScope: Postgres SQL Observability

Last edited: April 6, 2021

Postgres, like many database applications, has a robust dynamic trace capability. Combined with a highly configurable log facility, it’s quite possible to track database activity. But as with most attempts at observability, it isn’t quite that simple. 

Logs provide high-level views into a Postgres service, emitting details describing errors, connections, checkpoints, and duration. A trace capability allows an external utility to be called at specific points in the code, enabling execution trace. This is based on probes inserted into Postgres source code. The standard external utility supported is Dtrace. Supporting other dynamic tracing utilities is theoretically possible. 

One of the first issues you encounter when combining Postgres probes with a trace capability is that the probes are not configured by default. There are several reasons for this, including tracing’s potential to impose a non-trivial performance penalty. It’s generally something that is enabled only in a development environment. From the Postgres docs: By default the probes are not compiled into PostgreSQL; the user needs to explicitly tell the configure script to make the probes available.

AppScope has the ability to track all SQL activity associated with a Postgres service. This type of observability is not intended as a development tool. It should be useful to administrators and SREs without the need to modify Postgres source code. The observability detail can be readily enabled and disabled, in dynamic fashion, by updating AppScope config. More on dynamic config in a different post. 

It’s relatively easy to enable. The biggest limitation is that the database must be restarted to allow it to be “scoped.” We expect to alleviate this limitation in future releases. 

Scope Postgres

The Postgres service is “scoped” by causing the libscope library to be loaded when the service starts. The example below uses the libscope library directly. It’s also possible to use the ldscope application in lieu of a direct library load.

You can enable payload activity by using a config file, or by means of environment variables. The example below points to a config file, scope.yml, in /etc/scope

$ sudo -u postgres LD_PRELOAD=/usr/lib/libscope.so SCOPE_HOME=/etc/scope /usr/lib/postgresql/10/bin/postgres -D /var/lib/postgresql/10/main -c config_file=/etc/postgresql/10/main/postgresql.conf

Client Activity

Connect to the Postgres service and perform whatever activity is needed. 

$ sudo -u postgres psql -h 127.0.0.1

View Commands

Commands sent to the Postgres service will be emitted in the configured payload dir (/tmp by default) with the filename suffix *.in. The filename uses the pattern pid_localIP:port_remoteIP:port.in. Where IP addresses are not available, asin some SSL instances, the filename pattern replaces IP:port with tlsrx or netrx to indicate how the payload was extracted.

Given a binary data format, it’s convenient to use hexdump to observe the detail. 

$ hd -v 27270_tlsrx:0_tlsrx:0.in

View Responses

Responses sent from a Postgres service are emitted as *.out files. As described above, they are written to the configured payload dir, and utilize the same filename pattern. 

$ hd -v 27270_tlstx:0_tlstx:0.out    

 

In Conclusion

While AppScope makes no attempt to provide an execution trace for Postgres, you can use AppScope to track all payloads sent and received by a Postgres service. Normally, payload capture of this sort is not very useful, due to encryption. However, AppScope can access payloads before they are encrypted, and after they are decrypted. This means that you can readily observe every command received, and every response returned. There is no need to enable probes, modify source code, or incur a severe performance penalty to track all SQL activity.

The ability to observe all SQL activity creates a unique concern. There can be details that qualify as PII (Personally Identifiable Information) in the SQL statements and/or responses. Recognizable PII will usually need to be redacted. The details about performing such redaction are beyond the scope of this post. Check out Cribl LogStream for reference.

While the prospect of observing any and all SQL traffic may not be new or unique, this prospect offers the ability to answer a number of difficult questions. Questions such as: when did that admin have access to the DB; was a specific table modified; what tables have been accessed; which remote clients have had access, and to which databases? You get the idea. 

AppScope is available today. Check out the website, and get started by downloading it. Try it yourself by playing in our AppScope Fundamentals sandbox, which will walk you through how to use AppScope in an interactive shell right from your browser. Star us on GitHub and share what you’re doing with AppScope in our #appscope channel on the Cribl Community Slack!

Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.

We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.

More from the blog

get started

Choose how to get started

See

Cribl

See demos by use case, by yourself or with one of our team.

Try

Cribl

Get hands-on with a Sandbox or guided Cloud Trial.

Free

Cribl

Process up to 1TB/day, no license required.