AppScope 1.0: Changing the Game for Infosec, Part 2

April 27, 2022
Categories: Engineering

We’re introducing AppScope 1.0 with a series of stories that demonstrate how AppScope changes the game for SREs and developers, as well as Infosec, DevSecOps, and ITOps practitioners. This blog is the second of two Infosec stories. For both Part 1 and Part 2, Randy Rinehart, Principal Product Security Engineer at Cribl, contributed extensively. For Part 2, we also draw upon independent open source work by Michal Biesek, Staff Software Engineer at Cribl, where he’s a member of the AppScope team.

In our first Infosec story, we used AppScope to vet an application before it’s allowed to run in an enterprise environment. In this one, we’ll explore how an app interacts with other entities in its environment and beyond. The app we’ve chosen as our test subject is Okta, a popular enterprise Single Sign-On (SSO) solution.

The basic scenario is this: Your user navigates to a website; they are redirected to your SSO provider, where they authenticate; then, if they authenticate successfully, SSO sends them to the site they’re trying to visit. How this works is well-understood at a high level; but what if you want detailed information about what goes on under the surface?

With traditional Infosec tools, obtaining such information is a fairly heavy task. With AppScope it’s quick and easy. And you might find a few surprises.

Elastified AppScope, What???

If you’ve read Part 1 or have been trying AppScope out, you know that the most basic thing you can do with AppScope is to “scope” a command, which makes a great deal of information available in your terminal.

For this investigation, we’ll go further and visualize the data that AppScope ingests, in a Kibana dashboard. This is still quick and easy, thanks to Elastified AppScope, an open source project from Michal Biesek. Elastified AppScope deploys several containers with pre-defined routing and filters. In one container, a bash shell is running, and that bash is scoped, so all commands executed at its prompt are scoped. The result is that the data that AppScope ingests passes through Cribl Stream and Elasticsearch to Kibana.

The kinds of questions we’ll need to answer — how many of one type of request Okta makes, compared to another type of request, for example — are easier to explore in graphical form, so the Kibana dashboard help a lot. There is a nice overview diagram that explains what’s happening. The filters allow us to ask questions that would otherwise be difficult to answer.

At the risk of stating the obvious, we can only do this investigation in an environment where SSO is live, protecting various apps. That way, we can scope the browser through which we try to access an app that SSO is protecting. When we’re redirected to the SSO, and afterwards, AppScope will see everything that goes in and out of the browser.

The procedure is this:

  1. Download Elastified AppScope from https://github.com/michalbiesek/elastified-appscope.
  2. Start Elastified AppScope.
  3. At the bash command line in Elastified AppScope, start Firefox.
  4. In Firefox, navigate to a site that would cause us to log in through the SSO — we’ll pick Google Drive.
  5. Log in through SSO as directed.
  6. Look at the results in Kibana.

When we get through, we’ll have a very clear, deep picture of what’s actually happening in SSO interactions.

What Are We Looking For, Overall?

The key questions are about your user’s credentials. Are they ever …

  • Written to disk?
  • Sent to a third-party site (other than the SSO provider, which in this case is Okta)?
  • Sent by an insecure means?

Elastified AppScope provides a generous complement of tables, pie charts, bar charts, and text output that will help us delve into the big questions by looking at a series of more granular and detailed questions.

We’ll seek to progress logically from one detailed question to another, until the answers to our high-level questions emerge. Along the way, we’ll encounter some results which, while beyond the scope of the immediate investigation, warrant looking into at a later date. We won’t dig into every intriguing tidbit right now — the point is to show how AppScope brings these things to the surface.

A Tour of Our Results

We’ve already described the basic procedure we followed. Now, we’ll follow the thread of our detailed questions, and discuss our results.

To keep the discussion simple, we focussed on data that mattered:

  • We ignore any HTTP events that came from starting Firefox.
  • All events discussed that involve writing to disk were done by Firefox. We used AppScope’s ability to write data to disk to capture payloads; but we didn’t include those writes in the data we discussed.
  • While we chose Google Drive as the SSO-protected app to invoke Okta, our discussion ignores any data produced that relates to Google Drive.

How Many HTTP Requests Does This Generate?

A logical place to start is to ask what HTTP events occur in the process of redirection and SSO. Elastified AppScope’s HTTP protocol information table shows nearly 1400 HTTP requests!

That is a surprisingly high number. Why so many? To answer that question, we’ll look at what HTTP verbs and IP addresses or hosts are involved.

What Are All Those HTTP Requests For?

Elastified AppScope’s HTTP methods pie chart shows that 67% of the HTTP requests are POSTs.

Most of these are to Cloudflare (104.16.249.249), specifically mozilla.cloudflare-dns.com, with the HTTP endpoint /dns-query. So, we’ve discovered that nearly two-thirds of HTTP requests were for DNS over HTTPS. Off the bat, we don’t know why, and we’ll set that question aside for investigation later on.

But it leads us to ask: What about “regular” DNS requests (those that are not over HTTPS)? We’d have a more complete picture if we looked at those.

What DNS Requests Do We Have?

In the Top values of body.data.domain.keyword bar chart, Elastified AppScope shows which domain names were hit, and how many times.

There were 96 DNS requests and responses.

 

Based on the chart, we can conclude that Okta uses DNS for all requests to endpoints not related to Cloudflare. Now we’ll want to know what those endpoints are.

What Are the HTTP Requests Not Related to Cloudflare?

Elastified AppScope’s HTTP host and target table shows all endpoints accessed along with their associated hosts. Here’s an example:

We were surprised to see all the requests to a Google host. What has that got to do with Okta and SSO? A bit of research seems to indicate that clients6.google.com might be an alias for a Google API involved in cookie tracking. We should look into this further, at a future date. For now, it’s clear how effortlessly AppScope surfaces information that you might find surprising and want to investigate.

If you want to see each IP address, they’re in the Peer IP and Port table:

What Is HTTP 1.1 Used For?

For an SSO operation like Okta, you might expect most, if not all, HTTP traffic to be encrypted. But Elastified AppScope’s Insecure Communication pie chart reveals that more than 8% of the requests Okta made used plain HTTP, which is considered insecure.

Do these insecure connections represent an issue? Are there any user credentials in there? AppScope makes these connections, including headers and payloads, transparent.

Are Credentials Exposed in Payloads?

The -p or --payload option to the scope run command tells AppScope to save a payload file for each network connection. When scoping the browser, the command looks like this:

scope run -p firefox

Now every payload will be stored in a directory of the form:

~/.scope/history/firefox_190_127682_1642718347433272806/payloads/.

Since we’re looking for credentials, we just grep the payloads directory for the user login name. In our Okta exercise, we found it in several UNIX domain socket connections and one external connection.

Given that payloads are binary data, we’ll use hexdump to look at the content. It’s quick.

The one that interests us most is the external connection. Here’s the hexdump command:

hd -v .scope/history/firefox_190_127682_1642718347433272806/payloads/127688_3.15.36.192:443_172.16.198.210:48812.out

The output contains usernames and passwords in clear text. With those redacted, the results look like this:

000012a0 00 1f 7b 22 70 61 73 73 77 6f 72 64 22 3a 22 78 |..{"password":"X|
000012b0 78 78 78 78 78 78 78 78 78 78 78 78 75 73 65 72 |XXXXXXXXX","user|
000012c0 6e 61 6d 65 22 3a 78 78 78 78 78 78 78 78 78 78 |name":"XXXXXXXXX|
000012d0 78 78 78 78 78 22 6f 70 74 69 6f 6e 73 22 3a |XXXX","options":|

We also found credentials in clear text in two of the UNIX domain socket connections.

Why and How Are Credentials Present in Clear Text?

When you use AppScope, you are using the libscope library, which can extract payload data both before and after data is encrypted. That’s why, in the previous section, we saw credentials in clear text when looking at payloads. AppScope captured the payloads before the app encrypted them to send along in HTTP requests.

When you use AppScope, be mindful of its ability to see payloads before they’re encrypted. Depending on what you’re looking at, you may need to refrain from running certain AppScope commands to avoid seeing things you should not. Or, you might be required to report certain things, like PII, if AppScope reveals it to you.

Seeing these payloads in clear text just proves that AppScope does what we say it does. What we have verified here is that Firefox and Okta were properly encrypting payloads which contain credentials.

What Kind of HTTP Responses Do We See?

Having looked closely at HTTP requests, let’s turn to HTTP responses.

Of 363 total HTTP responses, most received a 200 status code. Only four received status codes other than 200, and none of those look like errors. Those four are:

  • 304 “Not Modified” from:
    • contile-images.services.mozilla.com/obgoOYObjIFea_bXuT6L4LbBJ8j425AD87S1HMD3BWg.9991.jpg
    • contile-images.services.mozilla.com//ZKZXG9SskBgN3rF4jHA_ml11Y3JEwyywOTNrpu4WN8U.9378.jpg
  • 101 “Switching Protocols” from:
    • push.services.mozilla.com/
  • 302 “Found” from:
    • drive.google.com/drive/my-drive
    • lh3.google.com//u/0/ogw/ADea4I7GABeecrrHC-DjM4BtrAZAAbGIQz2p2FxKD3a5=s32-c-m

None of these non-200 responses seem problematic; so far, so good. What about response times?

We see several really long response times—greater than 32,000 ms—from
/punctual/multi-watch/channel and /drive/v2beta/apps. Presumably these were waiting for 2fa to complete.

The response times from mozilla.cloudflare-dns.com/dns-query vary from ~=300 ms to a max of 2,500 ms.

Here’s an example of HTTP responses from the data set:

To sum up, we found nothing surprising or concerning in the HTTP responses.

Is There IPC Happening?

Browsers use UNIX domain sockets for inter-process communication (IPC).

Knowing that, it’s not surprising that we found 40 UNIX domain sockets opened in the course of the operation we investigated. We found that 15 distinct processes were spawned, of which 12 created network connections.

Did all of this result from just opening the Google Drive browser? The answer turned out to be “yes.” To find out, we went outside of Elastified AppScope, and signed in and accessed my-drive (the Google Drive endpoint. Then we ran a ps command, which showed the same pattern of process creation that we saw within Elastified AppScope.

Examining the UNIX domain socket connections, we see an interesting design approach at work. Among the 12 processes that create network connections, a single process makes external network connections. All other processes connect to that process using UNIX domain sockets.

In our exercise, we only opened one tab in Firefox. If we opened more tabs, we’d expect to see more IPC, since each tab uses one or more worker processes.

AppScope data shows you something about the architecture of the browser. It’s not that the tab you open Google Drive in has an outbound connection to Google. Instead, the tab has a local IPC connection to another process — only one process has an outbound connection.

Figure 1 illustrates the design approach.

What this demonstrates is that AppScope quickly gives you a solid initial picture of the IPC in the SSO app. To figure out exactly what the IPC is used for, we’d want to do a separate side-investigation with AppScope.

What Files Are Accessed?

We’d expect the SSO operation to access cache files. Beyond that, we’d like to learn: What other kinds of files are accessed? In particular, are any configuration files involved?

Elastified AppScope gives us File Reads, File Writes, and List of removed files tables to examine.

It turns out that 12,357 files are accessed, which seems like a surprisingly large number. Here’s how they break down:

  • 2,508 are cache entries
  • 767 from ~/.mozilla config files
  • 4,613 /proc
  • 786 /sys
  • 158 references to libs from /usr/lib/firefox

Our takeaway from this is that browser operation hit many more files than we expected. The large number of cache files makes sense right away, but why hit /proc over 4000 times? We do not assume that there is a problem here. We’re just saying wow, this is food for thought, and another point to investigate further at a later date.

What Files Are Related to the SSO Service?

To better understand what’s going on with file access, let’s look at it from another angle and ask: Which accessed files belong to Okta, the SSO app we’re investigating?

Turns out that 13 file open operations have okta in the path name. The unique file names include:

  • ~/.mozilla/firefox/7towh7jc.default-release-1634848201657/storage/default/https+++login.okta.com/.metadata-v2-tmp
  • ~/.mozilla/firefox/7towh7jc.default-release-1634848201657/storage/default/https+++login.okta.com/ls/data.sqlite
  • ~/.mozilla/firefox/7towh7jc.default-release-1634848201657/storage/default/https+++login.okta.com/ls/data.sqlite-journal
  • ~/.mozilla/firefox/7towh7jc.default-release-1634848201657/storage/default/https+++login.okta.com/ls/usage-journal
  • ~/.mozilla/firefox/7towh7jc.default-release-1634848201657/storage/default/https+++login.okta.com/.metadata-v2

This shows us that Okta is doing something with sqlite, something with metadata, and also records a usage journal. To learn more, we’ll look at each of those files.

Do Files in an SSO-Specific Subdirectory Contain Account Detail?

Here’s what we find by catting out the files of interest:

  • The metadata-v2 file defines the endpoint to use: okta.com, specifically https://login.okta.com.
  • There’s nothing noteworthy in metadata-v2-tmp, usage-journal, or sqlite-journal.
  • The sqlite data file contains user the login name and display name, in clear text.

What did we find surprising here? First of all, that Okta-specific files persist between logins. Secondly, that the user’s login name is among what gets persisted. A quick test showed that the login name is not removed when the user logs out.

This may reflect a design tradeoff by Okta, where persisting the files makes sense for their most common use cases (the user’s company laptop used at home, for example) but is possibly risky if public computers are used. We’re thinking about someone using a public computer in a library. The user thinks they’re safe but since their login name is persisted, a shoulder surfer could potentially find that login name. This is not as worrisome as it would be if the password were persisted, too, but it’s worth noting.

If we had not already looked to see whether credentials were exposed in payloads, we’d definitely check now, given that a user login name is present in an sqlite data file.

Summing Up: What Did We Find Out?

Now we can revisit our original questions about user credentials, this time with answers.

Are creds ever written to disk?

Only the login name, but not the password, was ever written to disk.

Are creds ever sent to a third party site? (Meaning, in this scenario, a third party site other than Okta.)

No. This is as it should be.

Are creds ever sent by an insecure means?

No. Again, this is a good result.

In summary, we found that creds are sent to the expected place, login.okta.com, in encrypted form — no surprises there.

It’s important to note that since AppScope allows us to see the content of encrypted payloads (before that content gets encrypted), we did find that credentials were present in a payload. We know that this would have to be the case at some point, for the SSO to function. But where did those payloads go? We’d want to see if they ever went to anyone other than the identity provider, which in this case is Okta. This is a question we’d like to investigate further.

What about unexpected finds? There were a few:

Lots of IPC communication.

40 UNIX domain sockets opened; 15 distinct processes spawned, of which 12 created network connections. Our working theory is that this is just a side-effect of the browser’s design approach, where communication between tabs uses UNIX sockets, while communicating with the internet uses a single TCP socket. So the IPC activity might not have anything to do with SSO itself. Validating that theory would require further investigation.

Lots of HTTP/S traffic, and use of different ports.

This was definitely interesting, and seemed like a lot of communication for an SSO process, which we assumed should just be going to and from the IDP. AppScope allowed us to see where all of this traffic was going, though, and none of it seemed unreasonable.

Use of standard (unencrypted) HTTP.

This involved a small number of requests to two IP addresses: one for Mozilla Cloudflare and the other to client6.google.com. We’d probably want to look more closely at that Google IP address, since we’re not certain what it’s for. Why were these requests not done over HTTPS?

Postscript: What if We Didn’t Have Appscope?

To do this same investigation without AppScope, you’d probably need to do something like this:

  • Use a web proxy, such as Burp suite. You would have to install a certificate. Then, you’d downgrade the encrypted traffic. The web proxy would be able to gather everything that’s sent back and forth, and see which URLs you’re communicating with over HTTP, and which over HTTPS.
  • Use tcpdump to get all of the non-HTTP/S traffic, which the Burp proxy can’t see.
  • Use a different tool, such as strace, to monitor the file writes and file reads. You’d look at the strace output to be able to say what’s a write and what’s a read, how many bytes written or read, and so on. Wrangling strace command switches to get the desired output can be pretty funky.
  • Finally, you’d have to process and collate the output from the various tools, in order to make sense of what you’re seeing, and form conclusions.

We think it’s fair to say that AppScope (especially Elastified AppScope) is the more appealing alternative. We hope you’ll give it a try, and let us know about any infosec use cases where you find AppScope helpful.

.
Blog
Feature Image

Cribl Stream: Up To 47x More Efficient vs OpenTelemetry Collector

Read More
.
Blog
Feature Image

12 Ways We Sleighed Innovation This Year

Read More
.
Blog
Feature Image

Scaling Observability on a Budget with Cribl for State, Local, and Education

Read More
pattern

Try Your Own Cribl Sandbox

Experience a full version of Cribl Stream and Cribl Edge in the cloud with pre-made sources and destinations.

box

So you're rockin' Internet Explorer!

Classic choice. Sadly, our website is designed for all modern supported browsers like Edge, Chrome, Firefox, and Safari

Got one of those handy?