x

The AppScope Origin Story

Written by The AppScope Team

February 24, 2022

Since we introduced AppScope in 2021, we’ve been relentlessly working towards the production-ready milestone. Last week we released AppScope 1.0. It’s been a long haul getting to this point. Not really sure if it took this long because we solved difficult problems, or if we’re just that slow. Someone told me that what we are doing would go a lot faster if we use a modern high-level language. Maybe … Can you imagine doing this in TypeScript? Yeah, me either.

This milestone had me musing on why we did this at all. Why did we start this? The simple answer is, to answer the fundamental question: What does that thing – that application – actually do?

My engagement with that question began in 2019 with a call I received from Clint Sharp (Cribl founder and CEO), whom I knew from my time as CTO at a previous company. Clint asked if I would be interested in creating a new, ubiquitous mechanism of getting application data. Ledion Bitincka (another Cribl founder) had created a prototype using an approach similar to work of mine that Clint remembered.

I was super excited to join, and to bring in my good friend John Chelikowsky, who had spent years building medical instruments. The kind that get implanted inside people. You don’t want to get that wrong, and you need to know what is actually happening. This experience made John a logical person to enlist in grappling with the question, What does that thing actually do?

I, too, had been doing a whole lot of embedded stuff. The kind of thing where a federal agency crawls through your code and checks every branch; where if you get it wrong people could be in danger. Stuff like spending 8+ hours meeting with people describing a network issue that prevented them from shutting off gamma rays as part of cancer treatment. We finally figured out what was happening. At the end of the day, they invited me to tour their lab. Do you mean the one where you are testing gamma ray emissions and can’t turn it off effectively? Thanks, but it’s getting late and I need to catch my train seemed like the only sane response.

It’s expected that people involved in certain embedded projects understand the details. The nature of many such projects requires that all dependencies and all behavior is known and measurable. And this is possible because these projects are highly constrained.

Later, interacting with people building and supporting many hundreds of enterprise applications was an eye opener. I began to realize that there are services deployed where few, if any, people know what the thing actually does. An architecture might be described. An overall operation might be communicated. Beyond that, there is not a lot of detail to be had.

Out of pure naivety I’d ask about dependencies, what files are accessed, what is modified, what connections are made, traffic, exfiltration, responses. I was told that there were too many apps and too much detail for anyone to really know any of this. Which of course made sense.

But then there were those meetings where people wanted, and appeared to need, that kind of detail. What does a suite of apps need to ensure that it could be failed over? What files get accessed? What connections are made? And someone had to go dig it all up as best they could.

There were regulators who insisted on knowing what was performed during a given maintenance window.

Then, the lawyers:

What’s being sent out over external connections? We need to approve this.

And we thought:

It’s all encrypted. What are we supposed to tell you?

The lawyers are the ones that signed off on the license for the vendor in question. Now they need to know what’s in all those payloads? Best not go further on this front.

We needed to be able to observe application behavior at a level of detail that was not readily available, for example:

  • Number of processes spawned
  • Network calls made
  • External network connections made
  • Network protocols used
  • Insecure communication
  • Network traffic
  • File reads
  • File writes
  • Files deleted
  • Directories read
  • IP addresses/hosts
  • DNS requests
  • Risky system calls
  • Log data written
  • HTTP endpoints
  • HTTP headers
  • External services accessed
  • Encrypted payloads
  • Clear text payloads
  • Stdout/Stderr traffic
  • Commands executed by any and all Users

Why in the world would anyone ever need all this? In practice there are numerous cases where the detail, and more importantly, the ability to ask and readily answer questions about an app, is critical.

  • A new vendor is introduced to the organization. Is it safe to run their app? What does it actually do? What does it exfiltrate?
  • An app is reported to be slow. Is it actually slow? Is it the app or something the app connects to? Is that app supposed to make these connections? Does that app normally modify files in that directory? Is that app supposed to modify config files? Is that right?

We could go on like this for a long time. But, let’s not.

If you’ve been there you get that this detail is difficult and time-consuming to gather for any given app, let alone for lots of apps. Practical experience says that not a lot of people know how to obtain the necessary detail.

And the detail is only valuable if it’s easy to obtain, right? But for a lot of these questions, if you had to manually get the needed detail in order to answer them, you wouldn’t do it. It’s just not worth it. So, we proceed without knowing, without answering the questions. Questions become moot because we can’t obtain viable answers. In many cases, we don’t think about what questions we really need to ask because we know there are no good answers.

Of course some questions must get answered in some form for security reasons. But it’s not easy; it slows things down, and causes people to be exasperated with the investigators.
Overall, many of us find ourselves in a situation where we are required to make progress without knowing, in any detail, what is happening and what should be happening with an app.

That’s pretty much why we did this AppScope thing and why it’s open source. We needed to be able to ask difficult questions and readily get accurate answers. To accomplish that with minimal effort. To democratize the ability to ask and answer questions about any given application.

Persisting in our naivety, we said to each other, there must be a way to get all this detail. I wonder if we just… So, we tried a number of things that don’t work, a few things that aren’t very practical and a few things that seem to work quite well.

Have we done anything at all? Can we answer the question, What does that thing actually do? Have we created a way to break the cycle of not thinking about what questions we really need to ask because we know there are no good answers? You be the judge.

Give it a whirl:

scope run -- <that_app>

We hope you’ll find that, with AppScope, you can finally answer the question, What does that thing actually do?

 

Questions about our technology? We’d love to chat with you.