In 2015, we founded Huntress to build a solution that was different from the plethora of existing security solutions. We wanted to make a real difference in the industry—but we also knew that as a small startup with limited resources, it would be difficult to compete directly with the largest vendors.
So, we started to look at what businesses those incumbent vendors weren’t serving. While enterprise organizations had all the security tools and gadgets they could possibly need at their fingertips, managed service providers (MSPs) and the small to mid-market businesses (SMBs) they served were woefully overlooked. These SMBs make up 99% of businesses in the US, yet are largely ignored or priced out of the solution.
This is where we have positioned Huntress to provide the most value. By focusing on the value derived from the security solution and not on all the bells and whistles—and by building a broad set of versatile solutions to many problems—we have had a lot of success serving our partners.
Naturally, this success has led to a lot of adoption of The Huntress Managed Security Platform, which means we have to scale our infrastructure and team to be able to support all of the partners we have today as well as to prepare ourselves to serve the partners we will onboard in the coming years. This is by no means an easy problem to solve, and it’s even harder to execute while ensuring the products we deliver are top-notch security solutions at an affordable cost. All of the large security solution providers we have spoken with would love to provide their products to the SMB, but they haven’t been able to do that at an affordable cost.
It’s no secret that we’ve had some struggles lately to keep up with the amount of data that we’re receiving and processing on a daily basis—and we’re not afraid to admit that. Solving hard problems requires significant effort and a lot of iteration.
To help frame the sheer size of the problem, I want to present a sneak peek behind the curtains and give some insight into just how much data we’re processing today and how much we expect to process in the future on our journey to protecting more than 2M endpoints in 2022.
Communicating With the Endpoints
Currently, Huntress is protecting 1.5M endpoints across the world. This means that 1.5M computers have the Huntress agent installed, running and communicating with our servers in Amazon AWS. These agents have to talk to our servers to check for updates, fetch their configuration, send us metadata about the applications running and state of the endpoint, and to receive tasking from our ThreatOps analysts and Assisted Remediations.
We always consider the tradeoffs of faster reaction time, which requires additional data and CPU utilization, versus a slower reaction time that requires less data and CPU utilization. This way, we can balance the security needs with the infrastructure to process the data—but this many endpoints still results in an absolute ton of requests our servers have to handle.
Recently, the average number of requests per minute ranges from 500-700K requests per minute! That’s right—10,000 requests per second, every second of every day.
Click here to enlarge the image. This is a screenshot of the request throughput to our servers with a spike to more than 1M requests per minute!
When you’re dealing with that many requests that need to query and update a database, even small mistakes can cause big issues.
You can see in the screenshot that we ran into an issue that caused us to return errors to agents asking for updates, and the number of requests dipped down into the 400K requests/minute range. When we fixed this issue, all of the agents that had been checking for updates were finally able to make requests, and we suddenly had to process more than 1M requests/minute! Thankfully, we run all of our infrastructure in AWS and were able to add servers within a few minutes to handle the additional requests.
EDR and Process Insights Data
For the last 18 months, we’ve been hard at work integrating and enhancing the endpoint detection and response technology we acquired from Level Effect. In typical Huntress fashion, we started providing our Process Insights capabilities to our partners through an early access program so that we could evaluate and learn how our solution would work in the real world and not just in a lab. We knew there would be a lot of data, and as we started to deploy, we learned there was even more data than we were expecting. In many cases, we found hosts sending back data indicating the same processes running several times each minute.
We’re continuously making efforts to de-duplicate and aggregate similar processes that run frequently, and we’re still collecting on average 22 billion process creation events in a seven-day period! On your average weekday during North American daytime hours, we add 60,000 process creation events every second.
Processing, storing and querying this volume of data while maintaining a cost structure that makes sense for the 99% comes with some unique challenges. We’re committed to solving them, but doing what other vendors do—throwing a ton of hardware at the problem and passing that cost on to partners—isn’t an option for Huntress. As a result, we have to get creative with how we process and store the data so that we can extract the maximum-security value from it while considering the cost of everything we do.
Click here to enlarge the image.
Our Journey To Protect 2M Endpoints
Hopefully, this peek behind the curtains gives a little more insight into the scale of the problems we’re trying to solve for the community as we do everything we can to bring security to the 99%. Transparency and integrity are core values at Huntress, and that means that we strive to be open and honest about both the good and the bad.
While we never anticipate issues with our site, these types of things do happen, and they can be frustrating for our partners—and for us. We’re going to continue improving to make sure that we can deliver the best experience and security possible to our partners and the community.