Why is supporting customer-hosted software deployments still so hard?
Dec 3, 2024
Supporting customer-hosted deployments of your software is painful. But lots of enterprises want to self-host – so vendors support it to access the most valuable customers.
I experienced this pain first-hand as co-founder of Context.ai, building a conversation analytics product. Large enterprise customers wanted to deploy our product into their own private cloud environment that we did not have access to. They wanted this due to the sensitivity of the data we were analyzing. This was challenging to support. It was hard to deploy, to upgrade, and particularly to debug when something went wrong.
I’ve spent the last month talking to ~50 other software vendors about how they manage their customer-hosted deployments. These ranged from the largest software businesses to startups working towards their first customer-hosted deployment. I wanted to understand if this deployment model was enduring, what the biggest challenges were, and how these challenges were being solved. In this post I’ll share my answers to these questions.
I was surprised how broadly and intensely vendors dislike supporting their customer-hosted deployments. They are seen as a drain of resources to provide a worse product experience to the end customer. Particular challenges included customer upgrade aversion, debugging issues across vastly different environments, and testing releases before deployment.
Let's dive in
What is customer-hosting?
Customer-hosting refers to software deployed in an environment controlled by the customer, ranging from on-premise to private cloud accounts.
There are a broad spectrum of deployment models this can refer to. One is airgapped on-premise deployment onto a customer’s physical servers. The other extreme involves a customer-provided a private cloud (e.g. AWS) account with the vendor fully managing the instance in that private cloud account. Most customer-hosted deployments are somewhere in the middle – often hosted within the customer's cloud account or their own data centers, with the vendor not having access to the deployment. In this post I refer to the whole spectrum as customer-hosted for simplicity.
Is customer-hosting an enduring deployment model?
The first question to answer is are customer-hosted deployments here to stay? There has clearly been a trend towards cloud SaaS over the past 15 years, but will this continue until everyone has adopted SaaS? I don’t know the answer, but in my conversations I heard:
Many vendors are trying to move away from customer-hosted deployments. It takes a significant amount of support capacity to deliver a worse product experience, and this support cost scales linearly with the number of deployments. This breaks the efficiency of the B2B SaaS business model that requires very low marginal cost per incremental user. Vendors are pushing their customers towards cloud SaaS deployments as a result, but the results are mixed.
The reality of the market is that customer requirements drive deployment decisions more than vendor preferences. Vendors will largely do what is needed to get a deal across the line. And several groups of customers have persistent requirements to self host: large enterprises; companies in regulated industries such as health, finance, defence, and government; and EU-based organisations.
My conclusion: the market is moving towards more cloud SaaS. But customer-hosted isn’t going away any time soon, particularly for AI products that require access to large amounts of sensitive data.
What are the biggest challenges with customer-hosted deployments? And how are they being tackled?
The root cause of the support challenges is that the vendor is supporting software in a unique environment they don’t control.
This is not a solvable problem. All customer environments are different, and one of the primary motivators for the customer to require self-hosting is to ensure their data is not accessible to the vendor – or anyone else.
There are many downstream problems that stem from this challenge. These include:
Debugging problems across a wide variety of customer environments
Getting access to application logs is often challenging, due to data sharing restrictions and redaction requirements. But application logs may not have caught the problem, either because the instance has restarted or because the issue is caused by a problem that is logged elsewhere, such as the Kubernetes logs. This is a huge problem, with several vendors telling us it takes ~10x longer for a support engineer to reproduce and resolve an issue with a customer-hosted deployment compared to a cloud SaaS issue.
To solve this:
Vendors and support teams often schedule Zoom calls with the customers to screenshare and work through issues. Sometimes onsite visits are even required.
Better solutions here look like DataDog’s Agent Flare, pulling a package of logs for easy sharing with the vendor.
Customers often also require logs to be redacted of sensitive information, but in practice this is extremely challenging.
A small group of customers are receptive to breakglass access to their deployments for debugging. This grants the vendor temporary access to the deployment, usually via SSH, to debug issues and make changes. This access requires explicit approval from the customer every time, is time-limited, and actions are logged. But most customers won't accept this.
Upgrade aversion
Customers have plenty of incentives not to upgrade and few incentives to do so. Upgrades require an investment of customer time and risk key functionality breaking. This can be in return for critical bug fixes (pretty persuasive), new features (often not persuasive), or product ‘improvements’ (often a negative when you’re happy with the current version). One large enterprise software buyer told us none of their vendors provided a great solution for self-hosting, mainly due to scheduled downtime requirements.
To solve this:
We’ve seen vendors invest to both reduce the customer effort required and reduce the risk of an upgrade breaking the deployment.
To reduce the risk of problems, vendors should enforce upgrade best practices such as taking pre-upgrade backups and using a blue/green deployment.
To reduce the investment required, upgrades should avoid requiring scheduled downtime whenever possible. An automated upgrade process is helpful, and helps reduce the risk of a process being completed incorrectly by a human.
Ongoing monitoring
Tracking the health and even the versions of the customer-hosted deployments is challenging, and worsens the debugging issue. Vendors don’t know the basic status of their deployments, or the versions that are being run.
To solve this:
Vendors often have spreadsheets tracking customer versions. More mature customers have dashboards reporting deployment status, health, and versions. This is powered by an agent that sits in each deployment and reports low-risk metadata back to the vendor, but again, not every customer would accept this level of data sharing.
Testing releases on many environments
When your customers are operating vastly different environments it becomes challenging to guarantee a new release will work on all of them. This quickly becomes a ‘matrix from Hell’ in the words of one founder I spoke to. Testing the new release on all these environments is not practical, particularly given that customers may be jumping many versions as they upgrade a very out-of-date version
To solve this:
Vendors test releases on recreations of common environments, and environments used by particularly high priority customers. But it’s impossible to test on every possible combination.
Initial deployments into hugely varied environments
Gathering specific requirements from the vendor can take months, and once you have the requirements it can be a lot of work to containerize your application and deploy it with its dependencies. One Solutions Architect told me they had lost 50% of their customer-hosted customers because ‘we’ve never had a smooth install’.
To solve this:
We’ve seen many vendors rely on Helm charts given the popularity of Kubernetes for their customer-hosted deployments. But these Helm charts can be time consuming to create and manage. One Sales Engineer told us that ‘Kubernetes raises a lot of questions that are hard for a non-expert to answer’’ when creating a Helm chart.
How intense are these challenges? The CTO of a cloud-only vendor estimated they’re leaving a 30% revenue lift on the table because they don’t want to support customer-hosting. That’s the intensity of pain businesses associate with customer-hosting.
Customer-hosting is here to stay, but supporting it doesn’t have to be so painful. Large software vendors have invested significantly in tooling to improve the process of supporting their customer-hosted deployments, with tools for deployment, upgrades, testing, monitoring, and debugging. This can reduce the pain significantly, but there aren’t great solutions for everyone else to solve these problems off-the shelf.
Thinking about customer-hosting?
We’re building a product to solve the challenge of supporting these customer-hosted deployments: packaging the product for deployment, debugging issues in environments you can’t access, and testing releases across many environments
I’d love to chat to share ideas and get feedback from anyone that offers customer-hosted deployment options and is thinking about these problems. Please reach out on LinkedIn or at henry@context.ai