Much ink has been spilled on the software opportunity, specifically software as a service (SaaS). The business model is the perfect combination of near-zero marginal costs and the potential for ever-growing customer lifetime values. While the finer intricacies lie below the surface of this assertion, every software entrepreneur attempts to turn this thesis into reality.
The SaaS model itself is a catch-all and has constantly evolved in terms of licensing, pricing models, customer personas and industries, form factors, etc. Software product development, go-to-market and delivery have also been exacted into a science. The competition within the category is at an all-time high - most opportunities get picked away quickly.
The latest sea change within the domain results from how customer imaginations have been captured by advances in Generative AI over the last five years, culminating in end applications like Dall-E, MidJourney, ChatGPT, and Bard. The transformation has brought about a total reconfiguration of customer expectations. Analysts and investors alike are now inquiring about companies' AI strategies, while product professionals and aspiring entrepreneurs view this technology as a vehicle to elevate their current offerings or to create something novel in order to stand out from existing alternatives.
While software has long had zero marginal costs*, AI-powered features mean this is no longer the case, and better products are often determined by the amount of compute available. Companies can choose from various approaches including developing their own foundational models, racking their GPUs for training & inference and allowing customers to bring their own models or API keys.
Let’s use a simple example of a Generative AI use-case and start by analysing the cost of third-party inference and what it means for the software business model. Third-party foundational model APIs are the easiest to set up and give us an outer bound of the costs involved.
Let’s take the case of large language models - text output is far more universally applicable.
The building blocks of these models (both input & output) are called tokens. Tokens are units of text or code that a model uses to process and generate output. Tokens can be characters, words, subwords, or other segments of text or code, depending on the chosen tokenisation approach.
The following is the cost per inference:
Cost per token (Ct) x [Avg. number of tokens per input (Lti) + Avg. Number of tokens per output (Lto)]
When we take this result and consider how often the service is used, we arrive at the total cost of providing this service over a specific time period.
Here’s what it takes to summarise the transcript from a 30 min call:
Let’s assume this product is built into a transcription service - a user uses this product to record 20 calls a week. At this usage level, the summarisation and Q&A service could result in costs ranging from $20 to several hundred dollars annually for the business. Even at the lower bound, this is a substantial increase in the cost of the product. A $100 base transcription service now costs $120/year.
This additional $20 could be addressed in a few ways:
A savvy team will start by designing a Proof of Concept (POC) using off-the-shelf APIs from platforms like OpenAI or Anthropic but will eventually have to overcome these economics questions. Teams could choose to self-host smaller, more fine-tuned models leading to lower cost of inference, but this opens up its own set of challenges including estimating the number of queries per second (or the peak capacity ) to be handled, the GPU architecture & memory bandwidth that the model requires, and the DevOps effort required to keep the infrastructure running.
Another aspect to highlight is that products with an established customer base will have a much easier time squeezing these features in because their gross margins give them more room to do so. For startups that are AI-first, there’s an added challenge. Unlike established companies, which can cushion the costs with existing revenues, newcomers face tighter margins and a narrower pathway to profitability, even though such companies are also best positioned to innovate on products or take on new problem statements.
While we are optimistic that architectures will improve, models will get smaller, and generic hardware will be more performant through open-source innovation and community efforts, AI applications are expected to be compute-bound for the foreseeable future. This means that not every problem is a nail suited to the AI hammer.
This paradigm shift has massive potential for increased customer value, satisfaction, and revenue streams. However, as companies rush to launch their AI features and shift their entire business model around AI in some cases, teams must align on a clear business case and a willingness to pay for the additional service cost.
As investors, we are interested in the founders’ decision-making process for implementing AI/ML systems into products, not just from a product perspective but more crucially in terms of economic viability. Some of the most interesting conversations we've had have been brainstorming sessions on these aspects, and we look forward to more such engaging discussions and opportunities to invest.