As the demand for AI infrastructure skyrockets, organizations are under increasing pressure to maximize the inference output from their available GPUs. For those with specialized knowledge in certain methods, this environment presents an excellent opportunity to secure investment.
This is one of the main reasons behind the emergence of Tensormesh, which has just come out of stealth mode after raising $4.5 million in seed capital. The round was headed by Laude Ventures, with additional backing from angel investor and database innovator Michael Franklin.
Tensormesh intends to use these funds to develop a commercial offering based on the open-source LMCache tool, which was created and is maintained by co-founder Yihua Cheng. When implemented effectively, LMCache can cut inference expenses by up to a factor of ten—a capability that has made it popular in open-source projects and attracted collaborations with major players like Google and Nvidia. Now, Tensormesh aims to transform its academic achievements into a sustainable business.
At the core of this technology is the key-value cache (KV cache), a memory architecture that streamlines the processing of complex inputs by summarizing them into key values. In most conventional systems, the KV cache is discarded after each query, but Tensormesh CEO Juchen Jiang believes this practice leads to significant inefficiency.
“It’s comparable to a brilliant analyst who reviews all the information but forgets everything after answering each question,” explains Tensormesh co-founder Junchen Jiang.
Rather than deleting the cache, Tensormesh’s approach preserves it, making it available for reuse when the model encounters a similar task in a future query. Since GPU memory is a limited resource, this often involves distributing data across multiple storage solutions, but the payoff is a substantial boost in inference capacity without increasing server demands.
This improvement is especially impactful for chat-based applications, where models must repeatedly access an expanding conversation history. Agentic systems face a similar challenge, as their logs of actions and objectives continually grow.
While, in principle, AI firms could implement these changes themselves, the technical hurdles make it a formidable undertaking. Given the Tensormesh team’s expertise and the complexity involved, the company is confident there will be strong interest in a ready-made solution.
“Storing the KV cache in secondary storage and reusing it efficiently without degrading system performance is a tough technical challenge,” Jiang notes. “We’ve observed teams hiring dozens of engineers and spending several months to build such a system. Alternatively, they can leverage our product and achieve the same results much more efficiently.”


