While microservices are motivated by domain driven design, which defines bounded business domains and ultimately manifests itself via loosely coupled services that can be independently updated, nanoservices are a response to the needs of machine learning engineering teams to manage an increasing number of models that act as a single service.

This practice emerged in machine learning engineering as the function matured and teams managed an increasing number of models in production. Managing multiple models in the same service came with increasing complexity:

  • running multiple models in the same service leads to heavy services that are harder to scale
  • by definition, two different models are loosely coupled, running them in different services allow teams to take advantage of this loose coupling to manage them independently.

Why nanoservice and not microservice?

The challenge for ML Engineering teams is different than for other backend engineering teams:

  • More services to manage: while backend teams reason in service, ML teams reason in projects. The number of project grows as the team matures. Each project usually represents a feature or a task powered by a model
  • Less codebase logic: this is one of the main reasons why machine learning gained popularity in the first place: instead of maintaining an increasingly complex rule-based system, this is replaced by a statistical entity, that have learned the patterns from the data and abstracts away this complexity. ML teams codebases are focused on data processing and model management.

So while ML Engineering teams end up managing more services, these services are much simpler, because the model weights replace what software would be if the solution was implemented via code.

So what's the difference between a nano service and a micro service?

  • while one microservice is for one team, one team can manage multiple nanoservices
  • while one microservice is a domain, a nanoservice is a feature
  • a nanoservice is much smaller than a micro service (hence the name)
  • nanoservices are aligned with functionality while microservices are aligned with the organisation

What's similar?

  • both can be deployed independently
  • both offer flexibility

AI nano services

The increasing popularity of large language models has only confirmed this trend.

The design of the popular standard MCP introduced by Anthropic in 2024 is based on the interaction of agents with tools and resources.

Agent, tools and resources are all communicating via REST, and individual agent, tools and resources are all designed to run in separate services. For instance, an agent that needs to make reservations and bookings, would run on its own service and connect to multiple MCP servers.

For AI engineering teams, this means that as their function matures, and they develop more features for their products, this means running multiple agents, tools and resources servers.

The principle remains the same, these nano services are going to be feature-specific, much smaller as the LLM abstracts away most of the implementation complexity (though some of it still can be managed in the codebase).

Closing words

As ML and AI engineering teams manage an increasing number of services, multiple pain points will start to arise:

  • A monolithic codebase becomes bloated (flaky test suite, long running CI/CD pipeline ...), while managing an increasing number of repositories comes with its own sets of challenges
  • Networking, monitoring, scalability cost

While microservices are deep and few, nanoservices are shallow and many. So while distributed systems technologies were not necessarily designed for microservices, but rather competed by them, their use for nanoservices present some challenges. I like to end my posts with opportunities (because all challenges are opportunities in disguise). Distributed systems technologies (such as Kubernetes) are the de-facto choice to run nanoservices at scale, but as nano services will challenge the horizontal scale limits of these systems, this is the opportunity for new technologies, or improvements of existing technologies to emerge and simplify the maintenance of such systems.