Skip to content
Saved

Deep Dive

Where the compute lives: on-device inference, NPUs, and the cloud round-trip

A robot's intelligence runs in one of two places, and which one decides its latency, its privacy posture, and whether a server outage turns it back into a dumb appliance. In 2026 the line is moving onto the device, unevenly.

By Robovations··2 min read·Updated

When a robot recognizes a cord on the floor, something somewhere ran a model. The question that shapes almost everything else about the product is where that something sits: on a chip inside the robot, or on a server the robot phoned to reach. The answer is rarely on the box, and it determines how fast the robot reacts, what data leaves your home, and what the robot can still do when the internet is down.

The industry shorthand is edge versus cloud. The reality in 2026 is a split system, with the fast reflexes moving on-device and the heavier, optional intelligence staying in the cloud.

Two places a decision can happen

On-device inference versus the cloud round-trip

YOUR HOMEVENDOR CLOUD1231Map data2Photos3Voice clipsretained per policy
Local inference runs the model on a chip inside the robot: fast, private, and available offline. A cloud round-trip sends data out, runs the model on a server, and waits for the answer, which adds latency and a dependency on connectivity.

Term

NPU (Neural Processing Unit)A processor built specifically to run the matrix math behind neural networks. It is the chip a robot uses to classify what its camera sees without sending the image anywhere. When a product says it recognizes objects on-device, an NPU or equivalent accelerator is usually what makes that possible.

The case for the edgeWhy obstacle recognition moved on-device

Real-time obstacle handling cannot tolerate a cloud round-trip. A robot moving across a floor has a fraction of a second to decide whether the shape ahead is a sock or a power strip. Sending each camera frame to a server and waiting for a reply would add latency the robot does not have, and it would fail completely the moment the connection dropped. So the reflexive layer, the recognition that has to happen now, has moved onto the robot. Modern premium vacuums pair an RGB camera and a structured-light or LiDAR sensor with an onboard neural unit that runs a trained classifier locally, with no server in the loop.

This is a genuine capability shift, and it is also quietly a privacy improvement. A robot that classifies objects on-device does not need to stream your living room to a data center to decide where to drive. The image can be processed and discarded in place. Whether it actually is discarded is a separate question, governed by the app’s data settings rather than the chip, but local inference at least makes private operation possible.

How to tellReading a robot for where it thinks

You can often infer the architecture without a teardown. A robot that advertises offline cleaning, local maps, and recognition that works without an account is telling you the reflex layer is on-device. A robot whose key features are gated behind a connected account and a monthly fee is telling you those features live on a server. The first arrangement survives an outage and keeps your camera frames at home. The second is more capable on a good day and more fragile on a bad one.

For classification, this matters because location of compute is a structural fact about the product, not a setting. It predicts the offline behavior, the privacy ceiling, and the subscription exposure all at once. It is one of the most useful things to know about a robot, and one of the least likely to be printed on the box.

The sensor list tells you what a robot can see. Where the compute lives tells you what it can do when the network goes dark.

Published June 18, 2026 · Updated June 23, 2026 · 435 wordsHave evidence that could change a classification?