The Surgeon and the Machine: When AI Gets a Body
Chapter 5: The Surgeon and the Machine — When AI Gets a Body
The Hands That Weren’t There
On a Wednesday morning in July 2025, a surgical team at Johns Hopkins University gathered around an operating table. On it lay a pig — anesthetized, prepped, draped in sterile blue. The procedure was a cholecystectomy: removal of the gallbladder. Routine, as surgeries go. Performed hundreds of thousands of times a year by human hands around the world.
But on this Wednesday, there were no human hands.
The Smart Tissue Autonomous Robot — STAR — performed the entire operation. Not a single step was teleoperated. No human finger touched a control. The robot planned its approach, identified the cystic duct and cystic artery, dissected the tissue planes, clipped the critical structures, and freed the gallbladder from its hepatic bed. Seventeen sequential surgical tasks, each requiring real-time decisions about where to cut, how much tension to apply, when to pause and reassess. The team watched on monitors, hands at their sides, as a machine navigated living anatomy with a precision that one observer described as “unsettlingly smooth.”
I want you to sit with that image for a moment. Not the technical achievement — we will get to the engineering. I want you to sit with the feeling of it. A room full of surgeons watching a machine do what their hands were trained to do. The strange silence of an operating room where no one is operating. The particular vertigo of watching something you spent a decade learning to do performed by something that learned it differently, learned it faster, and does not tremble.
In every chapter of this book so far, AI has been an advisor. It read scans. It scored probabilities. It flagged warnings. Even when it failed — as it failed Maria in Chapter 4 — it failed in the role of consultant. The physician remained the actor. The machine remained the voice in the ear.
In this chapter, the machine gets a body.
“The hand is the cutting edge of the mind.” — Jacob Bronowski
This changes everything. Not because autonomous surgery is imminent for your next operation — it is not. But because the moment AI acquires physical agency in the most intimate of medical domains, every assumption about augmentation, transparency, and equity must be renegotiated from the ground up.
The Taxonomy of Control
To understand where surgical AI actually stands — as opposed to where headlines place it — we need a framework. The clearest one available was not built for medicine. It was built for cars.
The Society of Automotive Engineers defines six levels of driving automation, from L0 (no automation — the human does everything) to L5 (full automation — the car handles all conditions without human input). This taxonomy has become the shared language for discussing self-driving technology because it replaces binary thinking (Is it autonomous or not?) with a spectrum (How much autonomy, in which situations, with what fallback?).
Surgery needs the same spectrum. Here is how it maps:
Level 0 — No Assistance. The surgeon’s hands, unaugmented. Traditional open surgery. A scalpel, a headlamp, and a lifetime of training. This is what surgery was for centuries, and it remains the reality in much of the world.
Level 1 — Passive Assistance. The machine provides information but does not act. Think of surgical navigation systems that display a patient’s anatomy in 3D, or AI that overlays a tumor’s boundaries on a laparoscopic video feed. The surgeon sees more clearly, but every motion is human-initiated. Most “AI in surgery” today lives here.
Level 2 — Active Assistance. The machine can perform specific subtasks under direct human control. The da Vinci Surgical System, which has dominated robotic surgery for two decades, operates at this level. The surgeon sits at a console, gripping hand controllers, and the robot translates their motions into movements of surgical instruments inside the patient’s body. The robot adds tremor filtration, motion scaling, and enhanced dexterity. But it does not decide anything. It is a sophisticated extension of the surgeon’s hands — a teleoperator, not an autonomous agent. Every cut is the surgeon’s cut, executed through a mechanical intermediary.
This distinction matters because most of what the public calls “robotic surgery” is Level 2. The robot is the instrument. The surgeon is the intelligence. Calling it “robotic surgery” is like calling a telephone conversation “electronic speaking” — technically accurate, functionally misleading.
Level 3 — Conditional Autonomy. The machine can perform defined sequences of tasks autonomously, but the human must be ready to intervene at any moment. Think of a system that can suture a wound on its own — choosing needle placement, managing tissue tension, tying knots — while the surgeon monitors and can take over if the tissue behaves unexpectedly. Several research groups have demonstrated this capability in controlled settings. The machine acts; the human supervises; the handoff between them must be instantaneous.
Level 4 — High Autonomy. The machine can handle a complete surgical task or procedure in a defined domain without human intervention, though a human is present and the domain is constrained. This is where the STAR robot operates. It performed a complete cholecystectomy autonomously — but on a pig, in a controlled research environment, with a known anatomy and a team ready to abort. The domain was bounded. Within those bounds, the machine made every decision.
Level 5 — Full Autonomy. The machine handles any surgical situation, in any patient, without human oversight. This does not exist. It may never exist. And whether it should exist is a question that deserves more than a paragraph — we will return to it.
The reason this taxonomy matters is that it punctures the binary narrative that dominates public discourse: either robots are replacing surgeons or they are mere tools. The truth is a gradient, and the gradient is where all the interesting questions live. At Level 2, the philosophical challenge is modest — the machine is clearly an instrument. At Level 4, the challenge becomes acute: if the machine is making real-time decisions about where to cut living tissue, what does augmentation mean? Who is augmenting whom?
The Haptic Gap
There is something that every surgeon knows and almost no engineer fully grasps. It is the thing that makes surgery an art and not merely a procedure. It has no good name in the technical literature, so surgeons speak of it in the language of craft: feel.
When a surgeon’s hand holds a scalpel and draws it through tissue, the hand is not merely executing a motion. It is listening. The resistance of the tissue tells a story — the drag of healthy fascia versus the gritty crunch of fibrosis, the subtle give of a vessel wall approaching its rupture point, the difference between a plane that separates cleanly (the right plane) and one that tears with reluctance (the wrong one). An experienced surgeon adjusts in real time, calibrating force, angle, and speed based on continuous tactile feedback that arrives below the level of conscious thought.
This is proprioception — the body’s sense of its own position and force in space. And in surgery, proprioception is not a luxury. It is the primary sensory channel through which the surgeon reads the tissue. A vascular surgeon can feel the difference between a calcified artery and a healthy one before the imaging confirms it. A neurosurgeon can sense when a tumor’s border has been reached because the texture changes under the bipolar forceps. These are not mystical claims. They are somatosensory computations, performed by the surgeon’s nervous system at speeds that make conscious analysis unnecessary.
The da Vinci system, for all its elegance, largely eliminates this channel. The surgeon sits at a console, feet away from the patient, manipulating hand controllers that translate into robotic arm movements. The visual feedback is superb — a magnified, three-dimensional view of the operative field. But the haptic feedback is minimal. The surgeon cannot feel what the instruments are touching. They must infer tissue properties from visual cues alone: how the tissue deforms, how it bleeds, how it responds to traction. Skilled da Vinci surgeons develop remarkable visual-proprioceptive substitution — they learn to “see” what they would normally feel. But the translation is imperfect, and every experienced robotic surgeon will tell you, if you ask honestly: something is lost.
Now compound that loss. At Level 2, the human brain still integrates visual information with cognitive models of anatomy and a career’s worth of tactile memory. The surgeon remembers what healthy tissue feels like, even when working through a robot that cannot feel it. At Level 4, where the machine is acting autonomously, that compensatory mechanism vanishes. The machine has no archive of tactile memories. It has never held a beating heart. It navigates tissue the way a driver navigates a city using only satellite imagery — accurately, perhaps, but without the ground-level knowledge of potholes, one-way streets with ambiguous signs, the particular way rain changes everything.
This is the haptic gap, and it is not merely an engineering problem awaiting a sensor upgrade. It is a philosophical problem about what it means to know tissue. Force sensors and strain gauges can measure resistance, deformation, and shear. But the surgeon’s proprioceptive intelligence is not raw measurement — it is measurement integrated with years of embodied experience, contextual judgment, and the ability to recognize when something feels wrong before the data confirms it. Can a machine replicate not just the sensing but the wisdom that accrues from a body that has felt ten thousand tissues and remembers them all?
The honest answer is: not yet. Perhaps not for a long time. And this answer should shape how we think about the autonomy spectrum. The levels where haptic intelligence matters most — complex dissections, vascular repairs, tumor resections near critical structures — may be the levels where human hands remain essential longest. The machine excels where geometry is clear and tissue is predictable. The human excels where the territory is uncertain and the only guide is a feeling that cannot be formalized.
The Movie Inverted
In every chapter of this book, the photograph-to-movie metaphor has pointed in one direction: AI takes static, isolated data points and reveals the dynamic patterns connecting them. The single lab value becomes a trajectory. The frozen scan becomes a temporal narrative. The isolated risk factor joins a web of interactions that unfolds over time. Photographs become movies. The hidden dimension is time.
Surgery inverts this.
The operative field is already a movie. It is continuous, flowing, alive. Tissue shifts. Blood wells and is suctioned away. Organs pulse with each heartbeat. The anatomy that looked fixed on a preoperative CT scan is, under the surgeon’s hands, a dynamic landscape in constant motion. The surgeon’s art is inherently cinematic — reading the flow, anticipating the next frame, adjusting the narrative in real time.
What the machine must do in surgery is the opposite of what it does in diagnosis. It must decompose the movie into photographs.
A surgical AI watching a continuous operative field must segment that flow into discrete decision points: This is the moment to clip. This is the plane to enter. This structure must be avoided. The tension here exceeds the safe threshold — release. Each decision is a still frame extracted from the film — a moment of photographic precision imposed on cinematic reality. The robot’s contribution to surgery is not adding the temporal dimension (the movie already exists) but adding spatial precision within the flow — the ability to execute at a specific point in space with a tolerance measured in fractions of a millimeter, at a specific moment in time chosen by algorithms processing visual data faster than any human eye.
This inversion matters because it reveals something about the division of labor between human and machine that the diagnostic chapters did not. In diagnosis, the machine’s gift is seeing more — perceiving the movie that humans, limited to photographs, could not see. In surgery, the machine’s gift is acting more precisely — executing within the movie at a resolution that human hands, limited by tremor and fatigue, cannot sustain.
The surgeon is the filmmaker. They understand the narrative of the operation — its arc, its risks, its contingencies, the moment when the plan must change because the tissue has told a different story than the imaging predicted. The machine is the cinematographer — executing each shot with technical precision that serves the film but does not direct it. At Level 2, this collaboration is explicit: the surgeon directs, the robot executes. At Level 4, the roles begin to merge, and the question of who is directing becomes genuinely hard to answer.
But here is what the STAR experiment revealed, and what makes it both thrilling and sobering: even at Level 4, the machine’s “direction” is fundamentally photographic. It executes a sequence of planned steps with extraordinary spatial precision. What it cannot do — not yet — is improvise. When the anatomy deviates from expectation, when a vessel is anomalous, when the tissue behaves in a way the training data never included, the machine does not have the filmmaker’s ability to rewrite the script in real time. It has the cinematographer’s ability to shoot each frame beautifully, but not the director’s ability to abandon the storyboard when the scene demands it.
The movie inverted. In diagnosis, AI is the projector that reveals the film. In surgery, AI is the precision instrument that operates within the film. Both are forms of augmentation. But they augment different things — one augments perception, the other augments execution — and the limits of each are mirror images of the other.
The Three Principles Under the Knife
Augmentation: Who Leads?
The Augmentation Principle — AI amplifies human capability, not replaces human judgment — was formulated in the context of diagnosis. A system that flags a suspicious nodule augments the radiologist. A system that predicts sepsis augments the intensivist. The human decides. The machine informs.
In surgery, augmentation becomes physically literal and conceptually murky.
At Level 2, augmentation is clean. The da Vinci system augments the surgeon’s hands — adding tremor filtration, motion scaling, and instrument articulation that exceeds the human wrist’s degrees of freedom. The surgeon’s judgment is unambiguously in command. No philosophical crisis here.
At Level 4, the STAR performing a cholecystectomy, the language of augmentation strains. The machine is not amplifying a human decision — it is the decision-maker. It identifies the cystic duct, selects the clip placement, determines the dissection angle. If the human team is present but not acting, in what sense is this augmentation? The machine is not a more powerful scalpel. It is a more powerful surgeon.
One answer — and I think it is the right one, though it requires careful thinking — is that augmentation operates at a higher level of abstraction. The human team designed the protocol. Human surgeons validated the approach. Human judgment determined that this particular patient, with this particular anatomy, was a candidate for autonomous operation. The machine acts within a space of possibility that human expertise defined. The augmentation is not hand-to-hand but system-to-system: human surgical knowledge, accumulated over centuries, compressed into a framework that the machine executes with superhuman consistency.
This is a different kind of augmentation than the photograph-to-movie shift. It is augmentation as delegation — the way a senior surgeon augments a resident by defining the parameters of a procedure and then allowing the junior surgeon to execute within them. The machine is the most consistent, most tireless, most spatially precise resident in history. But the attending — the corpus of human surgical knowledge — is still in the room.
Whether this framing survives Level 5, if Level 5 ever arrives, is a question I cannot answer. It may be the point where augmentation becomes a polite fiction, and we must find new language for what the human role has become.
Transparency: The 200-Millisecond Problem
In diagnosis, transparency means the system can show its work. The sepsis model displays the features driving its score. The imaging AI highlights the region it found suspicious. The physician reviews, challenges, overrides. This process operates on a timescale of minutes to hours — there is time for inspection.
In surgery, the timescale collapses.
When a surgical AI decides to clip a structure, it makes that decision in milliseconds. The visual processing, the anatomical identification, the force calculation, the motor command — the entire chain from perception to action executes faster than a human can formulate a question about it, let alone challenge the answer. If the machine misidentifies the common bile duct as the cystic duct — the most feared error in cholecystectomy, the error that can leave a patient in lifelong misery — the clip is placed before anyone can say wait.
This is the 200-millisecond problem, and it breaks the conventional model of transparency. You cannot inspect reasoning that has already been acted upon. Post-hoc explanation — “Here is why I clipped that structure” — is useful for learning, for quality improvement, for litigation. But it is useless for prevention. The Transparency Principle, as formulated for diagnostic AI, assumes a gap between recommendation and action in which human oversight can operate. Autonomous surgery eliminates that gap.
The solution, I believe, lies not in slowing the machine down (which would defeat the purpose of its precision) but in moving transparency upstream. Before the procedure begins, the system must present its surgical plan: here are the structures I expect to encounter, here is how I intend to navigate them, here are the decision points where my confidence is highest and lowest. The surgeon reviews this plan the way an air traffic controller reviews a flight path — not intervening in each maneuver, but validating the framework within which the maneuvers will occur. And the system must define its own abort criteria: if the anatomy deviates from expectation by more than a defined threshold, if the visual confidence drops below a critical level, the machine stops. It invokes apoptosis — the self-silencing we described in Chapter 4 — not for a recommendation on a screen, but for a blade in living tissue.
Transparency in surgery is not a window you look through in real time. It is a contract you negotiate in advance and a kill switch you design before the first incision.
Equity: The $2.5 Million Question
A da Vinci Surgical System costs approximately $2.5 million. Annual maintenance runs another $200,000. The instruments are proprietary and expensive. The training required to achieve competency is extensive, and the learning curve is steep.
Now consider the geography of surgical need. The Lancet Commission on Global Surgery estimated that five billion people — two-thirds of the world’s population — lack access to safe, affordable surgical care when they need it. The highest burden of surgically treatable disease falls on low- and middle-income countries, where surgical infrastructure is thinnest, where trained surgeons are scarcest, and where a $2.5 million robot is a fantasy bordering on insult.
This is the Equity Principle tested at its most brutal. If surgical AI remains tethered to platforms that cost as much as a rural hospital’s entire annual equipment budget, then the technology will amplify the existing distribution of surgical capability: excellent care concentrated in wealthy urban centers, scarcity everywhere else. The movie plays in high definition at Johns Hopkins and in no resolution at all in rural Bihar.
But here is where the trajectory bends toward hope — if we choose to bend it.
The STAR robot is not a da Vinci. It is a research platform, but its underlying architecture points toward a future where surgical autonomy is not locked inside a multi-million-dollar console. As computer vision improves, as actuators become cheaper, as surgical AI algorithms mature and become deployable on more modest hardware, the cost curve will follow the same path that every computational technology has followed: exponentially downward. The smartphone in your pocket is more powerful than the supercomputers of three decades ago. There is no physical law preventing a surgical robot from becoming, eventually, cheaper than the surgeon it assists.
The equity question, then, is not whether the technology can be democratized. It is whether we will choose to democratize it. Whether the business models, the regulatory frameworks, the global health priorities will be designed to push surgical AI toward the places where it is most needed rather than the places where it is most profitable. Whether the five billion people without surgical access will be the technology’s primary audience or its afterthought.
The precedent is not encouraging. The da Vinci system has been commercially available for over two decades. It has transformed surgery in wealthy nations. It has done almost nothing for surgical access in the developing world. The technology followed the money. If autonomous surgical AI follows the same money, the equity gap will not narrow. It will deepen — with the added cruelty that the technology capable of closing the gap chose not to.
This is not inevitable. But it is the default trajectory, and defaults are powerful. Changing the default requires intention — the kind of intention that writes equity into the design specification, not the marketing brochure.
The Ghost Limb
There is one more dimension to the surgeon-machine relationship that I have not yet named, because it is the hardest to articulate. It is not about capability or safety or cost. It is about identity.
Surgery is, among all the medical specialties, the one most defined by the body of the practitioner. Surgeons train their hands for years. They develop muscle memory that is as individual as a fingerprint — the particular way they hold a needle driver, the angle at which they tie a knot, the rhythm of their dissection that other surgeons can recognize the way musicians recognize each other’s phrasing. A surgeon’s hands are not tools. They are extensions of their clinical mind, shaped by every case they have ever performed, carrying the accumulated knowledge of ten thousand gestures that succeeded and a hundred that did not.
What happens to this identity when the hands are no longer needed?
I do not ask this as a Luddite lament. I ask it as a genuine question about what medicine loses when it gains autonomous surgery. The da Vinci surgeon still operates — their hands move, their proprioceptive intelligence (however attenuated) still engages, their embodied skill still matters. The Level 4 surgeon monitors. The Level 5 surgeon, if such a thing ever exists, is absent from the procedure entirely.
This loss is not sentimental. It is functional. The surgeon who operates develops judgment through the body — through the ten thousand cases where muscle memory and conscious analysis fused into something greater than either alone. The surgeon who monitors develops judgment through observation — a different kind of intelligence, valuable in its own right, but not the same. The question is whether surgical judgment can survive the transition from embodied practice to disembodied oversight. Whether the attending who never operates can teach the resident who never will. Whether the accumulated haptic wisdom of centuries of surgical craft can be preserved in a form the machines can use, or whether it will be lost — a phantom limb, still felt, no longer there.
I do not have the answer. But I know that any honest account of surgical AI must ask the question, because the stakes are not only clinical. They are civilizational. Surgery is one of humanity’s oldest healing arts — the direct, physical intervention of one human body upon another in the service of repair. If we transfer that art to machines, we must be clear-eyed about what we are choosing to preserve and what we are choosing to release. And we must ensure that the release is a choice, made with full understanding, not a drift that happens while we are distracted by the technical marvel of a machine that can remove a gallbladder from a pig without human hands.
The Blade’s Edge
We have now traveled through four domains of medical AI. We have watched the machine read — interpreting scans, literature, and clinical data in Chapters 1 and 2. We have watched it diagnose — detecting disease with alien perception in Chapter 3. We have watched it fail — cascading through correlated errors that harmed a patient in Chapter 4. And now we have watched it act — taking physical form in the operating room, holding the blade, navigating living tissue.
Each domain has stressed the three principles differently. Diagnosis demanded transparency. Failure demanded adaptive resilience. Surgery demands all three simultaneously — transparency before the cut, augmentation during it, equity in deciding who receives it — and adds a fourth demand that the previous chapters only hinted at: the preservation of human skill and identity in an age when machines can do what human hands once monopolized.
The STAR robot removed a gallbladder from a pig. Somewhere in the next decade, a descendant of that robot will remove a gallbladder from a human. And then an appendix. And then a tumor. Each step will be validated, regulated, debated. Each step will save lives that would have been lost to surgical error, fatigue, or geographic isolation from skilled hands. Each step will also ask us who we want surgeons to be when they are no longer the ones cutting.
The operating room has always been a place where the stakes of medicine are at their most visceral — where the abstraction of illness becomes the concreteness of tissue, where theory meets the body, where the physician’s commitment is measured not in prescriptions but in the steadiness of their hands. Bringing AI into that room is not adding a tool to the table. It is inviting a new kind of intelligence into the most intimate space in medicine and asking it to share the work that defines us.
We have seen what happens when AI reads. When it diagnoses. When it fails. When it acts. The next frontier is stranger still. In the chapters ahead, we will follow AI out of the hospital entirely — into the laboratory, where it is redesigning the molecular search space of drug discovery, generating compounds that no chemist imagined and testing them at speeds that compress decades of pharmaceutical development into years. The machine moves from the operating room to the molecule — from the body it can touch to the one it must imagine.
Next: Chapter 6 — The Molecule as Patient: AI Reimagines Drug Discovery
This book is free and open. Support thoughtful AI in medicine.