Skip to main content

Plant Phenotyping — Autonomous Root Inoculation Pipeline

·1004 words·5 mins
Ana-Maria Farazica
Author
Ana-Maria Farazica

This project sits at an unusual intersection: plant biology, computer vision, and robotics. The client was the Netherlands Plant Eco-phenotyping Centre (NPEC), a research facility that studies plant root systems at scale. Their problem was straightforward to describe and genuinely difficult to solve — automatically find the root tips of individual plants in Petri dish images, then direct a liquid-handling robot to inoculate each one with precision.

The dataset consisted of black-and-white images of Arabidopsis thaliana seedlings growing in Petri dishes, captured daily by the Hades phenotyping system. Over time the roots grow, branch, and start crossing each other, which made both segmentation and individual root detection increasingly challenging.

Sample raw images from the Hades system

Raw black-and-white images from the Hades system — each dish contains five Arabidopsis thaliana seedlings at different growth stages.

The Pipeline
#

The full system runs end-to-end from a raw image to a completed inoculation sequence:

Segmentation output showing root, shoot and seed masks

U-Net predictions for image 13 — red for roots, green for shoots, blue for seeds, with a blended overlay on the right.

Petri dish detection — Before any deep learning, traditional computer vision methods (Otsu thresholding, contour detection, bounding box extraction) crop the image down to just the dish. This keeps the model focused on the area that actually matters and reduces noise from the surrounding environment.

Semantic segmentation with U-Net — A U-Net model trained on annotated images from three datasets (Y2B_23, Y2B_24, Y2B_25) predicts pixel-level masks for three classes: roots, shoots, and seeds. The model reached a validation F1 of 0.875 and a validation loss of 0.122.

Segmentation output showing root, shoot and seed masks

Instance segmentation — With the root mask produced, the next step is separating the five individual plants. This was done using a watershed-based approach seeded from the seed mask predictions, which gave each plant a distinct region to work within.

Primary root extraction — For each plant, the primary root is identified by tracing the longest connected path from the attachment point (where root meets shoot) down to the root tip. The tip coordinates are the output that feeds into the robotics component.

Primary root extraction results for all five plants

Primary root paths traced for each of the five plants — each colour corresponds to one plant, with the root tip marked by a circle.

PID controller integration — The root tip pixel coordinates are transformed into robot workspace coordinates, and a PID controller moves the Opentrons OT-2 pipette to each location to perform the inoculation.

PID controller autonomously inoculating root tips

The Opentrons OT-2 simulation moving to each root tip location in sequence, controlled by the PID pipeline.

Root tip coordinates overlaid on segmented plants

Final root tip coordinates overlaid on the cropped Petri dish — these pixel positions are what gets passed to the robot.

The Hard Parts
#

Instance segmentation
#

Getting five clean, separated plant masks out of a single root mask was the most technically frustrating part of the project. The roots of different plants grow close together and eventually cross, which means any approach based purely on spatial proximity starts to break down.

The first attempts used distance transform maxima as watershed seeds — reasonable in theory, but the seed logic kept producing too many or too few markers, causing plants to either merge or split incorrectly. Marker-controlled watershed helped but introduced its own problem: when the unknown region boundaries weren’t defined precisely, the algorithm would drift into background areas. Centroid-based assignment caused label bleeding across plant boundaries. Each approach had a specific failure mode, and the final solution was essentially a fusion of the most reliable parts of each attempt, using seed mask predictions as stable anchors and combining multiple segmentation cues to assign pixels to plants more reliably.

Primary root detection
#

Finding the primary root specifically — not just any root, not all roots — required building a skeleton of the root system and then identifying the longest continuous path from the attachment point. This sounds clean but in practice the skeletonized masks had gaps, spurious branches, and noise that all interfered with path tracing. Morphological gap closing helped significantly, but the radius had to be tuned carefully: too small and the path broke; too large and separate roots merged together.

A key realisation was that the shoot and seed masks weren’t just useful for segmentation — they were essential for anchoring where each plant’s root system actually started. Without that reference point, distinguishing primary from lateral roots became much harder.

What Actually Moved the Score
#

The biggest single improvement came from fixing the training data, not the model. The initial U-Net runs were bottlenecked by imbalanced patch sampling — most patches were background, so the model learned to predict background well and roots poorly. Rebalancing to 80% plant patches and 20% background in the training set pushed the validation F1 from underwhelming to 0.875 on the first rebalanced run, without changing the architecture at all.

That result made a broader point that stuck with me: when a model is underperforming, the first question should be about the data, not the architecture.

For the Kaggle competition, the pipeline achieved a private leaderboard sMAPE of 14.777%, well within the target threshold of 45% and comfortably past the 10% goal line.

Sample raw images from the Hades system

Reflections
#

This was the most technically layered project I had worked on up to that point — it combined image annotation, deep learning, classical computer vision, root system analysis, simulation environments, PID control, and system integration, all within one block. The robotics component was genuinely new territory for me, and building a PID controller from scratch, tuning the gains, and then watching the pipette navigate to the right coordinates was one of those moments where everything feels like it connects.

The instance segmentation and root detection challenges also taught me something about how to work through hard problems. The instinct is usually to try a more complex approach when something fails. Most of the actual progress here came from slowing down, figuring out exactly why something was failing, and making a targeted fix rather than a wholesale replacement.

  • Python
  • TensorFlow / Keras
  • U-Net
  • OpenCV
  • PID Controller
  • Opentrons OT-2
  • Watershed Segmentation
  • NumPy
  • Matplotlib
  • Kaggle