Week 22 · Space GIS Architect

ML for satellite imagery: CNNs and U-Net segmentation

Deep learning has rewritten remote sensing. CNNs (object detection) and U-Nets (semantic segmentation) are now standard. This week you train one on real GOES data.

Learning objectives

Primer

Deep learning has rewritten the playbook for satellite imagery analysis over the past decade. Convolutional neural networks now do object detection, semantic segmentation, super-resolution, and change detection at production scale across every major Earth-observation platform. This week is the practical primer: when to use deep learning vs threshold rules, the U-Net architecture, and how to train one on real GOES data.

When deep learning beats thresholding

Threshold rules (Week 14's Band 7 > 320 K) work when the discriminator is a single scalar feature. They break down when:

Threshold rules are great for fast, explainable, debuggable baseline detection. Deep learning shines for the next layer: scoring, classification, and segmentation refinement.

The U-Net architecture

U-Net (Ronneberger et al. 2015) is the workhorse for image segmentation in remote sensing. It's an encoder-decoder with skip connections:

The output is a same-size map of per-pixel class probabilities. For plume segmentation, the classes are {background, plume}; for multi-class fire/plume/cloud, expand accordingly.

import torch.nn as nn

class UNet(nn.Module):
    def __init__(self, in_ch=1, out_ch=1, n_features=32):
        super().__init__()
        # ... 4 down blocks + bottleneck + 4 up blocks ...
        # Each block: Conv3x3 → BatchNorm → ReLU → Conv3x3 → BatchNorm → ReLU
        # Down: Conv block + MaxPool2x2
        # Up: ConvTranspose2x2 + concat with skip + Conv block

Weak supervision

The training-data problem: who hand-labels rocket plumes in tens of thousands of GOES frames? Nobody. The trick is weak supervision — generate the training labels programmatically.

For plumes: run Week 14's threshold detector + Week 20's morphology cleanup over a year of GOES frames around known launches. Cross-check against the published launch schedule. Use those pixel masks as training labels. The labels are noisy (some false positives, some false negatives), but with enough volume the U-Net learns to denoise — it picks up on spatial context the threshold rule can't see.

Evaluation: IoU and confusion matrices

For segmentation, accuracy is misleading (a network that predicts "no plume everywhere" gets 99.99% accuracy because most pixels really are no plume). Use:

Small models, not big

For thermal plume segmentation in 200×200 pixel windows, a 32-feature U-Net (~1M parameters) is more than enough. Don't reach for big pretrained models — they need huge training sets, they're slow to deploy, and the feature distribution of satellite imagery is far enough from ImageNet that pretrained weights help less than you'd expect.

The lab

You'll generate weakly-supervised training data from threshold detections + morphology over a year of GOES Band 7 frames, train a small U-Net in PyTorch, evaluate on held-out launches with IoU + confusion matrices, and confirm the per-pixel false-positive rate is below 5%. This is the architecture for LaunchDetect's "Layer 3" classifier — the production model that scores threshold-detected hotspots for plume-vs-fire-vs-noise.

Hands-on lab: U-Net for plume segmentation

Generate training data from threshold-detected plumes in GOES Band 7. Train a small U-Net to segment plume pixels. Evaluate on held-out launches with IoU and confusion matrices.

Quiz

Test yourself. Answer key on the certificate-track page (Gold-tier feature: progress tracking and auto-grading).

Q1. U-Net architecture is:
  1. Encoder-decoder with skip connections, ideal for segmentation
  2. Just a CNN
  3. Recurrent
  4. Transformer-only
Q2. IoU (intersection over union) measures:
  1. Overlap between predicted and ground-truth mask
  2. Loss only
  3. Reprojection error
  4. Compression ratio
Q3. Generating training data via thresholding is called:
  1. Weak supervision (programmatic labels)
  2. Manual labeling
  3. Synthetic data
  4. Augmentation
Q4. CNNs work well on images because:
  1. Translation invariance and locality
  2. They're newest
  3. Only choice
  4. Marketing
Q5. Why use a small U-Net (not a giant model)?
  1. Faster inference, less overfitting on small training sets, deployable to edge
  2. Always smaller is worse
  3. Required by law
  4. No reason