Production geospatial isn't a notebook — it's a pipeline. This week is the real AWS architecture LaunchDetect runs in production: S3 ingest, Lambda compute, EventBridge schedule, DynamoDB state.
Production geospatial isn't a notebook. It's a pipeline: data lands somewhere, code runs, results land somewhere else, the pipeline is monitored, alerts fire when something breaks. This week is the AWS architecture that LaunchDetect actually runs in production — minus a few enterprise-specific layers.
Four services, each doing one thing well:
LaunchDetect's flow:
s3://noaa-goes18/....scorer Lambda.publisher Lambda that decides whether the candidate is a real launch (vs fire / glint / industrial source), writes the public detection JSON to S3, and emits a "launch detected" event to EventBridge.Total latency from NOAA file landing to a push notification on a user's phone: typically 30–90 seconds.
DynamoDB's #1 footgun is hot partitions. Every item has a partition key (PK) and optionally a sort key (SK). DynamoDB hashes PK and routes the item to a physical partition. If 90% of your writes go to a single PK, you bottleneck on that one partition's WCU/RCU limit (3,000 reads / 1,000 writes per second).
Good PK choices spread writes evenly across partitions. For launch detections, a natural PK is DETECTION#{ulid} — ULIDs are time-ordered but have enough entropy that they distribute evenly. Bad PK: DATE#{yyyy-mm-dd} — all today's writes go to one partition.
When Lambda receives a request and has no warm container available, it cold-starts: provision a sandbox, download the function code, initialize the runtime, run the handler. Cold start can be 200 ms (Python 3.13 lightweight) to 3+ seconds (heavy Java / large dependency tree).
For latency-sensitive request paths (API endpoints), cold start matters and you mitigate with: provisioned concurrency, smaller deployment packages, lighter runtimes, lazy imports. For event-driven batch (which is most space-GIS pipelines), cold start is fine — a launch detection that takes 90 seconds doesn't care about 500 ms cold start.
AWS CDK (Cloud Development Kit) is infrastructure-as-code in real programming languages — TypeScript, Python, Java, Go. You write classes that instantiate AWS resources; CDK synthesizes them to CloudFormation templates; CloudFormation deploys them.
import * as cdk from 'aws-cdk-lib';
import { Bucket } from 'aws-cdk-lib/aws-s3';
import { Function, Runtime, Code } from 'aws-cdk-lib/aws-lambda';
import { S3EventSource } from 'aws-cdk-lib/aws-lambda-event-sources';
import { Table, AttributeType } from 'aws-cdk-lib/aws-dynamodb';
export class DetectionStack extends cdk.Stack {
constructor(scope: cdk.App, id: string) {
super(scope, id);
const bucket = new Bucket(this, 'IngestBucket');
const table = new Table(this, 'Detections', {
partitionKey: { name: 'pk', type: AttributeType.STRING },
sortKey: { name: 'sk', type: AttributeType.STRING }
});
const scorer = new Function(this, 'Scorer', {
runtime: Runtime.PYTHON_3_13,
handler: 'handler.handler',
code: Code.fromAsset('lambda/scorer')
});
scorer.addEventSource(new S3EventSource(bucket, {
events: [s3.EventType.OBJECT_CREATED]
}));
table.grantWriteData(scorer);
}
}
You'll build a mini detection pipeline: a Lambda triggered by S3 PutObject, that reads a small GOES NetCDF, threshold-detects hotspots, and writes detection records to a DynamoDB table. Deploy with AWS CDK. This is the architecture of LaunchDetect's artgis-cluster-scorer Lambda in production, minus the ML scoring layer and parallax correction.
Build a Lambda triggered by S3 PutObject. The Lambda reads a small GOES NetCDF, threshold-detects hotspots, writes records to a DynamoDB table. Deploy with AWS CDK.
Test yourself. Answer key on the certificate-track page (Gold-tier feature: progress tracking and auto-grading).