Glossary
Annotation – A point or polygon used for training that has an associated label. Also referred to as "ground truth," especially when collected from the field.
Annotating – The process of assigning labels to points or polygons, often by marking features on imagery. Also referred to as labeling.
Area of Interest – Defines a specific geospatial region (as specified by a GeoJSON Polygon or MultiPolygon) with a start time and an end time.
Category – A named group used in classification tasks (e.g., water, forest, buildings). Also referred to as a class.
Classification – A machine learning task that assigns a discrete class label to a pixel or window (e.g., land cover type, presence of a feature). Pixel-based classification is also referred to as segmentation.
Class – A category the model predicts in a classification task. Also referred to as a category.
Class distribution – The proportion of training examples belonging to each class in a dataset.
Dataset – A collection of annotations.
F1 – A measure of a model's accuracy that balances precision and recall into a single number.
Fine-tuning – The process of adapting a pretrained model using labeled data for a specific task.
Foundation model – A machine learning model trained on large and diverse datasets that can be adapted to many different tasks.
Ground truth – Verified labeled data used to train or evaluate a model.
Inference – The process of applying a trained model to new data to generate predictions. Also referred to as running inference.
Label – The value assigned to an annotation that the model learns to predict (e.g., crop type or biomass).
Metadata – Data associated with the point or polygon in an annotation. Metadata can be freeform or categorical text (string), boolean, or numerical data.
Model – A machine learning system that learns patterns from data to make predictions.
Model run – The process of running a fine-tuned model on an Area of Interest at a specific time (period).
Modality – A type of input data used by the model, such as radar or optical satellite imagery.
Negative examples – Training samples representing features the model should learn not to classify as the target class.
Overfitting – When a model learns patterns specific to the training data that do not generalize well to new data.
Per-pixel prediction – A prediction generated for every pixel in an image. Also referred to as pixel-level prediction.
Prediction – The output produced by a model when analyzing input data.
Project – A project encompasses all of the datasets, annotations, labels, fine-tuned models, predictions, and prediction results for a given geospatial intelligence task.
Regression – A machine learning task that predicts a numeric value (e.g., biomass or soil moisture).
Revisit rate – How often a satellite collects imagery for the same location. Also referred to as temporal resolution.
Segmentation – A machine learning task that assigns a class label to every pixel in an image, producing a continuous map of features (e.g., land cover boundaries, crop type extents). Also referred to as pixel-based classification.
Spatial context – The surrounding area of imagery provided to the model to help interpret the prediction window. Also referred to as the context window.
Spatial distribution – The geographic spread of labeled data across an area of interest.
Spatial resolution – The size of the ground area represented by a pixel in an image.
Task – A collection of annotations to be labeled or reviewed.
Task type – The type of prediction the model is trained to perform (e.g., classification, regression, segmentation). This is different than an annotation "Task" (see "Task" definition).
Temporal context – The time range of data the model uses when learning or making predictions.
Training – The process of teaching a model using labeled data.
Training data – The labeled data used to train a model.
Training split – The portion of a dataset used to train the model.
Validation split – The portion of a dataset used to monitor model performance and guide training.
Test split – The portion of a dataset used to evaluate the final model after training.
Window – A fixed area of imagery used as the unit of prediction. Also referred to as a scene or patch.
Window-level prediction – A prediction made for an entire window rather than individual pixels.