Skip to main content

Upload a GeoJSON Dataset

tip

OlmoEarth API Endpoints

OlmoEarth API endpoints are documented in the Interactive OlmoEarth API Browser.

You can use it to understand request and response schemas, make requests directly to the API, and generate client code in many common languages and libraries.

The following OlmoEarth API endpoints are used in this guide:

This guide explains how to create a dataset by uploading a GeoJSON file containing geometries and associated metadata. GeoJSON format supports complex geometry types including polygons and lines, making it ideal for representing areas, boundaries, or linear features.

Prerequisites

API Token

To get your API Token, see Authentication

Project ID

Datasets must be uploaded to a Project. To get a Project ID, view Your Projects, select your project, and copy the ID from the URL.

For example, if your project URL is:

https://olmoearth.allenai.org/projects/7e160260-5a5a-4120-ab33-8ce15998b982/tasks

Then your Project ID is 7e160260-5a5a-4120-ab33-8ce15998b982

GeoJSON Format

Your GeoJSON file must be a FeatureCollection where each feature includes:

Required Properties

  • task_name - Name for grouping features into tasks

Optional Properties

  • start_time - Start of the time range in ISO 8601 format (e.g., 2023-01-15T00:00:00Z)
  • end_time - End of the time range in ISO 8601 format
  • Any additional metadata properties you want associated with each feature

Supported Geometry Types

  • Point
  • Polygon
  • MultiPolygon
  • LineString
  • MultiLineString

Example GeoJSON File

Here's an example GeoJSON file for monitoring agricultural fields:

{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [[
[-120.6234, 47.1523],
[-120.6198, 47.1523],
[-120.6198, 47.1489],
[-120.6234, 47.1489],
[-120.6234, 47.1523]
]]
},
"properties": {
"task_name": "Yakima_Valley_Block_A",
"start_time": "2023-04-01T00:00:00Z",
"end_time": "2023-09-30T00:00:00Z",
"crop_type": "Wheat",
"field_id": "YV-2023-001",
"irrigation_type": "Center Pivot"
}
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [[
[-120.6198, 47.1523],
[-120.6162, 47.1523],
[-120.6162, 47.1489],
[-120.6198, 47.1489],
[-120.6198, 47.1523]
]]
},
"properties": {
"task_name": "Yakima_Valley_Block_A",
"start_time": "2023-04-01T00:00:00Z",
"end_time": "2023-09-30T00:00:00Z",
"crop_type": "Corn",
"field_id": "YV-2023-002",
"irrigation_type": "Drip"
}
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [[
[-120.5987, 47.1345],
[-120.5951, 47.1345],
[-120.5951, 47.1311],
[-120.5987, 47.1311],
[-120.5987, 47.1345]
]]
},
"properties": {
"task_name": "Yakima_Valley_Block_B",
"start_time": "2023-04-01T00:00:00Z",
"end_time": "2023-09-30T00:00:00Z",
"crop_type": "Alfalfa",
"field_id": "YV-2023-003",
"irrigation_type": "Flood"
}
}
]
}

In this example:

  • Each feature represents an agricultural field with polygon boundaries
  • Features are grouped into two tasks: Yakima_Valley_Block_A and Yakima_Valley_Block_B
  • Additional metadata (crop_type, field_id, irrigation_type) is included for each field

Components of the Request

1. Input File

The GeoJSON file containing your georeferenced features.

2. Sources and Time Ranges

Define the satellite imagery sources to include for each task. See Creating a Gridded Dataset for details on available sources.

3. Buffer Size

Specify the buffer size in meters around each geometry. This determines how much additional area beyond the geometry boundaries is included in the imagery. Default is 500 meters.

note

For polygon and line geometries, the buffer extends outward from the geometry edges.

Example Request

Below is a complete example using curl to upload a GeoJSON dataset. Visit the API Endpoint Documentation for a complete schema and sample requests in other languages or libraries.

curl -X POST "https://olmoearth.allenai.org/api/v1/datasets/upload-samples" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-F "input_file=@agricultural_fields.geojson" \
-F 'source_time_ranges=[
{
"source": "sentinel2",
"start_time": "2023-04-01T00:00:00Z",
"end_time": "2023-09-30T00:00:00Z",
"count": 10
},
{
"source": "landsat",
"start_time": "2023-04-01T00:00:00Z",
"end_time": "2023-09-30T00:00:00Z",
"count": 5
}
]' \
-F "name=Agricultural Fields - Growing Season 2023" \
-F "project_id=7e160260-5a5a-4120-ab33-8ce15998b982" \
-F "buffer_size=500" \
-F "resolution=10.0"
note

This endpoint uses multipart/form-data encoding since it includes file uploads. The source_time_ranges parameter must be provided as a JSON string.

Checking Dataset Status

The dataset will progress through several stages as it builds:

  • pending - Dataset creation has been queued
  • acquiring - Satellite imagery is being acquired
  • ingesting - Data is being processed and ingested
  • completed - Dataset is ready for use

You can monitor the dataset progress using GET /api/v1/datasets/{dataset_id}.

Notes

  • Dataset creation is an asynchronous process that may take hours depending on the number of features and images requested
  • The task_name property groups features into annotation tasks. All features with the same task_name are grouped together
  • The combined geometry of all features in a task cannot exceed 6 degrees (the UTM zone interval)
  • If start_time and end_time are provided in the properties, they will be used for the annotation and task times
  • All coordinates should be in WGS84 (EPSG:4326) coordinate reference system