Skip to main content

Adding a New Dataset

Add existing geometries and labels

This covers the case when you have geometries (points, polygons, line strings, etc.) and (optionally) labels ready to import.

note

Supported formats: CSV, GeoJSON


Configure your project

If your data has fields used for classification or segmentation tasks (e.g. "grass", "water", "built-up", etc.)

From the project "Annotation tags" section:

  1. Click on the "Tags" tab
  2. For each class, add a tag:
    • The internal name must exactly match the value that's used in your dataset
    • The display name can be anything
    • Pick a color for visual differentiation
  3. Add it to a tag group, "default" is fine for most use cases

Next:

  1. Click on the "Metadata" tab
  2. Add a new metadata field
    • The internal name must exactly match the column name (for CSV) or property (for GeoJSON) that's in your dataset
    • The display name can be anything
    • Set the data type to "tag group" and select the tag group that contains your tags, likely "default"
    • Mark it as required if every sample must have it
    • Mark it as read-only if you don't want it to be editable later

If your data has fields used for regression tasks or any other fields to include

From the project "Annotation tags" section:

  1. Click on the "Metadata" tab
  2. Add a new metadata field for each column name (for CSV) or property (for GeoJSON) that's in your dataset
    • The internal name must exactly match the column name (for CSV) or property (for GeoJSON) that's in your dataset
    • The display name can be anything
    • Set the data type to match your dataset, for example "number" for any numeric type (integer or floating point), boolean for "true" or "false" values, "text" for free-form text, etc.
      • Note that for fine-tuning a regression model, it must be a numeric or boolean type
    • Mark it as required if every sample must have it
    • Mark it as read-only if you don't want it to be editable later

Prepare your dataset

For CSV, the following columns are required:

  • latitude
  • longitude
note

Only points are supported for CSV data


For GeoJSON, the geometry type must be one of:

  • Point
  • Polygon
  • MultiPolygon
  • LineString
  • MultiLineString

The following columns (for CSV) or properties (for GeoJSON) are required:

  • task_name
note

The task name determines how samples are grouped. Samples with the same task_name are grouped into the same task.

The combined geometry of all the samples in the same task cannot be greater than 6 degrees (the UTM zone interval) because otherwise it will create issues with re-projections and area calculations.


The following columns (for CSV) or properties (for GeoJSON) are strictly optional but strongly recommended, especially if the data will be used for fine-tuning a model:

  • start_time
  • end_time
note

Times can be in any common date format, specifically one of the many formats that are parseable by the dateutil parser library.

Having both start_time and end_time will set annotation and task times.


Add your dataset

  1. Go to "My projects"
  2. Select your project
  3. Click "Build dataset" in the table's toolbar
  4. Select "Upload file"
  5. Add the file
  6. Keeping the default buffer size of 500 meters is likely fine, this is just how big the box will be around all geometries in the annotation task
  7. Skip adding data sources unless you want to include imagery for annotation
  8. Give your dataset a memorable name
  9. Click acquire, this will spawn an asynchronous job
  10. The job may take a while to complete, you can track its status in the dataset list