Semantic segmentation

Semantic segmentation module.

remote_sensing_processor.semantic.generate_tiles(x, y, output, tile_size=128, shuffle=False, split=None, filter_nodata='x', x_dtype=None, y_dtype=None, x_nodata=None, y_nodata=None)[source]

Cut rasters into tiles.

Parameters:
  • x (list of paths as strings) – Rasters to use as training data.

  • y (dict or list of dicts (Optional)) – Target variable or multiple target variables. It can be set to None if target value is not needed. Dict or multiple dicts. If a target variable is not needed in the dataset, can be set to None. It should contain: name: a name of a target variable that will be used further to call it. path: raster or vector file to use as target variable. burn_value (optional): a field to use for a burn-in value. Field should be numeric. If there is a burn_value key in dict, target variable will be considered a vector file, if there is only a path key, variable will be considered a raster file. We strongly recommend you to change class values to 0, 1, 2, …, n (where 0 is nodata) before generating tiles.

  • output (path as a string) – Path to save generated output x data. Data is saved in a .rspds format (custom dataset format based on WebDataset).

  • tile_size (int (default = 128)) – Size of tiles to generate (tile_size x tile_size).

  • shuffle (bool (default = False)) – Is a random shuffling of samples needed.

  • split (dict (optional)) – Splitting data in subsets. Is a dict, where keys are the names of split subsets and values are numbers defining proportions of every subset. For example, {“train”: 3, “validation”: 1, “test”: 1} will generate 3 subsets (train, validation, and test) in proportion 3 to 1 to 1.

  • filter_nodata (str (default = "x")) – How the nodata values should be treated. None: do not filter nodata. “x”: filter out pixels that are nodata in x. “y”: filter out pixels that are nodata in y. “x_or_y”: filter out pixels that are nodata in x or y. “x_and_y”: filter out pixels that are nodata in x and y.

  • x_dtype (dtype definition as a string (optional)) – If you run out of memory, you can try to convert your data to less memory consuming format.

  • y_dtype (dtype definition as a string (optional)) – If you run out of memory, you can try to convert your data to less memory consuming format.

  • x_nodata (int or float (optional)) – You can define which value in x raster corresponds to nodata and areas that contain nodata in x raster will be ignored while training and testing. Tiles that contain only nodata in both x and y will be omitted. If not defined, then the most common nodata value amongst x files will be used. If there are no nodata values, will be set to 0.

  • y_nodata (int (optional)) – You can define which value will be used to fill nodata. If there are polygons with the same value as y_nodata, they will be ignored while training and testing. Tiles that contain only nodata in both x and y will be omitted. If not defined, then it will be set to 0.

Returns:

Path to the output dataset.

Return type:

pathlib.Path

Examples

>>> import remote_sensing_processor as rsp
>>> x = ["/home/rsp_test/mosaics/sentinel/sentinel.json", "/home/rsp_test/mosaics/dem/dem.json"]
>>> y = [
...     {"name": "landcover", "path": "/home/rsp_test/mosaics/landcover.tif"},
...     {"name": "forest_types", "path": "/home/rsp_test/mosaics/forest_types.gpkg", "burn_value": "class"},
... ]
>>> out_file = "/home/rsp_test/model/landcover_dataset.rspds"
>>> out_dataset = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> print(out_dataset)
PosixPath('/home/rsp_test/model/landcover_dataset.rspds')
remote_sensing_processor.semantic.train(train_datasets, val_datasets, model_file, model, backbone=None, checkpoint=None, weights=None, epochs=None, loss=None, metrics=None, batch_size=32, repeat=1, augment=False, lr=0.001, generate_features=False, num_workers=0, precision=None, **kwargs)[source]

Trains segmentation model.

Parameters:
  • train_datasets (dict or list of dicts) – Dataset generated by generate_tiles() function that will be used to train the model. Each dataset can contain 3 elements: path (path as str): a path to a dataset. Required parameter. sub (str): subdataset name, list of subdataset names or ‘all’. Required parameter. y (str): if there is more than one target variable in dataset, then the name of the variable that should be used for training should be defined. Optional parameter. You can provide a list of datasets to train the model on multiple datasets.

  • val_datasets (dict or list of dicts or None) – Dataset generated by generate_tiles() function that will be used to validate the model. Each dataset can contain 3 elements: path (path as str): a path to a dataset. Required parameter. sub (str): subdataset name, list of subdataset names or ‘all’. Required parameter. y (str): if there is more than one target variable in dataset, then the name of the variable that should be used for validation should be defined. Optional parameter. You can provide a list of datasets to validate model on multiple datasets. Can be set to None if no validation is needed.

  • model_file (path as a string) – Checkpoint file where model will be saved after training. File extension must be *.ckpt for neural networks and *.joblib for scikit-learn models.

  • model (str or torch.nn or sklearn model) – Name of model architecture, pytorch semantic segmentation model or sklearn classification model.

  • backbone (str (optional)) – Backbone, solver or kernel of a model, if multiple backbones are supported.

  • checkpoint (path as a string (optional)) – Checkpoint file (*.ckpt or *.joblib) of a pre-trained model to fine-tune.

  • weights (str (optional)) – Name of pre-trained weights to fine-tune. Only works for neural networks.

  • epochs (dict (optional)) – Dict of values that set the number of training epochs and early stopping parameter for Deep Learning models. max_epochs (int): the maximum number of epochs. early_stopping (bool): is early stopping enabled. min_delta (float): minimum change in the monitored quantity to qualify as an improvement. Optional parameter. patience (int): number of epochs with no improvement after which training will be stopped. Optional parameter. If you only want to initialize model for future testing or prediction, set max_epochs to 0. If not set, will use max_epochs = 5 and early_stopping with default parameters. epochs have no effect for Scikit-Learn models. Please, set num_iter, tol and other epochs-related parameters via **kwargs.

  • loss (str or torch.nn (optional)) – Loss function that will be used during the training. The default one is CrossEntropy or default loss for HuggingFace Transformers models. You can use any custom loss function, but it must inherit torch.nn.modules.loss._Loss.

  • metrics (dict or list of dicts (optional)) – Metrics that will be used to evaluate model performance and logged. Can be a single dict or list of dicts. Each dict corresponds to one metric. name (str): name of a metric. If name is one of supported metrics, it will be automatically loaded and used. log (str): logging levels can be ‘epoch’ - to log the metric only on the end of each epoch, ‘step’ - to log on each training step and ‘verbose’ - to log on each step and show alongside progress bar. metric (Metric): your custom metric object. Optional parameter. You can use any custom metrics, but they must inherit torchmetrics.metric. If not set, accuracy and mean IoU are verbose logged and precision and recall are logged after each epoch.

  • batch_size (int (default = 32)) – Number of training samples used in one iteration. Only works for neural networks.

  • repeat (int (default = 1)) – Increase size of a dataset by repeating it n times. Can be useful if dataset is very small.

  • augment (bool or sequence of str (default = False)) – Apply augmentations to dataset. Only works for neural networks. No augmentations applied if set to False. If set to True then the default augmentations (RandomResizedCrop, RandomHorizontalFlip) are applied. You can pass your own sequence of augmentations, they will be applied to data in the given order. You can use any custom augmentations, but they must inherit torchvision.transforms.v2.Transform.

  • lr (float (default = 1e-3)) – Learning rate of a model. Lower value results usually in better model convergence, but much slower training. lr have no effect for Scikit-Learn models. Please, set learning_rate_init, alpha and other lr-related parameters via **kwargs.

  • generate_features (bool (default = False)) – If set to True, intensity, gradient intensity and local structure features will be generated, as described here. Can result in better segmentation quality, but can also significantly increase training time. Only works for scikit-learn models.

  • num_workers (int or 'auto' (default = 0)) – Number of parallel workers that will load the data. Set ‘auto’ to let RSP choose the optimal number of workers, set 0 to disable multiprocessing. Can increase training speed, but can also cause errors (e.g. pickling errors).

  • precision (str (optional)) –

    Precision that will be used in training process. Lower precision requires less memory, but can sometimes cause errors. More info can be found here

  • **kwargs – Additional keyword arguments that are used to initialize model. They are different for every model, so read the documentation.

Returns:

Trained model.

Return type:

torch.nn model or SklearnModel

Examples

>>> import remote_sensing_processor as rsp
>>> x = ["/home/rsp_test/mosaics/sentinel/sentinel.json", "/home/rsp_test/mosaics/dem/dem.json"]
>>> y = [
...     {"name": "landcover", "path": "/home/rsp_test/mosaics/landcover.tif"},
...     {"name": "forest_types", "path": "/home/rsp_test/mosaics/forest_types.gpkg", "burn_value": "class"},
... ]
>>> out_file = "/home/rsp_test/model/landcover_dataset.rspds"
>>> dataset_path = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> # We will train model to predict forest types
>>> train_ds = {"path": dataset_path, "sub": "train", "y": "forest_types"}
>>> val_ds = {"path": dataset_path, "sub": "val", "y": "forest_types"}
>>> model = rsp.semantic.train(
...     train_ds,
...     val_ds,
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     epochs={"max_epochs": 100, "early_stopping": False},
...     batch_size=32,
... )
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
  | Name    | Type                           | Params
-----------------------------------------------------------
0 | model   | UperNetForSemanticSegmentation | 59.8 M
1 | loss_fn | CrossEntropyLoss               | 0
-----------------------------------------------------------
59.8 M    Trainable params
0         Non-trainable params
59.8 M    Total params
239.395   Total estimated model params size (MB)
Epoch 9: 100% #############################################
223/223 [1:56:20<00:00, 31.30s/it, v_num=54,
train_loss_step=0.326, train_acc_step=0.871, train_auroc_step=0.796, train_iou_step=0.655,
val_loss_step=0.324, val_acc_step=0.869, val_auroc_step=0.620, val_iou_step=0.678,
val_loss_epoch=0.334, val_acc_epoch=0.807, val_auroc_epoch=0.795, val_iou_epoch=0.688,
train_loss_epoch=0.349, train_acc_epoch=0.842, train_auroc_epoch=0.797, train_iou_epoch=0.648]
`Trainer.fit` stopped: `max_epochs=10` reached.
>>> ds_mo = "/home/rsp_test/model/montana.rspds"
>>> ds_id = "/home/rsp_test/model/idaho.rspds"
>>> # Training on two different datasets - one from Montana and one from Idaho
>>> train_datasets = [
...     {"path": ds_mo, "sub": ["area_1", "area_2"]},
...     {"path": ds_id, "sub": ["area_3", "area_6", "area8"]},
... ]
>>> val_datasets = [
...     {"path": ds_mo, "sub": ["area_3", "area_4"]},
...     {"path": ds_id, "sub": ["area_1"]},
... ]
>>> model = rsp.semantic.train(
...     train_datasets,
...     val_datasets,
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     epochs={"max_epochs": 100, "early_stopping": False},
...     batch_size=32,
... )
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
  | Name    | Type                           | Params
-----------------------------------------------------------
0 | model   | UperNetForSemanticSegmentation | 59.8 M
1 | loss_fn | CrossEntropyLoss               | 0
-----------------------------------------------------------
59.8 M    Trainable params
0         Non-trainable params
59.8 M    Total params
239.395   Total estimated model params size (MB)
Epoch 99: 100% #############################################
223/223 [1:56:20<00:00, 31.30s/it, v_num=54, train_loss_step=0.326,
train_acc_step=0.871, train_auroc_step=0.796, train_iou_step=0.655,
val_loss_step=0.324, val_acc_step=0.869, val_auroc_step=0.620, val_iou_step=0.678,
val_loss_epoch=0.334, val_acc_epoch=0.807, val_auroc_epoch=0.795, val_iou_epoch=0.688,
train_loss_epoch=0.349, train_acc_epoch=0.842, train_auroc_epoch=0.797, train_iou_epoch=0.648]
`Trainer.fit` stopped: `max_epochs=100` reached.
remote_sensing_processor.semantic.test(test_datasets, model, metrics=None, batch_size=32, num_workers=0)[source]

Tests segmentation model.

Parameters:
  • test_datasets (dict or list of dicts) – Dataset generated by generate_tiles() function that will be used to test the model. Each dataset can contain 3 elements: path (path as str): a path to a dataset. Required parameter. sub (str): subdataset name, list of subdataset names or ‘all’. Required parameter. y (str): if there is more than one target variable in dataset, then the name of the variable that should be used for testing should be defined. Optional parameter. You can provide a list of datasets to test model on multiple datasets.

  • model (torch.nn model or SklearnModel or path to a model file) – Model to test. You can pass the model object returned by train() function or file (*.ckpt or *.joblib) where model is stored.

  • metrics (dict or list of dicts (optional)) – Metrics that will be used to evaluate model performance and logged. Can be a single dict or list of dicts. Each dict corresponds to one metric. name (str): name of a metric. If name is one of supported metrics, it will be automatically loaded and used. log (str): logging levels can be ‘epoch’ - to log the metric only on the end of each epoch, ‘step’ - to log on each training step and ‘verbose’ - to log on each step and show alongside progress bar. metric (Metric): your custom metric object. Optional parameter. You can use any custom metrics, but they must inherit torchmetrics.metric. If not set, will evaluate the metrics used in training process.

  • batch_size (int (default = 32)) – Number of samples used in one iteration. Only works for neural networks.

  • num_workers (int or 'auto' (default = 0)) – Number of parallel workers that will load the data. Set ‘auto’ to let RSP choose the optimal number of workers, set 0 to disable multiprocessing. Can increase training speed, but can also cause errors (e.g. pickling errors).

Examples

>>> import remote_sensing_processor as rsp
>>> x, y, out_file = ...
>>> ds = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> model = rsp.semantic.train(
...     {"path": ds, "sub": "train"},
...     {"path": ds, "sub": "val"},
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     batch_size=32,
... )
>>> rsp.semantic.test({"path": ds, "sub": "test"}, model=model, batch_size=32)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃        Test metric        ┃       DataLoader 0        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│      test_acc_epoch       │    0.8231202960014343     │
│     test_auroc_epoch      │    0.7588028311729431     │
│      test_iou_epoch       │    0.69323649406433105    │
│      test_loss_epoch      │    0.40799811482429504    │
│   test_precision_epoch    │    0.8231202960014343     │
│     test_recall_epoch     │    0.8231202960014343     │
└───────────────────────────┴───────────────────────────┘
remote_sensing_processor.semantic.generate_map(dataset, model, output, reference_dataset=None, batch_size=32, num_workers=0, write_stac=True)[source]

Create a map using pre-trained model.

Parameters:
  • dataset (dict) – Dataset generated by generate_tiles() function that will be used for prediction. Dataset can contain 3 elements: path (path as str): a path to a dataset. Required parameter. sub (str): subdataset name, list of subdataset names or ‘all’. Optional parameter. If not defined, prediction for the whole dataset will be performed. y (str): if there is more than one target variable in dataset, then the name of the variable that should be used for original data reconstruction should be defined. Optional parameter.

  • model (torch.nn model or SklearnModel or path to a model file) – Pre-trained model to predict target values. You can pass the model object returned by train() function or file (*.ckpt or *.joblib) where model is stored.

  • output (path as a string) – Path where to write an output map.

  • reference_dataset (path as a string (optional)) – Dataset generated by generate_tiles() function that will be used to reconstruct original class values and nodata if prediction dataset has no target variable (‘y’). Dataset can contain 2 elements: path: a path to a dataset. y: if there is more than one target variable in dataset, then the name of the variable that should be used for reconstruction should be defined.

  • batch_size (int (default = 32)) – Number of samples used in one iteration. Only works for neural networks.

  • num_workers (int or 'auto' (default = 0)) – Number of parallel workers that will load the data. Set ‘auto’ to let RSP choose the optimal number of workers, set 0 to disable multiprocessing. It can increase training speed, but can also cause errors (e.g. pickling errors).

  • write_stac (bool (default = True)) – If True, then output metadata is saved to a STAC file.

Returns:

Path where output raster is saved.

Return type:

pathlib.Path

Examples

>>> import remote_sensing_processor as rsp
>>> x, y, out_file = ...
>>> ds = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> model = rsp.semantic.train(
...     {"path": ds, "sub": "train"},
...     {"path": ds, "sub": "val"},
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     batch_size=32,
... )
>>> output_map = "/home/rsp_test/prediction.tif"
>>> rsp.semantic.generate_map({"path": ds, "y": "landcover"}, model, output_map)
Predicting: 100% #################### 372/372 [32:16, 1.6s/it]
>>> ds = {"path": "/home/rsp_test/model/ds.rspds"}
>>> model = "/home/rsp_test/model/upernet.ckpt"
>>> output_map = "/home/rsp_test/prediction.tif"
>>> rsp.semantic.generate_map(ds, model, output_map)
Predicting: 100% #################### 372/372 [32:16, 1.6s/it]
>>> # Train model on data from Montana
>>> x_montana_files = "/home/rsp_test/mosaics/landsat_montana/landsat.json"
>>> y_montana_files = {"name": "landcover", "path": "/home/rsp_test/mosaics/landcover_montana/landcover.tif"}
>>> ds_montana = rsp.semantic.generate_tiles(
...     x_montana_files,
...     y_montana_files,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> train_ds = {"path": ds_montana, "sub": "train"}
>>> val_ds = {"path": ds_montana, "sub": "val"}
>>> model_montana = rsp.semantic.train(
...     train_ds,
...     val_ds,
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     epochs={"max_epochs": 10, "early_stopping": False},
...     batch_size=32,
... )
>>> # Use model to map landcover of Idaho
>>> x_idaho_files = "/home/rsp_test/mosaics/landsat_idaho/landsat.json"
>>> ds_idaho = rsp.semantic.generate_tiles(x_idaho_files, None, tile_size=256)
>>> output_map = "/home/rsp_test/prediction_idaho.tif"
>>> pred_ds = {"path": ds_idaho}
>>> ref_ds = {"path": ds_montana, "y": "landcover"}
>>> rsp.semantic.generate_map(pred_ds, model_montana, output_map, reference_dataset=ref_ds)
Predicting: 100% #################### 372/372 [32:16, 1.6s/it]
remote_sensing_processor.semantic.band_importance(dataset, model, target_class=None, num_init_images=100, num_images=500, batch_size=32, num_workers=0)[source]

Explain the band importance for a pre-trained model using SHAP.

Parameters:
  • dataset (dict) – Dataset generated by generate_tiles() function that will be used for prediction. Dataset can contain 3 elements: path: a path to a dataset. sub: subdataset name, list of subdataset names or ‘all’. If not defined, prediction for the whole dataset will be performed. y: if there is more than one target variable in dataset, then the name of the variable that should be used for original data reconstruction should be defined.

  • model (torch.nn model or SklearnModel or path to a model file) – Pre-trained model to predict target values. You can pass the model object returned by train() function or file (*.ckpt or *.joblib) where model is stored.

  • target_class (int (optional)) – Index of the class to analyze band importance. If not set, will analyze all the classes.

  • num_init_images (int (default = 100)) – Number of images that will be used to initialize the SHAP explainer. We strongly recommend using very small num_init_images with sklearn models.

  • num_images (int (default = 500)) – Number of images that will be used to explain band importance. We strongly recommend using very small num_images with sklearn models.

  • batch_size (int (default = 32)) – Number of samples used in one iteration. Only works for neural networks.

  • num_workers (int or 'auto' (default = 0)) – Number of parallel workers that will load the data. Set ‘auto’ to let RSP choose the optimal number of workers, set 0 to disable multiprocessing. It can increase training speed, but can also cause errors (e.g. pickling errors).

Examples

>>> import remote_sensing_processor as rsp
>>> x, y, out_file = ...
>>> ds = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> model = rsp.semantic.train(
...     {"path": ds, "sub": "train"},
...     {"path": ds, "sub": "val"},
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     batch_size=32,
... )
>>> rsp.semantic.band_importance({"path": ds, "y": "landcover"}, model)
PartitionExplainer explainer: 100%|██████████████████████████████████████████▉| 499/500 [42:07<00:05,  5.07s/it]
Landsat-B1: 0.0162
Landsat-B2: 0.0493
Landsat-B3: 0.0875
Landsat-B4: 0.0243
Landsat-B5: 0.0319
Landsat-B7: 0.0194
NDVI: 0.0353
NBR: 0.0281
slope: 0.0134
curvature: 0.0239
aspect: 0.0311
dem-norm: 0.0236
>>> ds = {"path": "/home/rsp_test/model/ds.rspds"}
>>> model = "/home/rsp_test/model/xgboost.joblib"
>>> rsp.semantic.band_importance(ds, model, num_init_images=1, num_images=1)
PartitionExplainer explainer: 16385it [1:12:01,  3.78it/s]
coastal: 0.0266
blue: 0.0266
green: 0.0253
red: 0.0542
rededge071: 0.0194
rededge075: 0.0194
rededge078: 0.0196
nir: 0.0034
nir08: 0.0111
nir09: 0.0758
swir16: 0.2894
swir22: 0.0309
NDVI: 0.0483
canopyheight_norm: 0.0005
dem_norm: 0.0741
remote_sensing_processor.semantic.confusion_matrix(dataset, model, batch_size=32, num_workers=0)[source]

Generate confusion matrix.

Row indices of the confusion matrix correspond to the true class labels and column indices correspond to the predicted class labels.

Parameters:
  • dataset (dict) – Dataset generated by generate_tiles() function that will be used for prediction. Dataset can contain 3 elements: path: a path to a dataset. sub: subdataset name, list of subdataset names or ‘all’. If not defined, prediction for the whole dataset will be performed. y: if there is more than one target variable in dataset, then the name of the variable that should be used for original data reconstruction should be defined.

  • model (torch.nn model or SklearnModel or path to a model file) – Pre-trained model to predict target values. You can pass the model object returned by train() function or file (*.ckpt or *.joblib) where model is stored.

  • batch_size (int (default = 32)) – Number of samples used in one iteration. Only works for neural networks.

  • num_workers (int or 'auto' (default = 0)) – Number of parallel workers that will load the data. Set ‘auto’ to let RSP choose the optimal number of workers, set 0 to disable multiprocessing. It can increase training speed, but can also cause errors (e.g. pickling errors).

Examples

>>> import remote_sensing_processor as rsp
>>> x, y, out_file = ...
>>> ds = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> model = rsp.semantic.train(
...     {"path": ds, "sub": "train"},
...     {"path": ds, "sub": "val"},
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     batch_size=32,
... )
>>> rsp.semantic.confusion_matrix({"path": ds, "y": "landcover"}, model)
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [03:35<00:00, 6.71s/it]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [06:22<00:00, 16.10s/it]
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Predicting DataLoader 0: 100%|██████████████████████████████████████████▉|77/77 [11:27<00:00, 0.11it/s]
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [01:04<00:00, 2.73s/it]
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
|   | 0 |        1 |       2 |       3 |        4 |       5 |       6 |     7 | 8 | 9 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 0 | 0 |        0 |       0 |       0 |        0 |       0 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 1 | 0 | 57715149 |  616341 |  457550 |  1406527 |  508368 |   92479 |  6481 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 2 | 0 |  2844286 | 1599082 | 1083022 |  1371587 |  174298 |   40392 |  2411 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 3 | 0 |  1831478 |  558398 | 2673250 |  5310441 |  555055 |   94868 |  8320 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 4 | 0 |  1981444 |  188564 | 1732960 | 18768424 | 5257066 |  408550 |  8889 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 5 | 0 |   798174 |   24353 |  120365 |  4698820 | 7825820 | 1432090 | 20451 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 6 | 0 |   246685 |    9658 |   13940 |   280521 | 2202678 | 2090811 | 57364 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 7 | 0 |    12279 |    1288 |    2347 |    11409 |  112254 |  345227 | 57887 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 8 | 0 |        0 |       0 |       0 |      585 |       8 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 9 | 0 |       27 |       0 |       0 |      643 |       0 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
>>> ds = {"path": "/home/rsp_test/model/ds.rspds"}
>>> model = "/home/rsp_test/model/xgboost.joblib"
>>> rsp.semantic.confusion_matrix(ds, model)
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [03:35<00:00, 6.71s/it]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [06:22<00:00, 16.10s/it]
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Predicting DataLoader 0: 100%|██████████████████████████████████████████▉|77/77 [11:27<00:00, 0.11it/s]
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [01:04<00:00, 2.73s/it]
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
|   | 0 |        1 |       2 |       3 |        4 |       5 |       6 |     7 | 8 | 9 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 0 | 0 |        0 |       0 |       0 |        0 |       0 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 1 | 0 | 57715149 |  616341 |  457550 |  1406527 |  508368 |   92479 |  6481 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 2 | 0 |  2844286 | 1599082 | 1083022 |  1371587 |  174298 |   40392 |  2411 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 3 | 0 |  1831478 |  558398 | 2673250 |  5310441 |  555055 |   94868 |  8320 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 4 | 0 |  1981444 |  188564 | 1732960 | 18768424 | 5257066 |  408550 |  8889 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 5 | 0 |   798174 |   24353 |  120365 |  4698820 | 7825820 | 1432090 | 20451 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 6 | 0 |   246685 |    9658 |   13940 |   280521 | 2202678 | 2090811 | 57364 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 7 | 0 |    12279 |    1288 |    2347 |    11409 |  112254 |  345227 | 57887 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 8 | 0 |        0 |       0 |       0 |      585 |       8 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 9 | 0 |       27 |       0 |       0 |      643 |       0 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+

List of available NN models

Transformers backbones are:

  • BEiT

  • BiT

  • ConvNeXT

  • ConvNeXTV2

  • DiNAT

  • DINOV2

  • DINOV2WithRegisters

  • DINOV3ViT

  • DINOV3ConvNeXT

  • FocalNet

  • HGNet-V2

  • Hiera

  • LW-DETR

  • MaskFormer-Swin

  • Pixio

  • PVTV2

  • ResNet

  • RT-DETR-ResNet

  • Swin

  • SwinV2

  • ViTDet

  • Any TIMM backbone (experimental support)

You can fine-tune pre-trained model by defining weights. For models from Transformers you can get available weights from Huggingface Hub, for Torchvision models you just set weights = True.

rsp.segmentation.train also saves CSV and Tensorboard logs in directory where checkpoint file is saved.

DiNAT backbone require natten library, that is not available on Windows and Mac and not available via Conda. RSP supports DiNAT backbone, but you need to install natten in your python env manually.

List of available Scikit-learn models

Supported Scikit-Learn Models

Model Name

Kernel/solver

Warm start

Reference

Logistic Regression

“lbfgs”, “liblinear”, “newton-cg”, “newton-cholesky”, “sag”, “saga”

Only for lbfgs, newton-cg, sag, saga solvers

Scikit-learn

Ridge

Not available

Not supported

Scikit-learn

SGD

“hinge”, “log_loss”, “modified_huber”, “squared_hinge”, “perceptron”

Supported

Scikit-learn

Nearest Neighbors

Not available

Not supported

Scikit-learn

Radius Neighbors

Not available

Not supported

Scikit-learn

SVM

“rbf”, “linear”, “poly”, “sigmoid”

Not supported

Scikit-learn

Gaussian Process

Not available

Not supported

Scikit-learn

Naive Bayes

“gaussian”, “bernoulli”, “categorical”, “complement”, “multinomial”

Not supported

Scikit-learn

QDA

Not available

Not supported

Scikit-learn

LDA

Not available

Not supported

Scikit-learn

Decision Tree

Not available

Not supported

Scikit-learn

Extra Tree

Not available

Not supported

Scikit-learn

Random Forest

Not available

Supported

Scikit-learn

Extra Trees

Not available

Supported

Scikit-learn

AdaBoost

Not available

Not supported

Scikit-learn

Gradient Boosting

Not available

Supported

Scikit-learn

Multilayer Perceptron

“adam”, “sgd”, “lbfgs”

Supported

Scikit-learn

XGBoost

Not available

Not supported

XGBoost

XGB Random Forest

Not available

Not supported

XGBoost

Model kernel or solver is defined with backbone argument.

Models that support warm start can be fine-tuned using pre-trained models with checkpoint argument.

Some models can have issues while saving, especially when trained on big datasets. Some models (like SVM) can train for a very long time or (like Gaussian process) can have memory issues with big datasets. So we recommend using Scikit-learn models only for small datasets.

For Random Forest and Extra Trees models max_depth is by default set to 6, because it is unlimited by default and the training could be very slow. To train with unlimited tree depth set max_depth = None.

List of available losses

Supported loss functions

Loss

Reference

cross_entropy

Torch

nll

Torch

jaccard

Segmentation Models Pytorch

dice

Segmentation Models Pytorch

tversky

Segmentation Models Pytorch

focal

Segmentation Models Pytorch

lovasz

Segmentation Models Pytorch

You can also use your custom loss. It can be useful if you want to initialize a loss with custom parameters. You also can pass any custom function as a loss. The only limit - it must inherit torch.nn.modules.loss._Loss.

List of available metrics

Supported metrics

Metric

Additional parameters

Reference

accuracy_macro

average=”macro”

Torchmetrics

accuracy_micro

average=”micro”

Torchmetrics

cohen_kappa

None

Torchmetrics

exact_math

None

Torchmetrics

f1_macro

average=”macro”

Torchmetrics

f1_micro

average=”micro”

Torchmetrics

hamming_distance_macro

average=”macro”

Torchmetrics

hamming_distance_micro

average=”micro”

Torchmetrics

jaccard_index_macro

average=”macro”

Torchmetrics

jaccard_index_micro

average=”micro”

Torchmetrics

matthews_correlation_coefficient

None

Torchmetrics

negative_predictive_value_macro

average=”macro”

Torchmetrics

negative_predictive_value_micro

average=”micro”

Torchmetrics

precision_macro

average=”macro”

Torchmetrics

precision_micro

average=”micro”

Torchmetrics

recall_macro

average=”macro”

Torchmetrics

recall_micro

average=”micro”

Torchmetrics

dice_score_macro

average=”macro”

Torchmetrics

dice_score_micro

average=”micro”

Torchmetrics

generalized_dice_score

None

Torchmetrics

mean_iou

None

Torchmetrics

For most of the semantic segmentation metrics micro and macro-averaged versions are available by default.

You also can use any custom metric for evaluation. The only limit - it must inherit torchmetrics.metric.Metric.

Supported augmentations

Supported augmentations

Augmentation

Additional parameters

Reference

ScaleJitter

None

Torchvision

RandomResizedCrop

antialias=True

Torchvision

RandomHorizontalFlip

p=0.5

Torchvision

RandomVerticalFlip

p=0.5

Torchvision

RandomZoomOut

None

Torchvision

RandomRotation

degrees=90

Torchvision

RandomAffine

degrees=90, translate=(0.5, 0.5), shear=0.5

Torchvision

RandomPerspective

None

Torchvision

ElasticTransform

None

Torchvision

GaussianBlur

kernel_size=(5, 9)

Torchvision

If you just pass augment=True, RSP will use a default sequence of augmentations: (“RandomResizedCrop”, “RandomHorizontalFlip”). You can pass your own sequence of augmentations, they will be applied to data in the given order. You can use both supported augmentation names or custom augmentations. You can use any custom augmentations, but they must inherit torchvision.transforms.v2.Transform.