Semantic segmentation

Semantic segmentation module.

remote_sensing_processor.semantic.generate_tiles(x, y, output, tile_size=128, shuffle=False, split=None, filter_nodata='x', x_dtype=None, y_dtype=None, x_nodata=None, y_nodata=None)[source]

Cut rasters into tiles.

Parameters:

x (list of paths as strings) – Rasters to use as training data.
y (dict or list of dicts (Optional)) – Target variable or multiple target variables. It can be set to None if target value is not needed. Dict or multiple dicts. If a target variable is not needed in the dataset, can be set to None. It should contain: name: a name of a target variable that will be used further to call it. path: raster or vector file to use as target variable. burn_value (optional): a field to use for a burn-in value. Field should be numeric. If there is a burn_value key in dict, target variable will be considered a vector file, if there is only a path key, variable will be considered a raster file. We strongly recommend you to change class values to 0, 1, 2, …, n (where 0 is nodata) before generating tiles.
output (path as a string) – Path to save generated output x data. Data is saved in a .rspds format (custom dataset format based on WebDataset).
tile_size (int (default = 128)) – Size of tiles to generate (tile_size x tile_size).
shuffle (bool (default = False)) – Is a random shuffling of samples needed.
split (dict (optional)) – Splitting data in subsets. Is a dict, where keys are the names of split subsets and values are numbers defining proportions of every subset. For example, {“train”: 3, “validation”: 1, “test”: 1} will generate 3 subsets (train, validation, and test) in proportion 3 to 1 to 1.
filter_nodata (str (default = "x")) – How the nodata values should be treated. None: do not filter nodata. “x”: filter out pixels that are nodata in x. “y”: filter out pixels that are nodata in y. “x_or_y”: filter out pixels that are nodata in x or y. “x_and_y”: filter out pixels that are nodata in x and y.
x_dtype (dtype definition as a string (optional)) – If you run out of memory, you can try to convert your data to less memory consuming format.
y_dtype (dtype definition as a string (optional)) – If you run out of memory, you can try to convert your data to less memory consuming format.
x_nodata (int or float (optional)) – You can define which value in x raster corresponds to nodata and areas that contain nodata in x raster will be ignored while training and testing. Tiles that contain only nodata in both x and y will be omitted. If not defined, then the most common nodata value amongst x files will be used. If there are no nodata values, will be set to 0.
y_nodata (int (optional)) – You can define which value will be used to fill nodata. If there are polygons with the same value as y_nodata, they will be ignored while training and testing. Tiles that contain only nodata in both x and y will be omitted. If not defined, then it will be set to 0.

Returns:

Path to the output dataset.

Return type:

pathlib.Path

Examples

>>> import remote_sensing_processor as rsp
>>> x = ["/home/rsp_test/mosaics/sentinel/sentinel.json", "/home/rsp_test/mosaics/dem/dem.json"]
>>> y = [
...     {"name": "landcover", "path": "/home/rsp_test/mosaics/landcover.tif"},
...     {"name": "forest_types", "path": "/home/rsp_test/mosaics/forest_types.gpkg", "burn_value": "class"},
... ]
>>> out_file = "/home/rsp_test/model/landcover_dataset.rspds"
>>> out_dataset = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> print(out_dataset)
PosixPath('/home/rsp_test/model/landcover_dataset.rspds')

remote_sensing_processor.semantic.train(train_datasets, val_datasets, model_file, model, backbone=None, checkpoint=None, weights=None, epochs=None, loss=None, metrics=None, batch_size=32, repeat=1, augment=False, lr=0.001, generate_features=False, num_workers=0, precision=None, **kwargs)[source]

Trains segmentation model.

Parameters:

train_datasets (dict or list of dicts) – Dataset generated by generate_tiles() function that will be used to train the model. Each dataset can contain 3 elements: path (path as str): a path to a dataset. Required parameter. sub (str): subdataset name, list of subdataset names or ‘all’. Required parameter. y (str): if there is more than one target variable in dataset, then the name of the variable that should be used for training should be defined. Optional parameter. You can provide a list of datasets to train the model on multiple datasets.
val_datasets (dict or list of dicts or None) – Dataset generated by generate_tiles() function that will be used to validate the model. Each dataset can contain 3 elements: path (path as str): a path to a dataset. Required parameter. sub (str): subdataset name, list of subdataset names or ‘all’. Required parameter. y (str): if there is more than one target variable in dataset, then the name of the variable that should be used for validation should be defined. Optional parameter. You can provide a list of datasets to validate model on multiple datasets. Can be set to None if no validation is needed.
model_file (path as a string) – Checkpoint file where model will be saved after training. File extension must be *.ckpt for neural networks and *.joblib for scikit-learn models.
model (str or torch.nn or sklearn model) – Name of model architecture, pytorch semantic segmentation model or sklearn classification model.
backbone (str (optional)) – Backbone, solver or kernel of a model, if multiple backbones are supported.
checkpoint (path as a string (optional)) – Checkpoint file (*.ckpt or *.joblib) of a pre-trained model to fine-tune.
weights (str (optional)) – Name of pre-trained weights to fine-tune. Only works for neural networks.
epochs (dict (optional)) – Dict of values that set the number of training epochs and early stopping parameter for Deep Learning models. max_epochs (int): the maximum number of epochs. early_stopping (bool): is early stopping enabled. min_delta (float): minimum change in the monitored quantity to qualify as an improvement. Optional parameter. patience (int): number of epochs with no improvement after which training will be stopped. Optional parameter. If you only want to initialize model for future testing or prediction, set max_epochs to 0. If not set, will use max_epochs = 5 and early_stopping with default parameters. epochs have no effect for Scikit-Learn models. Please, set num_iter, tol and other epochs-related parameters via **kwargs.
loss (str or torch.nn (optional)) – Loss function that will be used during the training. The default one is CrossEntropy or default loss for HuggingFace Transformers models. You can use any custom loss function, but it must inherit torch.nn.modules.loss._Loss.
metrics (dict or list of dicts (optional)) – Metrics that will be used to evaluate model performance and logged. Can be a single dict or list of dicts. Each dict corresponds to one metric. name (str): name of a metric. If name is one of supported metrics, it will be automatically loaded and used. log (str): logging levels can be ‘epoch’ - to log the metric only on the end of each epoch, ‘step’ - to log on each training step and ‘verbose’ - to log on each step and show alongside progress bar. metric (Metric): your custom metric object. Optional parameter. You can use any custom metrics, but they must inherit torchmetrics.metric. If not set, accuracy and mean IoU are verbose logged and precision and recall are logged after each epoch.
batch_size (int (default = 32)) – Number of training samples used in one iteration. Only works for neural networks.
repeat (int (default = 1)) – Increase size of a dataset by repeating it n times. Can be useful if dataset is very small.
augment (bool or sequence of str (default = False)) – Apply augmentations to dataset. Only works for neural networks. No augmentations applied if set to False. If set to True then the default augmentations (RandomResizedCrop, RandomHorizontalFlip) are applied. You can pass your own sequence of augmentations, they will be applied to data in the given order. You can use any custom augmentations, but they must inherit torchvision.transforms.v2.Transform.
lr (float (default = 1e-3)) – Learning rate of a model. Lower value results usually in better model convergence, but much slower training. lr have no effect for Scikit-Learn models. Please, set learning_rate_init, alpha and other lr-related parameters via **kwargs.
generate_features (bool (default = False)) – If set to True, intensity, gradient intensity and local structure features will be generated, as described here. Can result in better segmentation quality, but can also significantly increase training time. Only works for scikit-learn models.
num_workers (int or 'auto' (default = 0)) – Number of parallel workers that will load the data. Set ‘auto’ to let RSP choose the optimal number of workers, set 0 to disable multiprocessing. Can increase training speed, but can also cause errors (e.g. pickling errors).
precision (str (optional)) –
Precision that will be used in training process. Lower precision requires less memory, but can sometimes cause errors. More info can be found here
**kwargs – Additional keyword arguments that are used to initialize model. They are different for every model, so read the documentation.

Returns:

Trained model.

Return type:

torch.nn model or SklearnModel

Examples

>>> import remote_sensing_processor as rsp
>>> x = ["/home/rsp_test/mosaics/sentinel/sentinel.json", "/home/rsp_test/mosaics/dem/dem.json"]
>>> y = [
...     {"name": "landcover", "path": "/home/rsp_test/mosaics/landcover.tif"},
...     {"name": "forest_types", "path": "/home/rsp_test/mosaics/forest_types.gpkg", "burn_value": "class"},
... ]
>>> out_file = "/home/rsp_test/model/landcover_dataset.rspds"
>>> dataset_path = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> # We will train model to predict forest types
>>> train_ds = {"path": dataset_path, "sub": "train", "y": "forest_types"}
>>> val_ds = {"path": dataset_path, "sub": "val", "y": "forest_types"}
>>> model = rsp.semantic.train(
...     train_ds,
...     val_ds,
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     epochs={"max_epochs": 100, "early_stopping": False},
...     batch_size=32,
... )
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
  | Name    | Type                           | Params
-----------------------------------------------------------
0 | model   | UperNetForSemanticSegmentation | 59.8 M
1 | loss_fn | CrossEntropyLoss               | 0
-----------------------------------------------------------
59.8 M    Trainable params
0         Non-trainable params
59.8 M    Total params
239.395   Total estimated model params size (MB)
Epoch 9: 100% #############################################
223/223 [1:56:20<00:00, 31.30s/it, v_num=54,
train_loss_step=0.326, train_acc_step=0.871, train_auroc_step=0.796, train_iou_step=0.655,
val_loss_step=0.324, val_acc_step=0.869, val_auroc_step=0.620, val_iou_step=0.678,
val_loss_epoch=0.334, val_acc_epoch=0.807, val_auroc_epoch=0.795, val_iou_epoch=0.688,
train_loss_epoch=0.349, train_acc_epoch=0.842, train_auroc_epoch=0.797, train_iou_epoch=0.648]
`Trainer.fit` stopped: `max_epochs=10` reached.

>>> ds_mo = "/home/rsp_test/model/montana.rspds"
>>> ds_id = "/home/rsp_test/model/idaho.rspds"
>>> # Training on two different datasets - one from Montana and one from Idaho
>>> train_datasets = [
...     {"path": ds_mo, "sub": ["area_1", "area_2"]},
...     {"path": ds_id, "sub": ["area_3", "area_6", "area8"]},
... ]
>>> val_datasets = [
...     {"path": ds_mo, "sub": ["area_3", "area_4"]},
...     {"path": ds_id, "sub": ["area_1"]},
... ]
>>> model = rsp.semantic.train(
...     train_datasets,
...     val_datasets,
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     epochs={"max_epochs": 100, "early_stopping": False},
...     batch_size=32,
... )
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
  | Name    | Type                           | Params
-----------------------------------------------------------
0 | model   | UperNetForSemanticSegmentation | 59.8 M
1 | loss_fn | CrossEntropyLoss               | 0
-----------------------------------------------------------
59.8 M    Trainable params
0         Non-trainable params
59.8 M    Total params
239.395   Total estimated model params size (MB)
Epoch 99: 100% #############################################
223/223 [1:56:20<00:00, 31.30s/it, v_num=54, train_loss_step=0.326,
train_acc_step=0.871, train_auroc_step=0.796, train_iou_step=0.655,
val_loss_step=0.324, val_acc_step=0.869, val_auroc_step=0.620, val_iou_step=0.678,
val_loss_epoch=0.334, val_acc_epoch=0.807, val_auroc_epoch=0.795, val_iou_epoch=0.688,
train_loss_epoch=0.349, train_acc_epoch=0.842, train_auroc_epoch=0.797, train_iou_epoch=0.648]
`Trainer.fit` stopped: `max_epochs=100` reached.

remote_sensing_processor.semantic.test(test_datasets, model, metrics=None, batch_size=32, num_workers=0)[source]

Tests segmentation model.

Parameters:

test_datasets (dict or list of dicts) – Dataset generated by generate_tiles() function that will be used to test the model. Each dataset can contain 3 elements: path (path as str): a path to a dataset. Required parameter. sub (str): subdataset name, list of subdataset names or ‘all’. Required parameter. y (str): if there is more than one target variable in dataset, then the name of the variable that should be used for testing should be defined. Optional parameter. You can provide a list of datasets to test model on multiple datasets.
model (torch.nn model or SklearnModel or path to a model file) – Model to test. You can pass the model object returned by train() function or file (*.ckpt or *.joblib) where model is stored.
metrics (dict or list of dicts (optional)) – Metrics that will be used to evaluate model performance and logged. Can be a single dict or list of dicts. Each dict corresponds to one metric. name (str): name of a metric. If name is one of supported metrics, it will be automatically loaded and used. log (str): logging levels can be ‘epoch’ - to log the metric only on the end of each epoch, ‘step’ - to log on each training step and ‘verbose’ - to log on each step and show alongside progress bar. metric (Metric): your custom metric object. Optional parameter. You can use any custom metrics, but they must inherit torchmetrics.metric. If not set, will evaluate the metrics used in training process.
batch_size (int (default = 32)) – Number of samples used in one iteration. Only works for neural networks.
num_workers (int or 'auto' (default = 0)) – Number of parallel workers that will load the data. Set ‘auto’ to let RSP choose the optimal number of workers, set 0 to disable multiprocessing. Can increase training speed, but can also cause errors (e.g. pickling errors).

Examples

>>> import remote_sensing_processor as rsp
>>> x, y, out_file = ...
>>> ds = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> model = rsp.semantic.train(
...     {"path": ds, "sub": "train"},
...     {"path": ds, "sub": "val"},
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     batch_size=32,
... )
>>> rsp.semantic.test({"path": ds, "sub": "test"}, model=model, batch_size=32)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃        Test metric        ┃       DataLoader 0        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│      test_acc_epoch       │    0.8231202960014343     │
│     test_auroc_epoch      │    0.7588028311729431     │
│      test_iou_epoch       │    0.69323649406433105    │
│      test_loss_epoch      │    0.40799811482429504    │
│   test_precision_epoch    │    0.8231202960014343     │
│     test_recall_epoch     │    0.8231202960014343     │
└───────────────────────────┴───────────────────────────┘

remote_sensing_processor.semantic.generate_map(dataset, model, output, reference_dataset=None, batch_size=32, num_workers=0, write_stac=True)[source]

Create a map using pre-trained model.

Parameters:

dataset (dict) – Dataset generated by generate_tiles() function that will be used for prediction. Dataset can contain 3 elements: path (path as str): a path to a dataset. Required parameter. sub (str): subdataset name, list of subdataset names or ‘all’. Optional parameter. If not defined, prediction for the whole dataset will be performed. y (str): if there is more than one target variable in dataset, then the name of the variable that should be used for original data reconstruction should be defined. Optional parameter.
model (torch.nn model or SklearnModel or path to a model file) – Pre-trained model to predict target values. You can pass the model object returned by train() function or file (*.ckpt or *.joblib) where model is stored.
output (path as a string) – Path where to write an output map.
reference_dataset (path as a string (optional)) – Dataset generated by generate_tiles() function that will be used to reconstruct original class values and nodata if prediction dataset has no target variable (‘y’). Dataset can contain 2 elements: path: a path to a dataset. y: if there is more than one target variable in dataset, then the name of the variable that should be used for reconstruction should be defined.
batch_size (int (default = 32)) – Number of samples used in one iteration. Only works for neural networks.
num_workers (int or 'auto' (default = 0)) – Number of parallel workers that will load the data. Set ‘auto’ to let RSP choose the optimal number of workers, set 0 to disable multiprocessing. It can increase training speed, but can also cause errors (e.g. pickling errors).
write_stac (bool (default = True)) – If True, then output metadata is saved to a STAC file.

Returns:

Path where output raster is saved.

Return type:

pathlib.Path

Examples

>>> import remote_sensing_processor as rsp
>>> x, y, out_file = ...
>>> ds = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> model = rsp.semantic.train(
...     {"path": ds, "sub": "train"},
...     {"path": ds, "sub": "val"},
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     batch_size=32,
... )
>>> output_map = "/home/rsp_test/prediction.tif"
>>> rsp.semantic.generate_map({"path": ds, "y": "landcover"}, model, output_map)
Predicting: 100% #################### 372/372 [32:16, 1.6s/it]

>>> ds = {"path": "/home/rsp_test/model/ds.rspds"}
>>> model = "/home/rsp_test/model/upernet.ckpt"
>>> output_map = "/home/rsp_test/prediction.tif"
>>> rsp.semantic.generate_map(ds, model, output_map)
Predicting: 100% #################### 372/372 [32:16, 1.6s/it]

>>> # Train model on data from Montana
>>> x_montana_files = "/home/rsp_test/mosaics/landsat_montana/landsat.json"
>>> y_montana_files = {"name": "landcover", "path": "/home/rsp_test/mosaics/landcover_montana/landcover.tif"}
>>> ds_montana = rsp.semantic.generate_tiles(
...     x_montana_files,
...     y_montana_files,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> train_ds = {"path": ds_montana, "sub": "train"}
>>> val_ds = {"path": ds_montana, "sub": "val"}
>>> model_montana = rsp.semantic.train(
...     train_ds,
...     val_ds,
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     epochs={"max_epochs": 10, "early_stopping": False},
...     batch_size=32,
... )
>>> # Use model to map landcover of Idaho
>>> x_idaho_files = "/home/rsp_test/mosaics/landsat_idaho/landsat.json"
>>> ds_idaho = rsp.semantic.generate_tiles(x_idaho_files, None, tile_size=256)
>>> output_map = "/home/rsp_test/prediction_idaho.tif"
>>> pred_ds = {"path": ds_idaho}
>>> ref_ds = {"path": ds_montana, "y": "landcover"}
>>> rsp.semantic.generate_map(pred_ds, model_montana, output_map, reference_dataset=ref_ds)
Predicting: 100% #################### 372/372 [32:16, 1.6s/it]

remote_sensing_processor.semantic.band_importance(dataset, model, target_class=None, num_init_images=100, num_images=500, batch_size=32, num_workers=0)[source]

Explain the band importance for a pre-trained model using SHAP.

Parameters:

dataset (dict) – Dataset generated by generate_tiles() function that will be used for prediction. Dataset can contain 3 elements: path: a path to a dataset. sub: subdataset name, list of subdataset names or ‘all’. If not defined, prediction for the whole dataset will be performed. y: if there is more than one target variable in dataset, then the name of the variable that should be used for original data reconstruction should be defined.
model (torch.nn model or SklearnModel or path to a model file) – Pre-trained model to predict target values. You can pass the model object returned by train() function or file (*.ckpt or *.joblib) where model is stored.
target_class (int (optional)) – Index of the class to analyze band importance. If not set, will analyze all the classes.
num_init_images (int (default = 100)) – Number of images that will be used to initialize the SHAP explainer. We strongly recommend using very small num_init_images with sklearn models.
num_images (int (default = 500)) – Number of images that will be used to explain band importance. We strongly recommend using very small num_images with sklearn models.
batch_size (int (default = 32)) – Number of samples used in one iteration. Only works for neural networks.
num_workers (int or 'auto' (default = 0)) – Number of parallel workers that will load the data. Set ‘auto’ to let RSP choose the optimal number of workers, set 0 to disable multiprocessing. It can increase training speed, but can also cause errors (e.g. pickling errors).

Examples

>>> import remote_sensing_processor as rsp
>>> x, y, out_file = ...
>>> ds = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> model = rsp.semantic.train(
...     {"path": ds, "sub": "train"},
...     {"path": ds, "sub": "val"},
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     batch_size=32,
... )
>>> rsp.semantic.band_importance({"path": ds, "y": "landcover"}, model)
PartitionExplainer explainer: 100%|██████████████████████████████████████████▉| 499/500 [42:07<00:05,  5.07s/it]
Landsat-B1: 0.0162
Landsat-B2: 0.0493
Landsat-B3: 0.0875
Landsat-B4: 0.0243
Landsat-B5: 0.0319
Landsat-B7: 0.0194
NDVI: 0.0353
NBR: 0.0281
slope: 0.0134
curvature: 0.0239
aspect: 0.0311
dem-norm: 0.0236

>>> ds = {"path": "/home/rsp_test/model/ds.rspds"}
>>> model = "/home/rsp_test/model/xgboost.joblib"
>>> rsp.semantic.band_importance(ds, model, num_init_images=1, num_images=1)
PartitionExplainer explainer: 16385it [1:12:01,  3.78it/s]
coastal: 0.0266
blue: 0.0266
green: 0.0253
red: 0.0542
rededge071: 0.0194
rededge075: 0.0194
rededge078: 0.0196
nir: 0.0034
nir08: 0.0111
nir09: 0.0758
swir16: 0.2894
swir22: 0.0309
NDVI: 0.0483
canopyheight_norm: 0.0005
dem_norm: 0.0741

remote_sensing_processor.semantic.confusion_matrix(dataset, model, batch_size=32, num_workers=0)[source]

Generate confusion matrix.

Row indices of the confusion matrix correspond to the true class labels and column indices correspond to the predicted class labels.

Parameters:

dataset (dict) – Dataset generated by generate_tiles() function that will be used for prediction. Dataset can contain 3 elements: path: a path to a dataset. sub: subdataset name, list of subdataset names or ‘all’. If not defined, prediction for the whole dataset will be performed. y: if there is more than one target variable in dataset, then the name of the variable that should be used for original data reconstruction should be defined.
model (torch.nn model or SklearnModel or path to a model file) – Pre-trained model to predict target values. You can pass the model object returned by train() function or file (*.ckpt or *.joblib) where model is stored.
batch_size (int (default = 32)) – Number of samples used in one iteration. Only works for neural networks.
num_workers (int or 'auto' (default = 0)) – Number of parallel workers that will load the data. Set ‘auto’ to let RSP choose the optimal number of workers, set 0 to disable multiprocessing. It can increase training speed, but can also cause errors (e.g. pickling errors).

Examples

>>> import remote_sensing_processor as rsp
>>> x, y, out_file = ...
>>> ds = rsp.semantic.generate_tiles(
...     x,
...     y,
...     out_file,
...     tile_size=256,
...     shuffle=True,
...     split={"train": 3, "val": 1, "test": 1},
... )
>>> model = rsp.semantic.train(
...     {"path": ds, "sub": "train"},
...     {"path": ds, "sub": "val"},
...     model="UperNet",
...     backbone="ConvNeXTV2",
...     model_file="/home/rsp_test/model/upernet.ckpt",
...     batch_size=32,
... )
>>> rsp.semantic.confusion_matrix({"path": ds, "y": "landcover"}, model)
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [03:35<00:00, 6.71s/it]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [06:22<00:00, 16.10s/it]
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Predicting DataLoader 0: 100%|██████████████████████████████████████████▉|77/77 [11:27<00:00, 0.11it/s]
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [01:04<00:00, 2.73s/it]
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
|   | 0 |        1 |       2 |       3 |        4 |       5 |       6 |     7 | 8 | 9 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 0 | 0 |        0 |       0 |       0 |        0 |       0 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 1 | 0 | 57715149 |  616341 |  457550 |  1406527 |  508368 |   92479 |  6481 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 2 | 0 |  2844286 | 1599082 | 1083022 |  1371587 |  174298 |   40392 |  2411 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 3 | 0 |  1831478 |  558398 | 2673250 |  5310441 |  555055 |   94868 |  8320 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 4 | 0 |  1981444 |  188564 | 1732960 | 18768424 | 5257066 |  408550 |  8889 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 5 | 0 |   798174 |   24353 |  120365 |  4698820 | 7825820 | 1432090 | 20451 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 6 | 0 |   246685 |    9658 |   13940 |   280521 | 2202678 | 2090811 | 57364 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 7 | 0 |    12279 |    1288 |    2347 |    11409 |  112254 |  345227 | 57887 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 8 | 0 |        0 |       0 |       0 |      585 |       8 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 9 | 0 |       27 |       0 |       0 |      643 |       0 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+

>>> ds = {"path": "/home/rsp_test/model/ds.rspds"}
>>> model = "/home/rsp_test/model/xgboost.joblib"
>>> rsp.semantic.confusion_matrix(ds, model)
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [03:35<00:00, 6.71s/it]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [06:22<00:00, 16.10s/it]
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Predicting DataLoader 0: 100%|██████████████████████████████████████████▉|77/77 [11:27<00:00, 0.11it/s]
Loading dataset from disk: 100%|██████████████████████████████████████████▉|37/37 [01:04<00:00, 2.73s/it]
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
|   | 0 |        1 |       2 |       3 |        4 |       5 |       6 |     7 | 8 | 9 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 0 | 0 |        0 |       0 |       0 |        0 |       0 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 1 | 0 | 57715149 |  616341 |  457550 |  1406527 |  508368 |   92479 |  6481 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 2 | 0 |  2844286 | 1599082 | 1083022 |  1371587 |  174298 |   40392 |  2411 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 3 | 0 |  1831478 |  558398 | 2673250 |  5310441 |  555055 |   94868 |  8320 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 4 | 0 |  1981444 |  188564 | 1732960 | 18768424 | 5257066 |  408550 |  8889 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 5 | 0 |   798174 |   24353 |  120365 |  4698820 | 7825820 | 1432090 | 20451 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 6 | 0 |   246685 |    9658 |   13940 |   280521 | 2202678 | 2090811 | 57364 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 7 | 0 |    12279 |    1288 |    2347 |    11409 |  112254 |  345227 | 57887 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 8 | 0 |        0 |       0 |       0 |      585 |       8 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+
| 9 | 0 |       27 |       0 |       0 |      643 |       0 |       0 |     0 | 0 | 0 |
+---+---+----------+---------+---------+----------+---------+---------+-------+---+---+

List of available NN models

Transformers backbones are:

BEiT
BiT
ConvNeXT
ConvNeXTV2
DiNAT
DINOV2
DINOV2WithRegisters
DINOV3ViT
DINOV3ConvNeXT
FocalNet
HGNet-V2
Hiera
LW-DETR
MaskFormer-Swin
Pixio
PVTV2
ResNet
RT-DETR-ResNet
Swin
SwinV2
ViTDet
Any TIMM backbone (experimental support)

You can fine-tune pre-trained model by defining weights. For models from Transformers you can get available weights from Huggingface Hub, for Torchvision models you just set weights = True.

rsp.segmentation.train also saves CSV and Tensorboard logs in directory where checkpoint file is saved.

DiNAT backbone require natten library, that is not available on Windows and Mac and not available via Conda. RSP supports DiNAT backbone, but you need to install natten in your python env manually.

List of available Scikit-learn models

Supported Scikit-Learn Models
Model Name	Kernel/solver	Warm start	Reference
Logistic Regression	“lbfgs”, “liblinear”, “newton-cg”, “newton-cholesky”, “sag”, “saga”	Only for lbfgs, newton-cg, sag, saga solvers	Scikit-learn
Ridge	Not available	Not supported	Scikit-learn
SGD	“hinge”, “log_loss”, “modified_huber”, “squared_hinge”, “perceptron”	Supported	Scikit-learn
Nearest Neighbors	Not available	Not supported	Scikit-learn
Radius Neighbors	Not available	Not supported	Scikit-learn
SVM	“rbf”, “linear”, “poly”, “sigmoid”	Not supported	Scikit-learn
Gaussian Process	Not available	Not supported	Scikit-learn
Naive Bayes	“gaussian”, “bernoulli”, “categorical”, “complement”, “multinomial”	Not supported	Scikit-learn
QDA	Not available	Not supported	Scikit-learn
LDA	Not available	Not supported	Scikit-learn
Decision Tree	Not available	Not supported	Scikit-learn
Extra Tree	Not available	Not supported	Scikit-learn
Random Forest	Not available	Supported	Scikit-learn
Extra Trees	Not available	Supported	Scikit-learn
AdaBoost	Not available	Not supported	Scikit-learn
Gradient Boosting	Not available	Supported	Scikit-learn
Multilayer Perceptron	“adam”, “sgd”, “lbfgs”	Supported	Scikit-learn
XGBoost	Not available	Not supported	XGBoost
XGB Random Forest	Not available	Not supported	XGBoost

Model kernel or solver is defined with backbone argument.

Models that support warm start can be fine-tuned using pre-trained models with checkpoint argument.

Some models can have issues while saving, especially when trained on big datasets. Some models (like SVM) can train for a very long time or (like Gaussian process) can have memory issues with big datasets. So we recommend using Scikit-learn models only for small datasets.

For Random Forest and Extra Trees models max_depth is by default set to 6, because it is unlimited by default and the training could be very slow. To train with unlimited tree depth set max_depth = None.

List of available losses

Supported loss functions
Loss	Reference
cross_entropy	Torch
nll	Torch
jaccard	Segmentation Models Pytorch
dice	Segmentation Models Pytorch
tversky	Segmentation Models Pytorch
focal	Segmentation Models Pytorch
lovasz	Segmentation Models Pytorch

You can also use your custom loss. It can be useful if you want to initialize a loss with custom parameters. You also can pass any custom function as a loss. The only limit - it must inherit torch.nn.modules.loss._Loss.

List of available metrics

Supported metrics
Metric	Additional parameters	Reference
accuracy_macro	average=”macro”	Torchmetrics
accuracy_micro	average=”micro”	Torchmetrics
cohen_kappa	None	Torchmetrics
exact_math	None	Torchmetrics
f1_macro	average=”macro”	Torchmetrics
f1_micro	average=”micro”	Torchmetrics
hamming_distance_macro	average=”macro”	Torchmetrics
hamming_distance_micro	average=”micro”	Torchmetrics
jaccard_index_macro	average=”macro”	Torchmetrics
jaccard_index_micro	average=”micro”	Torchmetrics
matthews_correlation_coefficient	None	Torchmetrics
negative_predictive_value_macro	average=”macro”	Torchmetrics
negative_predictive_value_micro	average=”micro”	Torchmetrics
precision_macro	average=”macro”	Torchmetrics
precision_micro	average=”micro”	Torchmetrics
recall_macro	average=”macro”	Torchmetrics
recall_micro	average=”micro”	Torchmetrics
dice_score_macro	average=”macro”	Torchmetrics
dice_score_micro	average=”micro”	Torchmetrics
generalized_dice_score	None	Torchmetrics
mean_iou	None	Torchmetrics

For most of the semantic segmentation metrics micro and macro-averaged versions are available by default.

You also can use any custom metric for evaluation. The only limit - it must inherit torchmetrics.metric.Metric.

Supported augmentations

Supported augmentations
Augmentation	Additional parameters	Reference
ScaleJitter	None	Torchvision
RandomResizedCrop	antialias=True	Torchvision
RandomHorizontalFlip	p=0.5	Torchvision
RandomVerticalFlip	p=0.5	Torchvision
RandomZoomOut	None	Torchvision
RandomRotation	degrees=90	Torchvision
RandomAffine	degrees=90, translate=(0.5, 0.5), shear=0.5	Torchvision
RandomPerspective	None	Torchvision
ElasticTransform	None	Torchvision
GaussianBlur	kernel_size=(5, 9)	Torchvision

If you just pass augment=True, RSP will use a default sequence of augmentations: (“RandomResizedCrop”, “RandomHorizontalFlip”). You can pass your own sequence of augmentations, they will be applied to data in the given order. You can use both supported augmentation names or custom augmentations. You can use any custom augmentations, but they must inherit torchvision.transforms.v2.Transform.