ffs_train is a wrapper function for a simple use of the forward feature selection approach of training random forest classification models. This validation is particulary suitable for leave-location-out cross validations where variable selection MUST be based on the performance of the model on the hold out station. See Meyer et al. (2018) for further details. This is in fact the case while using time space variable vegetation patterns for classification purposes. For the UAV based RGB/NIR imagery, it provides an optimized preconfiguration for the classification goals.
ffs_train(
trainingDF = NULL,
predictors = c("R", "G", "B"),
response = "ID",
spaceVar = "FN",
names = c("ID", "R", "G", "B", "A", "FN"),
noLoc = NULL,
sumFunction = "twoClassSummary",
pVal = 0.5,
prefin = "final_",
preffs = "ffs_",
modelSaveName = "model.RData",
runtest = FALSE,
seed = 100,
withinSE = TRUE,
mtry = 2,
noClu = 1
)
dataframe. containing training data
character. vector of predictor names as given by the header of the training data table
character. name of response variable as given by the header of the training data table
character. name of the spacetime splitting vatiable as given by the header of the training data table
character. all names of the dataframe header
numeric. number of locations to leave out usually number of discrete trainings locations/images
character. function to summarize default is "twoClassSummary"
numeric. used part of the training data default is 0.5
character. name pattern used for model default is "final_"
character. name pattern used for ffs default is "ffs_"
character. name pattern used for saving the model default is "model.RData"
logical. default is false, if set a external validation will be performed
numeric. number for seeding
locical. compares the performance to models that use less variables (e.g. if a model using 5 variables is better than a model using 4 variables but still in the standard error of the 4-variable model, then the 4-variable model is rated as the better model).
numerical. Number of variable is randomly collected to be sampled at each split time
numeric. number of cluster to be used
model of a forward feature selection driven random forest classification
The workflow of uavRst
is intended to use the forward feature selection as decribed by Meyer et al. (2018).
This approach needs at least a pair of images that differ in time and/or space for a leave one location out validation mode. You may overcome this situation if you tile your image and provide for each tile seperate training data.
If you just want to classify a single image by a single training file use the normal procedure as provided by the trainControl
function.