Optimal sample size and composition for crop classification with Sen2-Agri’s random forest classifier
Date Issued
Date Online
Language
Type
Review Status
Access Rights
Metadata
Full item pageCitation
Schulthess, U., Rodrigues, F., Taymans, M., Bellemans, N., Bontemps, S., Ortiz-Monasterio, I., Gérard, B., & Defourny, P. (2023). Optimal Sample Size and Composition for Crop Classification with Sen2-Agri’s Random Forest Classifier. Remote Sensing, 15(3), 608. https://doi.org/10.3390/rs15030608
Permanent link to cite or share this item
External link to download this item
Abstract/Description
Sen2-Agri is a software system that was developed to facilitate the use of multi-temporal satellite data for crop classification with a random forest (RF) classifier in an operational setting. It automatically ingests and processes Sentinel-2 and LandSat 8 images. Our goal was to provide practitioners with recommendations for the best sample size and composition. The study area was located in the Yaqui Valley in Mexico. Using polygons of more than 6000 labeled crop fields, we prepared data sets for training, in which the nine crops had an equal or proportional representation, called Equal or Ratio, respectively. Increasing the size of the training set improved the overall accuracy (OA). Gains became marginal once the total number of fields approximated 500 or 40 to 45 fields per crop type. Equal achieved slightly higher OAs than Ratio for a given number of fields. However, recall and F-scores of the individual crops tended to be higher for Ratio than for Equal. The high number of wheat fields in the Ratio scenarios, ranging from 275 to 2128, produced a more accurate classification of wheat than the maximal 80 fields of Equal. This resulted in a higher recall for wheat in the Ratio than in the Equal scenarios, which in turn limited the errors of commission of the non-wheat crops. Thus, a proportional representation of the crops in the training data is preferable and yields better accuracies, even for the minority crops.
Author ORCID identifiers
Francelino Rodrigues https://orcid.org/0000-0001-7273-2217
Ivan https://orcid.org/0000-0002-2572-3219
Bruno Gerard https://orcid.org/0000-0002-1079-7493