TY - JOUR
T1 - CloudSEN12 - a global dataset for semantic understanding of cloud and cloud shadow in Sentinel-2
AU - Aybar, Cesar
AU - Ysuhuaylas, Luis
AU - Loja, Jhomira
AU - Gonzales, Karen
AU - Herrera, Fernando
AU - Bautista, Lesly
AU - Yali, Roy
AU - Flores, Angie
AU - Diaz, Lissette
AU - Cuenca, Nicole
AU - Espinoza, Wendy
AU - Prudencio, Fernando
AU - Llactayo, Valeria
AU - Montero, David
AU - Sudmanns, Martin
AU - Tiede, Dirk
AU - Mateo-García, Gonzalo
AU - Gómez-Chova , Luis
N1 - © 2022. The Author(s).
PY - 2022/12/24
Y1 - 2022/12/24
N2 - Accurately characterizing clouds and their shadows is a long-standing problem in the Earth Observation community. Recent works showcase the necessity to improve cloud detection methods for imagery acquired by the Sentinel-2 satellites. However, the lack of consensus and transparency in existing reference datasets hampers the benchmarking of current cloud detection methods. Exploiting the analysis-ready data offered by the Copernicus program, we created CloudSEN12, a new multi-temporal global dataset to foster research in cloud and cloud shadow detection. CloudSEN12 has 49,400 image patches, including (1) Sentinel-2 level-1C and level-2A multi-spectral data, (2) Sentinel-1 synthetic aperture radar data, (3) auxiliary remote sensing products, (4) different hand-crafted annotations to label the presence of thick and thin clouds and cloud shadows, and (5) the results from eight state-of-the-art cloud detection algorithms. At present, CloudSEN12 exceeds all previous efforts in terms of annotation richness, scene variability, geographic distribution, metadata complexity, quality control, and number of samples.
AB - Accurately characterizing clouds and their shadows is a long-standing problem in the Earth Observation community. Recent works showcase the necessity to improve cloud detection methods for imagery acquired by the Sentinel-2 satellites. However, the lack of consensus and transparency in existing reference datasets hampers the benchmarking of current cloud detection methods. Exploiting the analysis-ready data offered by the Copernicus program, we created CloudSEN12, a new multi-temporal global dataset to foster research in cloud and cloud shadow detection. CloudSEN12 has 49,400 image patches, including (1) Sentinel-2 level-1C and level-2A multi-spectral data, (2) Sentinel-1 synthetic aperture radar data, (3) auxiliary remote sensing products, (4) different hand-crafted annotations to label the presence of thick and thin clouds and cloud shadows, and (5) the results from eight state-of-the-art cloud detection algorithms. At present, CloudSEN12 exceeds all previous efforts in terms of annotation richness, scene variability, geographic distribution, metadata complexity, quality control, and number of samples.
UR - http://www.scopus.com/inward/record.url?scp=85144636643&partnerID=8YFLogxK
U2 - 10.1038/s41597-022-01878-2
DO - 10.1038/s41597-022-01878-2
M3 - Article
C2 - 36566333
SN - 2052-4463
VL - 9
JO - Scientific data
JF - Scientific data
IS - 1
M1 - 782
ER -