Convolutional Neural Network (CNN) for Dwelling Extraction in Refugee/IDP Camps

Activity: Talk or presentationOral presentationscience to science / art to art


Both human-made and natural disasters are the main reasons of population displacement, and refugee / IDP (internally displaced people) camps are often the first accommodation for people who have been forced to flee their home. There is a growing availability and usage of optical very high spatial resolution (VHR) satellite images for efficient support of camp planning and humanitarian aid. These satellite images have significant potential to provide humanitarian organisations with critical information to have a deeper and better understanding of the camp situation. Therefore, VHR satellite images are considered as the primary source of information including number, type and size of dwellings and an estimated number of refugees for tasks demanding such high-level information. However, processing the satellite images to the production of useful high-level information is a challenging task. The extraction and categorisation of different dwellings is a demanding process, because a large variety of dwellings exist with various colour, sizes, and multiple placement positions. Several adequate studies implemented object-based image analysis (OBIA), template-matching and (semi-)automated workflows for dwelling extraction and classification. These workflows are mostly expert-knowledge-based rule-sets, which are easily transferable to different camps. Several neural networks (NN) and other machine learning methods have been used in different studies for VHR satellite image classification. During the past decade, deep neural networks and in particular convolutional neural networks (CNN) marked a new epoch in the application of the NNs in computer vision and image understanding. Because of the current state-of-the-art achievements of CNNs in image analysis tasks and the vast availability of labelled VHR satellite images, there is a growing desire for using CNN for object detection, image classification, scene annotation and so-called semantic segmentation. The CNN methods are supervised multilayer feed-forward neural networks that are tailored to specific image analyses. Even though CNN models have reached high accuracies for some object extraction aims in VHR satellite images, e.g., detection of vehicles, roads and aeroplanes, the potential and challenges of using these models for dwelling extraction in refugee/IDP camps are not fully explored. In this application, we face some specific challenges including non-uniform and a variety of shapes, and very small objects compared to the spatial resolution of the usable (satellite) imagery. In this study, we evaluate a specific CNN model using Trimble’s eCognition software environment based on Google TensorFlow library for extracting different dwelling types of a refugee camp in Cameroon. In this regard, we used a training data set of labelled images obtained from an operational service for humanitarian mapping at the University of Salzburg, Interfaculty Department of Geoinformatics (Z_GIS). The input data include four spectral bands of a WorldView-3 image captured on 12th of April 2015, namely blue (450–510 nm), green (510–580 nm), red (630–690 nm), and near-infrared (770–895 nm). We considered and used different input window sizes based on the various sizes of the different dwelling types. To deal with different generated sample patch sizes, our eCognition-based CNNs were structured in different layer depths. The integration in an OBIA software will allow the integration of the CNN results in knowledge-based analyses in future studies. In order to precisely validate the resulting dwelling extractions, the mean intersection-over-union (mIOU) and a manually labelled test dataset were used. The mIOU is a validation metric which is widely used in the field of computer vision mostly to measure the accuracy of the results of object detection models. It is a general validation metric where any method that produces bounding polygons can be validated by using mIoU based on a precise test dataset of target polygons. It is described as the mean of the following equation (1): IOU = (Area of Overlap) ⁄ (Area of Union) (1) The mIOU value was calculated based on the resulting true positive (TP), false positive (FP), and false negative (FN) for each dwelling extraction.
Period3 Jul 2019
Event title39th Annual EARSeL Symposium
Event typeConference
LocationSalzburg, Austria


  • IDP Camps;
  • Convolutional Neural Network (CNN);
  • Dwelling Extraction

Fields of Science and Technology Classification 2012

  • 207 Environmental Engineering, Applied Geosciences
  • 507 Human Geography, Regional Geography, Regional Planning