Human-centric Computing and Information Sciences volume 12, Article number: 58 (2022)
Cite this article 2 Accesses
https://doi.org/10.22967/HCIS.2022.12.058
Although the use of high-resolution land cover maps is increasing, securing a reasonable level of accuracy in order to preserve detailed land cover information remains a challenge. This study optimized a deep learning model to automatically classify 41 land cover classes with remote sensing data. D-LinkNet, a semantic segmentation network, was selected as an automated extraction method to ensure the accuracy and efficiency of the sub-divided (or level 3) land cover classification. In this study, aerial ortho-photos and level 3 land cover maps were used to train the model and validate the data using model training accuracy. D-LinkNet was applied by utilizing ResNet-34 backbone to classify the 41 semantic classes. The mean intersection over union (mIoU) was used to evaluate the accuracy, the result of which was 22.81%. To improve accuracy, the learning datasets were refined and reclassified from 41 into 28 classes. The model was then improved by exchanging the encoder with ResNet-101 and hyperparameter modifications. The results show that mIoU was improved to 51.24% and 30.1% for the training and test, respectively.
Comprehensive Evaluation, Metric Learning, Athlete Scoring, Competition Prediction
Land cover mapping is essential for detecting environmental changes and monitoring regional environmental planning [1]. Most land cover classification methods began in the 1970s and 1980s; however, land cover mapping technology using satellite images has made rapid progress with the incorporation of algorithms and specific classifiers in recent years [2]. Since the 2010s, various methods of land cover classification using machine learning have emerged, but various deep learning methods have also been explored due to the limitation of the need for expert knowledge at the feature extraction stage by machine learning [3].
Since 2019, the creation of sub-divided land cover maps across the Republic of Korea has been completed, and the nationwide renewal of such maps has continued there [4]. The land cover mapping project of the Korean Ministry of Environment has established a foundation for eco-friendly land use and effective environmental policy. Above all, level 3 land cover maps with a regional scale of 1:5,000 provide scientific evidence that can be used to assist the decision-making process of environmental impact assessments and urban planning [5]. Therefore, it is critical to improve their quality and update them promptly to keep pace with the increasing use of level 3 land cover maps [6].
Deep learning is a technology that has advanced machine learning. It is a self-learning technology in which a computer learns through an artificial neural network, and verifies and develops through feedback
[7]. Deep learning models have become a robust method of object recognition, image segmentation, and image classification. As they show better performance with higher accuracy and a shorter computation time, many studies have focused on overcoming other drawbacks [8, 9].
Deep learning algorithms are also popular in land use land cover (LULC) classification [10]. Even though traditional datasets such as color and grey images can be successfully analyzed, there are still limitations in processing remote sensing data for land cover classification [11]. Recent research reviews have proposed improvements for level 3 land cover mapping system with the focus on three aspects: pre-processing of primary datasets, improvements of automatic extraction models, and land cover classification systems.
Due to the increasing use of level 3 land cover maps increases, quality improvements and rapid updates are becoming critical. Therefore, this study applies remote sensing and artificial intelligence (AI) learning to improve the results of level 3 land cover classification through human interventions. Currently, AI is applied only to classify simple properties, such as buildings, roads, rice paddies, fields, barren land, street trees, parking lots, and forests. To date, automatic classification by deep learning has been applied to between five and seven classes in general, while a recent study by Yoo et al. [12] applied ResU-Net to conduct an automatic classification of LULC, which was limited to only five land use classes.
Most importantly, this study attempted to automatically classify 41 land cover attributes using remote sensing data and to improve the results by reflecting the model modifications. The objectives of this study include the following: (1) the proposal of an automated extraction method with a semantic segmentation network named D-LinkNet to explore tools for securing the accuracy and efficiency of level 3 land cover classification; (2) an investigation of a systematic and practical process to improve the pre-processing of learning datasets and the quality of datasets; and (3) an examination of the current level 3 land cover classification system, along with suggested improvements to minimize the update budget and expand various data uses.
The remainder of this study is structured as follows. Section 2 introduces related works; Section 3 describes the proposed methodology, which includes D-LinkNet application, research process, study site selection, and dataset preparation; Section 4 presents the results of the proposed model and a comparison with other models; and Section 5 presents the conclusions and limitations of this study.
Updating and constructing land cover data on an annual basis is challenging due to the high construction cost and amount of time required. An innovative approach is needed to develop a highly accurate land cover classification system at low cost, while shortening the processing time. However, the land cover maps produced in Korea have only used verified classification methods, such as classification by human judgment or simple attribute classification through AI. Still, visual examination techniques with reference data (such as clinical and digital maps) are mainly used to update the land cover maps [5]. Thus, the effectiveness of the current classification system of level 3 land cover maps can be improved by reducing, integrating, and reclassifying the existing 41 items of the classification system. The standard for revising level 3 classification items needs to reflect widespread use in the field of environmental management and reduce dependence on reference materials, such as digital topographic maps and property maps, in order to achieve the rapid construction and renewal of land cover maps [6].
However, the extraction of sub-divided land cover classifications from high-resolution satellite images presents many challenges owing to their technical limitations and unique dataset features. The classification level of input images is high and sub-divided, but features such as forest types appear similar in satellite images. Therefore, ensuring a reasonable rate of accuracy for preserving detailed land cover information is critical [13].
According to reviews of deep learning in remote sensing, convolutional neural networks (CNN) are the most widely used and popular of deep learning models [14]. The amount of applications of deep learning in remote sensing has increased recently, and CNN appears the most prevalent and most commonly used deep learning model in remote sensing image analysis [13, 15]. Ma et al. [13] claimed that CNN is highly suitable for processing multi-band remote sensing data for image classification tasks such as LULC classification, object detection, and scene classification. Carranza-Garcia et al. [15] compared deep learning models such as CNNs to machine learning methods, with CNN showing better results in terms of the computation time required for training and testing. In addition, Pashaei et al. [16] claimed that deep learning is a powerful technique for remote sensing, particularly semantic image segmentation. Their study used deep CNN architectures to train and evaluate a wetland area with hyper-spatial resolution unmanned aircraft systems (UASs) imagery. The results showed great potential for securing land cover prediction accuracy with deep CNNs. Mahdianpari et al. [17] investigated a deep learning tool called InceptionResNetV2 with a view to attaining better wetland classification accuracy. More recent studies have applied multi-series satellite images and cross-sensor remote sensing data using deep learning models for automatic land cover classification. For instance, Kim et al. [18] used U-Net to classify seven sub-classes and compared the landcover classification accuracy of single-period data and multi-series satellite images. Kalita et al. [19] investigated a multi-sensor framework and classified samples into four classes, for example, water, trees, meadow, and barren land. Naushad et al. [20] applied high-resolution imagery and deep transfer learning techniques named VGG16 (Visual Geometry Group 16) and Wide ResNet-50, using the EuroSat dataset for LULC classification. As the study only classified ten classes, ResNet-50 showed a rate of accuracy of 99.17%, and classes like forest, highway, residential, and sea/lake showed a rate of accuracy of 99%.
CNN models such as U-Net, SegNet, and D-LinkNet have been used for image segmentation. U-Net, which won the ISBI cell tracking challenge in 2015, has shown outstanding performance in various biomedical segmentation applications. Especially, annotated images are almost unnecessary for data augmentation, and the training time is reasonable [21]. Yoo et al. [22] used the Residual U-Net model to generate land cover maps in a new and quicker way. Aerial ortho-images and Landsat 8 satellite images were used for level 1 and 2 land cover classifications, the results of which showed a classification accuracy of 86.6 % for level 1 and 71 % for level 2, which proves that fewer semantic classes will result in better class accuracy. Ulmas and Liiv [1] used U-Net to examine the accuracy of land cover mapping. Pollatos et al. [23] used multispectral satellite images and applied training and testing of the dataset with the ResUNet model. The study implemented the following three levels of the class hierarchy: level 1 (five classes), level 2 (13 classes), and level 3 (22 classes). However, limitations exist in the prediction of uncommon sub-classes and mismatching resolutions of the dataset. Tan et al. [24] used U-Net with ResNet-34 to extract roads from remote sensing images. The results of the comparative experiments showed outstanding performance in terms of the overall accuracy.
SegNet was proposed to design an efficient architecture for roads and indoor environments. It efficiently processed memory and reduced the calculation time during inference while using significantly fewer trainable parameters than competing products [25]. Lee and Kim [26] used SegNet to evaluate the land cover classification method with four semantic classes (urban, farmland, forest, and water), for which the overall classification accuracy was 85.48%. Lee et al. [27] also applied SegNet and categorized six classes—building, tree, low vegetation (grass), impervious surface (road), car, and water—for land cover classification using multisource data.
D-LinkNet is a semantic segmentation model that was used in the 2018 DeepGlobe Road Extraction Challenge. Recently, there have been other models with a higher accuracy rate [28]. Yuan et al. [29] applied D-LinkNet to generate road probability maps and proposed an improved seamline determination method for urban image mosaicking. However, D-LinkNet is an efficient method for extracting roads by fusing low and high-dimensional features at different resolutions [30]. As compared in Table 1, D-LinkNet has the advantages of securing continuous linear features and generating less pixel-based noise.
Table 1. Model comparison
Model | Competition | Feature | Note |
U-Net [21] | Biomedical Image Segmentation ('15) | Image segmentation, end-to-end based fully-convolutional network. Exact localization network: acquisition of overall Image information. Data augmentation performance improvement: guaranteed good performance with low training data load. Skip architecture: combines the feature map of a shallow layer with a deep layer. |
Image segmentation in the biomedical field |
SegNet [25] | Pixel-wise Semantic Segmentation ('16) | Understanding of road scenes. Maintain boundary information: ensures excellent performance in recognizing the shapes of large objects (roads, buildings, etc.). Efficient motion design: ensures good performance while gaining efficiency in memory and computation time. |
|
D-LinkNet [28] | DeepGlobe Challenge ('18) | Addition of dilated convolution layer: added to the center part of LinkNet (2017). ImageNet: use of the encoder part as the backbone. Residual block: two residual blocks are used for each encoder block. Skip connection: spatial information reduced during pooling is additionally provided during decoding. |
Road extraction in remote sensing |
D-LinkNet was initially planned to be developed for remote sensing. The challenge was to extract roads from high-resolution satellite imagery. Compared to U-Net, D-LinkNet showed superior results in the validation set and excelled in classifying linear objects, thus solving the road connection problem. Furthermore, it can be applied to a large study area. Therefore, this study applies D-LinkNet, with aerial photographs serving as the learning material. The main feature of D-LinkNet is the addition of a dilated convolution (center part) layer to LinkNet (2017) and the use of ImageNet (1 of CNN models) as the backbone of the encoder part. It adds residual blocks; each encoder block uses two residual blocks to prevent gradient vanishing and to draw significant results (Fig. 1).
This study used D-LinkNet to improve classification accuracy and learning speed after training and verification. A deep learning process for automated land cover classification begins with the preparation of learning datasets and the selection of suitable network models. Thereafter, a deep learning network model with the applicable configurations was applied in order to conduct a network learning and validation process. As shown in Fig. 2, this study follows four steps: pre-processing of learning datasets, configuration of the model, learning, and implementation of the model with improved model configurations.
First, the learning datasets were divided into training, validation, and testing. Each dataset was used to train the model and validate the data based on the accuracy of model training. The test data were used to evaluate the final performance of the training model. Geometric errors and attribute errors were checked through the pre-processing of the learning data.
Next, the network model was trained through a deep learning procedure using the prepared learning data. D-LinkNet conducts the network learning and validation process by applying 70% of the learning datasets to the training process and 30% of the learning datasets to validate accuracy. The confusion matrix was used to check the accuracy, precision, recall (recall, sensitivity), fail-out, and F1-score for each attribute item.
Class (code No.) | Class (code No.) |
1. Independent residence (111) | 22. Other arable land (252) |
2. Group residence (112) | 23. Broadleaf tree forest (311) |
3. Industrial (121) | 24. Needle leaf tree forest (320) |
4. Commercial & business (131) | 25. Mixed stand forest (331) |
5. Mixed (132) | 26. Pasture natural (411) |
6. Entertainment (141) | 27. Golf course (421) |
7. Airport (151) | 28. Cemetery (422) |
8. Seaport (152) | 29. Other pasture (423) |
9. Railroad (153) | 30. Inland swamp (511) |
10. Road (154) | 31. Mudflat (521) |
11. Other transportation & Communication (155) | 32. Salt pond (522) |
12. Basic environment (161) | 33. Seacoast (611) |
13. Education & Adm. (162) | 34. River basin (612) |
14. Other public facilities (163) | 35. Cliff, rock (613) |
15. Rice paddy readjusted (211) | 36. Mining area (621) |
16. Rice paddy not yet readjusted (212) | 37. Sports ground (622) |
17. Field adjusted (221) | 38. Other barren land (623) |
18. Field not yet readjusted (222) | 39. River (711) |
19. Greenhouse (231) | 40. Lake and pond (712) |
20. Orchard (241) | 41. Seawater (721) |
21. Ranch (251) | - |
To validate the results of the D-LinkNet automated classification, the current level 3 land cover map and the learning model classification results were compared visually (Fig. 8). An evaluation method for the introduced algorithms is the mean intersection over union (mIoU) method, which measures the degree of overlap between the predicted class region and the actual corresponding class region. The mIoU is an index indicating the ratio of the intersection area to the union area (in pixels) of the predicted value for a specific classification target (class) and the actual area. The closer the index is to 1, the higher the classification accuracy of the algorithm. The mIoU is calculated using the ratio of the intersection area (overlapping area) and the union area of the predicted boundary (predicted pixel for each class) to the actual true value boundary (ground truth corresponding class pixel).
D-LinkNet hyper-parameter | Existing condition | Improved condition |
Backbone | ResNet-34 | ResNet-101 |
Loss function | Categorical cross-entropy loss | Categorical cross-entropy loss |
Optimizer | Adam | Adam |
Learning rate | 2.00E-04 | 1.00E-04 |
Epoch | 30 | 100 |
Current level 3 classes (code No.) | Revised level 3 classes |
Commercial & business (131) | Commercial area |
Mixed (132) | |
Airport (151) | Transportation area |
Seaport (152) | |
Railroad (153) | |
Vehicle road (154) | |
Other transportation & communication (155) | |
Basic environment (161) | Public facilities |
Education & Administration (162) | |
Other public facilities (163) | |
Broadleaf tree forest (311) | Forest area |
Needle leaf tree forest (320) | |
Mixed stand forest (331) | |
Golf course (421) | Artificial pasture |
Cemetery (422) | |
Other pasture (423) | |
Seacoast (611) | Natural barren land |
River basin (612) | |
Cliff, rock (613) |
Data | Aerial imagery | ||||
Class | 41 (level 3) | 41 (level 3) | 22 (level2) | 7 (level1) | 28 (level 3 revised) |
Model | ResNe-t34, D-LinkNet |
ResNet-101, D-LinkNet |
ResNet-101, D-LinkNet |
ResNet-101, D-LinkNet |
ResNet-101, D-LinkNet |
Total epoch | 30 | 100 | 100 | 10 | 100 |
Accuracya) (%) | |||||
Training | 32.25 | 40.31 | 51.34 | 69.38 | 51.24 |
Test | 22.81 | 26.71 | 26.45 | 56.55 | 30.1 |
Benbahria et al. [31] | Seferbekov et al. [32] | Lee and Kim [26] | Rakhlin et al. [33] | This study | ||
Data | Landsat8 (NIR, R, G) | DeepGlobe | Aerial imagery | DeepGlobe | Aerial imagery | |
Class | 7 ea. | 7 ea. | 5 ea. | 7 ea. | 7 ea. | 28 ea. |
Model | ResNet-50, | ResNet-50, | SegNet | U-Net | ResNet-101, | |
U-Net | FPN | D-LinkNet | ||||
Accuracy (%) | 51 | 49 | 85 | 64.8 | 56.55 | 30.1 |
(mIoU) | (mIoU) | (PA) | (mIoU) | (mIoU) | (mIoU) |
For the purposes of this study, a deep learning technology named D-LinkNet was applied to train, validate, and extract 41 land cover classes from multi-layered land cover datasets. The results showed that optimizing a model by refining the learning datasets and hyper-parameters improves the results of automatic classification. Most importantly, applying automatic extraction of all 41 level 3 land cover classes is significant, even if the results show a relatively low level of accuracy. Existing studies have classified approximately five to seven types of broad-level classes using deep learning-based automatic extraction. This study found that quality control of accuracy and fast dataset preparation was crucial in improving the automatic land cover classification. In particular, the hyperparameters had to be modified to improve the accuracy and efficiency of the D-LinkNet learning network. Further work is needed to revise and integrate similar sub-classes of the current level 3 classes with 41 items in order to enable the rapid construction and periodic renewal of land cover maps. Finally, classifying too many classes poses a significant challenge. AI learning alone cannot ensure the accurate classification of many land cover properties. Therefore, the application of various models and training data should be explored to increase the level of accuracy. Moreover, it is necessary to establish a methodology to assign land use attributes that are not confirmed by images after changes in AI learning or process attributes due to seasonal influences. Multi-resolution segmentation using multiple images with different resolutions or seasons will be considered for further advancements, instead of AI learning with a single image.
Conceptualization, BS, SJ. Funding acquisition, BS, SJ. Investigation and methodology, BS, SJ, KP, MJ. Project administration, JB. Supervision, MJ. Writing of the original draft, BS, JB, ,MJ. Writing of the review and editing, MJ. Software, KP, IL. Validation, IL. Visualization, KP, IL.
None.
The authors declare that they have no competing interests.
Name : Bongsang Seo
Affiliation : Managing Director, Project Development Division, ALLforLAND, Korea
Biography : Bongsang Seo is a PH.D. student at the University of Seoul. Since 2018, he has served as the Managing Director of All4land Public Works Headquarters in Korea. He is currently researching digital twin data construction technology that virtualizes real space based on LiDAR in the field of environment and forestry and automatic object recognition technology based on artificial intelligence.
Name : Sejin Jang
Affiliation : Managing Director, ALLforLAND, Korea
Biography : Sejin Jang received his PH.D. degree in Computer Science and Engineering, Graduate School, Kyung Hee University, Korea in 2006. Also, he acquired Professional Engineer Surveying Geo-Spatial Information in the same year. He has worked as a managing director at Project Development Division, ALLforLAND, Korea since Feb. 2019. Currently, He is researching digital twin data construction technology that virtualizes real space and deep learning-based automatic object recognition technology.
Name : Jinsik Bong
Affiliation : General Manager, ALLforLAND, Korea
Biography : Jinsik Bong graduated from the Department of Forestry at Kookmin University. Since 2015, he has served as the general manager of the All4land Public Works Division in Korea. He is currently researching digital twin data construction technology that virtualizes real space based on LiDAR in the field of environment and forestry and automatic object recognition technology based on artificial intelligence.
Name : Kangmin Park
Affiliation : Geoinformatics, University of Seoul, Korea
Biography : Kangmin Park is in the master’s course of Dept. of Geoinformatics at the University of Seoul. He is mainly interested in photogrammetric computer vision based on deep learning, especially in semantic segmentation and uncertainty estimation, and its applications.
Name : Impyeong Lee
Affiliation : Geoinformatics, University of Seoul, Korea
Biography : Impyeong Lee received a Ph.D. degree in Geodetic Science and Surveying from The Ohio State University. He is now a Professor of Dept of. Geoinformatics and the Director of AI Center for Complex System at the University of Seoul. His current research interests include photogrammetric computer vision applications such as real-time drone mapping and analysis, change detection and semantic segmentation, and other geospatial deep learning topics.
Name : Moonsun Jeong
Affiliation : Dept. of Human and Environment Design, Cheongju University, Korea
Biography : Moonsun Jeong received a Ph.D. degree in Environmental Design and Planning from Virginia Polytechnic Institute and State University, USA in 2010. She is an assistant professor of the Dept. of Human and Environment Design at Cheongju University. Her research interests include green infrastructure, low impact development, participatory GIS mapping, and natural resource protection and management.
Bongsang Seo1, Sejin Jang2, Jinsik Bong1, Kangmin Park3, Impyeong Lee3, and Moonsun Jeong4,*, Application of Deep Learning to the Production of Sub-divided Land Cover Maps, Article number: 12:58 (2022) Cite this article 2 Accesses
Download citationAnyone you share the following link with will be able to read this content:
Provided by the Springer Nature SharedIt content-sharing initiative