홈으로ArticlesAll Issue
ArticlesEfficient Classification of Hyperspectral Data Using Deep Neural Network Model
  • Omaimah Bamasaq1, Daniyal Alghazzawi2,*, Suhair Alshehri2, Arwa Jamjoom1, and Muhammad Zubair Asghar3

Human-centric Computing and Information Sciences volume 12, Article number: 35 (2022)
Cite this article 1 Accesses
https://doi.org/10.22967/HCIS.2022.12.035

Abstract

In recent years, there has been tremendous progress in the classification of hyperspectral images (HSI) using a deep neural network. Due to the high complexity of spectral properties, limited ground truth samples, and extreme inhomogeneity of hyperspectral data, an efficient classification of HSI using deep convolutional neural networks remains difficult. As a result, the ever-increasing volume of data necessitates the effective categorization of remote sensor-based HSI utilizing advanced deep learning. Over time, certain deep learning models for learning to detect HSI have been developed, and many researchers have used convolutional neural networks (CNNs). Previous research on hyperspectral data categorization using deep learning models have faced the issue of performance degradation due to insufficient layer selection in the CNN, which is studied in this research. In our solution, we have proposed a CNN model for effectively classifying hyperspectral data using deep learning. We tested our method using land cover benchmark datasets and found that it outperforms the current state-of-the-art. When compared to conventional machine learning methods and baseline procedures, the results of this study reveal that using an upgraded CNN model significantly improves the performance (accuracy = 93.621, precision = 91.571, recall = 92, F1 score = 90.714).


Keywords

Hyperspectral Images, Deep Learning, Remote Sensor, Convolutional Neural Networks


Introduction

Hyperspectral imaging (HSI) is a remote sensing technique that gathers electromagnetic radiation ranging from transparent to very near wavelengths. Hyperspectral imaging sensors from the same location on the earth’s surface often give many small spectral channels. Each pixel in a hyperspectral picture may be considered as a high-dimensional column at particular wavelengths, with entries matching the spectra reflection [1].
The most recent remote sensing technology, neural network-based HSI, can distinguish either geometrical two-dimensional (2D) spatial data or one-dimensional continuous spectral data from a specific object in tandem. Spectral data reflects the basic form and chemical characteristics of the underlying object. HSI gathers detailed information on researched objects [2]. Environmental data analysis [3, 4], agricultural remote sensing [5, 6], viscosity prediction [7, 8], marine remote sensing [9, 10], and other remotely sensed applications all make use of hyperspectral remote sensing.
A significant part of using hyperspectral remote sensing techniques is to determine how to build a more efficient and accurate classification strategy. Traditional approaches that use manual feature extraction to reduce the dimension of the original image and project it into the low-level feature space include support vector machine (SVM) [8], 3D wavelet transform [9], and Gaussian mixture [10]. The efficacy of these algorithms, however, is dependent on the performance and quality of manually derived features. Furthermore, these methods are restricted to extracting subsurface characteristics. Traditional pattern recognition-based techniques for behavior detection suffer from low prediction accuracy and model uniformity as a result of these constraints. These methods typically change the way the original image bands are connected, resulting in the loss of spectral information or the inability to extract abstract HSI features. Unsurprisingly, these problems have an effect on the accuracy of classified images.

Research Motivation
Deep learning has progressed quickly in recent years, drawing a significant number of research activities, notably in computer vision, computational linguistics, object tracking, and other varied data processing aspects, and has achieved remarkable achievements [11]. Convolutional neural networks (CNNs) have become increasingly popular in image classification [2, 4], speech recognition [3, 11], target tracking [4, 5], and image thresholding [6, 7] as a result of the application and progress of deep learning architectures. In each of the aforementioned domains, CNN exhibits excellent capability for extracting information from images. In this regard, a growing number of studies have replaced traditional classification algorithms with CNN in order to quickly capture the spatial and spectral features of HSI and achieve improved classification accuracy. The majority of previous research on HSI classification [8, 1214] has used supervised machine learning and deep learning algorithms. For HSI classification from hyperspectral data, Zhou et al. [11] used a deep learning model. However, the performance of the CNN model suffers as a result of inadequate parameterization and layer selection. Due to this, the proposed CNN model for classifying hyperspectral data includes a modified layered structure and parameter setup. We also ran trials with different variants of the CNN model and compared the results with those from the baseline study. Previous research on HSI classification from hyperspectral data using deep learning [11] showed poor results due to an insufficient number of hidden layers in the CNN. In contrast, this research proposes an improved CNN model for effectively classifying hyperspectral data. Furthermore, we do experiments with several CNN models and evaluate their efficacy against the benchmark study.

Problem Statement
The task can be described in the following way: with HSI as an input, the classification model must classify the remote sensor image in a land cover environment. Given a land cover hyperspectral picture HSI=hsi1,hsi2,hsi3,...,hsin as an input, the purpose is to construct a model that recognizes and allocates a label to the imagery. In this study, we present a system for HSI classification based on enhanced CNN and remote sensor data. In contrast to current state-of-the-art machine learning classifiers using conventional feature representation approaches, we employ CNN with an adequate number of hidden layers and ideal set of parameters to construct the model successfully with a smaller proportion of learnable parameters.

Research Questions
To produce rapid and accurate classification results, a CNN model with an adequate number of hidden layers and optimal combination of parameters is utilized in this study. The purpose of this research is to classify HSI of land cover. The research topics posed in this paper are summarized and shown in Table 1.

Table 1.Research questions

Research questions Motivation
RQ1. How can the CNN model with an appropriate number of hidden layers & optimum set parameters be used to effectively categorize HSI? Examine the CNN model with an appropriate number of hidden layers and optimum set of parameters to understand how it might be used to categorize HSI.
RQ2. In terms of several performance evaluation measures, how efficient is the proposed approach in comparison to other DL and ML models? Assess the utility of the proposed deep learning mode, which classifies HSI using numerous performance standards such as accuracy, recall, F1 measure, and precision.
RQ3. What is the proposed method's performance in relation to benchmark methods? Using numerous assessment criteria such as precision, recall, F1-score, and accuracy, we compare the usefulness of the proposed deep learning model in categorizing HSI to baseline tests.

Research Contributions
The following is an overview of the study's major contributions:

We describe a robust hyperspectral image classification system based on a CNN-based deep learning model with an ideal number of hidden layers and a set of parameters. The suggested multi-layer network could be trained in half the time with roughly half the parameters used.

We synthesize remote sensor-based given input patterns (HSI) with an enhanced CNN model, utilizing their internal memory arrangements and resource utilization.

To address overfitting, we use a dropout technique that involves flipping off particular neurons at random during the training phase.

The proposed method has been tested on publicly available datasets, and the results show that it surpasses the state of art in terms of classification.

We evaluate the proposed approach by contrasting it to a baseline work on the same dataset using publicly available datasets of remote sensor data.


Here is a description of how the article is structured as follows. After evaluating comparable research in Section 2, Section 3 delves into the theoretical basis, while Section 4 focuses on the methodologies and Section 5 presents the results and analysis. Then Section 6 outlines the work's conclusions as well as the article's future perspectives.


Related Work

For the first time, Hu et al. [15] used a convolutional neural network to classify HIS, and thus built a 1D-CNN with one convolutional layer and two fully linked layers. The model, however, is based solely on spectral data. Their classification performance was slightly lower than that of conventional techniques without taking into account the spatial information provided by HSI. Following that, Makantasis et al. [16] used a random principal component analysis (R-PCA) to reduce the amount of spectra in the source images, followed by a convolution layer model to encapsulate the spectral features contained in the pixel resolution, and finally used a cross perceptron to complete the categorization (MLP). They obtained the maximum degree of categorization accuracy. It should be noted, however, that the use of PCA to reduce dimensionality will disrupt the spectrum consistency and result in the loss of certain spectral data. In another work, Zhang et al. [17] used data augmentation to expand the training samples of the original HSI information, and suggested an inter-CNN, using a single convolution to retrieve each pixel’s spectral features, as well as a double convolution to retrieve the objective pixel’s neighborhood spatial features. To achieve classification, the two convolution algorithms were merged. Bing et al. [17] constructed a multilayer deep network using 3D-CNN and residual connections. To make the network density denser, Lee et al. [12] employed the residue connectivity and 11 convolution cores to acquire the hierarchy attributes. Roy et al. [18] recently developed a compact model (HybridSN) that merged 3D- and 2D-CNNs to retrieve spectral characteristics and compared it to more improved systems in the field. Cao and Guo [19] have presented a novel end-to-end hybrid extension residue deep CNN model based on multi-dilated convolutions (HDC) and residue blocks built on SSRN. Wu et al. [20] developed the 3D ResNeXt structure by combining features and using tag softening methods.
The research reviewed above (also refer to Table 2) [12, 15, 1720] clearly demonstrates that utilizing deep learning model for HSI classification without proper layer adjustment and a lack of efficient parameter selection will not provide a good differentiating feature space from the spectral plane, and spectral information would be damaged as a result of this limitation. When working with classes with comparable patterns across a wide range of spectral channels, using a multi-dimensional CNN for feature extraction can improve the accuracy. To avoid any efficiency loss as a result of inefficient parameters and layer selection in the CNN, it is necessary to develop an effective deep CNN, which is the aim of this research.

Table 2.A review of summarized works

Study Technique(s) Results Limitations
Hu et al. [15] CNN 2% gain than baseline methods Without considering HIS' spatial data
Zhang et al. [17] Random principal component analysis (R-PCA) Improved results than SVM and other classifiers Disruption of spectral consistency by PCA
Inter-CNN using a single convolution Classification accuracy improved Unsupervised and semi-supervised
Bing et al. [21] 3D-CNN and residual connections Improved results (88%) over SVM and other DNN models Dimensionality reduction needs to be incorporated.
Lee et al. [12] Residue connectivity with eleven convolution cores 91% (Acc.) A stronger network with better training and cooperative exploitation of spatial
Roy et al. [18] HybridSN that merged 3D- CNN and 2D-CNN 3D-CNN approach is relatively inefficient compared to the suggested approach It is necessary to integrate the concept of dimension reduction.
Cao and Guo [19] Deep CNN model based on multi-dilated convolutions 94% (Acc.) Integrating HSI features with transfer learning or other ways of extending samples
Wu et al. [20] 3D ResNeXt structure 93% (Acc.) Combining deep learning methods can help improve classification results.


Proposed Methodology


The methodology of the research (refer to Fig. 1) is described in this part. The primary goal of our technique is to develop an enhanced convolutional neural network model-driven deep learning solution for HSI in a land cover scenario. As an input, the system gets a stream of source data from remote sensors, and the result is the classification of input HSI into different classes, i.e., Asphalt, Meadows, Gravel, Trees, Metal, Bare Soil, Bitumen, Bricks, and Shadows. The dataset used and the proposed method are explained in the subsections as follows.

Fig. 1. Proposed framework.


Data Collection
Researchers have employed numerous datasets for classification of the HSI environment. An evaluation of the suggested approach was carried out using a well-known HSI dataset. The Pavia University HSI are for classification purposes. During a flying mission above Pavia in northern Italy, the ROSIS sensor captured the Pavia University HSI. The HSI data is publicly available on Group De Inteligencia Computacional [22]. The HSI of Pavia University contains 610×340 pixels and comprises 103 spectral bands. There is a 1.3-meter geometrical resolution, and the HSI's ground truth is broken down into nine categories. Before the evaluation, certain pixels in pictures must be eliminated since they offer insufficient content. Fig. 2 depicts important information about the dataset.

Fig. 2.Detail illustration of the Pavia University dataset.


Preprocessing
Hyperspectral pictures provide a wealth of spectral and spatial information that may be used to classify ground objects. Nevertheless, the overly large dimensionality and extremely duplicated information might result in a significant increase in computing overhead, which may have an impact on classification accuracy. As a result, performing hyperspectral picture dimension reduction before categorization is critical [23].
The data for HSI is generally available in mat type, which may be accessed using a computer language such as Python. Hyperspectral data has a unique spatial-temporal perspective. Among the most important preprocessing activities are band selection and normalization of HSI data. This phase aids in the handling of HSI data as well as implementation of classifiers [24]. Some spectral bands may be hard to analyze or include outliers that cause the spectral kinetics to change based on the sensor. Moisture content bands, bands with a poor signal-to-noise ratio, and saturation readings, for instance, are frequently removed [25]. It would not only enhance the consistency of the classifications by reducing data distortion, but also assists combat the well-known dimensionality constraint, which causes statistical classification techniques to function worse as the dimensionality of the data grows. Employing principal component analysis (PCA) [13] or mutual information [26], band selection may also be used to eliminate non-insightful bands.

3.2.1 Band selection using PCA
PCA [27] is a well-known method of dimensionality reduction technique. A two-dimensional method would be a viable option. Extraction of spatial characteristics from two-dimensional batches or the whole band is used to leverage spatial processing. However, band selection should always be considered with caution. Unsupervised image compression can occasionally produce inferior results as opposed to utilizing the original data because it removes content that is not useful for compressing, yet is helpful for classifications [28]. We use the typical four phases to calculate PCA [28].
PCA compresses all of the information from a “N” band raw data into a smaller proportion than “N” of new bands (or principal components) so as to accomplish reduced dimensions by maximizing correlations and reducing redundancies [27].

3.3 A Detailed Overview of the Proposed Technique
For classifying HSI, we use the CNN model shown in Fig. 3.
Fig. 3.Detailed illustration of proposed scheme.



3.3.1 Source image
Arrays of pixels cover the image’s size and shape. One matrix is characterized by three channels in a color photo, whereas one matrix is represented by one band in a grayscale image. Color intensity is denoted as a number between 0 and 255 for each pixel in the grid (All things image classification, n.d.). An incoming image’s structure is described by the “input shape” option in a convolution layer. There are three dimensions showing the structure of a source image, specified as h×w×d. The height of a photograph is denoted by h, width denoted by w, and stream of a source image denoted by d (red, green, and blue for color images). This creates a 3D input picture grid [29]. The incoming picture has now become ready for the convolutional layer to interpret it.

3.3.2 First convolution layer
Extraction of features is the primary purpose of a CNN’s convolutional layer, which is the first layer in the architecture [30]. In order to create the feature space, the convolutional layer utilizes trainable filters (a smaller matrix) that are combined with the input picture template. As a result, in a convolutional operation, a filtering grid K is reapplied to an image matrix I, resulting in the formation of a feature space F.
The formula for calculating the feature space F is expressed as:

(1)


Fig. 4 provides an explanation of the variables used in Equation (1).
Fig. 4.Explanation of variables.


If there is a minus pixel value in the feature space, ReLU substitutes the value with 0 and executes an element-wise process. Using the given equations, the CNN model’s nonlinearity may be expressed in ReLU’s primary function as follows: (x)=max(x,0).
Formula (2) is used to figure out the matrix F.

(2)



3.3.3 First maxpooling layer
A subsampling layer or pooling layer follows the convolution layer and produces a downstream sampling representation of input vectors, while reducing the feature space map without compromising vital information. To reduce computational cost and overfitting, it is employed [31]. Since the feature map generated from the previous layer was too large, we used a max pooling technique in our suggested model.
The formula for calculating maxpool is shown as follows:

(3)


(4)



3.3.4 Second convolutional layer
The maxpooling feature map is used as a feed for the subsequent convolution layer, which is used to retrieve the high-level features. The second and third convolution layers, which are identical, do an analogous computation (Equations (1) and (2)).
3.3.5 Second max pooling layer
In order to lower the volume of the vector, the second pooling layer has been implemented. Similar to the calculations of the first layer of maxpooling, the second layer is computed (Equations (3) and (4)).
3.3.6 Flattening layer
Based on the results of the previous maximum pooled layer’s outcome (a feature mapping), this tier has been created. In this layer, a column or array of features is flattened. To describe the feature maps, the function [32] is reshaped within the layer. It is computed as follows:

(5)



3.3.7 Classification
Class probabilities are determined by using a single neuron density layer in conjunction with an activation function (i.e., sigmoid). The following is how Equation (6) calculates the net input:

(6)



3.3.8 Making use of the activation function
At the classification layer, the ReLU and sigmoid activation functions are utilized. The sigmoid is calculated using Equation (7), while the ReLU is calculated using f(n)=max(n,0).

(7)


Applied Example
In this part, we explain how the suggested approach works for HSI in terms of its implementation. To do this, we begin with an HSI image and process it through the suggested model. Fig. 5 shows how well the proposed model may be used to classify land cover HSI.
Fig. 5.Applied example.


3.4.1 Source image interpretation
After capturing it, we will use the proposed model to identify the land cover HSI. To begin, a pixel-by-pixel breakdown of the source image is performed. A trigrid of reddish, greenish, and blue slices is constructed in the case of a multicolor graphic, which is 7×8×3 in our example. As shown in Fig. 6, all of the pixels in the grid are expressed in the form.
Fig. 6.Three channels for the input picture matrix.


3.4.2 Employing the first convolutional layer to extract features
Following interpreting a picture, a convolutional layer extracts features from the picture map. Within this tier, the filtering grid and the source image grid are convoluted. A feature map is generated by convolving a filtering grid with a source image grid (Equations (1) and (2)). The activities involved with filtering include:
Activity 1: Synchronization of filtering and picture matrices.
Activity 2: A filtering grid element is multiplied by each picture’s pixels.
Activity 3: Accumulate the values from Activity 2 and compute the total to get the element μ_12 of the feature space. Eventually, the filtering grid is shifted right, and the algorithm is repeated for other entries in the feature mapping.

3.4.3 Dimension reduction with the first pooling layer
Convolved feature maps are maxpooled just after the preceding convolution layer of the CNN model has been applied. In order to reduce the picture vector dimension, the pooling layer is used. Many activities are carried out throughout the poling layer, including the following:
Activity 1: determining the scale factor (in our case, 2),
Activity 2: establishing a step (also in our case, 2),
Activity 3: moving the frame across the feature space, and finally,
Activity 4: selecting the highest value.

3.4.4 Feature extracting features with 2nd layer convolution
There are several similarities between how this 2nd convolution and the 1st convolutional layer operate. The second convolution plays an important role in the extraction of high-level features from the feature space, and it does so by utilizing all of the filtration functions discussed in Section 4.4.2. Considering that the CNN model performed feature extraction in layers, the layer at a deeper level could be capable of extracting high-level information [30].

3.4.5 Applying 2nd pooling layer
The workings of the subsequent pooling layer are identical to those of the first pooling layer. Using the approach described in Section 4.4.3, the second pooling layer reduces the dimensional scale to make it easier to recognize the features.

3.4.6 Using the flatten layer
An extended feature set is obtained from the outcome of the subsequent pooling layer (the maxpooling map).

3.4.7 Utilization of activation functions for classification

The classification of the image has finally been conducted. The input (feature map) received from the flatten layer is exploited for classification of land cover HSI. Here is a description of the classification task. Using ReLU and sigmoid activation functions, we calculate the probability of the 15 land cover classes.
The gross inputs may be calculated by using the Equation (6) for classifying the ultimate picture depiction. N=x1.w1+x2.w2+x1.w3……xn.wn+b, assuming x1-x4 are equal to 4, b is equal to 0.10, and then the following calculation is used:

Utilization of the ReLU activation function: Using the equation O=max (0,N), where O is the outcome and N is the number of distinct net entries, the ReLU functions may be used. The formula for O is explained below.

Using sigmoid activation: The following formula is used to compute the sigmoid function: S=F(N), with S being the output. N=n1,n2,n3,n4,n5,n6,n7,n8,n9 denotes the number of calculated net inputs, and F is the sigmoid activation function. S is calculated by using the Equation (7). Since the "Meadows" land cover class has the highest probability (S = 0.99942) based on the calculations described above, the source picture is projected to be "Meadows" (refer to Fig. 7).
Fig. 7.Classification of land cover data from HSI using the CNN model.



Results Discussion

This section summarizes and evaluates the information gathered by establishing experiment setups and conducting multiple trials to address the research questions.
Answering Research Question No. 1


To get a solution to RQ1: “How can the CNN model with an appropriate number of hidden layers and optimum set of parameters be used to effectively categorize HSI?” we should examine the hardware and software to be used to create the recommended system. The parameters for training and performance of the CNN structure are thus fully explored.

4.1.1 Hardware and software configuration For the experiments, we utilized an Intel Core i7 CPU, a 1080 GPU, and Windows 10 for the tests. The computer language used was Python 3.5. The PyTorch toolkit in Python 3.5 is used to implement deep learning, whereas MATLAB 2019b is utilized to process and analyze image embeddings. To use the pre-trained model, we reduced the STIF frame dimension produced from the sample to 224×224×3. We use the supplied dataset to evaluate the proposed approach, as described in Section 4.1. The dataset is split into two sections of training and testing. During the first stage, we set the scaling factor to 0.001, which decreases by a ratio of 0.9 every 10th epoch. The Adam optimizer is used for optimization with a momentum value of 0.999. The training process repeats until 100 epochs have elapsed.

4.1.2 Parameter setting We carried out an investigation using numerous criteria for effectively categorizing HSI. From one to six convolutional layers, the number of filters is from 8 to 64, the number of epochs is 2 to 10, and the filter sizes include three by three and two by two. It made use of several fixed-size parameters, such as the maximum number of pool layers, picture size, activation function, and batch size. The frequency of the convolution layer in each CNN model depends on the quantity of epochs and size of the filter used. Each CNN model is shown in Table 3, which shows the configuration setup for each one. Table 4 shows the test accuracy, training duration, and training loss for all CNN models tested with varied parameter values. We found that the suggested CNN model (CNN-HSI(10)), which has two convolutional layers, 10 epochs, a 3-filter size and 32 filters, functioned better and achieved the best accuracy of 94.77%. Increasing the epochs and limiting the number of convolutional layers lowers the suggested model of a CNN’s train loss rate.

Table 3.Different CNN models with parameter settings for HSI classification

Model for HSI Number of convolutional layers Number of filters Size of filter Number of epochs
CNN-HSI(1) 6 16 2 6
CNN-HSI(2) 5 64 3 2
CNN-HSI(3) 2 12 3 3
CNN-HSI(4) 5 8 3 4
CNN-HSI(5) 5 16 3 3
CNN-HSI(6) 5 8 2 3
CNN-HSI(7) 2 64 3 2
CNN-HSI(8) 3 12 3 8
CNN-HSI(9) 4 64 2 4
CNN-HSI(10) 2 32 3 10


Table 4.Training time, training loss, and test accuracy of CNN HSI models for HSI classification
Model for HSI Training loss Training time (s) Test accuracy (%)
CNN-HSI(1) 0.08 40 74.92
CNN-HSI(2) 0.28 291 79.27
CNN-HSI(3) 0.28 24 80.17
CNN-HSI(4) 0.16 141 83.61
CNN-HSI(5) 0.16 54 87.96
CNN-HSI(6) 0.28 149 88.6
CNN-HSI(7) 0.28 44 92.4
CNN-HSI(8) 0.03 53 92.67
CNN-HSI(9) 0.05 156 92.69
CNN-HSI(10) 0.01 41 94.77

4.1.3 Implementation of HSI classification in land cover
For deployment of the classification system for HSI, Table 5 shows the distribution of processor time for a particular deductive logic over network layers. With three interpretations per second, the network analysis took an adequate amount of time for feature extraction and image source processing.

Table 5.Layered process time distribution and classification assessment tasks
Layer Execution time (% of ms) Millions of operations (%)
Conv1 41.01 21.1
max_pool1 2.9 0.08
Conv2 3.5 8.1
max_pool2 2.8 0.02
flatten 3.3 7.7
ReLU and sigmoid 0.2 0.1


Responding to Research Question No. 2 To assess the proposed DWS-CNN model for categorizing HSI, we tested the performance of traditional CNN models on the obtained datasets to answer RQ2: “In terms of several performance evaluation measures, how efficient is the proposed approach in comparison to the other DL and ML models?”

4.2.1 Experimental results
Because the model has been trained on a limited number of labeled samples, lowering the system parameters makes training easier; therefore, the dimension of the convolution kernel in the neural network is set at 1×1. Simultaneously, the number of filters per layer is adjusted to 64, yielding a final outcome of 5×5×64. To guarantee the experiment’s consistency, the training images for all investigations were selected at random from the HSI collection. The enhanced CNN model is presented to train small quantities of sample data more thoroughly, which not only lowers the computing cost but also enhances the model’s training efficiency. Table 6 shows the confusion matrix of a CNN-based HSI model in a laboratory setup.

Table 6.A CNN-based HSI model’s confusion matrix
Asphalt Meadows Gravel Trees Painted metal sheets Bare soil Bitumen Self-blocking bricks Shadows
Asphalt 1921 1 5 1 0 2 81 31 1
Meadows 5 4388 10 21 1 171 1 7 0
Gravel 31 12 392 1 0 1 0 121 0
Trees 1 73 6 831 1 5 0 1 1
Painted metal sheets 0 9 6 0 511 1 1 0 1
Bare soil 11 43 6 5 1 1261 0 2 0
Bitumen 31 8 6 1 0 1 290 6 1
Self-blocking bricks 43 2 159 0s 1 4 0 1050 0
Shadows 2 5 6 1 1 0 1 0 389


4.2.2 Contrast of suggested CNN with a traditional SVM
Table 7 displays the computation cost as the total number of multipliers of one frame of data in convolution-based layers for the forward run, with the total number of learnable model parameters calculated as well. Table 7 compares the efficacy of the recommended technique, namely the convolutional neural network, with that of a traditional SVM. It also indicates that, in comparison to a standard SVM, using a convolution-based CNN enhances accuracy by 0.6% and dramatically lowers computation times and trainable parameters. The results of the experiment and complexity analysis demonstrate that the proposed model may be used in spectral image-based applications with a reasonable accuracy.

Table 7.CNN (proposed) and conventional SVM performance comparison
Accuracy (avg %) Overhead in computing (min) Parameters
Conventional SVM 92.008 8.428 14,362
CNN (proposed) 93.164 2.106 6, 844


4.2.3 Results of cross-validation for classification techniques
The ten-fold cross-validation technique was used to test classification models [14, 33, 34]. Table 8 presents the findings of the average of accuracy, standard deviation (accuracy), precision Marco (average), standard deviation (precision Marco), recall Marco (avg.), standard deviation (recall Marco), average F1 Marco, and standard deviation F1 Marco. These factors will assist us in evaluating and forecasting the effectiveness of HSI [3537]. Table 8 shows that the greatest average accuracy in the Pavia University dataset is 93.621%, when compared to SVM and LSTM.

Table 8.Cross-validation of classification methods
Classifier Accuracy Precision Macro Recall Macro F1 Macro
Avg. SD Avg. SD Avg. SD Avg. SD
SVM 91.041 0.06 85 0.06 88 0.07 86 0.07
LSTM 92.96 0.06 86 0.05 89 0.06 87 0.06
CNN (proposed) 93.621 0.05 89 0.04 90 0.05 88 0.05


Because of their efficacy, performance, and capacity to address a heavy feature set, where the number of features is regarded as quite high, SVM and CNN are the most commonly used classification techniques in HCI investigations. These are the supervised learning methods that may be applied to classification problems. The pixel is plotted as a spot in r-dimensional space in SVM [38], with each component’s importance being the score of a specific arrangement. Rather, undertake classification by locating the hyperplane that best distinguishes the two different groups.
The classification maps of three techniques on the Pavia University dataset are shown in Fig. 8.
Fig. 8.Three techniques’ classification maps using the Pavia University dataset: (a) CNN (93.621%), (b) LSTM (92.960%), and (c) SVM (91.041%).


4.2.4 Evaluation of performance
The effectiveness of the recommended model is examined using four sets of measures (accuracy, precision, recall, and F1-score) and the detection time.
Table 9 demonstrates that the model can correctly identify spectral images with a high accuracy and recall (sensitivity). The model’s excellent recall and precision findings in the experiment suggest that it has a lot of potential for reducing false positives and negatives in HSI applications that use remote sensor data in land cover classification. The model's average classification time of 1,004 samples marked 50.3 seconds throughout the testing, making it acceptable for real-time HSI in the land cover classification paradigm.

Table 9.Results of proposed CNN model for land cover classification in HSI paradigm (unit: %)
Land cover class Precision Recall F1-score
Asphalt 96 96 95
Meadows 97 96 97
Gravel 79 78 78
Trees 92 88 89
Painted metal sheets 97 97 97
Bare soil 95 98 95
Bitumen 85 91 84
Self-blocking bricks 87 86 87
Shadows 98 98 98


Answering the Third Research Question
While answering RQ3: “What is the proposed method’s performance in relation to benchmark methods?” we evaluated the effectiveness of the baseline on the given dataset to assess the proposed CNN model for HSI. We also conducted a statistical assessment to verify the usefulness of the proposed technique.

4.3.1 Using the baseline methods as a comparison
We chose a study published in [8] that used the 10-fold cross-validation approach on the same dataset to compare the proposed system with a notable recent work of remote sensor-based HSI classification. They also created a system based on deep CNN categorization methods. The comparative evaluation in terms of precision and F1 measure is presented in Table 10. As observed, the proposed model’s sensitivity and F1-score findings are greater than that of the baseline framework. This improvement is due to the recommended platform’s ability to understand and retain the classification of remote sensor-based HSI in the land cover paradigm.
Furthermore, we also compared the proposed method with other studies and presented the results in Table 11. It is evident that the proposed system with an enhanced CNN has outperformed the comparable studies.

Table 10.Comparative results to benchmark work (unit: %)
Land cover class Sun  et al. [8] Proposed framework
Precision Recall F1-score Precision Recall F1-score
Asphalt 87 92 91 96 96 95
Meadows 92 91 96 97 96 97
Gravel 78 77 77 79 78 78
Trees 85 88 87 92 88 89
Painted metal sheets 88 85 86 97 97 97
Bare soil 91 94 91 95 98 95
Bitumen 89 90 85 85 91 84
Self-blocking bricks 87 88 87 87 86 87
Shadows 91 90 91 98 98 98



Table 11.Comparison with similar works
Study Technique Accuracy (%)
Zhou et al. [11] LSTM 91.51
Sun et al. [8] SVM 89.06
Hu et al. [15] CNN 88.91
Suggested approach (proposed work) CNN with an appropriate number of hidden layers and an optimum set of parameters 93.261


External Validation of the Proposed Approach
The proposed approach in Section 5.1.1 was tested internally to ensure model stability, and two additional datasets were collected to externally justify the proposed methodology, as stated earlier in Section 4.1. Following an internal validation of the proposed approach to ensure design reliability, we collected two additional datasets, which are described as follows.

Dataset 2 (Indian Pines): The second dataset, which covers 224 spectrums, has been taken by the AVIRIS sensor over the Indian Pine testing site in Northwestern Indiana, USA. In this study, we used 200 bands after excluding 4 bands with a value of 0 and 20 disruptive bands that were impacted by moisture content. The picture has a pixel density of 20 m and a spatial dimension of 145×145 pixels. Each class has a total of 10,249 samples, ranging from 20 to 2,455. We have specified the following layer parameters for this dataset in our CNN as follows: n1=226,n2=204,n3=42,n4=102,n5=18,k1=26 and k2=6, with the maximum tally of training examples inside the given data being 82,216.

Dataset 3 (IEEE GRSS data): The said dataset was collected via the IEEE GRSS data fusion software [39]. To get the most accurate results, this dataset includes 146 multispectral bands that span the range of wavelength between 380 nm and 1050 nm. Each band has a 4.65 nm range of wavelength with a 2.5-m resolution. There were 2,832 pixels in the training set and 12,197 pixels in the test set. There are 649,816 unidentified pixels in the image, out of a 664,845 maximum. When using this dataset in the suggested CNN classifier, the layer parameters are described as follows: n1=224,n2=199,k1=246,k2=6,n3=402,n4=102 and n5=10 and the number of overall training examples in the given data is 21,481.

By assessing it against the various datasets presented in this part, we demonstrate that the suggested model for categorizing HSI data is efficient and precise. Classifiers are developed on the available datasets in order to assess the proposed technique. On the other hand, models that have received training on the main dataset (dataset 1) are evaluated on the two other datasets. Table 12 summarizes the results that were discovered. We evaluated the suggested (CNN) model against both the SVM and LSTM baseline approaches [40, 41]. The suggested system has been found to beat the comparable approaches (SVM and LSTM) by 82 percent on average. The results of this investigation show that the suggested model and its ability to improve classification efficacy are supported by the results.

Table 12.Proposed method’s external validation
Sensitivity Specificity Accuracy
Dataset 2 (Indian Pines) SVM 0.786 0.742 0.79
LSTM 0.755 0.776 0.78
Proposed 0.82 0.827 0.825
Dataset 3 (IEEE GRSS data) SVM 0.753 0.937 0.802
LSTM 0.745 0.761 0.751
Proposed 0.824 0.824 0.816


Figs. 9 and 10 depict the effectiveness of the proposed approach when compared to comparable approaches on different datasets.
Fig. 9.External validation of proposed approach on Indian Pines dataset: (a) LSTM (78.0%), (b) SVM (79.0%), and (c) CNN (82.5%).


Fig. 10.External validation of proposed approach on IEEE GRSS dataset: (a) LSTM (75.11%), (b) SVM (80.02%), and (c) CNN (81.6%).



Conclusion

We suggested an effective CNN model in this work to handle the problem of HSI image classification. Our proposed method is divided into three modules of dataset collection, data preparation, and CNN model deployment. On a given dataset, we compared our results to the original convolutional filters. According to our results, the proposed system outperformed current classical convolutions in trials. The proposed method is also compared to past work on a motivated baseline method. Our results (accuracy = 93.621, precision = 91.571, recall = 92, F1-score = 90.714) demonstrate that the suggested technique achieves the best overall performance over a wide range of evaluation criteria. However, some of the top major flaws in the proposed model are the following: (1) in this work, we utilized a limited dataset from a few domains and (2) just one deep learning-based CNN model, with no further deep learning models, as well as (3) embeddings instead of a pre-trained CNN model. Future studies may look at the use of varied datasets from many domains of HSI, as well as the use of different configurations of deep learning models and that of specific pre-trained neural network models, such as ImageNet, ResNet, and others.


Author’s Contributions

Conceptualization, DA. Methodology, DA, OB. Software, AA, MA. Validation, DA, MUA. Formal analysis, DA, AA. Investigation, OB. Resources, MZA. Data curation, AA, OB. Writing of the original draft, MZA. Writing of the review and editing, MUA. Visualization, MZA, MUA. Supervision, DA. Project administration, DA. Funding acquisition, DA.


Funding

The authors extend their appreciation to the Deputyship for Research & Innovation, Ministry of Education in Saudi Arabia for funding this research work through the project number IFPRC-114-612-2020 and King Abdulaziz University, DSR, Jeddah, Saudi Arabia.


Competing Interests

The authors declare that they have no competing interests.


Author Biography

Author
Name : Omaimah Bamasaq
Affiliation :Department of Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Saudi Arabia
Biography : She is a Professor of Cybersecurity at the Department of Computer Science, FCIT, KAU, Dean of Community Services and Continuing Education at UJ, Visiting Researcher at MIT. She received, Ph.D. in Computer Science, Electronic Information Security, University of Manchester, UK (2006).

Author
Name : Daniyal Alghazzawi
Affiliation : Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Saudi Arabia
Biography : He is a Professor of Cybersecurity at the Computing Information Systems Department and the head of the Information Security Research Group at King Abdulaziz University. He graduated with a Ph.D. in computer science from the University of Kansas in 2007. He served in a variety of administrative and leadership roles and was awarded the Leadership Management International Certificate (LMI). In 2010, he was appointed Honorary Lecturer at the University of Essex. Daniyal has organized both domestic and international seminars and conferences. In the disciplines of smart e­ learning, cybersecurity, and artificial intelligence, he is the author of multiple scholarly papers and patents. He has also served as a reviewer and editor for a number of local and international conferences, journals, workshops, and contests. Daniyal has worked as a consultant for a number of companies, assisting them in developing information security policies and obtaining certifications such as ABET, ISO27001, ISO22301, and others.

Author
Name : Suhair Alshehri
Affiliation :Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Saudi Arabia
Biography :She received the Ph.D. degree in computing and information sciences from Golisano College of Computing and Information Sciences, Rochester Institute of Technology, in 2014. She is currently an Assistant Professor with the Information Technology Department, Faculty of Computing and Information Technology, King Abdulaziz University. Her main research interests include security and privacy in computer and information systems and applied cryptography.


Author
Name : Arwa Jamjoom
Affiliation : Department of Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Saudi Arabia
Biography :She is an assistant professor in the Department of Computer Science at King Abdulaziz University in Jeddah, Saudi Arabia. Her research interests lie in data warehousing and discreteevent simulation of healthcare delivery system.

Author
Name : Muhammad Zubair Asghar
Affiliation : Institute of Computing and Information Technology (ICIT), Gomal University, D. I. Khan (KP), Pakistan
Biography :He is an HEC approved supervisor recognized by Higher Education Commission (HEC), Pakistan. Ph.D. research includes recent issues in Opinion Mining and Sentiment Analysis, Computational Linguistics and Natural Language Processing. More than50 publications in journals of international repute (JCR and ISI indexed) and having more than 20 years of University teaching and laboratory experience in Social Computing, Text Mining, Computational Linguistics and Opinion Mining and Sentiment Analysis. Currently, he is acting as Reviewer and Academic Editor of different top-tier journals, such as IEEE ACCESS and PLOS ONE. Furthermore, he is also acted as Special Session Chair (Social Computing) at BESC 2018 International Conference (Taiwan) and Lead Guest Editor, Special Issue


References

[1] S. Kakarla, “Land cover classification of hyperspectral imagery using deep neural networks: using deep learning (DL) for land cover classification of hyperspectral imagery,” 2020 [Online]. Available: https://towardsdatascience.com/land-cover-classification-of-hyperspectral-imagery-using-deep-neural-networks-2e36d629a40e.
[2] L. Dang, P. Pang, and J. Lee, “Depth-wise separable convolution neural network with residual connection for hyperspectral image classification,” Remote Sensing, vol. 12, no. 20, article no. 3408, 2020. https://doi.org/10.3390/rs12203408
[3] C. Zhao, H. Zhao, G. Wang, and H. Chen, “Hybrid depth-separable residual networks for hyperspectral image classification,” Complexity, vol. 2020, article no. 4608647, 2020. https://doi.org/10.1155/2020/4608647
[4] T. H. Hsieh and J. F. Kiang, “Comparison of CNN algorithms on hyperspectral image classification in agricultural lands,” Sensors, vol. 20, no. 6, article no. 1734, 2020. https://doi.org/10.3390/s20061734
[5] Z. Pan, C. L. Glennie, J. C. Fernandez-Diaz, C. J. Legleiter, and B. Overstreet, “Fusion of LiDAR orthowaveforms and hyperspectral imagery for shallow river bathymetry and turbidity estimation,” IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 7, pp. 4165-4177, 2016.
[6] D. K. Prasad and K. Agarwal, “Classification of hyperspectral or trichromatic measurements of ocean color data into spectral classes,” Sensors, vol. 16, no. 3, article no. 413, 2016. https://doi.org/10.3390/s16030413
[7] J. M. Bioucas-Dias, A. Plaza, G. Camps-Valls, P. Scheunders, N. Nasrabadi, and J. Chanussot, “Hyperspectral remote sensing data analysis and future challenges,” IEEE Geoscience and Remote Sensing Magazine, vol. 1, no. 2, pp. 6-36, 2013.
[8] W. Sun, C. Liu, Y. Xu, L. Tian, and W. Li, “A band-weighted support vector machine method for hyperspectral imagery classification,” IEEE Geoscience and Remote Sensing Letters, vol. 14, no. 10, pp. 1710-1714, 2017.
[9] A. Ghasemzadeh and H. Demirel, “Hyperspectral face recognition using 3D discrete wavelet transform,” in Proceedings of 2016 6th International Conference on Image Processing Theory, Tools and Applications (IPTA), Oulu, Finland, 2016, pp. 1-4.
[10] J. Qu, Q. Du, Y. Li, L. Tian, and H. Xia, “Anomaly detection in hyperspectral imagery based on gaussian mixture model,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 11, pp. 9504-9517, 2021.
[11] F. Zhou, R. Hang, Q. Liu, and X. Yuan, “Hyperspectral image classification using spectral-spatial LSTM,” Neurocomputing, vol. 328, pp. 39-47, 2019.
[12] H. Lee and H. Kwon, “Going deeper with contextual CNN for hyperspectral image classification,” IEEE Transactions on Image Processing, vol. 26, no. 10, pp. 4843-4855, 2017.
[13] M. D. Farrell and R. M. Mersereau, “On the impact of PCA dimension reduction for hyperspectral detection of difficult targets,” IEEE Geoscience and Remote Sensing Letters, vol. 2, no. 2, pp. 192-195, 2005.
[14] ] M. Hammad, A. M. Iliyasu, I. A. Elgendy, and A. A. Abd El-Latif, “End-to-end data authentication deep learning model for securing IoT configurations,” Human-centric Computing and Information Sciences, vol. 12, article no. 04, 2022. https://doi.org/10.22967/HCIS.2022.12.004
[15] W. Hu, Y. Huang, L. Wei, F. Zhang, and H. Li, “Deep convolutional neural networks for hyperspectral image classification,” Journal of Sensors, vol. 2015, article no. 258619, 2015. https://doi.org/10.1155/2015/258619
[16] K. Makantasis, K. Karantzalos, A. Doulamis, and N. Doulamis, “Deep supervised learning for hyperspectral data classification through convolutional neural networks,” in Proceedings of 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 2015, pp. 4959-4962.
[17] H. Zhang, Y. Li, Y. Zhang, and Q. Shen, “Spectral-spatial classification of hyperspectral imagery using a dual-channel convolutional neural network,” Remote Sensing Letters, vol. 8, no. 5, pp. 438-447, 2017.
[18] S. K. Roy, G. Krishna, S. R. Dubey, and B. B. Chaudhuri, “HybridSN: exploring 3-D–2-D CNN feature hierarchy for hyperspectral image classification,” IEEE Geoscience and Remote Sensing Letters, vol. 17, no. 2, pp. 277-281, 2020.
[19] F. Cao and W. Guo, “Deep hybrid dilated residual networks for hyperspectral image classification,” Neurocomputing, vol. 384, pp. 170-181, 2020.
[20] P. Wu, Z. Cui, Z. Gan, and F. Liu, “Three-dimensional ResNeXt network using feature fusion and label smoothing for hyperspectral image classification,” Sensors, vol. 20, no. 6, article no. 1652, 2020. https://doi.org/10.3390/s20061652
[21] B. Liu, X. Yu, P. Zhang, and X. Tan, “Deep 3D convolutional network combined with spatial-spectral features for hyperspectral image classification,” Acta Geodaetica et Cartographica Sinica, vol. 48, no. 1, pp. 53-63, 2019.
[22] M. Grana, M. A. Veganzons, and B. Ayerdi, “Hyperspectral remote sensing scenes,” [Online]. Available: http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes.
[23] L. Lin, C. Chen, and T. Xu, “Spatial-spectral hyperspectral image classification based on information measurement and CNN,” EURASIP Journal on Wireless Communications and Networking, vol. 2020, article no. 59, 2020. https://doi.org/10.1186/s13638-020-01666-9
[24] I. Bidari, S. Chickerur, H. Ranmale, S. Talawar, H. Ramadurg, and R. Talikoti, “Hyperspectral imagery classification using deep learning,” in Proceedings of 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), London, UK, 2020, pp. 672-676.
[25] N. Audebert, B. Le Saux, and S. Lefevre, “Deep learning for classification of hyperspectral data: a comparative review,” IEEE Geoscience and Remote Sensing Magazine, vol. 7, no. 2, pp. 159-173, 2019.
[26] B. Guo, S. R. Gunn, R. I. Damper, and J. D. B. Nelson, “Band selection for hyperspectral image classification using mutual information,” IEEE Geoscience and Remote Sensing Letters, vol. 3, no. 4, pp. 522-526, 2006.
[27] S. Yu, S. Jia, and C. Xu, “Convolutional neural networks for hyperspectral image classification,” Neurocomputing, vol. 219, pp. 88-98, 2017.
[28] K. Koonsanit and C. Jaruskulchai, “Band selection for hyperspectral image using principal components analysis and maxima-minima functional,” in Knowledge, Information, and Creativity Support Systems, Berlin, Heidelberg, Germany: Springer Berlin Heidelberg, 2011, pp. 103-112.
[29] H. Hassan, A. K. Bashir, M. Ahmad, V. G. Menon, I. U. Afridi, R. Nawaz, and B. Luo, “Real-time image dehazing by superpixels segmentation and guidance filter,” Journal of Real-Time Image Processing, vol. 18, no. 5, pp. 1555-1575, 2021.
[30] M. H. Khan, Z. Saleem, M. Ahmad, A. Sohaib, H. Ayaz, M. Mazzara, and R. A. Raza, “Hyperspectral imaging-based unsupervised adulterated red chili content transformation for classification: Identification of red chili adulterants,” Neural Computing and Applications, vol. 33, no. 21, pp. 14507-14521, 2021.
[31] A. Khattak, M. Z. Asghar, M. Ali, and U. Batool, “An efficient deep learning technique for facial emotion recognition,” Multimedia Tools and Applications, vol. 81, no. 2, pp. 1649-1683, 2022.
[32] A. Khattak, M. U. Asghar, U. Batool, M. Z. Asghar, H. Ullah, M. Al-Rakhami, and A. Gumaei, “Automatic detection of citrus fruit and leaves diseases using deep neural network model,” IEEE Access, vol. 9, pp. 112942-112954, 2021.
[33] N. Hussain, M. A. Khan, S. Kadry, U. Tariq, R. R Mostafa, J. I. Choi, and Y. Nam, “Intelligent deep learning and improved whale optimization algorithm based framework for object recognition,” Human-centric Computing and Information Sciences, vol 11, article no. 34, 2021. https://doi.org/10.22967/HCIS.2021.11.034
[34] S. Jang and C. Choi, “Prioritized environment configuration for drone control with deep reinforcement learning,” Human-centric Computing and Information Sciences, vol. 12, article no. 01, 2022. https://doi.org/10.22967/HCIS.2022.12.002
[35] M. M Salim, V. Shanmuganathan, V. Loia, and J. H. Park, “Deep learning enabled secure IoT handover authentication for blockchain networks,” Human-centric Computing and Information Sciences, vol. 11, article no. 21, 2021. https://doi.org/10.22967/HCIS.2021.11.021
[36] S. Rathore, J. H. Park, and H. Chang, “Deep learning and blockchain-empowered security framework for intelligent 5G-enabled IoT,” IEEE Access, vol. 9, pp. 90075-90083, 2021.
[37] S. K. Singh, A. E. Azzaoui, T. W. Kim, Y. Pan, and J. H. Park, “DeepBlockScheme: a deep learning-based blockchain driven scheme for secure smart city,” Human-centric Computing and Information Sciences, volume 11, article no. 12, 2021. https://doi.org/10.22967/HCIS.2021.11.012
[38] A. Ozdemir and K. Polat, “Deep learning applications for hyperspectral imaging: a systematic review,” Journal of the Institute of Electronics and Computer, vol. 2, no. 1, pp. 39-56, 2020.
[39] M. Khodadadzadeh, J. Li, S. Prasad, and A. Plaza, “Fusion of hyperspectral and LiDAR remote sensing data using multiple feature learning,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 8, no. 6, pp. 2971-2983, 2015.
[40] T. R. Gadekallu, D. S. Rajput, M. P. K. Reddy, K. Lakshmanna, S. Bhattacharya, S. Singh, A. Jolfaei, and M. Alazab, “A novel PCA–whale optimization-based deep neural network model for classification of tomato plant diseases using GPU,” Journal of Real-Time Image Processing, vol. 18, no. 4, pp. 1383-1396, 2021.
[41] T. R. Gadekallu, M. Alazab, R. Kaluri, P. K. R. Maddikunta, S. Bhattacharya, K. Lakshmanna, and M. Parimala, “Hand gesture classification using a novel CNN-crow search algorithm, Complex & Intelligent Systems, vol. 7, no. 4, pp. 1855-1868, 2021.

About this article
Cite this article

Omaimah Bamasaq1, Daniyal Alghazzawi2,*, Suhair Alshehri2, Arwa Jamjoom1, and Muhammad Zubair Asghar3, Efficient Classification of Hyperspectral Data Using Deep Neural Network Model, Article number: 12:35 (2022) Cite this article 1 Accesses

Download citation
  • Recived4 January 2022
  • Accepted27 February 2022
  • Published30 July 2022
Share this article

Anyone you share the following link with will be able to read this content:

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords