ArticlesAll Issue
ArticlesRadiation Pneumonitis Prediction Using Multi-Omics Fusion Based on a Novel Machine Learning Pipeline
• Bing Li1, Xiaoli Zheng1, Wei Guo1, Yunhan Wang1, Ronghu Mao1, Xiuyan Cheng1, Chengcheng Fan1, Ting Wang1, Zhaoyang Lou1, Hongchang Lei1, Lingguang Meng1, Yuanpeng Zhang2,3,*, and Hong Ge1,*

Human-centric Computing and Information Sciences volume 12, Article number: 49 (2022)
https://doi.org/10.22967/HCIS.2022.12.049

Abstract

Radiomics and dosiomics as two kinds of imaging features are widely used for machine learning-based prognosis prediction in adaptive radiotherapy. Feature selection and modeling are two main components in the radiotherapy prognosis prediction pipeline. So far, few studies have considered both the stability and discrimination ability of the features at the stage of feature selection. Also, in the modeling phase, to fuse radiomics and dosiomics features, most works have only directly concatenated radiomics or dosiomics features as inputs into a model, which may omit the complementary information across different omics. Additionally, overfitting is a common issue when the training data is not enough or contains noises. To solve these problems, in this study, we have developed a novel machine learning-based pipeline and applied it to predict radiation pneumonitis for stage III non-small cell lung cancer patients under medical Internet of Things. The contributions contain the following: in the feature selection phase, a decision criterion which considers both the feature stability and feature discrimination is developed to determine appropriate feature selection methods; in the modeling phase, we have developed a non-sparse multi-kernel learning method with manifold regularization for multi-omics fusion, which can fully explore patterns from both radiomics and dosiomics features and reduce overfitting coincidently. Experimental results show that the decision criterion works effectively for feature selection method selection. Compared to direct feature concatenation, the proposed multi-kernel fusion strategy performs better. Moreover, manifold regularization can alleviate the overfitting problem.

Keywords

Multi-Omics Fusion, Radiomics, Dosiomics, Machine Learning, Feature Selection, Multiple Kernel Learning

Introduction

uts for modeling. As indicated in [15], direct feature concatenation may omit the complementary information contained in different omics. Through the above analysis of FS and modeling, it is found that there are still some issues that need to be further addressed, which are summarized as follows:
(1) In the FS phase, in addition to strong discrimination of selected features, the feature stability is also a factor that needs to be considered. Therefore, a strategy to evaluate the stability of FS is required.
(2) In the modeling phase, direct concatenation of both radiomics and dosiomics features cannot make full use of complementary information contained in different omics. Therefore, we need a new model to explore the complementary information across radiomics and dosiomics features. Additionally, the new model should have mechanisms to alleviate the overfitting problem.
To address the above two issues, for this study, we have developed a new machine learning pipeline for radiation pneumonitis prediction and embedded it into the treatment planning system under the medical Internet of Things (IoT) [16–18]. The main contributions are summarized as follows.
(1) In the FS phase, a decision criterion which considers both the stability and discrimination ability of the features is developed to determine appropriate FS methods;
(2) In the modeling phase, we have developed a non-sparse multi-kernel learning method with manifold regularization for multi-omics fusion, which can fully explore patterns from both radiomics and dosiomics features and alleviate the overfitting problem coincidently.
The following sections are organized as follows. In Section 2, we state data preprocessing including patient characteristics and feature engineering, while in Section 3, we present our multi-kernel fusion method. In Section 4, we report our experimental results, and in Section 5, we conclude this study.

Data Preprocessing

Patient Characteristics
Table 1. The overall characteristic information of all patients

 Characteristic Overall (n=126) p-value Sex 0.04 Male 109 (86.5) Female 17 (13.5) Age (yr) 61 (29–82) 0.67 Pathology 0.46 Squamous cell carcinoma 79 (62.7) Adenocarcinoma 42 (33.3) Others 5 (4.0) RT dose (Gy) 60 (50–70) 0.94 Smoking 0.23 Activity or former 97 (77.0) Never 29 (23.0) Overall stage 0.23 IIIA 80 (63.5) IIIB 46 (36.5) Treatment method 0.97 Sequential chemoradiotherapy 83 (65.9) Concurrent chemoradiotherapy 42 (33.3) Radiotherapy 1 (0.8) Adaptive radiotherapy 54 (42.9) Acute radiation pneumonitis 64 (50.8)
Values are presented as number (%) or median (range).

The pre-treatment planning CT images, planning dose distributions, and lung segmentations were collected from the treatment planning system. All CT images were scanned on the Brilliance Big Bore CT scanner (Philips Electronics, Eindhoven, Netherlands) with a 3-mm slice thickness. The dose grid size was set at 3 mm in the dose calculation, and the lung volumes were contoured by a radiologist with more than 5 years of experience.

Extraction of Radiomics and Dosiomics Features
In the study, 4,610 radiomics features are calculated using the Python package pyradiomics [19] from the pre-treatment planning CT image for the whole lung region. All radiomics features can be organized into three categories specified as first-order statistical category, three-dimensional (3D) shape-based category, and textural category. The first-order statistical category contained 19 features describing the distribution of voxel intensities within the region of interest (ROI). The shape-based category with 16 descriptors represents the 3D size and shape of ROI. For the textural category owing 75 elements, it is used to characterize the distribution of gray levels inside ROI. The original image and 11 filtered images are employed to calculate the first-order and higher-order features, resulting in a total of 27,660 radiomics features. The image filters include three Laplacian-of-Gaussian filters (sigma=1, 3, 6 mm) and eight wavelet filters with exhaustive combinations of high and low-pass filtering along each imaging axis. All the images are discretized into fixed bin counts by five different bin count numbers (20, 50, 100, 150, 200).
The dose features included in this study were commonly used in previous research. They are categorized into scale-invariant 3D dose moments, dose-volume histogram (DVH) dosimetric factors, and dosiomics features.
Scale-invariant 3D dose moments: They describe the weighted dose center within the organ-at-risk (OAR) volume with varying orders along with anterior-posterior, medial-lateral, and craniocaudal directions. In this study, the maximum order (3) is chosen for each dimension, resulting in 64 possible combinations of orders. Since the order (0, 0, 0) results in a constant value (1), a total of 63 dose moments are included in the dosiomics feature set.
DVH dosimetric factors: DVH summarizes the dose accumulation within a volume. It is defined as the isodose volume at varying levels of doses and is widely used in the clinic for convenient dose comparisons. DVH dosimetric factors, which are the dose values at specific volumes or volume values at specific doses, are commonly used as the evaluation metrics for plan quality assessment. In this study, we selected dose values (Gy) at multiple relative volumes from 0 to 1 and volume values at multiple relative and absolute dose values on the DVH curve.
Dosiomics: A total of 91 first-order and higher-order radiomics features are extracted from the dose map to describe the dose histogram statistics and dose texture. Only the original dose map is employed without further preprocessing.
In total, 213 dose features for each subregion and the whole lung region are extracted.

Selection of Radiomics and Dosiomics Features
After feature extraction, we finally obtain 4,610 radiomics and 213 dosiomics features. Since the number of features is significantly higher than the number of patients (216 persons in our study), when machine learning models are applied to the high-dimensional data, a critical issue is known as the curse of dimensionality, which refers to the phenomenon where data that becomes sparse in high-dimensional space may occur [20]. Therefore, in this study, we need an FS strategy to reduce redundancy and irrelevancy for the modeling phase. Moreover, feature stability is also an important factor that should be considered during the FS phase. If the FS methods are not very stable, the selected biomarkers are turbulent as well. Fig. 1 shows our FS strategy.
In [20], the authors shared a python package containing 33 methods for FS, in which supervised methods are used for irrelevancy reduction and unsupervised methods are used for redundancy reduction. In our study, we tried all methods and set an exclusion criterion for FS method selection. The exclusion criterion contains the following:
1) If the AUC value of a FS method is lower than 0.5, the method is excluded.
2) If the running time of one try of a FS method is more than 30 minutes, the method is excluded.

Fig. 1. Feature section strategy: (a) SU combination and (b) decision graph.

Based on the exclusion criterion, we finally select four supervised methods—S1 (F score [21]), S2 (T score [20]), S3 (ReliefF [22]), S4 (Fish score [20])—and six unsupervised methods—U1 (Lap_score [20]), U2 (SPEC [20]), U3 (MCFS [20]), U4 (NDFS [20]), U5 (UDFS [20]), U6 (Person_score [20])—for our following experiments. To determine the best supervised and unsupervised (SU) combination for redundancy and irrelevancy reduction, we fully generate 24 different SU combinations, as shown in Fig. 1(a). In addition, it is expected that the selected features are stable in different tries. Therefore, according to [23], we introduce a frequency-based criterion to measure the stability of each SU combination. Let’s suppose we have a binary matrix Z that can be defined as:

(1)

where each row represents the feature selection result of the d features in response to a SU combination in one try. As an example, $z_{Md}$=1 means that the d-th feature in the M-th try is selected. Then the stability based on the frequency-based criterion can be defined as:

(2)

which ranges from 0 to 1. The greater the value, the better the stability will be.
Fig. 1(b) shows the decision graph for each SU combination in which the combination with the highest AUC*Stability is selected as the final SU combination for redundancy and irrelevancy reduction.

Multi-Kernel Fusion with Manifold Regularization

In [24], based on the criterion of least square and manifold learning, a regression model was proposed to overcome the overfitting problem. But the proposed model is linear, so it is limited in dealing with nonlinear data. Additionally, the proposed model cannot be directly used for multi-modal data fusion. Therefore, in this section, we will extend the model proposed in [24] to a multi-kernel version for multi-modal nonlinear medical data fusion.

Ridge Regression with Manifold Regularization
Ridge regression (RR) has been widely used in many fields of machine learning and pattern recognition due to its mathematically tractable and efficient characteristics [25, 26]. Let’s suppose we have a training set {X,y}, where X∈$R^{NXd}$, Y∈$R^{NXC}$, while N is the size of the training set, and C is the number of classes. Then the objective function of RR can be expressed as follows:

(3)

where A is the transformation matrix. When RR is applied to classification tasks, margins between different classes are expected to be as large as possible after X∈$R^{NXd}$ is transformed into Y∈$R^{NXC}$. This criterion can very easily cause overfitting when the distribution of the training data is sparse or the training data has noisy samples or outliers. To solve this problem, many efforts have been done and achieved great success. As one representative work, Fang et al. introduced manifold learning to find a new space in which the transformed samples well-preserve the intrinsic geometry structure of the original input samples [24]. They assumed that samples sharing the same labels should be kept as close as possible after they are transformed. To achieve this goal, Fang et al. designed an objective function [24] expressed as:

(3)

where $W_{ij}$ is used to capture the relationship of samples $x_i$ and $x_j$. $W_{ij}$ is defined as follows:

(5)

f($x_i$)=$x_i$ A in (4) represents the transformed result of $x_i$ by the transformation matrix A.Therefore, we can find that the minimization of objective function defined in (4) actually ensures the fact that if any two samples in the feature space share the same label, then when they are transformed into the label space, they are kept close together. By substituting f($x_i$)=$x_i$ A and (5) to (4), manifold regularization in (4) can be updated as follows:

(6)

where L denotes the Laplacian matrix.
Therefore, the objective function of RR with manifold regularization can be expressed as follows:

(7)

where γ is a regularized parameter. Please note that since all training samples are used to learn the transformation matrix A, the regularization term is removed in (3). Hence, the burden of parameter tuning is reduced.

Kernelization and Fusion
Most medical data is very complex and nonlinear, which limits the application of RR. What’s more, in the field of radiotherapy, radiomics and dosiomics both contain useful omics-patterns that can be used for prognosis prediction. Therefore, how to fuse them in a kernelized feature space is very important. In this section, we will derive the kernelized version of RR to tackle the nonlinear and multi-omics fusion problems.
The pivotal difference between RR and kernel ridge regression (KRR) [27] is that features used in KRR are represented in the reproducing kernel Hilbert space (RKHS), and coincidently represented in the original feature space in RR. Let’s suppose we have a nonlinear function φ(x):$R^d$→RKHS that can map the input features x =[$x_1$,$x_2$,...,$x_d$ ]^T from $R^d$ to RKHS, and a positive semidefinite kernel-Gram matrix K∈$R^{NXN}$, where each matrix element is denoted as the inner product between any two input samples represented in RKHS. However, in practice, finding an appropriate φ() is difficult or even impossible. Thus, by introducing kernel techniques, the inner produce is always replaced by a specific kernel function. That is, $K_{ij}$=k($x_i$,$x_j$), where k is a specific kernel function, e.g., the Gaussian kernel function, polynomial kernel function, etc. In addition, according to [24], in RKHS, the solution to A can be expressed as a linear combination of training samples, that is, A=($X_φ$ )^T G, where G=[($g_1$ )^T,($g_2$ )^T,..,($g_N$ )^T] can be considered as a new transformation matrix in RKHS. With the above definitions, the objective function in (7) can be updated as:

(8)

To achieve multi-omics fusion, based on (8), we suppose that the kernel matrix K can be expressed as a linear combination of several different types of kernel matrices, that is, , where α_m is the kernel combination coefficient. In order to fully explore the complementary information contained in each omics-pattern, we impose a constraint on to get a non-sparse distribution. By substituting into (8), we have the following:

Compared to the proposed model in [24] and the sparse kernel weighting strategy used in [10], the innovation of our method can be refined as follows:
1) We extend the proposed model in [24] to a multi-kernel version, so that our method not only inherits the merits of the model in [24], but also the training patterns can be observed from different types of kernel windows in the reproducing kernel Hilbert space. In this way, more detailed recognition patterns are expected to be found.
2) In contrast to the sparse kernel weighting strategy which focuses on few modalities, we introduce p-norm (p>1) regularization to act on kernel weight learning. In this way, the complementary patterns across different modalities are expected to be explored.
Optimization and Algorithm
The optimization of (9) can be implemented by alternate iteration of G and α. When α is given, G has a closed-form solution that can be derived by setting the partial derivation of (9) in relation to G to 0. Therefore, we have the following:

(10)

Similarly, when G is given, α has a closed-form solution that can be derived by setting the partial derivation of (9) in relation to α to 0. Therefore, we have the following:

(11)

With the two rules in (10) and (11) in relation to G and α, the algorithm of multiple kernel fusion with manifold regularization (MRMKF) can be deduced with the related steps listed in Algorithm 1. The time complexity of MRMKF mainly consists of two parts. The first one entails L and K_m computation, and the second one α and G computation. The asymptotic time complexity of L and K_m computation is O($N^2$), which is the size of training samples. The asymptotic time complexity of α and G computation is O($N^3$). Therefore, the asymptotic time complexity of MRMKF is O($N^3$).

Experimental Study

Settings
The workflow of determining the best SU combination for redundancy and irrelevancy reduction is shown in Fig. 2. For starters, the bootstrap strategy is used to partition each omics data into “bootData” and “OOBData,” where “bootData” is used for training and “OOBData” for testing [28]. As we can see, both supervised and unsupervised FS methods are combined to select features in this workflow. After FS, RR is selected as the classifier for modeling. This procedure is repeated M times for stability evaluation.

Fig. 2. Workflow process for the best SU combination for feature redundancy and irrelevancy reduction.

After we determine the best SU combination for feature redundancy and irrelevancy reduction, we use the workflow (Fig. 3) to evaluate the proposed model. For starters, the patient cohort is divided into a training set (70%) and testing set (30%). In phase 1, for each omics in the training set, we further partition it into a training set (70%) and validation set (30%), where the validation set is used to determine the best model. That is, the best model (optimal feature set) is obtained by finding the best validation AUC. When the best model for both radiomics and dosiomics are obtained, we further use the best SU combination to reduce redundancy and irrelevancy that may exist in both radiomics and dosiomics. In phase 2, we use our proposed method to fuse the radiomics and dosiomics features.

Fig. 3. Workflow process of multi-omics fusion.

Regarding multi-omics fusion in phase 2, the “all-single” strategy for multi-kernel learning is adopted [15]. Because this strategy considers the contribution of both combined features and single features, as shown in Fig. 4. “A” denotes the feature combination of radiomics or dosiomics features, “S” denotes one single feature, and “KM” denotes the kernel matrix. As an example, let’s suppose we have a dataset with four patients, each patient has three radiomics features and three dosiomics features, then this dataset can be expressed as “A” can be represented as and “S” can be represented as In our study, we select the Gaussian kernel taking {0.5, 1, 2, 5, 7, 10, 12, 15, 17, 20} as the kernel parameters and the polynomial kernel taking {1, 3, 5} as the kernel parameters for multi-kernel fusion. Therefore, finally we have 91 kernel matrices, and the multi-kernel learning aims to learn the weights of these 91 kernel matrices.

Fig. 4. All-single fusion strategy.

Experimental Results
In this section, we will report our experimental results from four aspects: selection of a SU combination, validation of radiomics and dosiomics, modeling of multi-omics, and parameter sensitive analysis.

4.2.1 Selection of a SU combination
Fig. 5 shows the decision results of a SU combination in terms of AUC*Stability. In Fig. 5(a), we see that both S2U1 and S4U6 achieve the highest AUC*Stability (0.50), which means that S2U1 and S4U6 keep the best balance between feature stability and prediction performance. Therefore, in the modeling phase, we can choose either S2U1 or S4U6 as the best SU combination for supervised and unsupervised radiomics FS. In Fig. 5(b), we see that S4U6 achieves the highest AUC*Stability (0.51). Accordingly, S4U6 is used to select dosiomics features.

Fig. 5. Decision results of SU combination in terms of AUC*Stability: (a) radiomics and (b) dosiomics.

4.2.2 Validation of radiomics and dosiomics
Fig. 6 shows the validation results in term of AUC on radiomics and dosiomics, respectively. As shown in Fig. 3, we can get the best model on each omics by finding the best validation performance. Therefore, from Fig. 6(a) and 6(b), we see that the first five features (best feature set) response to the best model on dosiomics, as well as the first 10 features on radiomics.

Fig. 6. Modeling results of single-omics: (a) radiomics and (b) dosiomics.

4.2.3 Modeling of multi-omics
With the best feature set of each omics, Fig. 7 shows the multi-omics fusion results based on multi-kernel learning with manifold regularization. In Fig. 7(a), we see that in comparison to single-omics modeling results shown in Fig. 6, multi-omics fusion significantly improves the radiation pneumonitis prediction performance. From Fig. 7(a), we see that when we select the first five features, shown in Table 2, our model achieves the best prediction result. We see from Table 2 that the final model contains two radiomics or dosiomics features.
During multi-omics fusion, we select the Gaussian kernel with kernel parameters {0.5, 1, 2, 5, 7, 10, 12, 15, 17, 20} and the polynomial kernel with kernel parameters {1, 3, 5} as the kernel functions. Therefore, for the top 5 features, we totally have 78 kernel matrices. Fig. 7(b) shows the kernel weight against each kernel matrix. As we can see, the kernel matrices 64, 67, 61, and 73 contribute the most to the final decision. By taking deeper insights, we find that kernel matrix 64 is generated from the feature “CT_Image_LungT_wavelet-LLL_glszm_GrayLevelVariance_20_binCount,” using the polynomial kernel with kernel parameter 1. Kernel matrix 67 is also generated from the feature “LungT_dose_original_glrlm_RunEntropy,” using the polynomial kernel with kernel parameter 1. Kernel matrix 61 is generated from all of the top 5 features, using the polynomial kernel with kernel parameter 1. Kernel matrix 73 is also generated from the feature “LungT_dose_original_glrlm_RunEntropy,” using the polynomial kernel with kernel parameter 1. Kernel matrix 73 is generated from the feature “CT_Image_LungT_wavelet-HHL_firstorder_Minimum,” using the polynomial kernel with kernel parameter 1.

Fig. 7. Modeling results of multi-omics fusion: (a) AUC against number of features and (b) weight against number of kernels.

Table 2. Final top 5 radiomics and dosiomics features
 Category Feature name Radiomics CT_Image_LungT_wavelet-LLL_glszm_GrayLevelVariance_20_binCount Dosiomics LungT_dose_original_glrlm_RunEntropy Dosiomics LungT_minimum_dose Radiomics CT_Image_LungT_wavelet-HHL_firstorder_Minimum Dosiomics LungT_dose_moment_0_3_3

To further highlight the superiority of multi-omics fusion, we directly combine radiomics and dosiomics features as inputs for RR. Fig. 8 shows the comparison results, as we can see, multi-omics fusion performs better than direct feature combination in relation to both training and testing AUC.

Fig. 8. Multi-omics fusion versus direct feature combination: (a) training and (b) testing.

4.2.4 Parameter sensitive analysis
As shown in the objective function in (9), we totally have two parameters, i.e., γ and p to be set in advance, where γ is a regularized parameter used to control the contribution of the manifold regularization term. In particular, when γ is set to 0, our method is degenerated into another version without considering the overfitting problem. p is a parameter of p-norm used to control the distribution of kernel weights. To observe the impact of these two parameters, we plot the curve of training and testing AUC against different values of γ or p, as shown in Fig. 9. From Fig. 9(a), we see that the training AUC basically remains stable with an increase in γ, and the testing AUC gradually increases and then becomes stable. This phenomenon indicates two findings, accordingly. The first one is that with an increase in γ, the differences between the training AUC and testing AUC gradually decrease, which means that the overfitting problem is reduced by manifold regularization. This is because manifold regularization assumes that samples sharing the same labels should be kept as close as possible after they are transformed into the label space. The second finding is that our method is not highly sensitive to γ because the testing AUC begins to become stable when γ is set from 1e02 to 1e05. Therefore, in practice, it is easy to set γ. From Fig. 9(b), we see that when p>2.5, the testing AUC becomes stable. Therefore, our method is also not highly sensitive to p. Additionally, we see that when p=1.1 or p=1.5, the testing AUC is lower than 0.6. The reasons can be found from Fig. 10, accordingly. When p is very small, such as p=1.1, the kernel weight distribution is very sparse (Fig. 10(a)). That is, only few kernel matrices are used for the final decision. But as we know that both radiomics and dosiomics features may contain complementary patterns, their effective fusion may have a positive effect on the final decision. When p is set to 4 and 6 (Fig. 10(b) and 10(c)), we see that the kernel weight distribution becomes non-sparse, and thus more kernel matrices take part in the final decision. With more kernel matrices, the testing AUC does increase from less than 0.6 to 0.67.

Fig. 9. Parameter sensitive analysis: (a) AUC against γ and (b) AUC against p.

Fig. 10. Parameter sensitive analysis: (a) AUC against γ and (b) AUC against p.

Discussion
In this experimental study, we have evaluated a machine learning pipeline proposed for radiation pneumonitis prediction by using both radiomics and dosiomics features. The pipeline contains a FS phase and modeling phase.
In medical data analysis, notably radiomics or dosiomics-based analysis, the number of patients is often limited, but the number of features describing patients is very large. Therefore, selecting discriminative features is very important for machine learning-based model construction [15, 20]. For instance, Yu et al. [6] employed LASSO to select discriminative radiomics features for distant metastasis prediction of nasopharyngeal carcinoma patients. For the same application scenario, Zhang et al. [5] employed mRMR for feature selection. Although their studies finally generated acceptable results, they omit the stability of the FS methods, which is very important to the robustness of the prediction pipeline. Therefore, in this study, in the FS phase, a decision criterion which considers both stability and prediction performance is developed to determine appropriate FS methods to reduce redundancy and irrelevancy of features. From Fig. 4, we find that in terms of AUC*Stability, the decision graph can easily get the best SU combination for feature selection.
In addition to radiomics features, dosiomics features have also been demonstrated to be correlate with radiotherapy prognosis [7, 8]. For instance, in [7], the authors combined both radiomics and dosiomics features to predict locoregional recurrence prognosis for head and neck cancer patients. However, in these studies, different kinds of features were directly concatenated as inputs for modeling. As indicated in [15], direct feature concatenation may omit the complementary information across different omics. Therefore, in this study, based on the model proposed in [24], we proposed a multi-kernel model with manifold regularization to fuse together radiomics and dosiomics features in the modeling phase. Whereas the sparse kernel weighting strategy focuses on few modalities, we introduce p-norm (p>1) regularization to act on kernel weight learning. In this way, the complementary patterns across different modalities are expected to be explored. From Table 2 and the comparison with direct feature concatenation shown in Fig. 7, we indeed see the power of multi-omics fusion. Additionally, we see from Fig. 9(a) that the overfitting problem is reduced by manifold regularization.
Our pipeline is not without any limitations. For instance, as we see that multi-kernel fusion requires lots of memories to storage kernel matrices and high costs to compute kernel weights. Therefore, the way to improve the efficiency of multi-kernel fusion is still a matter to be resolved. Given the evidence in recent studies [29, 30] showing that the stochastic-variance-reduced-gradient (SVRG) method stimulates efficient multi-kernel learning, adoption of this SVRG method is anticipated to alleviate computational complexity in the context of multi-kernel learning. Therefore, in our future studies, SVRG or SVRG-based variants are expected to be introduced for fast model optimization.

Conclusion

In this study, under medical IoT, we developed a novel machine learning-based pipeline including the FS phase and the modeling phase to predict radiation pneumonitis for stage III NSCLC patients. The study was approved by the hospital ethical committee of the Affiliated Cancer Hospital of Zhengzhou University, and the NSCL cancer patients with stage IIIA/B were retrospectively collected from 2015 to 2019. We have used pyradiomics to extract features, while finally obtaining 4,610 radiomics features and 213 dosiomics features. In the feature selection phase, we have designed a decision criterion which considers both the stability and accuracy to select appropriate FS methods for feature selection. In the modeling phase, we have developed a multi-kernel fusion method to fully explore patterns from both radiomics and dosiomics features. From the experimental results, we have found that in terms of AUC*Stability, the proposed decision criterion can easily get the best SU combination for feature selection. Also, in comparison to direct feature concatenation, multi-omics fusion can explore more patterns for the prediction model.

Author’s Contributions

Conceptualization, HG, YZ. Funding acquisition, HG, YZ, BL. Investigation and methodology, YZ, BL. Project administration, HG, YZ, BL. Resources, WG, YW. Supervision, HG, YZ. Writing of the original draft, BL, XL, WG, XC, CF, TW. Writing of the review and editing, HG, YZ, HL, ZL, TW, CF. Software, BL, ZL, XC. Validation, XZ, WG, RM. Formal analysis, YZ, BL, RM, CF, ZL. Data curation, XZ, YW, CF, TW. Visualization, WG, YW, RM. All the authors have proofread the final version.

Funding

This work was supported in part by the Provincial and Ministry Co-constructed Project of Henan Province Medical Science and Technology Research (No. SBGJ202103038 and SBGJ202102056), Henan Province Key R&D and Promotion Project (Science and Technology Research) (No. 222102310015), Natural Science Foundation of Henan Province (No. 222300420575), Henan Province Science and Technology Research (No. 222102310322), the National Natural Science Foundation of China (No. 82072019), the Natural Science Foundation of Jiangsu Province (No. BK20201441), and Jiangsu Post-doctoral Research Funding Program (No. 2020Z020).

Competing Interests

The authors declare that they have no competing interests.

Author Biography

Name : Bing Li
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China
Biography : Bing Li received his Ph.D. degree in Nuclear Science and Technology from the University of Science and Technology of China, 2018. Currently, he is a physicist of radiotherapy. And he is focusing on the medical image, radiomics, and machine learning.

Name : Xiaoli Zheng
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China
Biography : Xiaoli Zheng is a physician of radiation oncology. She is focusing on radiotherapy research for lung and esophageal cancer.

Name : Wei Guo
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China
Biography : Wei Guo is a medical physicist with a master’s degree. He is focusing on research of radiotherapy physics.

Name : Yunhan Wang
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China
Biography : Yunghan Wang is a master’s student in radiotherapy oncology.

Name : Ronghu Mao
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China
Biography : Ronghu Mao is a medical physicist with a master’s degree. He is focusing on radiotherapy physics technology and Flash radiotherapy.

Name : Xiuyan Chen
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China
Biography : Xiuyan Chen is a medical physicist with a master’s degree. He is focusing on research of radiotherapy physics.

Name : Chengcheng Fan
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China
Biography : Chengcheng Fan is a physician of radiation oncology. He is focusing on radiotherapy research for lung and esophageal cancer.

Name : Ting Wang
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China
Biography : Ting Wang is a physician of radiation oncology. She is focusing on radiotherapy research for lung and esophageal cancer.

Name : Zhaoyang Lou
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China
Biography : Zhaoyang Lou is a medical physicist with a Ph.D. degree. He is focusing on radiotherapy physics technology.

Name : Hongchang Lei
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China
Biography : Hongchang Lei is a chief medical physicist. He is focusing on radiotherapy physics technology.

Name : Lingguang Meng
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China.
Biography : Lingguang Meng is a medical physicist and engineer with a master’s degree. He is focusing on radiotherapy physics technology.

Name : Yuanpeng Zhang
Affiliation : Department of Medical informatics, Nantong University, Nantong, China
Biography : He received the Ph.D. degree in information engineering from the School of Computer Application Technology, Jiangnan University, Wuxi, China, in 2018. He is currently an Associate Professor with the Department of Medical Informatics, Nantong University, Nantong, China. He is also a Postdoctoral Fellow with the Department of Health Information Technology, Hong Kong Polytechnic University, Hong Kong. He has authored or coauthored about 20 papers in international/national journals including IEEE TRANSACTIONS ON FUZZY SYSTEMS, IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, KNOWLEDGE-BASED SYSTEMS, and ACM Transactions on Internet of Technology. His main research interests include pattern recognition and data mining.

Name : Hong Ge
Affiliation : Department of Radiation Oncology, Affiliated Cancer Hospital of Zhengzhou University, Henan Cancer Hospital, Zhengzhou, China
Biography : Hong Ge is a full professor in radiotherapy oncology. She is the director of the department of radiotherapy oncology. She is focusing on the research of lung and esophageal cancer with radiotherapy.

References

[1] J. D. Bradley, C. Hu, R. R. Komaki, G. A. Masters, G. R. Blumenschein, S. E. Schild, et al., “Long-term results of NRG oncology RTOG 0617: standard-versus high-dose chemoradiotherapy with or without cetuximab for unresectable stage III non–small-cell lung cancer,” Journal of Clinical Oncology, vol. 38, no. 7, pp. 706-714, 2020.
[2] M. Cheng, S. Jolly, W. O. Quarshie, N. Kapadia, and F. D. Vigneau, “Modern radiation further improves survival in non-small cell lung cancer: an analysis of 288,670 patients,” Journal of Cancer, vol. 10, no. 1, pp. 168-177, 2019.
[3] S. H. Abid, V. Malhotra, and M. C. Perry, “Radiation-induced and chemotherapy-induced pulmonary injury,” Current Opinion in Oncology, vol. 13, no. 4, pp. 242-248, 2001.
[4] S. Ramella, M. Fiore, S. Silipigni, M. C. Zappa, M. Jaus, A. M. Alberti, et al., “Local control and toxicity of adaptive radiotherapy using weekly CT imaging: results from the LARTIA trial in stage III NSCLC,” Journal of Thoracic Oncology, vol. 12, no. 7, pp. 1122-1130, 2017.
[5] Y. Zhang, S. Lam, T. Yu, X. Teng, J. Zhang, F. K. H. Lee, et al., “Integration of an imbalance framework with novel high-generalizable classifiers for radiomics-based distant metastases prediction of advanced nasopharyngeal carcinoma,” Knowledge-Based Systems, vol. 235, article no. 107649, 2022. https://doi.org/10.1016/j.knosys.2021.107649
[6] T. T. Yu, S. K. Lam, L. H. To, K. Y. Tse, N. Y. Cheng, Y. N. Fan, et al., “Constructing novel prognostic biomarkers of advanced nasopharyngeal carcinoma from multiparametric MRI radiomics using ensemble-model based iterative feature selection,” in Proceedings of 2019 International Conference on Medical Imaging Physics and Engineering (ICMIPE), Shenzhen, China, 2018, pp. 1-7.
[7] B. Liang, H. Yan, Y. Tian, X. Chen, L. Yan, T. Zhang, Z. Zhou, L. Wang, and J. Dai, “Dosiomics: extracting 3D spatial features from dose distribution to predict incidence of radiation pneumonitis,” Frontiers in Oncology, vol. 9, article no. 269, 2019. https://doi.org/10.3389/fonc.2019.00269
[8] A. Wu, Y. Li, M. Qi, X. Lu, Q. Jia, F. Guo, et al., “Dosiomics improves prediction of locoregional recurrence for intensity modulated radiotherapy treated head and neck cancer cases,” Oral Oncology, vol. 104, article no. 104625, 2020. https://doi.org/10.1016/j.oraloncology.2020.104625
[9] J. S. Park and J. H. Park, “Advanced technologies in Blockchain, machine learning, and big data,” Journal of Information Processing Systems, vol. 16, no. 2, pp. 239-245, 2020.
[10] V. Shanmuganathan, H. R. Yesudhas, K. Madasamy, A. A. Alaboudi, A. K. Luhach, and N. Z. Jhanjhi, “AI based forecasting of influenza patterns from Twitter information using random forest algorithm,” Human-centric Computing and Information Sciences, vol. 11, article no. 33, 2021. https://doi.org/10.22967/HCIS.2021.11.033
[11] Z. Zhou, S. Li, G. Qin, M. Folkert, S. Jiang, and J. Wang, “Multi-objective-based radiomic feature selection for lesion malignancy classification,” IEEE Journal of Biomedical and Health Informatics, vol. 24, no. 1, pp. 194-204, 2019.
[12] Q. Zhang, Y. Xiao, J. Suo, J. Shi, J. Yu, Y. Guo, Y. Wang, and H. Zheng, “Sonoelastomics for breast tumor classification: a radiomics approach with clustering-based feature selection on sonoelastography,” Ultrasound in Medicine & Biology, vol. 43, no. 5, pp. 1058-1069, 2017.
[13] S. Li, K. Wang, Z. Hou, J. Yang, W. Ren, S. Gao, et al., “Use of radiomics combined with machine learning method in the recurrence patterns after intensity-modulated radiotherapy for nasopharyngeal carcinoma: a preliminary study,” Frontiers in Oncology, vol. 8, article no. 648, 2018. https://doi.org/10.3389/fonc.2018.00648
[14] M. Vallieres, E. Kay-Rivest, L. J. Perrin, X. Liem, C. Furstoss, H. J. Aerts, et al., “Radiomics strategies for risk assessment of tumour failure in head-and-neck cancer,” Scientific Reports, vol. 7, article no. 10117, 2017. https://doi.org/10.1038/s41598-017-10371-5
[15] Y. Zhang, S. Wang, K. Xia, Y. Jiang, and P. Qian, “Alzheimer’s disease multiclass diagnosis via multimodal neuroimaging embedding feature selection and fusion,” Information Fusion, vol. 66, pp. 170-183, 2021.
[16] J. D. Lee, H. S. Cha, S. Rathore, and J. H. Park, “M-IDM: a multi-classification based intrusion detection model in healthcare IoT,” Computers, Materials and Continua, vol. 67, no. 2, pp. 1537-1553, 2021.
[17] M. M. Salim, S. Rathore, and J. H. Park, “Distributed denial of service attacks and its defenses in IoT: a survey,” The Journal of Supercomputing, vol. 76, no. 7, pp. 5320-5363, 2020.
[18] J. H. Park, S. Rathore, S. K. Singh, M. M. Salim, A. E. Azzaoui, T. W. Kim, Y. Pan, and J. H. Park, “A comprehensive survey on core technologies and services for 5G security: taxonomies, issues, and solutions,” Human-centric Computing and Information Sciences, vol. 11, article no. 3, 2021. https://doi.org/10.22967/HCIS.2021.11.003
[19] S. Loetpipatwanich and P. Vichitthamaros, “Sakdas: a Python package for data profiling and data quality auditing,” in Proceedings of 2020 1st International Conference on Big Data Analytics and Practices (IBDAP), Bangkok, Thailand, 2020, pp. 1-4.
[20] J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu, “Feature selection: a data perspective,” ACM Computing Surveys, vol. 50, no. 6, article no. 94, 2017. https://doi.org/10.1145/3136625
[21] P. Tao, H. Yi, C. Wei, L. Y. Ge, and L. Xu, “A method based on weighted F-score and SVM for feature selection,” in Proceedings of 2013 25th Chinese Control and Decision Conference (CCDC), Guiyang, China, 2013, pp. 4287-4290.
[22] X. Liu, X. Wang, and Q. Su, “Feature selection of medical data sets based on RS-RELIEFF,” in Proceedings of 2015 12th International Conference on Service Systems and Service Management (ICSSSM), Guangzhou, China, 2015, pp. 1-5.
[23] S. Nogueira, K. Sechidis, and G. Brown, “On the stability of feature selection algorithms,” Journal of Machine Learning Research, vol. 18, article no. 174, 2017.
[24] X. Fang, Y. Xu, X. Li, Z. Lai, W. K. Wong, and B. Fang, “Regularized label relaxation linear regression,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 4, pp. 1006-1018, 2018.
[25] Z. Deng, K. S. Choi, Y. Jiang, and S. Wang, “Generalized hidden-mapping ridge regression, knowledge-leveraged inductive transfer learning for neural networks, fuzzy systems and kernel methods,” IEEE Transactions on Cybernetics, vol. 44, no. 12, pp. 2585-2599, 2014.
[26] M. Byrtek, F. O'Sullivan, M. Muzi, and A. M. Spence, “An adaptation of ridge regression for improved estimation of kinetic model parameters from PET studies,” IEEE Transactions on Nuclear Science, vol. 52, no. 1, pp. 63-68, 2005.
[27] C. Zhao, W. Liu, Y. Xu, and J. Wen, “Hyperspectral image classification via spectral–spatial shared kernel ridge regression,” IEEE Geoscience and Remote Sensing Letters, vol. 16, no. 12, pp. 1874-1878, 2019.
[28] A. Brenning, “Spatial cross-validation and bootstrap for the assessment of prediction rules in remote sensing: the R package sperrorest,” in Proceedings of 2012 IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, 2012, pp. 5372-5375.
[29] M. Alioscha-Perez, M. C. Oveneke, and H. Sahli, “Svrg-mkl: a fast and scalable multiple kernel learning solution for features combination in multi-class classification problems,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 5, pp. 1710-1723, 2020
[30] Z. Zhang, “An efficient empirical solver for localized multiple kernel learning via DNNs,” in Proceedings of 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 2021, pp. 647-654.

Bing Li1, Xiaoli Zheng1, Wei Guo1, Yunhan Wang1, Ronghu Mao1, Xiuyan Cheng1, Chengcheng Fan1, Ting Wang1, Zhaoyang Lou1, Hongchang Lei1, Lingguang Meng1, Yuanpeng Zhang2,3,*, and Hong Ge1,*, Radiation Pneumonitis Prediction Using Multi-Omics Fusion Based on a Novel Machine Learning Pipeline, Article number: 12:49 (2022) Cite this article 2 Accesses