An ensemble prediction model of solar proton events (SPEs), combining the information of solar flares and coronal mass ejections (CMEs), is built. In this model, solar flares are parameterized by the peak flux, the duration and the longitude. In addition, CMEs are parameterized by the width, the speed and the measurement position angle. The importance of each parameter for the occurrence of SPEs is estimated by the information gain ratio. We find that the CME width and speed are more informative than the flare’s peak flux and duration. As the physical mechanism of SPEs is not very clear, a hidden naive Bayes approach, which is a probability-based calculation method from the field of machine learning, is used to build the prediction model from the observational data. As is known, SPEs originate from solar flares and/or shock waves associated with CMEs. Hence, we first build two base prediction models using the properties of solar flares and CMEs, respectively. Then the outputs of these models are combined to generate the ensemble prediction model of SPEs. The ensemble prediction model incorporating the complementary information of solar flares and CMEs achieves better performance than each base prediction model taken separately.
Three new longitudinal magnetic field parameters are extracted from SOHO/MDI magnetograms to characterize properties of the stressed magnetic field in active regions, and their flare productivities are calculated for 1055 active regions. We find that the proposed parameters can be used to distinguish flaring samples from non-flaring samples. Using the long-term accumulated MDI data, we build the solar flare prediction model by using a data mining method. Furthermore, the decision boundary, which is used to divide flaring from non-flaring samples, is determined by the decision tree algorithm. Finally, the performance of the prediction model is evaluated by 10-fold cross validation technology. We conclude that an efficient solar flare prediction model can be built by the proposed longitudinal magnetic field parameters with the data mining method.
A multi-model integration method is proposed to develop a multi-source and heterogeneous model for short-term solar flare prediction. Different prediction models are constructed on the basis of extracted predictors from a pool of observation databases. The outputs of the base models are normal- ized first because these established models extract predictors from many data resources using different prediction methods. Then weighted integration of the base models is used to develop a multi-model integrated model (MIM). The weight set that single models assign is optimized by a genetic algorithm. Seven base models and data from Solar and Heliospheric Observatory/Michelson Doppler Imager lon- gitudinal magnetograms are used to construct the MIM, and then its performance is evaluated by cross validation. Experimental results showed that the MIM outperforms any individual model in nearly every data group, and the richer the diversity of the base models, the better the performance of the MIM. Thus, integrating more diversified models, such as an expert system, a statistical model and a physical model, will greatly improve the performance of the MIM.
The mispredictive costs of flaring and non-flaring samples are different for different applications of solar flare prediction.Hence,solar flare prediction is considered a cost sensitive problem.A cost sensitive solar flare prediction model is built by modifying the basic decision tree algorithm.Inconsistency rate with the exhaustive search strategy is used to determine the optimal combination of magnetic field parameters in an active region.These selected parameters are applied as the inputs of the solar flare prediction model.The performance of the cost sensitive solar flare prediction model is evaluated for the different thresholds of solar flares.It is found that more flaring samples are correctly predicted and more non-flaring samples are wrongly predicted with the increase of the cost for wrongly predicting flaring samples as non-flaring samples,and the larger cost of wrongly predicting flaring samples as non-flaring samples is required for the higher threshold of solar flares.This can be considered as the guide line for choosing proper cost to meet the requirements in different applications.
Solar flares strongly influence space weather and human activities, and their prediction is highly complex. The existing solutions such as data based approaches and model based approaches have a common shortcoming which is the lack of human engagement in the forecasting process. An image-case-based reasoning method is introduced to achieve this goal. The image case library is com- posed of SOHO/MDI longitudinal magnetograms, the images from which exhibit the maximum hori- zontal gradient, the length of the neutral line and the number of singular points that are extracted for retrieving similar image cases. Genetic optimization algorithms are employed for optimizing the weight assignment for image features and the number of similar image cases retrieved. Similar image cases and prediction results derived by majority voting for these similar image cases are output and shown to the forecaster in order to integrate his/her experience with the final prediction results. Experimental results demonstrate that the case-based reasoning approach has slightly better performance than other methods, and is more efficient with forecasts improved by humans.