Data Construction Method for Small Sample Sets : Theory and Applications

Bok av Hsiao-Fan Wang
Data Construction Method (DCM) based on the multiset division is proposed. The DCM can not only generate addition data within the domain value of the given sample for revealing the data's patterns, but also creates the membership function from the generated data for further applications. In this way, the DCM is taken to filling up the information gaps caused by small-sample-sets. To demonstrate the effectiveness of DCM, after presenting the DCM's theoretic background, properties, and algorithm, we compared the DCM with several existing approaches in estimating the population mean and improving the supervised neural network learning performance. The results show that the DCM performs better in a comparative manner. To show its applicability, we have applied the membership function derived from the DCM data to the studies of predicting the severe earthquakes in Taiwan and forecasting the psychotic episode of individual schizophrenics. The results have shown that the DCM can provide appropriate references for prediction from both spatial and temporal small data sets.