Abstract:At present, imbalanced data classification is the research hotspot of machine learning. Traditional machine learning classification algorithms are usually used on balanced datasets, which cannot be directly applied to the imbalanced data. A new kind of imbalanced boosting algorithm based on feature selection was proposed to balance the importance for the minority class samples from different types of attributes of datasets, which not only could select the more meaningful attributes for the prediction of the minority class samples, but also reduce data dimension. Then, the imbalanced boosting algorithm was combined to make the datasets balanced. Finally, since the wrong sample weight of the original algorithm grew fast, a new algorithm which could restrain the growths of sample weight effectively was put forward. Experimental results show that the proposed algorithm can effectively improve the classification performance of imbalanced datasets, especially that of the minority class.