Optimized SMOTE for Imbalanced Data Handling in Machine Learning

Dimensions

Gomathy, B (2025) Optimized SMOTE for Imbalanced Data Handling in Machine Learning. 2025 3rd International Conference on Advancement in Computation & Computer Technologies (InCACCT). pp. 349-354.

[thumbnail of Optimized SMOTE for Imbalanced Data Handling in Machine Learning.pdf]

Text
Optimized SMOTE for Imbalanced Data Handling in Machine Learning.pdf - Published Version
Download (1MB)

Official URL: https://doi.org/10.1109/InCACCT65424.2025.11011340

Abstract

Machine learning models face imbalanced data as a significant challenge after reaching over two decades of development. The prevalent class imbalance creates imbalanced models which prefer the majority group while reducing performance numbers across minority classes. This research presents an optimized version of SMOTE for Imbalanced Data handling through machine learning (SMOTE-IDEA) that acts as a technique for imbalanced data classification. The process of making minority-class synthetic samples to address class imbalance has been traditionally handled through SMOTE approaches. Standard SMOTE applications allow unwanted irrelevant data points to enter the data which produces undesirable results. The research suggests a new version of SMOTE which addresses past limitations to enhance the oversampling procedure. SMOTE technique generates synthetic data more efficiently through several additions. The method selects significant features through an automated process that removes unimportant information. Sample filtering and high-quality synthetic instance generation take place as part of this method. The method adapts weight distribution schemes to ensure that data generation occurs primarily in high-density regions thereby improving minority class composition. Several classification methods analyzed different ratio-imbalanced datasets to evaluate the newly upgraded SMOTE approach. The proposed strategy demonstrates superior performance against traditional SMOTE implementation and top-level benchmark approaches by delivering improved classifications in terms of accuracy and precision in addition to recall and F1-score operation on minority and majority groups.

Item Type:	Article
Subjects:	C Computer Science and Engineering > Database Management System C Computer Science and Engineering > Machine Learning
Divisions:	Computer Science and Engineering
Depositing User:	Dr Krishnamurthy V
Date Deposited:	04 Jul 2025 04:44
Last Modified:	04 Jul 2025 04:44
URI:	https://ir.psgitech.ac.in/id/eprint/1463

Actions (login required)

: View Item