INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue XI November 2025

Page 352

www.rsisinternational.org

Smart Pocket: A Machine Learning–Based Expense Tracker and

Spending Predictor

Bindeshwar Mahto

, Anurag Kumar Varma

, Sonal Kumari

, Ruksar Parveen

, Chahat Firdous

Anuradha Sharma

, Dr. Kumar Amrendra

1,2,3,4,5

Students, Department of Computer Science & Engineering and Information Technology, Jhar-

khand Rai University, Ranchi, Jharkhand, India

6,7

Assistant Professor, Department of CSE & IT, Jharkhand Rai University, Ranchi, Jharkhand, India

DOI: https://dx.doi.org/10.51584/IJRIAS.2025.101100032

Received: 24 November 2025; Accepted: 01 December 2025; Published: 08 December 2025

ABSTRACT

Personal financial management has become increasingly challenging in a digital economy characterized by

frequent micro-transactions, expanding spending categories, and the growing shift toward cashless payments.

Individuals often struggle to monitor their daily expenses, identify spending patterns, and maintain financial

discipline without systematic tools. This research presents Smart Pocket, an intelligent expense-tracking and

financial-insight system designed to automate expense recognition, predict spending trends, and support users

in maintaining budget control. The system utilizes machine learning techniques to classify expenses into

categories such as Food, Cloths, Other, and Fruits, while also analyzing spending patterns, budget usage, and

category-wise distributions.Through a combination of bar charts, doughnut charts, progress indicators, and

time-series visualization, Smart Pocket provides a comprehensive analytical dashboard that transforms raw

user expenses into actionable insights. The system demonstrates high accuracy in expense categorization and

generates reliable predictions for future spending behavior. Experimental results reveal that users spent ₹4919

of a ₹6000 monthly budget, staying within the recommended spending threshold, and showed identifiable

spending peaks and cycles across different days. These insights validated the effectiveness of Smart Pocket in

helping users understand their financial habits and optimize their budgeting strategies. The study concludes

that integrating machine learning and visual analytics significantly enhances the quality of personal financial

management. Smart Pocket not only reduces manual effort in recording expenses but also empowers users to

make informed financial decisions and adopt sustainable spending habits. Future improvements may extend

into automated bill extraction, advanced forecasting models, and personalized recommendation engines,

further enriching the system’s ability to support long-term financial well-being.

Keywords: Smart Pocket, Personal Finance Management, Expense Tracking, Machine Learning, Budget

Monitoring,Spending Pattern Analysis, Financial Visualization, Expense Classification

INTRODUCTION

Financial literacy and expense awareness have emerged as essential skills in the modern world, especially as

individuals navigate diverse spending opportunities, cashless payment systems, and digital marketplaces.

Despite the abundance of available financial tools, many people still rely on manual tracking methods or

fragmented applications that provide limited insight into their spending patterns. This often results in

unmonitored expenses, overshooting budgets, and poor control over financial habits. The need for an

integrated, intelligent, and automated solution is now more crucial than ever.

Traditional budgeting methods such as handwritten logs, spreadsheets, or simple mobile apps lack analytical

depth, real-time classification, and predictive capabilities. They require users to enter data manually, interpret

charts on their own, and derive actionable conclusions without computational assistance. Advanced financial

management systems exist, but many are either overly complex for everyday users, fail to provide meaningful

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue XI November 2025

Page 353

www.rsisinternational.org

personalization, or do not leverage machine learning to offer accurate categorization and trend prediction.

These limitations highlight a gap for a system that is both user-friendly and analytically powerful.

To bridge this gap, this research introduces Smart Pocket, a machine-learning-based personal finance

management system that automates expense tracking, classifies spending categories, monitors budget

utilization, and presents intuitive visual insights. Unlike conventional tools, Smart Pocket transforms raw

expenditures into structured data through intelligent categorization. The system generates four key insights:

 Min/Max/Avg category-wise analysis,

 Total monthly budget usage,

 Time-series trend visualization, and

 Category distribution analysis.

These visualizations provide users with a holistic overview of their financial behavior. For example, the time-

series chart reveals spending patterns across days highlighting peaks around September 12 and September 23

while the doughnut chart uncovers category dominance, such as the significantly larger share of “Other”

expenses. Similarly, the budget progress indicator shows that users consumed 82% of their monthly allocation,

emphasizing the system’s capability to keep individuals within financial limits. Together, these insights

encourage proactive decision-making and help users adopt healthier financial routines.

Smart Pocket brings automation into a domain traditionally dominated by manual effort. By applying

classification algorithms, the system interprets data consistently and accurately, reducing human error in

recording expenditures. It also supports future scalability, allowing integration with OCR, digital receipts, and

predictive forecasting techniques. As a result, Smart Pocket is positioned as an accessible, efficient, and

intelligent platform for personal finance management.

This research not only evaluates the performance and reliability of Smart Pocket but also investigates its real-

world impact on user behavior. The results demonstrate that users gain clearer visibility into their financial

status, enabling them to adjust their budgets, reconsider spending priorities, and maintain control over

impulsive purchases. Ultimately, the study argues that incorporating machine-learning-driven automation into

personal expense management significantly enhances financial awareness, promotes informed decision-

making, and fosters long-term economic stability.

METHODOLOGY

Data Collection

The dataset for the Smart Pocket system was collected from multiple heterogeneous sources to ensure

diversity, reliability, and real-world applicability. Primary data sources included digital receipts, bank

statements, mobile payment notifications, and manually entered user transactions. To support large-scale

acquisition and improve automation, additional data collection methods such as secure APIs, web scraping of

authorized financial dashboards, and POS system exports were incorporated. The dataset was gathered over a

period of three months and consisted of 1,250 anonymized transactions contributed by 27 users aged 18–45.

Each record captured essential attributes including transaction date, amount, merchant name, payment mode,

and spending category. To ensure compliance with data protection standards, all personally identifiable

information (PII) was removed prior to storage, and sensitive components were encrypted using AES-256.

User identities were replaced with randomly generated tokens to maintain anonymity, and all data transfers

were secured using HTTPS. The continuous and multi-source nature of the collection process enabled the

system to capture temporal spending patterns, category variations, and behavioral trends required for accurate

machine learning classification and forecasting.

Data Preprocessing

The collected dataset underwent a comprehensive preprocessing phase to ensure consistency, accuracy, and

suitability for machine learning tasks. Initial cleaning involved removing duplicate entries, correcting

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue XI November 2025

Page 354

www.rsisinternational.org

inconsistencies in transaction amounts, and addressing missing values using context-based imputation

techniques. Categorical attributes such as spending categories, merchant types, and payment modes were

standardized to eliminate variations caused by spelling differences or formatting discrepancies. Temporal

fields including transaction date and time were converted into machine-readable formats and decomposed into

features such as day, month, weekday, and weekend indicators to capture behavioral patterns. Numerical

values were normalized using Min–Max or Z-score scaling to prevent bias during model training and to

maintain uniform feature distribution. Outlier detection techniques such as IQR and z-score filtering were

applied to identify anomalous transactions, which were either corrected or removed based on their contextual

validity. Exploratory Data Analysis (EDA) using correlation heatmaps, box plots, and trend curves guided the

refinement of feature selection and revealed spending cycles essential for forecasting models.

Feature Extraction

Feature extraction played a critical role in converting raw financial records into structured representations

suitable for machine learning models. Numerical features such as transaction amount, budget usage, and

cumulative daily spending were directly incorporated, while categorical attributes including transaction type,

merchant category, and payment mode were encoded using techniques such as One-Hot Encoding and Label

Encoding. Text-based fields like merchant names and item descriptions were transformed into high-

dimensional feature vectors using advanced natural language processing (NLP) techniques such as TF-IDF,

bag-of-words representation, or word embeddings (Word2Vec/fastText). Temporal features—including

spending intervals, frequency of purchases, and day-based patterns—were engineered to support LSTM-based

time-series forecasting. Dimensionality reduction techniques such as PCA and chi-square feature selection

were applied to eliminate redundant or low-contribution attributes, thereby improving training efficiency and

model generalization. The extracted feature set provided a robust foundation for classification, clustering, and

predictive modeling tasks within Smart Pocket.

Model Selection

Four baseline models were evaluated to address reviewer concerns: Logistic Regression, Support Vector

Machine (SVM), Random Forest, and LSTM. Random Forest was selected as the final classification model

due to superior generalization, lower misclassification rate, and robustness against noise. LSTM was used for

forecasting due to its capability of modelling long‑term dependencies. Hyperparameters were tuned using grid

search and 10‑fold cross‑validation.

Model Training

The selected machine learning models—Logistic Regression, Support Vector Machine (SVM), Random

Forest, and Long Short-Term Memory (LSTM) networks—were trained using carefully optimized procedures

to ensure high predictive accuracy and robust generalization. For traditional classifiers, training involved

minimizing classification loss using optimization techniques such as stochastic gradient descent (SGD) or

Adam, depending on the algorithm. Hyperparameters including learning rate, number of estimators, maximum

tree depth, kernel type, batch size, and regularization strength were systematically tuned using grid search and

random search to identify the optimal configuration. To prevent overfitting and improve stability, k-fold cross-

validation (k = 10) was employed across all models, ensuring consistent performance across different subsets

of data. For the LSTM model, the time-series data was reshaped into sequential input windows, and training

was performed using the Adam optimizer with Mean Squared Error (MSE) as the loss function. Techniques

such as dropout layers, early stopping, and weight regularization were integrated into the training pipeline to

enhance generalization and mitigate learning instability. The final chosen model, Random Forest for

classification and LSTM for forecasting, demonstrated strong robustness, minimal variance, and high

predictive reliability across validation folds

Model Evaluation

Model evaluation was conducted using a separate validation set to provide an unbiased assessment of

predictive performance. A comprehensive suite of metrics was used, including accuracy, precision, recall, and

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue XI November 2025

Page 355

www.rsisinternational.org

F1-score, ensuring a multidimensional view of classification effectiveness. Precision and recall offered insights

into the model’s ability to correctly identify spending categories without misclassification, while the F1 -score

balanced these two metrics for scenarios involving category imbalance. Confusion matrices were generated to

visualize misclassification patterns, particularly in categories that historically exhibited overlap such as

Clothing vs. Other and Fruits vs. Food. Receiver Operating Characteristic (ROC) curves and Area Under the

Curve (AUC) values were analyzed to measure the classifier's ability across different threshold levels. For

forecasting tasks using LSTM, evaluation metrics such as Mean Squared Error (MSE), Root Mean Squared

Error (RMSE), and Mean Absolute Error (MAE) were employed to quantify prediction accuracy. The

evaluation results confirmed that the Random Forest classifier and LSTM predictor significantly outperformed

the baseline models, achieving high reliability in both category classification and future spending predictions.

Dataset Description

The expanded dataset contains 1,250 transactions collected over 90 days from 27 anonym zed users aged 18–

45. Data sources include digital receipts, POS logs, bank statements, and self entries. Attributes include:

transaction date, amount, category, merchant name, payment mode, and user demographic group. Category

breakdown: Food (310), Clothing (180), Utilities (160), Entertainment (140), Fruits (120), Other (340).

Because the 'Other' category disproportionately dominated classification, k-means clustering was applied to

create refined subcategories such as Transport, Gifts, and Household Supplies, improving model clarity.

Model Tuning

Model tuning was performed to optimize predictive performance and ensure that both the classification and

forecasting components of Smart Pocket generalized effectively across diverse spending patterns. Hyperpa-

rameter optimization was carried out using a combination of grid search, random search, and cross-validation

to systematically explore optimal parameter configurations for each model. For traditional machine learning

classifiers such as Random Forest, the number of trees, maximum depth, minimum sample split, and feature

selection strategies were fine-tuned to balance accuracy and computational efficiency. Similarly, tuning of

SVM involved identifying the most effective kernel function, regularization parameter (C), and gamma values.

For the LSTM forecasting model, architectural refinements were explored including variations in the number

of hidden layers, number of units per layer, dropout rates, and activation functions (ReLU, tanh). Sequence

window lengths and batch sizes were also optimized to capture temporal dependencies more effectively. Early

stopping and learning rate scheduling were incorporated to stabilize training and avoid overfitting. Feedback

from validation metrics, confusion matrices, and domain insights on transaction behavior guided iterative ad-

justments to the model architecture.

These tuning strategies collectively resulted in improved classification accuracy, reduced forecasting error, and

enhanced robustness, ensuring that the final models performed reliably across different spending categories

and time-series patterns.

Deployment

The final machine learning models were seamlessly integrated into the Smart Pocket application through a se-

cure, scalable, and modular deployment architecture. Containerization using Docker ensured consistent

runtime environments across development, testing, and production stages, eliminating dependency conflicts

and enabling reproducible builds. The machine learning components—responsible for expense classification

and spending prediction—were deployed as independent microservices, allowing efficient scaling and mainte-

nance.

A RESTful API layer built using Flask connected the machine learning modules with the Next.js frontend, en-

abling real-time predictions and interactive financial insights for users. These APIs facilitated smooth commu-

nication of input features, category predictions, budget utilization metrics, and time-series forecasts. To opti-

mize performance, caching strategies and load balancing mechanisms were incorporated to handle concurrent

user requests while maintaining low latency.

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue XI November 2025

Page 356

www.rsisinternational.org

Security measures were implemented throughout the deployment pipeline. All user interactions and API com-

munications were encrypted via HTTPS, and JWT-based authentication ensured that only authorized users

could access the system’s features. Sensitive data such as transaction records and model inputs were encrypted

using AES-256 encryption, while anonymization protocols were applied before processing to preserve user

privacy. Additionally, role-based access control (RBAC) separated privileges for regular users and administra-

tors, enhancing system integrity.

A CI/CD workflow automated the build, testing, and deployment processes, ensuring continuous integration of

updates without service interruption. This deployment strategy enabled Smart Pocket to operate reliably in re-

al-world environments—providing accurate predictions, secure data handling, and high system availability.

Testing

Extensive testing was conducted to validate the accuracy, robustness, and real-world reliability of the Smart

Pocket system. The testing framework included unit testing, integration testing, system testing, and user ac-

ceptance testing (UAT) to ensure that every component of the application performed as expected. Machine

learning models and APIs were evaluated independently before full system deployment.

A/B testing was performed using multiple versions of the classification and forecasting models to compare

performance across different user groups. This helped identify the most effective model configurations for re-

al-world financial behavior. Functional testing assessed model performance across diverse spending patterns,

category distributions, and transaction frequencies. Various input scenarios—including irregular spending

spikes, incomplete entries, duplicate transactions, and unusual category assignments—were introduced to

evaluate how well the system handled edge cases.

Load testing and stress testing were also conducted to analyze system behavior under high traffic and simulta-

neous user requests, ensuring that response times remained stable. Real-time testing using live user feedback

provided additional insights into dashboard usability, prediction clarity, and overall satisfaction with the sys-

tem.

Overall, the testing phase confirmed that Smart Pocket is resilient, accurate, and capable of managing a wide

spectrum of financial scenarios encountered in real-world applications.

Comparative Evaluation

The following metrics were computed for classification models:

Logistic Regression — Accuracy 82.1%, Precision 80.4%, Recall 78.9%, F1-score 79.1%

SVM — Accuracy 87.6%, Precision 86.5%, Recall 84.1%, F1-score 85.2%

Random Forest — Accuracy 93.4%, Precision 91.2%, Recall 89.7%, F1-score 90.4%

Spending Prediction (LSTM): RMSE = 0.114, MAE = 0.089.

Random Forest outperformed others in all metrics, justifying its use as final model.

Performance Metrics

Detailed evaluation:

- Accuracy: 93.4%

- Precision: 91.2%

- Recall: 89.7%

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue XI November 2025

Page 357

www.rsisinternational.org

- F1-score: 90.4%

- Confusion Matrix: Shows reduced misclassification in Clothing vs Other and Fruits vs Food.

- Cross-validation: 10-fold CV average accuracy = 92.8%.

Category Refinement

To address reviewer concerns about the dominance of 'Other', clustering techniques were used to create

additional subcategories. After refinement, category distribution was more balanced, improving classification

accuracy and providing more actionable insights. This significantly enhanced interpretability of spending

patterns.

Privacy & Security Enhancements

The revised system includes:

- AES-256 encrypted data storage

- HTTPS-secured API communication

- JWT token-based authentication

- Full anonymization (no personal identifiers retained)

- GDPR-aligned data protection strategies

- ISO/IEC 27001‑aligned encryption and access control policies.

Continuous Improvement

The system was monitored regularly to maintain accuracy. New data was used to retrain the model over time.

Transfer learning and online learning techniques were considered for future enhancement. Collaboration with

users helped identify improvements and new features

DatasetAnalysis

This section presents the visual and numerical analysis of the expense data captured through the Smart Pocket

system. The charts give insights into spending patterns, category distribution, and overall budget usage

Category-wise Min/Max/Avg Expenses

Figure 1.1

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue XI November 2025

Page 358

www.rsisinternational.org

The chart displays expense statistics for three primary categories: