To address these challenges, the SMARTSTOCK system leverages artificial intelligence (AI), computer vision,
and image processing technologies, including YOLO (You Only Look Once) for object detection and SORT
(Simple, Online, and Realtime Tracking) for tracking and counting. The system provides a comprehensive,
automated solution for monitoring inventory, detecting OOS products, and analyzing in-store and parking traffic.
By automating critical warehouse processes, SMARTSTOCK reduces reliance on manual operations, minimizes
human error, and improves operational efficiency. The system integrates three core modules: stock detection and
tracking, customer presence monitoring, and car park counting, all supported by a user-friendly desktop
application that provides real-time reporting and analytics. A robust database underpins the system, enabling
efficient storage, retrieval, and historical analysis of inventory and traffic data.
The expected outcomes of SMARTSTOCK include accurate real-time detection and tracking of warehouse
stock, automated identification of low-stock and OOS items, monitoring of customer presence and parking
availability, and provision of actionable insights through an intuitive desktop interface. These capabilities allow
warehouse managers and staff to optimize resource allocation, expedite replenishment, and make informed
operational decisions. By enhancing inventory visibility, improving traffic monitoring, and automating repetitive
tasks, SMARTSTOCK aims to strengthen overall warehouse efficiency, reduce operational costs, and improve
customer satisfaction. Its scalability and adaptability make it suitable for deployment across diverse retail and
warehouse environments, offering a sustainable and intelligent approach to modern warehouse management.
LITERATURE REVIEW
The development of the SMARTSTOCK system is underpinned by advancements in three key technical areas:
the challenges and digitization of inventory management, the application of computer vision in logistics, and the
evolution of real-time object detection models.
The Need for Automated Inventory Management
Traditional inventory systems, reliant on manual processes such as periodic audits and barcode scanning, are
inherently inefficient and introduce susceptibility to significant human transcription errors and data latency. The
integration of cyber-physical systems, a core tenet of Industry 4.0, mandates a shift towards automated,
continuous monitoring solutions. Research in smart warehousing highlights the critical need for systems that
provide real-time visibility into stock levels to minimize holding costs and prevent stockout scenarios [1].
Studies have explored utilizing Radio-Frequency Identification (RFID) and sensor networks; however, these
technologies often require direct tagging of every item, presenting scalability and cost challenges for high-
volume, dynamic environments [2]. The limitations of conventional methods establish a clear requirement for
non-intrusive, vision-based alternatives.
Object Detection in Logistics and Supply Chain
Computer vision has emerged as a disruptive technology in supply chain optimization, offering solutions for
quality inspection, package sorting, and inventory assessment. Early implementations primarily used traditional
image processing techniques, but modern approaches leverage deep learning for superior performance under
complex conditions. Convolutional Neural Networks (CNNs) have facilitated significant breakthroughs,
classifying detection models into two main categories: two-stage detectors (e.g., R-CNN, Faster R-CNN) and
single-stage detectors (e.g., SSD, YOLO) [3]. While two-stage models typically yield marginally higher
localization accuracy, single-stage detectors are demonstrably favored in industrial applications where high
inference speed is a non-negotiable prerequisite for real-time operation.
Evolution of the YOLO Framework
The You Only Look Once (YOLO) architecture, as analyzed in Table 1 and Figure 1, has become the de facto
standard for real-time object detection due to its ability to frame object detection as a single regression problem,
vastly improving speed while maintaining competitive accuracy.