Advance and Reliable Cooking Oil Frequency Usage Classification via Deep Vision Analysis of Challenging Visual Features
- Norazrai Daniel Afiq Razali
- Nik Mohd Zarifie Hashim
- Muhammad Nur Amir Che Hamid
- Masrullizam Mat Ibrahim
- Fadhli Syahrial
- Mohd Fazli Mohd Sam
- Salizawati Mohd Yusof
- Mahmud Dwi Sulistiyo
- 3654-3672
- Oct 9, 2025
- Computer Science
Advance and Reliable Cooking Oil Frequency Usage Classification via Deep Vision Analysis of Challenging Visual Features
Norazrai Daniel Afiq Razali1, Nik Mohd Zarifie Hashim1,2*, Muhammad Nur Amir Che Hamid3, Masrullizam Mat Ibrahim1,2, Fadhli Syahrial4, Mohd Fazli Mohd Sam5, Salizawati Mohd Yusof6, Mahmud Dwi Sulistiyo7
1Faculty Technology dan Kejuruteraan Elektronik dan Computer, University Technical Malaysia Melaka, Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia
2Centre for Telecommunication Research and Innovation (CeTRI), University Technical Malaysia Melaka, Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia
3 Wisma Genting, 28 Jalan Sultan Ismail, 50250 Kuala Lumpur, Malaysia
4Faculty Technology dan Kejuruteraan Mechanical, University Technical Malaysia Melaka, Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia
5Faculty Pengurusan Technology dan Teknousahawanan, University Technical Malaysia Melaka, Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia
6 Bahagian Keselamatan & Kualiti Makanan, Jabatan Kesihatan Negeri Perak
7 School of Computing, Telkom University, West Java, Indonesia
*Corresponding author
DOI: https://dx.doi.org/10.47772/IJRISS.2025.909000303
Received: 27 August 2025; Accepted: 04 September 2025; Published: 09 October 2025
ABSTRACT
The classification of cooking oil usage in real-world scenarios presents significant challenges due to several varying visual conditions such as angular perspectives, blurriness, and occlusions. Traditional computer vision approaches often struggle with these challenges, leading to reduced reliability in automated systems. This study explores the effectiveness of different deep learning architectures in addressing these challenges for robust cooking oil usage classification. Several selected architectures of convolutional neural networks (CNNs) modals and our proposed modal has been evaluated to determine their performance in handling distorted, blurred, and partially obscured oil images. Through extensive experimentation, proposed model demonstrates superior performance over existing methods, achieving over 99% accuracy. These findings highlight the potential of deep vision analysis in improving classification accuracy for real-world applications, providing insights into model selection for challenging visual feature extraction.
Keywords— Angular, Blurry, Classification, CNN, Computer vision, Cooking oil usage, Deep learning, Food safety, Occlusion
INTRODUCTION
The accurate classification of cooking oil usage in practical, real-world settings represents a significant and complex challenge, having critical implications for food safety, quality assurance, and automated monitoring systems. Cooking oils undergo various chemical and physical changes when used repeatedly or in different conditions, which can reflect on the oil’s visual appearance [1]. However, practical environments often introduce complicating factors such as angular perspectives, motion blur, and partial occlusions, which hinder reliable detection [1]. Traditional computer vision methods that depend on handcrafted features struggle under these distortions or incomplete data [2].
Recent advances in deep learning, particularly convolutional neural networks (CNNs), have shown strong capability in extracting rich, hierarchical features from complex images, even under adverse conditions [1][2]. This approach, often referred to as deep vision, leverages CNNs to automatically learn feature representations from raw visual data, enabling high performance in tasks such as object detection, classification, segmentation, and tracking [3]. Its adaptability makes it highly suitable for food-related applications, including cooking oil classification.
Nevertheless, deep vision systems face persistent challenges in practice. Blurriness, angular variations, and occlusion can obscure critical features, complicating recognition and classification [4]. Blurry images, whether from motion or poor focus, reduce the clarity of features; angular changes alter the perceived appearance of objects; and occlusions hide essential visual cues. Addressing these limitations is essential for improving the robustness of deep vision in applications ranging from autonomous driving and medical imaging to manufacturing inspection and food safety monitoring. Strategies such as data augmentation, attention mechanisms, and multi-view learning continue to push the field toward more reliable real-world performance.
LITERATURE REVIEW
The effective cooking oil classification has driven extensive exploration of diverse analytical techniques, each possessing unique strengths and limitations [5][6]. Traditional methods often rely on subjective criteria or hand-engineered algorithms, limiting their adaptability across diverse oil types and conditions. In contrast, machine vision integrated with advanced image processing and deep learning has emerged as a powerful solution for food quality assessment [7]. Over the past few decades, integrating artificial intelligence with food category recognition has emerged as an important and active research domain.
Deep learning models, especially CNNs, excel at discerning complex visual patterns, enabling robust recognition without explicit physical modelling [5][6]. Their effectiveness is further supported by the availability of large-labelled datasets and increasing computational resources, accelerating adoption in food processing and safety applications [6]. These methods have proven particularly effective in tasks requiring fine-grained classification, making them well-suited for identifying cooking oil usage.
Challenges Posed by Visual Features
Despite these advancements, visual factors such as angular variations, blurriness, and occlusions remain major obstacles to reliable deep vision-based classification. Each can degrade image quality and hinder accurate feature extraction, underscoring the need for models that perform well under real-world conditions.
Angular Variations
Angular variations refer to changes in the viewing angle or orientation of an object within an image, which can dramatically alter its visual appearance. For cooking oil classification, the same sample may look different when captured from diverse perspectives, affecting the perceived texture, colour, or visual indicators of degradation. To improve inspection precision, some studies integrate deep learning frameworks with multi-angle imaging. Research by Lihong Xie et al. on agricultural multi-view learning [8] and Bin Liu et al. on multi-angle surface defect detection [9] both demonstrate how diverse viewing angles strengthen convolutional networks. Despite these developments, challenges persist in real-time inference, computational demands, and the need for large, varied datasets to ensure model generalizability.
Further addressing object recognition, Nik Mohd Zarifie Hashim et al. proposed a method for “Next Viewpoint Recommendation” to minimize pose ambiguity. Their approach recommends optimal poses by quantifying pose entropy, significantly increasing the accuracy of 6D pose estimation, especially when initial observations are limited or unclear [10]. This strategy directly supports food safety imaging where items require thorough, multi-perspective scanning for contamination or degradation. Similarly, Ashish Reddy Mulaka et al.’s work on pine seedlings found that low-angle side views and wide fields of view improved detection and differentiation, highlighting how camera settings and spatial arrangements crucially impact machine vision systems for applications like food processing inspections [11].
Blurriness
Blurriness caused by motion, limited depth of field, or sensor imperfections leads to loss of image detail and reduced clarity. In cooking oil images, blur can arise from various sources, including motion blur due to camera movement or object displacement, blur caused by out-of-focus lenses, and atmospheric blur due to steam or smoke present during cooking. Blur introduces uncertainty in edge detection and feature extraction, which are critical for CNNs to learn discriminative representations [2]. The challenge lies in developing algorithms robust to various types of blurs, including motion blur, Gaussian blur, and out-of-focus blur, each requiring tailored preprocessing or network architectures to mitigate their effects [12].
This degradation compromises a deep learning model’s ability to extract the subtle and fine-grained features necessary for accurate classification. Sayed et al. highlight challenges encountered with motion blur in real-time object detection [13], while one paper discuss advanced blur kernel estimation and deblurring methods aimed at restoring image sharpness [14]. In cooking oil analysis, blur can obscure visual cues such as subtle changes in viscosity or particulate presence, making blur-robust feature learning or effective deblurring essential for reliable assessment.
Occlusion
Occlusion occurs when parts of the object are hidden behind other items or artifacts in the scene, resulting disrupts the complete visual information required for accurate classification, potentially leading to misclassification if the occluded regions contain critical features. Cooking oil samples may be partially obscured by food particles, cooking utensils, or other objects, is a common and complex problem in real-world cooking scenarios.
Robust object detection and segmentation techniques can help identify and isolate the cooking oil region from occluding elements, enabling more accurate analysis. AI algorithms and models are transforming food safety applications by offering advanced capabilities in contamination detection, quality assurance, and risk management [15]. Integrating deep learning techniques in food image recognition is leading to innovative dietary assessment methods with higher accuracy and precision [16]. Occlusions involving visually similar objects further complicate discrimination. For cooking oil usage classification, occlusion may cause misinterpretation of clarity or contamination indicators. Therefore, developing models that infer full object characteristics from partial observations is crucial for practical deployment.
Together, these challenging visual factors constitute significant barriers for deep vision analysis in cooking oil usage classification. Future research must prioritize the development and integration of advanced algorithms capable of mitigating the adverse effects of angular variations, blurriness, and occlusions, thereby enhancing the robustness, adaptability, and reliability of deep learning models in real-world application scenarios.
METHODOLOGY
This section outlines systematic methodology to achieve reliable cooking oil frequency usage classification through deep vision analysis. The approach begins with image acquisition of cooking oil samples under real-world conditions, including challenging visual features such as blurriness, angular variation, and occlusion. Preprocessing techniques are applied to enhance image quality before feeding the data into a deep convolutional neural network for feature extraction and classification. Finally, model performance is validated using standard evaluation metrics to ensure robustness, accuracy, and generalizability of the proposed framework.
Dataset Preparation
Accuracy classifying the action performed on three types of selected foods hinges on several distinct phases, all critical within the proposed methodology. In this study, chosen foods were ‘chicken’, represents raw fresh food, ‘lekor’ represents a processed traditional food, and ‘nugget’ represents processed industrial food illustrated in Fig. 1.
Fig. 1 Types of food: (a) ‘chicken’ and (b) ‘lekor’ (c) ‘nugget’
Food classification based on processing levels is common framework used to understand the nature and implications of food products on diet and health. Raw or unprocessed foods refer to natural, whole foods that have undergone minimal alteration from their natural state, such as fresh fruits, vegetables, and meats. Processed foods encompass a wide range of products that have been altered to varying degrees, from minimal processes like washing and freezing to more extensive treatments like canning, preserving, and instant preparation. Importantly, not all processed foods contain chemical preservatives; some rely on physical preservation methods such as freezing or vacuum sealing, which maintain the product’s integrity without additives.
Fig. 2 Proposed method flowcharts
A flowchart showing the overall process of proposed method that have been taken place as shown in Fig. 2. The first phase is the collection and preprocessing of data. A comprehensive dataset of cooking oil usage by using photography collections. The process taken to create and collect the datasets for every cooking oil sample.
In this study, the dataset plays a critical role in ensuring the accuracy and reliability of cooking oil frequency usage classification. A custom dataset was created because no existing public dataset adequately represents the unique visual challenges associated with cooking oil samples, such as blurriness, angular variations, and occlusion. By collecting images directly under controlled and real-world conditions, the dataset reflects the true diversity and variability of oil usage scenarios, ensuring that the deep vision model can generalize effectively.
As shown in Table I, total 2160 were taken for each cooking oil samples. These images of cooking oil for each sample were meticulously annotated to include species labels that have four classes, for example, “1x usage”, “2x usage”, “3x usage”, and “Others” with total of 540 images for each class. This is to ensure that the training data was accurate and representative.
TABLE I. TOTAL IMAGE FOR EVERY CLASS
| Cooking Oil Samples | Number of Image for Each Classes | Total Images | |||
| 1x Usage | 2x Usage | 3x Usage | Others | ||
| ‘Chicken’ | 540 | 540 | 540 | 540 | 2160 | 
| ‘Lekor’ | 540 | 540 | 540 | 540 | 2160 | 
| ‘Nugget’ | 540 | 540 | 540 | 540 | 2160 | 
All the finalized images for each class have been collected. In Fig. 3 shows examples of four classes cooking oil usage for all samples. The evaluation used rendered images for training and testing to show the effectiveness of the proposed viewpoint recommendation method.
Fig. 3 Sample of four classes of cooking oil usage samples:
(a) ‘chicken’ and (b) ‘lekor’ (c) ‘nugget’
Angular Datasets: Multi-angle image capture is critical to acquire and ensure all the visual features are captured because the quality of oil might differ based on food type used and the time it was used. According to Bin Liu et al. on multi-angle surface defect detection, the sample collection was carried out by capturing image samples of cooking oil’s visual characteristics sequentially from three different elevation angles: 0°, 45°, and 90° [7].
Fig. 4 Camera elevation angle for image capturing
The camera angle schematic shows how all oil samples are viewed from three different heights as shown in Fig. 4. Having a 0° view gives a great advantage for clearly inspecting colour and clarity since it provides a horizontal view of the oil. A 45° angle works best when searching for particles and other surfaces since it provides oblique views which offers both depth and surface appearance capturing. The 90° angle, or top-down view, is ideal for inspecting surface texture, colour uniformity, and possible floating residues.
Fig. 5 Example of images observed from different elevation angles:
(a) 0° (b) 45° (c) 90°
By capturing images from these three perspectives as shown in Fig. 5, the system gathers richer visual data, making it more suitable for detailed analysis. This setup enhances the ability to detect subtle differences between oil samples, thereby improving the accuracy of quality assessment models. It plays a key role in supporting a vision-based food safety monitoring system.
As shown in Table II, total 1080 were taken for each cooking oil samples. These images of cooking oil for each sample were meticulously annotated to include species labels that have four classes, for example, 1x usage, 2x usage, 3x usage, and others with total of 120 images for each class.
TABLE II. TOTAL IMAGE FOR EVERY CLASS
| Cooking Oil Samples | Usage | Number of Image Angle | Total Images | ||
| 0° | 45° | 90° | |||
| ‘Chicken’ | 1x | 120 | 120 | 120 | 360 | 
| 2x | 120 | 120 | 120 | 360 | |
| 3x | 120 | 120 | 120 | 360 | |
| ‘Lekor’ | 1x | 120 | 120 | 120 | 360 | 
| 2x | 120 | 120 | 120 | 360 | |
| 3x | 120 | 120 | 120 | 360 | |
| ‘Nugget’ | 1x | 120 | 120 | 120 | 360 | 
| 2x | 120 | 120 | 120 | 360 | |
| 3x | 120 | 120 | 120 | 360 | |
Blurry Datasets: Blurriness variation in image datasets is critical for evaluating the robustness of vision-based classification systems, as the quality of captured features can be significantly affected by different levels of blur. In this study, cooking oil image samples were prepared with four controlled blurriness levels: 25%, 50%, 75%, and 100% as shown in Fig. 6. These levels were applied systematically to the original dataset to simulate real-world conditions where factors such as camera motion, focus issues, and environmental vibrations may degrade image clarity.
Fig. 6 Example of image samples with different blurriness levels:
(a) 25% (b) 50% (c) 75% (d) 100%
Blurriness application illustrating the progressive loss of detail from mild (25%) to extreme (100%) blurring. Images with 25% blurriness retain most of the original visual characteristics, allowing for relatively accurate inspection of colour and clarity. 50% blurriness starts to obscure finer surface details, making it more challenging to detect particles and subtle texture variations. At 75% blurriness, significant feature degradation occurs, with only general colour tones and large shapes remaining distinguishable. 100% blurriness represents a fully degraded condition where surface features, particles, and even colour uniformity are difficult to evaluate.
By incorporating these four blurriness levels, the dataset provides a comprehensive benchmark for testing model performance under varying levels of image degradation. This approach supports the development of more robust and adaptive deep learning models capable of maintaining classification accuracy even in visually compromised conditions, thereby enhancing the reliability of automated cooking oil quality assessment systems.
Occlusion Datasets: Occlusion image capture plays a crucial role in testing the robustness of vision-based systems, as real-world environments often involve partially blocked visual information. In this study, the dataset was designed with varying occlusion patterns to simulate challenging inspection conditions. The occlusions were applied in different orientations such as horizontal, vertical, left-diagonal, and right-diagonal covering different portions of the visual field, as shown in Fig. 7.
Fig. 7 Example of image samples with different occlusion orientation: (a) horizontal (b) vertical (c) right-diagonal (d) left-diagonal
Horizontal occlusions primarily block key visual features across the middle or upper/lower sections of the image, while vertical occlusions hide parts of the left or right side of the oil sample. Left-diagonal and right-diagonal occlusions create slanted obstructions that block both vertical and horizontal information simultaneously, making object recognition and feature extraction more difficult. By training and evaluating the system on these occluded datasets, the model can be tested for its ability to accurately assess cooking oil quality even when parts of the visual information are missing. This method enhances the robustness of the quality assessment system, ensuring it can operate reliably in practical scenarios where complete visibility is not guaranteed.
Proposed Model for Deep Vision Analysis
The second phase focuses on the development and training of the deep learning model. Before developing the model, dataset splitting strategy has been applied for the cooking oil image classification task (original, blurry and occlusion). These images were divided into three distinct sets to ensure proper model training and evaluation. Specifically, 70% of the images, equivalent to 1,512, were allocated to the training dataset, which is used to train the deep learning model. Another 20%, or 432 images, were assigned to the evaluation (or validation) dataset to fine-tune model parameters and prevent overfitting during training. The remaining 10% (216 images) were reserved for the testing dataset, which is used to assess the model’s performance on unseen data. This split ensures that the model is both effectively trained and accurately evaluated, supporting reliable and general predictions.
However, dataset splitting strategy for angular datasets differ from original, blurry and occlusion because each angle has been split. The data splitting for angular datasets was 80% of the images, equivalent to 1080, were allocated to the training dataset, which is used to train the deep learning model. Another 20% of images were assigned to the evaluation (or validation) dataset to fine-tune model parameters and prevent overfitting during training.
Convolutional Neural Networks (CNNs), known for their efficiency in image recognition tasks, were employed. The advantage of CNN compared to another algorithm is CNN can detect the important features without any human supervision. CNN is useful in a lot of applications, especially in image related tasks including the image classification, image semantic segmentation, object detection in images, and more. The model architecture was carefully designed to balance complexity and performance, featuring multiple convolutional layers to capture intricate patterns in the cooking oil images. The model was fine-tuned using the annotated cooking oil image dataset, with hyperparameters optimized through a combination of grid search and cross-validation techniques to ensure the best possible performance.
Reliable computer vision methods must be evaluated followed by testing then building models capturing every relevant detail of utmost importance for model refinement through classification accuracy focused evaluations. With the help of confusion matrices, can analyse the specific model classification challenges in deeper detail for accurate calibrations. For model derived results meeting expectations, it was subsequently deployed onto a seamless tool interface meant for monitoring restaurant food preparation equipment, food vending machines, and laboratory centres. Among numerous options embedded inside the application is an instant class usage evaluation function by image analysis thought oil image input. Additionally, the system was designed to be scalable, capable of handling large volumes of data, and adaptable to future enhancements, for example incorporating additional cooking oil when using other types of food or using other types of cooking oil
RESULT AND ANALYSIS
This section presents the experimental findings obtained from the proposed cooking oil quality assessment framework, evaluated under multiple controlled conditions. The results are organised into four key components: state-of-the-art (SOTA) model comparison, angular variation analysis, blurriness evaluation, and occlusion assessment.
State-of-the-Art (S.O.T.A.) Methods Comparison Analysis
S.O.T.A. comparison establishes a performance benchmark by evaluating the proposed method against existing deep learning architectures widely used in food quality inspection tasks. This highlights the relative effectiveness of the proposed approach in terms of classification accuracy, robustness, and computational efficiency.
TABLE III. S.O.T.A. PERFORMANCE ACCURACY
| Cooking Oil Samples | Modal | Training Accuracy (%) | Validation Accuracy (%) | 
| ‘Chicken’ | VGG19 | 99.04 | 99.77 | 
| ResNet50 | 95.84 | 99.56 | |
| EfficientNet | 93.76 | 97.48 | |
| Mobile Net | 96.97 | 100.00 | |
| Proposed Method 1 | 99.09 | 99.77 | |
| Proposed Method 2 | 92.79 | 98.91 | |
| Proposed Method 3 | 94.95 | 100.00 | |
| ‘Lekor’ | VGG19 | 97.31 | 99.42 | 
| ResNet50 | 90.11 | 96.83 | |
| EfficientNet | 87.51 | 92.75 | |
| Mobile Net | 91.25 | 98.31 | |
| Proposed Method 1 | 97.46 | 99.56 | |
| Proposed Method 2 | 85.44 | 91.90 | |
| Proposed Method 3 | 88.31 | 95.81 | |
| ‘Nugget’ | VGG19 | 97.67 | 98.61 | 
| ResNet50 | 91.67 | 98.80 | |
| EfficientNet | 88.92 | 93.70 | |
| Mobile Net | 94.25 | 99.72 | |
| Proposed Method 1 | 98.36 | 99.54 | |
| Proposed Method 2 | 89.75 | 97.25 | |
| Proposed Method 3 | 92.77 | 99.05 | 
From Table III for ‘Chicken’ samples, performance is consistently high, with several models reaching perfect or near-perfect evaluation accuracy. Proposed Method 3 and MobileNet achieve 100% evaluation accuracy, and Proposed Method 1 and VGG19 also score 99.77%. ResNet50 performs slightly lower at 99.56%, while EfficientNet again trails with 97.48% evaluation accuracy. This suggests that chicken oil samples are generally easier for the models to classify, with MobileNet and Proposed Method 3 being the most reliable choices. However, ‘Lekor’ samples show more variation in results, with Proposed Method 1 and VGG19 maintaining high evaluation accuracy above 99%, while Proposed Method 2, Proposed Method 3, and ResNet50 drop to 91.90%, 95.81%, and 96.83%, respectively. This indicates that ‘lekor’ oil samples may present more complex or less consistent visual features, making classification more challenging for certain models.
For ‘Nugget’ samples, most models perform strongly, with Proposed Method 1 achieving 99.54% evaluation accuracy and MobileNet recording the highest at 99.72%. VGG19 and Proposed Method 3 also perform well above 97%, while EfficientNet records the lowest results, especially in its first instance (87.51% training and 92.75% evaluation accuracy), though it improves slightly in the second instance. This indicates that while most models can classify nugget oil samples effectively, EfficientNet may require optimization for better results. Overall, MobileNet and Proposed Method 1 stand out as consistently strong performers across different sample types, whereas EfficientNet shows weaker performance in all categories, suggesting it may not be as well-suited for this dataset without further tuning.
Angular-based Image Capturing Analysis
The angular variation analysis examines the effect of multi-angle image capture (0°, 45°, and 90°) on model performance, simulating real-world variability in camera positioning. This is essential for assessing the system’s ability to maintain accuracy despite changes in viewpoint. Two experiments of performance of the multi-angle in classify the images highlights how the model behaves under same and different angle. By analysing the output, this study observe on how each multi-angle contributes to prediction of cooking oil usage.
First Experiment: Training and Testing Dataset Using Same Angle: For first experiment, the model had been develop based on training and testing dataset using same angle to predict and classify the images samples.
TABLE IV. PERFORMANCE ACCURACY FOR FIRST EXPERIMENT
| Cooking Oil Samples | Image Angle | Proposed Method | |
| Training Accuracy (%) | Validation Accuracy (%) | ||
| ‘Chicken’ | 0° | 100.00 | 100.00 | 
| 45° | 77.78 | 76.67 | |
| 90° | 100.00 | 98.89 | |
| ‘Lekor’ | 0° | 58.06 | 58.89 | 
| 45° | 58.89 | 52.22 | |
| 90° | 74.72 | 51.11 | |
| ‘Nugget’ | 0° | 60.00 | 71.11 | 
| 45° | 81.94 | 53.33 | |
| 90° | 100.00 | 100.00 | |
Second Experiment: Trained on One Angle but Tested with All Angles: For second experiment, the model had been developed when trained on one angle but tested with all angles, simulating a more practical scenario with varied input orientations.
TABLE V. PERFORMANCE ACCURACY FOR SECOND EXPERIMENT
| Cooking Oil Samples | Image Angle | Proposed Method | |
| Training Accuracy (%) | Validation Accuracy (%) | ||
| ‘Chicken’ | 0° | 66.67 | 78.89 | 
| 45° | 66.67 | 77.78 | |
| 90° | 65.00 | 44.00 | |
| ‘Lekor’ | 0° | 51.67 | 43.33 | 
| 45° | 53.06 | 30.00 | |
| 90° | 70.00 | 38.89 | |
| ‘Nugget’ | 0° | 80.56 | 55.56 | 
| 45° | 53.06 | 42.22 | |
| 90° | 62.50 | 50.00 | |
Third Experiment: Combining All Angle of Image: For this experiment, the model had been develop based on combining all angles for training and testing dataset to predict and classify the images samples, enhancing its exposure to orientation variations.
TABLE VI. PERFORMANCE ACCURACY FOR THIRD EXPERIMENT
| Cooking Oil Samples | Proposed Method | |
| Training Accuracy (%) | Validation Accuracy (%) | |
| ‘Chicken’ | 99.80 | 99.82 | 
| ‘Lekor’ | 99.93 | 97.92 | 
| ‘Nugget’ | 100.00 | 98.84 | 
From all experiments obtained as shown in the three tables, collectively offer valuable insights into how image angle affects the model’s performance across different types of cooking oil samples. The comparison reveals three key aspects: consistency in image angle, generalization capability, and impact of diverse training data. It is shown when the model is trained and tested on the same image angle (Table IV), it performs well especially for ‘Chicken’ and ‘Nugget’ samples due to its ability to memorize features from a consistent perspective, though it still struggles with less distinctive samples like ‘Lekor’. However, when tested on different angles than it was trained on (Table V), the model’s performance drops significantly, particularly for ‘Lekor’, revealing poor generalization and strong angle sensitivity. In contrast, training with images from all angles combined (Table VI) leads to a substantial improvement in accuracy across all samples, indicating that exposure to diverse orientations enables the model to learn more robust, angle invariant features and perform reliably in varied real-world conditions.
The overall findings underline a crucial principle were data diversity during training leads to generalization and robustness. A model trained only in fixed conditions (like a single angle) may excel in controlled settings but fail in real-world applications. By combining data from multiple perspectives, the model becomes more adaptive and reliable, making it far better suited for real-life implementation.
Blurriness-based Image Capturing Analysis
The blurriness tolerance evaluation investigates model performance degradation when image clarity is progressively reduced at four levels: 25%, 50%, 75%, and 100% blur. This simulates situations where images are captured under suboptimal conditions such as motion, focus issues, or poor lighting.
Blurriness Analysis for ‘Chicken’ Sample:
Fig. 8 Performance Accuracy for ‘Chicken’ Sample
The results as shown in Fig. 8 indicate that the Proposed Methods (PM1, PM2, PM3) consistently demonstrate strong classification performance across different blurriness levels when compared with conventional deep vision architectures. Among them, Proposed Method 2 achieves the highest accuracy across all degraded conditions, reaching 95.6% at 25% blurriness and maintaining 93.55% at 50% blurriness, which is superior to VGG19, ResNet, EfficientNet, and MobileNet under the same conditions. Similarly, Proposed Method 3 also shows high robustness, with 94.21% at 25% blurriness and 90.2% at 50% blurriness, reflecting its reliability in handling moderate image distortions.
As the blurriness level increases to 75% and 100%, all models experience a natural decline in accuracy. However, the proposed methods still outperform traditional networks, particularly in high distortion scenarios where standard models such as MobileNet and EfficientNet drop below 80% accuracy. In contrast, the proposed methods maintain accuracies above 80%, highlighting their resilience in dealing with challenging visual features. At the original (non-blurred) dataset, all models achieve near-perfect results, with Proposed Method 3 and MobileNet reaching 100% accuracy, showing that performance degradation is primarily influenced by visual distortion rather than dataset bias.
Overall, these findings demonstrate that the Proposed Methods are more robust and reliable than established architectures, especially under conditions of visual degradation. This robustness confirms the suitability of the proposed approaches for real-world food safety monitoring applications, where image imperfections such as blur, occlusion, and angular variation are common.
Blurriness Analysis for ‘Lekor’ Sample:
Fig. 9 Performance Accuracy for ‘Lekor’ Sample
The classification results for the ‘Lekor’ sample as shown in Fig. 9 demonstrate that the Proposed Methods deliver consistently strong performance across varying levels of blurriness, though a gradual decrease in accuracy is observed as distortion increases. At 25% blurriness, Proposed Method 2 achieves the highest accuracy (93.2%), followed closely by Proposed Method 3 (91.8%) and Proposed Method 1 (88.5%). These results highlight the capability of the proposed approaches to capture key visual cues even under moderate image degradation.
At 50% and 75% blurriness, accuracies decline across all models, with Proposed Method 2 maintaining a relatively high performance (91.0% and 83.5%, respectively). Despite the reduction, the proposed approaches remain superior to conventional architectures such as EfficientNet and MobileNet, which experience more significant drops in accuracy under similar conditions. At 100% blurriness, all models show notable decreases, with accuracies ranging from 64.2% to 78.9%, reflecting the challenge of extracting meaningful features from heavily degraded images.
On the original (clear) dataset, nearly perfect classification is achieved across all models, with Proposed Method 3 reaching 99.1% accuracy, confirming the effectiveness of the framework when image quality is preserved. Overall, the findings confirm that the Proposed Methods are more resilient to visual distortions and are well-suited for reliable cooking oil frequency usage classification in real-world applications where image imperfections are inevitable.
Blurriness Analysis for ‘Nugget’ Sample:
Fig. 10 Performance Accuracy for ‘Nugget’ Sample
From Fig. 10, the results for the ‘Nugget’ sample show that while the Proposed Methods maintain relatively strong performance, the overall accuracies are slightly lower compared to other sample types. At 25% blurriness, Proposed Method 2 records the best performance at approximately 88%, followed by Proposed Method 3 (86%) and Proposed Method 1 (83%). These values reflect a modest reduction in robustness when handling early-stage image distortion, suggesting that nugget samples may present more complex or less distinctive visual patterns.
As the blurriness level increases, classification performance decreases further. At 50% blurriness, the proposed approaches range between 80–85%, still outperforming conventional architectures but with a noticeable accuracy drop compared to clearer conditions. By 75% and 100% blurriness, accuracies fall into the 70–78% range, highlighting the difficulty in extracting reliable visual features under heavy distortion. Traditional models such as MobileNet and EfficientNet drop further, reaching the mid-60% range at maximum blurriness, reinforcing the sensitivity of these architectures to degraded visual inputs.
On the original dataset, however, all models achieve near-perfect classification, with Proposed Method 3 reaching about 97% accuracy, confirming that image clarity strongly influences detection reliability. Overall, the findings suggest that while the proposed framework remains effective, nugget samples introduce additional classification challenges under blurred conditions, making this food type more demanding for deep vision analysis
Occlusion-based Image Capturing Analysis
Finally, the occlusion robustness assessment tests the system’s resilience when parts of the visual features are blocked. Four occlusion patterns horizontal, vertical, left-diagonal, and right-diagonal are applied at varying coverage levels to evaluate the model’s ability to infer cooking oil quality from incomplete visual information.
Horizontal Occlusion Analysis: Horizontal occlusion, in which a thick band obscures the middle horizontal region of the oil sample, introduced a greater performance challenge compared to vertical blocking. This type of distortion interrupts both the oil’s surface reflection at the top and the depth representation at the bottom, two critical features necessary for evaluating transparency and quality.
TABLE VII. PERFORMANCE ACCURACY FOR HORIZONTAL OCCLUSION
| Cooking Oil Samples | Modal | Training Accuracy (%) | Validation Accuracy (%) | 
| ‘Chicken’ | VGG19 | 90.76 | 87.98 | 
| ResNet50 | 92.55 | 88.81 | |
| EfficientNet | 89.77 | 88.79 | |
| Mobile Net | 88.31 | 85.21 | |
| Proposed Method 1 | 91.7 | 91.38 | |
| Proposed Method 2 | 93.78 | 92.57 | |
| Proposed Method 3 | 93.14 | 92.19 | |
| ‘Lekor’ | VGG19 | 85.75 | 86.25 | 
| ResNet50 | 88.12 | 86.71 | |
| EfficientNet | 85.94 | 85.54 | |
| Mobile Net | 83.73 | 81.86 | |
| Proposed Method 1 | 89.58 | 85.59 | |
| Proposed Method 2 | 90.85 | 86.8 | |
| Proposed Method 3 | 90.76 | 86.62 | |
| ‘Nugget’ | VGG19 | 85.66 | 83.19 | 
| ResNet50 | 83.48 | 82.51 | |
| EfficientNet | 83.1 | 80.96 | |
| Mobile Net | 81.7 | 78.93 | |
| Proposed Method 1 | 86.17 | 84.74 | |
| Proposed Method 2 | 89.29 | 86.44 | |
| Proposed Method 3 | 85.96 | 84.54 | 
As result shown in Table VII, most baseline deep learning models exhibited notable decreases in accuracy, with VGG19 and ResNet experiencing significant performance loss. This indicates that central horizontal visual cues are more influential than side regions in distinguishing subtle oil quality features.
The proposed methods (PM2 and PM3 in particular) demonstrated superior robustness against horizontal occlusion, suggesting that their design effectively incorporates redundancy from unblocked areas, such as the left and right regions of the sample. These models appear capable of reconstructing context from side features and compensating for missing top-bottom depth information. Nevertheless, even the proposed methods showed some decrease in accuracy, underscoring that horizontal occlusion is more damaging than vertical since it interferes with both surface-level reflections and sedimentation layers. This implies that future methods may need specialized augmentation strategies to simulate horizontal occlusion during training to further enhance resilience.
Vertical Occlusion Analysis: The vertical occlusion scenario introduces a central block that disrupts the middle portion of the image, where key visual cues such as the oil’s clarity, colour uniformity, and sedimentation levels are typically most prominent.
TABLE VIII. PERFORMANCE ACCURACY FOR VERTICAL OCCLUSION
| Cooking Oil Samples | Modal | Training Accuracy (%) | Validation Accuracy (%) | 
| ‘Chicken’ | VGG19 | 92.13 | 88.36 | 
| ResNet50 | 89.66 | 89.47 | |
| EfficientNet | 88.44 | 88.8 | |
| Mobile Net | 88.15 | 86.55 | |
| Proposed Method 1 | 93.37 | 89.98 | |
| Proposed Method 2 | 92.75 | 93.83 | |
| Proposed Method 3 | 94.4 | 91.27 | |
| ‘Lekor’ | VGG19 | 89.06 | 85.39 | 
| ResNet50 | 89.93 | 86.23 | |
| EfficientNet | 87.08 | 83.78 | |
| Mobile Net | 86.35 | 83.92 | |
| Proposed Method 1 | 89.62 | 88.62 | |
| Proposed Method 2 | 90.98 | 89.65 | |
| Proposed Method 3 | 89.48 | 91.71 | |
| ‘Nugget’ | VGG19 | 83.85 | 81.76 | 
| ResNet50 | 87.05 | 84.98 | |
| EfficientNet | 83.94 | 83.11 | |
| Mobile Net | 81.78 | 80.81 | |
| Proposed Method 1 | 87.56 | 86.69 | |
| Proposed Method 2 | 86.97 | 85.83 | |
| Proposed Method 3 | 87.94 | 84.24 | 
From the experimental results as shown in Table VIII, it is evident that this type of occlusion reduces model accuracy, though not as severely as other forms of occlusion. The reason is that while the center is blocked, enough peripheral features at the top and bottom of the oil sample remain visible, allowing models to capture partial patterns necessary for classification. Traditional deep networks such as VGG19 and ResNet showed moderate resilience but still recorded performance degradation compared to the unoccluded baseline.
On the other hand, the proposed methods (PM1–PM3) consistently maintained stronger performance under vertical occlusion. Their ability to extract peripheral texture cues and compensate for the missing central region reflects better adaptability to real-world scenarios where objects may be partially blocked. Interestingly, MobileNet and EfficientNet, which are optimized for lightweight feature extraction, demonstrated steeper drops, indicating that smaller architectures struggle when critical regions are missing. Overall, the vertical occlusion test highlights the importance of designing models capable of leveraging contextual background information when key regions are obscured.
Left-Diagonal Occlusion Analysis: The left diagonal occlusion case presented one of the most challenging conditions, as it disrupted both vertical and horizontal symmetry across the image.
TABLE IX. PERFORMANCE ACCURACY FOR LEFT-DIAGONAL OCCLUSION
| Cooking Oil Samples | Modal | Training Accuracy (%) | Validation Accuracy (%) | 
| ‘Chicken’ | VGG19 | 89.01 | 88.26 | 
| ResNet50 | 90.58 | 85.41 | |
| EfficientNet | 88.54 | 86.65 | |
| Mobile Net | 86.8 | 84.36 | |
| Proposed Method 1 | 89.56 | 88.69 | |
| Proposed Method 2 | 92.58 | 88.57 | |
| Proposed Method 3 | 90.94 | 88.99 | |
| ‘Lekor’ | VGG19 | 83.98 | 83.21 | 
| ResNet50 | 85.26 | 85.35 | |
| EfficientNet | 83.41 | 83.03 | |
| Mobile Net | 81.8 | 83.4 | |
| Proposed Method 1 | 87.88 | 85.11 | |
| Proposed Method 2 | 88.49 | 84.83 | |
| Proposed Method 3 | 88.4 | 86.61 | |
| ‘Nugget’ | VGG19 | 84.56 | 81.33 | 
| ResNet50 | 83.57 | 84.04 | |
| EfficientNet | 82.89 | 81.25 | |
| Mobile Net | 78.88 | 78.54 | |
| Proposed Method 1 | 85.48 | 82.64 | |
| Proposed Method 2 | 84.91 | 82.46 | |
| Proposed Method 3 | 85.05 | 83.77 | 
Unlike straight occlusions, diagonal blocking introduces irregular distortion that overlaps with multiple critical regions simultaneously. This led to a substantial drop in accuracy across most models, especially EfficientNet and MobileNet, which struggled to generalize when diagonal cues were missing. The diagonal cut reduced visibility of the oil’s surface shine, its central transparency, and even part of its lower sedimentation, leading to incomplete feature representation for classification.
Among all models as shown in Table IX, the proposed methods again performed relatively better, with PM1 sustaining moderate performance compared to deep baselines. This suggests that the proposed techniques may have better adaptability to irregular distortions by extracting context from available fragments rather than relying heavily on global symmetry. The left diagonal occlusion highlights the vulnerability of standard architecture when confronted with non-orthogonal noise patterns. It also emphasizes the necessity of training models with occlusion-aware augmentation strategies that simulate diagonal blockages to improve their robustness for real-world applications
Right-Diagonal Occlusion Analysis: The right diagonal occlusion produced results like the left diagonal case but with slightly less severe degradation. This type of occlusion cuts across the sample from the opposite side, again interfering with both vertical and horizontal regions simultaneously.
TABLE X. PERFORMANCE ACCURACY FOR RIGHT-DIAGONAL OCCLUSION
| Cooking Oil Samples | Modal | Training Accuracy (%) | Validation Accuracy (%) | 
| ‘Chicken’ | VGG19 | 86.89 | 85.76 | 
| ResNet50 | 88.53 | 85.45 | |
| EfficientNet | 85.35 | 82.49 | |
| Mobile Net | 85.76 | 82.27 | |
| Proposed Method 1 | 91.76 | 87.79 | |
| Proposed Method 2 | 89.57 | 90.6 | |
| Proposed Method 3 | 89.73 | 88.74 | |
| ‘Lekor’ | VGG19 | 85.38 | 84.47 | 
| ResNet50 | 87.01 | 83.25 | |
| EfficientNet | 86.4 | 80.22 | |
| Mobile Net | 85.5 | 78.55 | |
| Proposed Method 1 | 86.73 | 84.92 | |
| Proposed Method 2 | 86.24 | 88.65 | |
| Proposed Method 3 | 86.28 | 85.7 | |
| ‘Nugget’ | VGG19 | 82.32 | 81.26 | 
| ResNet50 | 82.02 | 79.65 | |
| EfficientNet | 80.5 | 80.4 | |
| Mobile Net | 78.27 | 77.58 | |
| Proposed Method 1 | 84.5 | 80.91 | |
| Proposed Method 2 | 83.81 | 86.19 | |
| Proposed Method 3 | 81.94 | 81.27 | 
The drop in performance was somewhat smaller compared to left diagonal occlusion. This difference may be attributed to dataset asymmetry, where lighting conditions or the natural distribution of oil features may have made the right side less critical for accurate classification. Nevertheless, baseline models such as MobileNet and EfficientNet still suffered noticeable declines, indicating their sensitivity to oblique distortions.
The proposed methods, particularly PM2, demonstrated stronger robustness as shown in Table X, maintaining higher testing accuracy under diagonal occlusion compared to other models. This suggests that their feature extraction mechanisms are less dependent on symmetry and more capable of reconstructing contextual cues. The resilience under right diagonal blocking shows promises for deployment in real environments, where random occlusions may occur due to glass markings, utensils, or camera angles. Overall, this test reinforces that diagonal occlusions represent one of the toughest challenges for visual inspection systems, but the proposed models show encouraging adaptability
CONCLUSION
This study introduced a vision-based framework for cooking oil quality assessment that effectively addresses key challenges in food safety monitoring. The proposed method outperformed state-of-the-art deep learning models and showed resilience to variations in angle, blurriness, and occlusion, with multi-angle capture proving especially useful. While performance remained high at moderate blur levels, extreme distortions reduced accuracy, and occlusion effects varied by orientation. Overall, the findings confirm the model’s reliability under common visual challenges. Future improvements could include automated imaging, real-time preprocessing, larger datasets with synthetic distortions, hybrid sensing with spectral data, and optimization for edge deployment, making the system a scalable and practical solution for real-world cooking oil monitoring.
ACKNOWLEDGMENT
The authors would like to acknowledge and thank Bahagian Keselamatan dan Kualiti Makanan, Jabatan Kesihatan Negeri Perak, Kementerian Kesihatan Malaysia for the Research Project Collaboration with Universiti Teknikal Malaysia Melaka and Telkom University during discussion and research activities conducted together.
REFERENCES
- L. Zhu, P. Spachos, E. Pensini and K. N. Plataniotis “Deep learning and machine vision for food processing: A survey,” Current Research in Food Science, Vol. 4, pp. 233–249, 2021.
- A. Susanto, T. Cenggoro, and B. Pardamean, “Oil Palm Fruit Image Ripeness Classification with Computer Vision using Deep Learning and Visual Attention,” Journal of Telecommunication, Electronic and Computer Engineering, Vol. 12, No. 2, 2020.
- Q. Abbas, M. E. A. Ibrahim, and M. A. Jaffar, “A comprehensive review of recent advances on deep vision systems,” Artificial Intelligence Review, Vol. 52, no. 1, pp. 39–76,2018.
- Z. Xiao, J. Wang, L. Han, S. Guo, and Q. Cui, “Application of Machine Vision System in Food Detection,” Frontiers in Nutrition, Vol. 9, 2022.
- K. Lim, K. Pan, Z. Yu, and R. H. Xiao, “Pattern recognition based on machine learning identifies oil adulteration and edible oil mixtures,” Nature Communications, Vol. 11, No. 1, 2020.
- I. N. Rafiqah and Sriani, “Classification of Crude Palm Oil Quality Eligibility Using Support Vector Machine Algorithm,” Journal La Multiapp, vol. 5, no. 4, pp. 371–376, 2024.
- H. Zhao et al., “The application of machine-learning and Raman spectroscopy for the rapid detection of edible oils type and adulteration,” Food Chemistry, vol. 373, p. 131471, 2022.
- L. Xie, R. Han, S. Xie, D. Chen, and Y. Chen, “Multi-View Fusion Network for Crop Disease Recognition,” pp. 121–126, 2021.
- B. Liu, Y. Yang, S. Wang, Y. Bai, Y. Yang, and J. Zhang, “An automatic system for bearing surface tiny defect detection based on multi-angle illuminations,” Optik, Vol. 208, pp. 164517, 2020.
- N. Hashim, Y. Kawanishi, D. Deguchi, I. Ide, A. Amma, N. Kobori and H. Murase, “Best Next-Viewpoint Recommendation by Selecting Minimum Pose Ambiguity for Category-Level Object Pose Estimation,” Journal of the Japan Society for Precision Engineering, Vol. 87, No. 5, pp. 440–446, 2021.
- Ashish Reddy Mulaka, R. Bidese, and Y. Bao, “Effects of Viewing Angle and Field of View on Detection, Tracking, and Counting of Pine Seedlings Towards Automated Forest Nursery Inventory,” Smart Agricultural Technology, pp. 100951–100951, 2025.
- B. N. Jagadesh et al., “Enhancing food recognition accuracy using hybrid transformer models and image preprocessing techniques,” Scientific Reports, Vol. 15, No. 1, 2025.
- M. Sayed and G. Brostow, “Improved Handling of Motion Blur in Online Object Detection,” arXiv.org, 2020.
- Y. Chang, “Research on de-motion blur image processing based on deep learning,” Journal of Visual Communication and Image Representation, vol. 60, pp. 371–379, 2019.
- S. Naseem and M. Rizwan, “The Role of Artificial Intelligence in Advancing Food Safety: A Strategic Path to Zero Contamination,” Food Control, p. 111292, 2025.
- R. Bianco et al., “Tailoring the Nutritional Composition of Italian Foods to the US Nutrition5k Dataset for Food Image Recognition: Challenges and a Comparative Analysis,” Nutrients, vol. 16, no. 19, pp. 3339–3339, 2024.
 
								









