Submission Deadline-29th October 2025

October Issue of 2025 : Publication Fee: 30$ USD Submit Now

Submission Deadline-04th November 2025

Special Issue on Economics, Management, Sociology, Communication, Psychology: Publication Fee: 30$ USD Submit Now

Submission Deadline-19th November 2025

Special Issue on Education, Public Health: Publication Fee: 30$ USD Submit Now

Optimizing STL Image Compression with Recurrent Neural Networks and Binarized LSTM

K. Himaja
M. Christina
D. Anu Disney
786-796
May 15, 2025
Computer Science

Optimizing STL Image Compression with Recurrent Neural Networks and Binarized LSTM

K. Himaja., M. Christina., Dr. D. Anu Disney, M.E., Ph.D

Department of Computer Science and Engineering Sathyabama Institute of Science and Technology Semmancheri, Chennai TamilNadu

DOI: https://doi.org/10.51584/IJRIAS.2025.10040065

Received: 17 April 2025; Accepted: 21 April 2025; Published: 15 May 2025

ABSTRACT

Industry 4.0 has transformed the production process. It impacts virtually every area of life and every sort of business. This paper deals with how the healthcare sector, along with Industry 4.0, can be integrated and automated and is intelligent. Inventions in medical science, orthopedics being one of the specialties, have been designed. Orthopaedics is a field requiring customized products, as implants and equipment differ in each patient’s case. Industry 4.0 has a very smart production system. Its requirements for orthopedics can be easily met. Quality implants, bio-models, surgical instruments, and many other orthopedic devices are designed and manufactured quickly. Here, doctors as well as patients are supported by virtual reality simulation and 3D views of equipment and patients. Holography will benefit education. Industry 4.0 would help to reduce the pain of the patient during the planning of a surgical task. Industry 4.0 will use a world-class production system that effectively produces smarter medical and orthopedic devices. Skin cancer is curable if diagnosed early by a dermatologist using a dermatoscope. Many companies use AI to improve sales, productivity, speed, efficiency, segmentation, targeting, compliance, conversions, product development, and business growth. This approach involves compressing 3D images using RNN and auto-encoder network designs without losing the quality of the photos. For that reason, the results are tested through rate-distortion analysis that is performed before and after compression.

Keywords: Industry 4.0, Orthopedics, Artificial Intelligence, 3D Imaging, Smart Manufacturing.

INTRODUCTION

Image compression is essential in reducing the space used for images and the bandwidth used for their transmission over networks. The possibility of achieving these ideals has become indispensable in a multitude of applications, including web content delivery, digital media, medical imaging, and video streaming. Conventional compression techniques will need to grow because present-day images tend to be growing more complex with greater demands toward real-time processing. The Classic algorithms such as the DCT transform reduce significantly larger images down to simple representation in terms of frequency without altering the complexity in the structure of the same image. Yet it suffers flexibility in handling most general types of images and application scenarios. Recent recent advancements in deep learning have opened new opportunities for improving image compression.

Unlike fixed algorithmic methods, models of deep learning, including both CNNs and RNNs, can automatically learn complex patterns in a vast dataset, enabling better generalization and adaptability while optimizing an objective function. CNNs also have a prominent ability to identify spatial features inside images mainly because of their localized receptive field and hierarchical architecture. They played a crucial role in improving compressibility by picking out patterns and redundancy at every spatial scale. RNNs, however, shine in dealing with sequential data and understanding the relationships over time, which enables them very much to capture spatial and temporal dependencies in image data.

Binarized LSTM networks are another variation of RNNs; this reduces the computational overhead but retains the sequential processing strength characteristic of traditional LSTMs while preserving its ability to represent complex patterns, making them ideal for high-performance, resource- efficient image compression tasks. A high-level objective: to create more advanced image compression models that consolidate the processing power of RNNs with binarized efficient compactness in LSTM and high-quality feature extraction from CNN coupled with strength in the DCT domain, all of which converge to bring an increase in benefits while compressing the images. It will train and test the model on the STL image dataset, containing a broad set of images so that its robustness against a wide variety of scenarios can be tested.

Furthermore, it would be tested against the latest state-of-the-art image compression techniques, so it may demonstrate better improvement in the ratio of compression, quality of image, and computation efficiency.

LITERATURE REVIEW

Although efficient, a conventional image compression technique like JPEG has artifacts and information loss at a low bit rate. Researchers offered image compression by recurrent neural networks to overcome all these issues that were presented using an end-to-end variational framework. That approach iterates the refinement of representations for images incorporating residual learning about fine details into the process followed by binarization and entropy coding. Compression, based on the RNN adapts well to a variable resolution where superior quality comes with fewer artifacts. Although at a state-of-the-art compression rate, it still suffers high computational requirements where optimization is desired to balance quality and efficiency. [1]

Artificial intelligence has reached an advanced level where deep learning has become the transformative approach. Early breakthroughs were mainly CNNs used in image processing, thereby trying to tackle problems in feature extraction. However, these were difficult for sequential data and thus developed into RNNs. These networks are perfect for handling temporal patterns but suffer limitations with long-term dependencies. This added pretraining methods for unsupervised learning like autoencoders and RBMs that improved feature representations. These improvements in algorithms, computational powers, and big datasets, it was all used to deepen machine learning quickly on tasks such as image recognition, speech processing, and reinforcement learning. [2]

Deep learning models are computationally intensive and difficult to deploy on resource-constrained architectures. Binarized neural networks, therefore, aim to constrain weights and activations at the expense of two binary values. These reductions in computation complexity and memory usage make BNNs suitable for mobile phones and embedded systems. Yet BNNs suffered from reduced accuracy, especially in complex tasks. In overcoming this, novel training strategies, optimization techniques, and hybrid approaches have been proposed to enhance precision while still maintaining efficiency. Although BNNs offer impressive cost and energy savings, the trade-offs made in terms of precision make them suitable for targeted edge AI applications. [3]

Although powerful, deep learning models usually require many computational resources and thus are hardly deployable in constrained environments. Binarized Neural Networks (BNNs) overcame this problem by binarizing the weights and activations of a neural network. BNNs simplify the complexity of a network at the cost of sacrificing some accuracy. The concept was further enhanced with XNOR-Net, which presented binary convolutional neural networks based on efficient XNOR and bitcount operations. To fill the gap in accuracy from binarization, an approximation scheme was proposed. XNOR-Net achieves competitive accuracy on datasets like ImageNet while reducing computational costs by an order of magnitude, though minor performance differences remain compared to full-precision models. [4]

Though traditional image compression methods such as JPEG and H.264 are used more widely, these methods cannot ensure quality at low bitrates. In contrast to the methods introduced above, deep learning-based approaches rose to overcome this problem. The autoencoder-based approaches are especially better at achieving compact representations; consequently, it doesn’t take storage space and sacrifice no significant detail. They fail to achieve adaptability to various compression rates, which is one of the issues progressive coding provides through RNNs. The perceptual quality sometimes lacking within RNN. GAN-based enhanced approaches achieve realistic details which improve subjective quality significantly. In addition to the above schemes, recent breakthroughs in entropy modeling and objectives of optimization further established superiority over traditional codecs at low bitrates with both objective metrics and visual appeal. [5]

Traditional codecs, like JPEG and HEVC, dominated for decades of image compression but could not easily optimize quality and efficiency at low bitrates, breaking the bottleneck through deep learning-based data-driven and end-to-end optimized strategies. Autoencoder-based models, such as VAEs, are excellent probabilistic models but fail to perform perceptually.

Transformer-based models recently surfaced, with higher adaptability and efficiency. Along with entropy coding and rate- distortion optimization advancements, deep learning techniques can surpass the traditional ones constantly, which in turn has changed the scenario of image compression for modern applications. [6]

A large-scale image database of more than 14 million labeled images transformed the field. Organized using the WordNet hierarchy, it provided a foundation to train powerful models, such as CNNs. ImageNet’s impact was soon manifested in the ILSVRC, where AlexNet established a new benchmark for object recognition. However, as ImageNet became the benchmark to which algorithms are tested, concerns of scalability started surfacing regarding manual labeling and people began looking into other alternatives, like self- supervised learning. [7].

JPEG and HEVC are compression algorithms for images. They can efficiently reduce the size of the files but usually fail at keeping good visual quality at low bitrates. Overcoming this requires the development of generative adversarial networks (GANs) for image compression purposes, although perceptual quality improvement has been highlighted as a focus point. Combining deep autoencoders for feature extraction and an adversarial loss function, GANs generate visually realistic images, even at high compression rates. More importantly, this approach outperforms the state-of-the-art in perceptual metrics such as MS- SSIM and LPIPS at the cost of providing competitive PSNR values. The downside of GAN-based methods is that they might be computationally expensive. [8]

Spatial features in image processing have traditionally been extracted using CNNs. They have a problem with the fact that temporal dependencies are involved within image sequences. RNNs were invented to overcome this. Since they model sequential information, there is still room for improvement since RNNs alone cannot extract spatial features. To overcome this, hybrid models that combine CNNs and RNNs were proposed, capturing both spatial and temporal features better. This led to improved accuracies in the tasks of image classification and object recognition but with an increased computational cost with the added complexity of hybrid models. [9]

DNNs have shown incredible performance but consume a lot of computational resources, making their application on resource-constrained devices impossible. Model compression is the approach to this problem. It compresses DNNs without loss of accuracy. The effective method for this approach is a three-stage approach: pruning, trained quantization, and Huffman coding. Pruning first removes the redundant weights. Then, the reduction of precision occurs by trained quantization. Huffman coding compresses the model further.

This technique was able to show up to 35x compression with comparable accuracy. However, the challenge here is to balance compression and model performance; aggressive compression would result in a little loss of accuracy. [10]

Proposed System

The proposed system puts forward an advanced methodology to optimize STL (stereolithography) image compression by integrating the preprocessing, feature extraction, and cutting-edge compression, ultimately leveraging the strengths of anchored long short-term memory (LSTM) networks and recurrent neural networks (RNNs) toward enhancing both the efficiency of compression and image quality. The pre-processing stage prepares and standardizes the image data for further processing. Normalizes pixel values in a uniform range either in the range of [0, 1] and sometimes [-1, 1] to make data processing more efficient using neural network models. Uniform images are resized by uniform dimensions, assuring consistency to ensure good batch processing as well as good model training. Data augmentation such as rotation, flip, crop, and color augment is also used to synthetically enlarge the given dataset to create more variability. This reduces the chances of over fitting in model training such that the system is very robust for varying conditions in the images For feature extraction, the system uses RNNs, which are well-suited for capturing sequential dependencies within image data. Although originally developed for sequential data, such as text or time series, RNNs can be adapted effectively to image processing tasks. RNNs have been used for compressing STL images to identify and encode spatially and sequentially key features representing both hierarchies, providing significant superiority over state-of-the-art methods like CNNs, especially when there are temporal or sequential relationships to be considered within the image. This feature extraction is very important to encode important information in compressing and maintaining image integrity even after compression and decompression.

The core of the compression mechanism of the system is binarized LSTMs. These networks just extend the traditional LSTM architecture by applying binarization techniques on weights and activations by constraining the values to be held between 0 and 1. This significantly reduces the computational complexity and storage requirements. Binarized LSTMs allow the compacting of the features extracted by the RNNs into compact binary representations, thus enhancing the efficiency of the encoding process. The resulting binary representation is much smaller in size compared to conventional floating-point representations; thus, the compressed model retains only the critical features for image reconstruction.

The binarization of the LSTM further aids in speeding up the computations both during the training and during the inference stage, which can make the system applicable for real-time applications or when the environment has resource constraints.

The two major stages that define the compression algorithm are encoding and decoding. In the encoding stage, the output from the trained RNN and output from the networks of the binarized LSTMs translate the input images into compressed codes in binary format. Thus, the information here is highly compressed and merely holds the information relevant for reconstructing the original picture with minimal information loss. After decoding, this compressed binary information goes through the LSTM decoder binarized. Now, in the step of decoding, an LSTM tries its best to reconstruct images so that, on one side, it makes a minimum detail loss and makes it visually well- crafted with minimum losses. This is a critical step because it results in the image being reconstructed with a closeness to the original, thereby achieving some balance between the compression ratio and the quality of the image.

Besides, the inclusion of DCT makes a great impact on the enhancement of the compression process. DCT is a mathematical tool that transforms the spatial domain data, such as pixel values of images and signals, into frequency components. Since the image data is transformed to the frequency space, DCT separates the most important image features (like edges, textures, and patterns) from the other less important data. This type of selective compression will reduce the size of the data without affecting its perceptual quality. DCT is particularly good at taking low-frequency component captures, which are generally very important to human visual perception, and throwing away higher frequency, often regarded as noise, which can add to the higher efficiency of compressing images while maintaining a better quality visual outcome.

The performance of the proposed system has been rigorously evaluated using some key metrics like peak signal-to-noise ratio, structural similarity index, and compression ratio. The PSNR gives the quality of the reconstructed image by comparing the original and compressed images with greater values representing fewer distortions. SSIM provides a measure for the structural similarity of the original and reconstructed images and considers luminance, contrast, and texture to compute the metric. The better the SSIM, the more the structure is preserved perceptually. The quantitative measure of compression ratios represents the factor by which it reduces data and is a perfect measure of this system’s performance. It takes on several permutations of image categories, resolutions, and formats under its testing scenarios to be useful in diverse application scenarios or environments.

This STL image compression system is optimized for the efficiency of compression and quality of the image by a combination of advanced preprocessing, RNN- based feature extraction, binarized LSTM encoding, and frequency domain transformation based on DCT.

It does all this in an integrated manner to compress very large-sized images and produce images that have strong fidelity while working towards the compression of up to very large scales, thereby becoming one of the best approaches applicable to scenarios such as 3D printing and CADs which require optimum storage of data in terms of STL images and several other such fields for optimum data transmissions and storage of such image STL files.

Fig.1. Flow of the proposed system

Figure 1 illustrates the image processing pipeline that merges several advanced techniques to offer efficient compression and reconstruction. The input image is raw data to be processed. The pipeline begins with a discrete cosine transform, converting the image into the frequency domain. This compression emphasizes low-frequency coefficients; these often contain the most important visual information, so few coefficients are required to represent the image. The DCT coefficients are then passed through CNNs. CNNs are good at deconstructing images to extract features by identifying local patterns and correlations such as edges, textures, and structures. These features are essential for reducing redundancy and understanding the content of an image. After this, the extracted features are fed sequentially to a recurrent neural network (RNN). RNNs capture long-range dependencies and temporal patterns in the features, further enhancing the representation of the image in a more meaningful way. To optimize efficiency, the RNN output is passed through a binarized LSTM unit. Binarization reduces the computational complexity and dimensionality yet retains the important information, hence making it more efficient for further operations.

The compressed representation is then subjected to entropy encoding, which is a lossless compression that reduces the number of bits required for the storage or transmission of the image. It maximizes data compression without losing information integrity. After the compression process, the decoding process reverses the entropy encoding procedure to obtain the compressed data. Finally, reverse operations reconstruct the image from compressed representation by undoing earlier transforms such as DCT and feature extraction. Lastly, it appears as a reconstructed image, quite close to the original input representation. This pipeline is well balanced in terms of compression, feature extraction, and reconstruction, thereby making the application feasible for any application such as image storage, transmission, or analysis.

RESULTS

Loss Curves: The other curve is the loss curve, which is typically followed. In the loss curve, one plots the training and validation losses for each epoch. Hence, it is possible to conclude whether the model is learning. Ideally, one should have loss decay in both architectures – binarized LSTM which should then smoothly converge to training loss. Overfitting would then be reflected in a growing gap between the training and validation loss. For binarized LSTMs, a more rapid decline in loss might be seen due to the efficiency of binary representations in reducing model complexity. For RNNs, slower but steady loss reduction would be expected due to the sequential nature of the data processing.

Accuracy Graphs: Figure 2. Accuracy graphs are often plotted along with loss curves to understand how well the model is performing. These plots display the accuracy of both the training and validation sets as the number of epochs is increased. In the case of binarized LSTMs, the graph should reflect the fact that even though the representation is compact in the binary space, the accuracy is high; this shows that the network can preserve the important features of the image even though complexity is reduced. The graphs based on RNN are expected to have slight accuracy loss compared to the traditional CNN-based models; however, the sequential dependency handling can still result in huge improvements in compression tasks, especially where data features evolve.

Fig. 2. Accuracy Graph of Binarized LSTM and RNN

Figure 3: If unseen data was tested on the model, there would be a very nice graphical output from the confusion matrix that indicates how good the model is in classifying. That would further emphasize the false positives and false negatives, as well as the true positives and true negatives, particularly the compressing and reconstructing images without the model losing the structural integrity of the image. The binarized LSTM model could illustrate how the confusion matrices represent how the binarization influences their discriminative ability for different features between the images and what the RNNs depict is the efficiency of handling sequences and recovering patterns in compressed data.

Fig. 3. Confusion matrix of Binarized LSTM and RNN

ROC Curves: Figure 4. Curve that evaluates the capacity of a model to classify classes. Again, this is a plot mainly used in the assessment of two-class classification tasks, though possibly applied to the task at hand. The curve of ROC helps estimate the trade-off between the true positive rate (sensitivity) and the false positive rate (1-specificity). Ideally, a better model should be closer to the top left corner (higher sensitivity and lower false positives). This curve could describe how good the ability of the binary representation for the binarized LSTMs captures the relevant information without overfitting. For RNN, this curve would illustrate well the network reconstruction ability of the compressed image, thereby discriminating among the features useful for the classification.

Fig. 4. ROC curve for Binarized LSTM and RNN

PSNR and SSIM plots: When image compression occurs, PSNR and SSIM plots are used as key metrics. It measures the quality of the reconstructed image against the original image. PSNR considers pixel-wise differences, while SSIM accounts for structural details in terms of luminance, contrast, and texture. High values of PSNR and SSIM for both the binarized LSTM and RNN models point out that the quality and structural properties of the compressed images are retained, and that such methods work well in retaining the important information in the image. Graphs of these metrics over epochs might illustrate improvements in image quality at training.and texture. High values of PSNR and SSIM for both the binarized LSTM and RNN models point out that the quality and structural properties of the compressed images are retained, and that such methods work well in retaining the important information in the image. Graphs of these metrics over epochs might illustrate improvements in image quality at training. compressed images are retained, and that such methods work well in retaining the important information in the image. Graphs of these metrics over epochs might illustrate improvements in image quality at training.and texture. High values of PSNR and SSIM for both the binarized LSTM and RNN models point out that the quality and structural properties of the compressed images are retained, and that such methods work well in retaining the important information in the image. Graphs of these metrics over epochs might illustrate improvements in image quality at training. over epochs might illustrate improvements in image quality at training.and texture. High values of PSNR and SSIM for both the binarized LSTM and RNN models point out that the quality and structural properties of the compressed images are retained, and that such methods work well in retaining the important information in the image. Graphs of these metrics over epochs might illustrate improvements in image quality at training.

RNN models point out that the quality and structural properties of the compressed images are retained, and that such methods work well in retaining the important information in the image. Graphs of these metrics over epochs might illustrate improvements in image quality at training. compressed images are retained, and that such methods work well in retaining the important information in the image. Graphs of these metrics over epochs might illustrate improvements in image quality at training.and texture. High values of PSNR and SSIM for both the binarized LSTM and RNN models point out that the quality and structural properties of the compressed images are retained, and that such methods work well in retaining the important information in the image. Graphs of these metrics over epochs might illustrate improvements in image quality at training.

An important analysis of the compression task is a plot of the compression ratio against the metrics used to compute the images’ spatial quality, PSNR, or SSIM. Such a graph is a way to view the trade- off between how much data is compressed and how much quality is preserved in the process. Binarized LSTMs should have a positive compression ratio but with good-quality images because they can encode images in binary format. RNN-based methods will be efficient on sequential data, but the trade-off curve may be different, with perhaps less efficient compression but still effective reconstruction of images.

fig. 5. Graph of Binarized LSTM.

Figure 5: Training and validation loss curves over 50 epochs of model training. The blue curve represents the training loss, and it continuously drops as the model optimizes the performance over the training data. The orange curve represents the validation loss, describing how well the model generalizes over unseen data. Both curves seem to converge towards a low value (~0.05), implying that the model learns without overfitting. This would then indicate good generalization since the model performs reasonably well on both the training and validation datasets.

Fig. 6. Original Point Cloud image vs Reconstructed Image by RNN

Figure 6 is a top row of three different original point clouds (blue points) in 3D space. Probably, each of the point clouds represents a set of data points or a spatial structure, similar to a human figure or something like that. The bottom row shows the reconstructed point clouds (red points) corresponding to the original ones above. These reconstructions are model or algorithm attempts to reconstruct the original point cloud given some learned representation. Reconstructions of the point clouds are several orders of magnitude smaller and much simpler than the originals, and this may reflect that the model compresses information at a price of fine detail. Such setups are usually utilized in benchmarking the performance of autoencoders or similar models for the application of dimensionality reduction or reconstruction.

Fig. 7: Before Compression of the original image using RNN

The Figure 7 plot is a high-density original point cloud in a 3D coordinate space. The data appears to be uniformly distributed across a cubical volume on the x, y, and z axes ranging from 0 to 1. Such a distribution can be thought of as raw sampled points from a uniform random distribution or a synthetically generated data set.

It indicates uniformity that is used as an input to test whether the model compresses and reconstructs the data without bringing bias into it.

Fig. 8: After Compression of the image using RNN

In this figure, it can be depicted that the lower- dimensional space compressing the latent representation of the original point cloud is represented in a Figure 8 plot. It shows the sparsity of the points in red against “Latent Dimension 1,” “Latent Dimension 2,” and “Latent Dimension 3” in 3D. It depicts how the data can be reduced to less complexity but retains its characteristic features. A compact encoding of input data, usually generated by the encoder part of an autoencoder or similar neural network, latent space serves to visualize how information is organized and compressed within this space.

CONCLUSION

The proposed system has the functionality of integrating LSTM networks, such that image feature transformations can be mapped into compact representations in binary form. This transforms into a vast reduction in using memory and has further accelerated compression in comparison with other traditional ways of doing it. Capturing complex image features and their sequential patterns is, therefore, possible through the use of RNN. As such, processing data in a temporal manner, these architectures ensure that all the important details and intricate dependencies of the images are preserved, leading to high-quality reconstructions. It means the system minimizes compression artifacts and preserves important characteristics of the image while attaining an optimal balance between its compression ratio and image fidelity.

In further efficiency improvement, the system performs an initial set of feature extraction operations based on CNNs. CNNs excel in spatial hierarchies and localized pattern detection in images that encourage the retention of texture, edges, and fine details at compression. In aid of this, the system will apply the DCT to spatial image data transforms into frequency components to acquire the system separately, as it distinguishes significant features inside the image and obliterates the less important information. This hybrid approach involving CNNs, RNNs, and DCT ensures the compression of various kinds of images of different resolutions is appropriately done with very high visual quality.

The proposed system exhibits adaptability and robustness through rigorous testing across various datasets and can handle a wide range of scenarios, making it more suitable for applications demanding high performance under diverse conditions. The effectiveness of the mentioned binarization process embedded into LSTMs further increases the system utility by reducing computational overhead and making faster processing. This is especially beneficial for real-time or near-real-time applications, especially live video streaming or interactive media. Moving forward, the presented system provides a great basis on which to build further improvements in image compression technology. For instance, it may further fine-tune the system’s performance and scalability based on optimized optimization techniques in follow-up research.

It will open practical applications on a much larger scale if it expands to video compression and integrates into existing hardware and software platforms. The study of new architectures, for example, transformers or hybrid models, and novel compression algorithms could unlock more enhancement and innovation in this field. Therefore, the combination of RNNs, CNNs, DCTs, and binarized LSTM networks offers a forward- looking approach to modernizing image compression. The proposed system provides high compression efficiency with excellent retention of image quality, which could be an excellent tool for numerous applications such as digital media storage, streaming services, and real-time multimedia processing.

REFERENCE

Toderici, G., et al. (2017). Full Resolution Image Compression with Recurrent Neural Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Ballé, J., et al. (2016). End-to-End Optimized Image arXiv preprint arXiv:1611.01704.
Minnen, D., et al. (2018). Learning Structured Compressors with Rate-
Autoencoders. arXiv preprint arXiv:1802.06853.
Balle, J., et al. (2018). Variational Image Compression with a Scale Hyperprior. arXiv preprint arXiv:1802.01436.
Hinton, G. E., et al. (2015). Deep Learning. Nature.
Sutskever, I., et al. (2014). Sequence to Sequence Learning with Neural Networks. Advances in Neural Information Processing Systems.
Zhou, X., et al. (2020). Recurrent Neural Networks for Image Processing. IEEE Access.
Chen, H., et al. (2020). Binarized Neural Networks: A Survey and Review. IEEE Access.
Rastegari, M., et al. (2016). XNOR- Net: ImageNet Classification Using Binary Convolutional Neural Networks. IEEE Conference on Compression Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding.
Huang, Y., et al. (2021). Deep Learning for Image Compression: A Review. IEEE Transactions on Circuits and Systems for Video
Deng, J., et al. (2009). ImageNet: A Large-Scale Hierarchical Image Database. IEEE Conference on Computer Vision and Pattern
Jin, L., et al. (2020). Image Compression with Generative Adversarial Networks. IEEE Transactions on Image Processing.
Kingma, D. P., & Welling, M. (2014). Auto Encoding Variational arXiv preprint arXiv:1312.6114.
Zhang, , Wang, S., Wang, M., Li, J., Wang, X., & Kwong, S. (2023). representation with cross-modality transfer. Karimi, D., Dou, H., War eld, S. K., & Gholipour, A. (2020) Deep learning with noisy labels: Exploring techniques and remedies in medical image analysis. Medical image analysis, 65, 101759.

Article Statistics

Track views and downloads to measure the impact and reach of your article.

PDF Downloads

[views]

Metrics

PlumX

Altmetrics

About RSIS International

Publication Method

Conference

Join Our Team