INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)
ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025
Page 64
www.rsisinternational.org
Real-Time Traffic Signal Optimisation Using Deep Q-Network
Algorithm and Camera Data
Samkeliso Suku Dube
1
,Presley Nyama
2
,Tinahe Peswa Dube
3
,Admire Bhuru
4
National University of Science and Technology,Bulawayo, Zimbabwe
DOI: https://doi.org/10.51584/IJRIAS.2025.100900006
Received: 26 July 2025; Accepted: 01 Aug 2025; Published: 10 October 2025
ABSTRACT
Traffic congestion has become a problem in developing countries’ urban areas. This is largely caused by traffic
signals that have fixed-timing which causes them to fail to adapt to changing traffic conditions in real-time.
This research introduces a Reinforcement Learning-based solution using a Deep Q-Network algorithm to
optimise traffic signal lights control, aiming at reducing congestion and enhancing traffic flow efficiency. The
system is developed in a virtual environment using PTV VISSIM simulation software and the real-time traffic
data is collected using simulated cameras. The collected traffic data is then processed using Deep Q-Network
algorithm which is implemented using Python and TensorFlow. By optimising traffic signal light timings to be
adaptive, the system introduces significant improvements in reducing traffic waiting times at intersections and
improving traffic flow on the road in comparison to the traditional fixed-timing systems. The system ensures
scalability and effectiveness in offering a promising framework for adaptive traffic management in urban roads.
Keywords --Traffic Signal Optimisation, Deep Q-Network, Reinforcement Learning, Urban Mobility, Real-
Time Simulation, PTV VISSIM.
INTRODUCTION
Traffic congestion presents a growing challenge across many cities in developing countries, slowing economic
growth, reducing quality of life and contributing to environmental degradation by emitting harmful gases (Lu
et al., 2021). Traditional traffic signal systems with fixed-timing fail to account for changing traffic patterns,
resulting in prolonged traffic delays and inefficient intersection throughput. This research makes use of
reinforcement learning, specifically the Deep Q-Network (DQN) algorithm to optimise traffic lights in real-
time using data that is collected by cameras. The approach is utilised within a simulated environment using
PTV VISSIM and is trained using real-time data that is collected from simulated cameras.
A. Background of Study
Developing countries’ road infrastructure especially in urban areas, is facing significant challenges due to
growing vehicle ownership, underdeveloped public transport systems and poor traffic signal management. The
current traffic signal lights with fixed-timing do not adapt to changing traffic volumes which results in frequent
congestion, especially during peak hours (Munuhwa, 2020). With rising economic activities and urban
expansion, there is a growing need for efficient traffic management solutions. Since advanced technologies like
machine learning and smart city innovations are gaining momentum globally, developing countries have an
opportunity to advance into the field of intelligent traffic systems (Papageorgiou, et al., 2019).
Methods like agile and other flexible solutions like the Deep Q-Network (DQN) based traffic signal control
system offer great promises in advancing the efficiency of traffic signal management systems. Deep Q-
Network algorithm is a form of reinforcement learning (RL). It is utilised for adapting traffic signal timings
based on real-time data inputs to improve traffic flow efficiency by reacting to real-time traffic scenarios (Qi,
et al., 2022). Real-time traffic data is collected using virtual cameras in the simulation environment, which
captures key metrics like vehicle counts and traffic density (Cornell University, 2019).
INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)
ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025
Page 65
www.rsisinternational.org
With the increasing availability of comprehensive traffic datasets and increasing usage and improvements of
Deep reinforcement learning techniques, developing countries can utilise reinforcement learning (RL) for
traffic signal control. A key question for applying RL to traffic signal control is how to define the reward and
state. The ultimate objective in traffic signal control is to reduce the travel time, which is difficult to reach
directly (Jiang, et al., 2021). Existing studies often define reward as an ad-hoc weighted linear combination of
numerous traffic measures. However, there is no guarantee that the travel time will be optimised with the
reward. Recent RL approaches use more complicated state (e.g., image) in order to describe the full traffic
state. None of the existing studies has discussed whether such a complex state representation is necessary
(Jiang, et al., 2022).
This research explores and addresses these challenges, aiming to provide a scalable solution to improve urban
traffic management and reduce congestion.
Related Work
The traffic signal control sector has gone through significant transformations over the past two decades
primarily driven by technological advancements that cater for growing vehicle ownership for convenience,
real-time updates, and integrated services. Some studies include traditional methods that use pre-timed and
actuated signals (Haimerl, FHWA, and Haddad). They are restricted in adaptability as well as allowing for
integration. Multi-agent reinforcement learning shows potentials to solve the heavy traffic problems. It adopts
centralised or distributed strategies but suffers from scalability issues (Kim & Jeong, 2020: Kolat, 2023: Ge et
al., 2022). DRQN models combine Long Short Term Memory (LSTM) with Deep Q-Network for temporal
awareness (Ma et al., 2025) but require more data and training complexity. However, in some developing
countries these innovations are not yet fully leveraged.
(Spatharis & Blekas, 2024) Introduced a concept whereby traffic lights at individual intersections are treated as
autonomous agents. They collaborate in managing signal control, taking into account real time traffic
conditions and optimising their actions to alleviate congestion. This approach mimics a decentralised decision
making system, where each traffic light adapts its timing based on local traffic dynamics (Spatharis & Blekas,
2024).
A multi-element traffic light system featuring blue, yellow and red displays was proposed and it aimed at
improving the communication between traffic lights and drivers, making signal intentions more intuitive.
Genetic algorithms (GA) were used to adaptively adjust and efficiently control these multi-element traffic
lights. This approach highlighted the importance of human computer interaction and user centric design in
traffic control (Haimel et al., 2022).
(Zheng et al., 2019) Proposed a deep recurrent Q-network (DRQRN) technique that combines a recurrent
neural network (RNN) with a deep Q-network (DQN) to learn various traffic environments. The DQRN
minimises the total number of waiting vehicles before the stop line (Liu et al., 2023: Zheng et al., 2019). The
proposed model defined the queue length parameter in the same sense as the travel time of the vehicles.
Studies on reinforcement learning, in particular deep QNetworks have shown that artificial intelligence (AI)
driven systems can outperform traditional methods by dynamically adjusting to traffic conditions (Qi, et al.,
2022).
Research Gap
Existing frameworks and traffic control systems are relevant, but, they are mainly applicable to developed
infrastructures (Cui et al., 2020). There is lack of camera data to enhance the environment for training the
traffic systems to adjust dynamically to changing scenarios (Wang et al., 2021). These traffic lights should be
used for collecting data on vehicle counts, waiting times and traffic flow at intersections.
INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)
ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025
Page 66
www.rsisinternational.org
METHODOLOGY
The simulation research methodology guided the iterative design, analysis, and validation of the system model,
enabling a controlled and repeatable evaluation of urban traffic scenarios using a virtual environment. This
methodology facilitated the systematic modelling, simulation, and refinement of the Deep Q-Network-based
traffic signal control system, ensuring that it effectively addresses the challenges of congestion at intersections
while contributing to the broader domain of intelligent transportation systems, see Figure 1.
Figure 1: Simulation Research Methodology Steps for the Research.
For the technical implementation, the agile Kanban methodology, a lightweight, visual project management
approach, was adapted to support flexible, efficient, and iterative software development. Kanban enables a
clear visualisation of the progress of tasks, continuous delivery of requirements and adaptation to changing
requirements throughout the development of the system.
By aligning simulation-based research objectives with modular software development goals, the combined use
of simulation research methodology and agile Kanban ensured that the system was not only experimentally
validated but also practically implementable. This dual approach ensured that the resulting platform is robust,
adaptable and capable of supporting real-time traffic signal optimisations in rapid changing environments like
urban cities.
Ethical Considerations
To maintain the integrity of the research and ensure compliance with ethical standards this study was
conducted with a strong emphasis on confidentiality and data protection. All information gathered from and
datasets, free sources and from stakeholders such as city officials, transport authorities and road users were
securely stored to protect the privacy of individuals involved. Participants who contributed feedback during
prototype demonstrations and usability testing were fully informed of the research objectives, and their explicit
statements were obtained prior to their involvement. This ensured they were aware of their rights and the
voluntary nature of their participation. Additionally, in line with data protection regulations special care was
taken to avoid the collection and misuse of any sensitive or personal information. Although the system uses
simulated data in a virtual environment, future implementations involving real-time camera data will adhere to
ethical standards regarding surveillance, public safety and individual privacy.
System Architecture
Figure 2: System Architecture for Real-Time Traffic Signal Optimisation System.
INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)
ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025
Page 67
www.rsisinternational.org
The system architecture shown in Figure 2 consists of three layers, the local server layer being the workstation,
the Deep Q-Network model layer and the traffic environment layer which were illustrated as a simulated
environment. The HIK Vision traffic cameras sends traffic data in image form to the model. The Deep Q-
Network model is python based and runs on a Google Coral Dev Board mini-computer which is responsible
for all traffic data processing and sends traffic signal light timings to the traffic lights at controlled
intersections. The server provides resources such as power, more storage and cooling mechanism to the Google
Coral Dev Board. The local server grants permissions to the PTV VISSIM simulation application, in which the
traffic cameras and traffic environment are simulated, and provides it with the resources to run this
environment. The simulation layer provides collected traffic data to the server and model through the COM
API (Component Object Model) and API analytics respectively. The model receives traffic data collected in the
simulation environment, analyses the data and makes decisions based on the Deep Q-Network algorithm. It
also sends signals to the local server if ever there are special resources needed to run the desired traffic
optimisation. This type of architectural design promotes easier debugging and consistent data flow.
Implementation
The development of the system followed an object-oriented design approach, facilitating the modular and
iterative implementation of system components. This enabled flexible development and integration of
simulation, data processing, and optimisation modules.
A. System Components and Integration
Three key intersections within Bulawayo’s Central Business District (in Zimbabwe), a developing country,
were modelled using the map functionality in PTV VISSIM. These included the intersections of Ninth Avenue
and Fort Street, Ninth Avenue and Hebert Chitepo Street, and the uncontrolled intersection at Ninth Avenue
and Joshua Mqabuko Nkomo Street. These locations were chosen due to their high congestion levels and
varying traffic light control schemes.
Simulated traffic environment: The traffic simulation environment was constructed to mirror the real-world
geometry and behaviour of vehicles, intersections, and traffic signals. The simulation was configured to
observe naturalistic vehicle interactions at intersections, highlighting traffic build-up and delays as would
occur in physical urban settings.
Traffic signal and camera setup: At each intersection, simulated Econolite Cobalt Series traffic lights were
deployed. Additionally, four simulated HIK vision traffic cameras were placed per junction, each covering one
approach to capture vehicle counts and movements. These cameras facilitated real-time vehicle detection via
the COM API, enabling data-driven optimisation.
Vehicle modelling: The simulation included diverse vehicle models to replicate realistic traffic scenarios.
Vehicles were assigned stochastic behaviours across lanes and directions, reflecting real-world traffic
dynamics.
B. Functional Implementation
The core Python script, main.py, controlled system behaviour. It imported modules for traffic analysis and
signal control, calling functions in a continuous loop every 10 seconds. The system collected traffic data,
determined the optimal green light configuration using the DQN model, and applied the decision to the traffic
environment.
Using PTV VISSIM’s Component Object Model (COM) API, traffic cameras collected data on vehicle speeds,
volumes, and lane occupancy within a fifty metres capture radius. This data was logged into CSV files and
represented the system's primary input for DQN training and inference.
DQN model training: The Deep Q-Network was implemented using TensorFlow v2.18.0 in Python. It was
trained on a dataset of 64 episodes covering many traffic flow conditions. Each episode corresponded to
INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)
ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025
Page 68
www.rsisinternational.org
different lane combinations and signal scenarios. Training aimed to reduce queue lengths and waiting times by
learning the optimal signal policies based on observed states.
Data processing and decision making: Once trained, the model processed incoming traffic data in real time. It
compared vehicle volumes across opposing lanes and selected the highest congestion paths to receive a green
signal. Decision rules were based on learned policies that maximise throughput and minimise delay.
Signal optimisation and validation: Simulation outputs were validated by visual inspection of traffic flow
progresses at the selected intersections. In scenarios with high traffic density, green light sequences were
allocated to lanes with heavier volumes ensuring smoother flow and avoiding traffic accumulation.
C. Tools and Technologies used for system development.
Component Object Model (COM) is an API enabled Python-based interaction with PTV VISSIM, allowing for
real-time data extraction and signal control. Visual Studio Code v1.97.2 was used as the integrated
development environment (IDE) for Python and TensorFlow programming. It offered debugging, version
control, and code management features. Deep Q-Network (DQN) algorithm combines Q-Learning with deep
neural networks. It enables the system to process traffic data, learn optimal traffic light policies, and adjust
signal timings dynamically. PTV VISSIM 2025 (SP05 Student) is a leading microscopic simulation tool used
to model urban traffic and interface with external control systems. Python v3.11.8 was used for algorithm
implementation, simulation control, data processing, and model training due to its simplicity and robust
libraries. TensorFlow v2.18.0 is a framework which provides a scalable and GPU-accelerated environment to
build, train, and deploy the Deep Q-Network model effectively.
RESULTS
The Traffic Signal Light Optimisation System was tested in a simulated environment and it successfully
addressed the traffic congestion problem.
Figure 3: Traffic simulation on intersection of Ninth Avenue and Fort Street
Figure 4: Traffic simulation on Ninth Avenue and Hebert Chitepo Street
Figure 3 and Figure 4 show the optimisation of traffic signal lights at controlled intersections. The controlled
intersections show signs of intelligence by minimising green light sequences for traffic lights facing lanes with
fewer to no traffic at all whilst providing the advantage to the ones with congestion. Figure 4 shows heavier
vehicles being given precedence at the intersection over less than four vehicles in lanes of comparison. The
Deep Q-Network model uses multiple layers of neurons to extract patterns from traffic flow, vehicle density,
and signal timing. It then learns an optimal traffic signal control policy by mapping observed states (traffic
INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)
ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025
Page 69
www.rsisinternational.org
conditions) to the best possible actions (signal changes) to minimise congestion. This deep learning process is
used in decision-making, helping the Deep Q-Network agent predict and select the best traffic light
adjustments based on real-time data from the PTV VISSIM simulation. It then sends instructions of signal light
timings to the traffic signal light which minimises congestion at intersections and the vehicles respond to the
signal changes as expected.
The simulation shows that the developed system effectively prioritised congested lanes, demonstrating
improved traffic optimisation in the simulation. A real-time traffic signal optimisation system using Deep Q-
Network algorithm and camera data, to collect real-time traffic scenarios and process it using the Deep Q-
Network trained model to optimise traffic flow at congested intersections to reduce traffic congestion was
designed and implemented. Figure 5 shows a comparison of this developed system and other systems that are
already in existence.
Figure 5: Comparison with similar systems
CONCLUSION AND FUTURE WORK
To maintain relevance and scale impact, several enhancements are envisioned for future versions of the system.
For future system improvements, OpenCV can be used for real-time traffic analysis, detecting vehicle density
and movement from camera feeds. This data can improve the Deep Q-Network model’s decision-making,
leading to better traffic flow optimisation. Efforts will also be focused towards real-time deployment, which is
essential for accurately reflecting the unpredictable complexities inherent in real-world scenarios, such as
dynamic pedestrian movement and fluctuating traffic volumes. It is vital that the model undergoes rigorous
testing with real-world traffic data obtained from actual surveillance systems. This approach will allow for the
capture of the full spectrum of complexities, including variations in lighting, diverse weather conditions, and
unpredictable driver behaviour. Furthermore, pilot testing conducted in a controlled real-world environment
will be instrumental in collecting real-time operational data and gaining valuable perceptions into the model’s
performance stability and its readiness for wider, large-scale implementation.
REFERENCES
1. Liu, B., Liu, X., Chen, C., Huang, J., & Ding, Z. (2023). Decentralized Multi-Agent
Reinforcement Learning for Traffic Signal Control. In 2023 42nd Chinese Control Conference
(CCC) (pp. 6045-6050). IEEE.
2. Liu, J., Qin, S., Su, M., Luo, Y., Wang, Y., & Yang, S. (2023). Multiple Intersections Traffic Signal
Control Based on Cooperative Multi-agent Reinforcement Learning. Information Sciences, 647,
119484.
3. Lu, J., Li, B., Li, H., & Al-Barakani, A. (2021). Expansion of City Scale, Traffic Modes, Traffic
Congestion, and Air Pollution. Cities, 108, 102974.
INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)
ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025
Page 70
www.rsisinternational.org
4. Kolat, M., Kővári, B., Bécsi, T., & Aradi, S. (2023). Multi-agent Reinforcement Learning for
Traffic Signal Control: A Cooperative Approach. Sustainability, 15(4), 3479.
5. Qi, F., He, R., Yan, L., Yao, J., Wang, P., & Zhao, X. (2022, August). Traffic Signal Control with
Deep Q-Learning Network (DQN) Algorithm at Isolated Intersection. In 2022 34th Chinese
Control and Decision Conference (CCDC) (pp. 616-621). IEEE.
6. Jiang, S., Huang, Y., Jafari, M., & Jalayer, M. (2021). A Distributed Multi-agent Reinforcement
Learning with Graph Decomposition Approach for Large-scale Adaptive Traffic Signal
Control. IEEE Transactions on Intelligent Transportation Systems, 23(9), 14689-14701.
7. Cornell University, 2019. Diagnosing Reinforcement Learning for Traffic Signal Control. [Online]
Available at: https://arxiv.org/abs/1905.04716
[Accessed 24 September 2024].
8. Ge, H., Gao, D., Sun, L., Hou, Y., Yu, C., Wang, Y., & Tan, G. (2021). Multi-agent Transfer
Reinforcement Learning with Multi-view Encoder for Adaptive Traffic Signal Control. IEEE
Transactions on Intelligent Transportation Systems, 23(8), 12572-12587.
9. Jiang, X., Zhang, J. and Wang, B. (2022) ‘Energy-Efficient Driving for Adaptive Traffic Signal
Control Environment via Explainable Reinforcement Learning’, Applied Sciences (Switzerland),
12(11). Available at: https://doi.org/10.3390/app12115380.
10. Kim, D. and Jeong, O. (2020) ‘Cooperative Traffic Signal Control with Traffic Flow Prediction in
Multi-intersection’, Sensors (Switzerland), 20(1). Available at: https://doi.org/10.3390/s20010137.
11. Munuhwa, S., 2020. Approaches for Reducing Urban Traffic Congestion in the City of Harare.
[Online] Available at:
https://www.researchgate.net/publication/361860503_Approaches_for_Reducing_Urban_Traffic_
Congestion_in_the_City_of_Harare [Accessed 11 October 2024].
12. Papageorgiou, M., Diakaki, C., Dinopoulou, V., Kotsialos, A., & Wang, Y. (2003). Review of Road
Traffic Control Strategies. Proceedings of the IEEE, 91(12), 2043-2067.
13. Ma, J., Li, C., Hong, L., Wei, K., Zhao, S., Jiang, H., & Qu, Y. (2025). Vision-based attention deep
q-network with prior-based knowledge. Applied Intelligence, 55(6), 565.
14. Zheng, G. et al. (2019) Diagnosing Reinforcement Learning for Traffic Signal Control’. Available
at: http://arxiv.org/abs/1905.04716.
15. Spatharis, C. and Blekas, K. (2024) ‘Multiagent Reinforcement Learning for Autonomous Driving
in Traffic Zones with Unsignalized Intersections’, Journal of Intelligent Transportation Systems:
Technology, Planning, and Operations, 28(1), pp. 103119. Available at:
https://doi.org/10.1080/15472450.2022.2109416.
16. Wang, T., Cao, J. and Hussain, A. (2021) ‘Adaptive Traffic Signal Control for Large-scale Scenario
with Cooperative Group-based Multi-agent Reinforcement Learning’, Transportation Research
Part C: Emerging Technologies, 125, p. 103046. Available at:
https://doi.org/10.1016/J.TRC.2021.103046.
17. Haimerl, M., Colley, M. and Riener, A. (2022) ‘Evaluation of Common External Communication
Concepts of Automated Vehicles for People with Intellectual Disabilities’, Proceedings of the
ACM on Human-Computer Interaction, 6(MHCI). Available at: https://doi.org/10.1145/3546717.
18. Cui, H. et al. (2020) ‘Convolutional neural network for recognizing highway traffic congestion’,
Journal of Intelligent Transportation Systems: Technology, Planning, and Operations. Taylor and
Francis Inc., pp. 279289. Available at: https://doi.org/10.1080/15472450.2020.1742121.