INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025

Page 64

www.rsisinternational.org

Real-Time Traffic Signal Optimisation Using Deep Q-Network

Algorithm and Camera Data

Samkeliso Suku Dube

,Presley Nyama

,Tinahe Peswa Dube

,Admire Bhuru

National University of Science and Technology,Bulawayo, Zimbabwe

DOI: https://doi.org/10.51584/IJRIAS.2025.100900006

Received: 26 July 2025; Accepted: 01 Aug 2025; Published: 10 October 2025

ABSTRACT

Traffic congestion has become a problem in developing countries’ urban areas. This is largely caused by traffic

signals that have fixed-timing which causes them to fail to adapt to changing traffic conditions in real-time.

This research introduces a Reinforcement Learning-based solution using a Deep Q-Network algorithm to

optimise traffic signal lights control, aiming at reducing congestion and enhancing traffic flow efficiency. The

system is developed in a virtual environment using PTV VISSIM simulation software and the real-time traffic

data is collected using simulated cameras. The collected traffic data is then processed using Deep Q-Network

algorithm which is implemented using Python and TensorFlow. By optimising traffic signal light timings to be

adaptive, the system introduces significant improvements in reducing traffic waiting times at intersections and

improving traffic flow on the road in comparison to the traditional fixed-timing systems. The system ensures

scalability and effectiveness in offering a promising framework for adaptive traffic management in urban roads.

Keywords --Traffic Signal Optimisation, Deep Q-Network, Reinforcement Learning, Urban Mobility, Real-

Time Simulation, PTV VISSIM.

INTRODUCTION

Traffic congestion presents a growing challenge across many cities in developing countries, slowing economic

growth, reducing quality of life and contributing to environmental degradation by emitting harmful gases (Lu

et al., 2021). Traditional traffic signal systems with fixed-timing fail to account for changing traffic patterns,

resulting in prolonged traffic delays and inefficient intersection throughput. This research makes use of

reinforcement learning, specifically the Deep Q-Network (DQN) algorithm to optimise traffic lights in real-

time using data that is collected by cameras. The approach is utilised within a simulated environment using

PTV VISSIM and is trained using real-time data that is collected from simulated cameras.

A. Background of Study

Developing countries’ road infrastructure especially in urban areas, is facing significant challenges due to

growing vehicle ownership, underdeveloped public transport systems and poor traffic signal management. The

current traffic signal lights with fixed-timing do not adapt to changing traffic volumes which results in frequent

congestion, especially during peak hours (Munuhwa, 2020). With rising economic activities and urban

expansion, there is a growing need for efficient traffic management solutions. Since advanced technologies like

machine learning and smart city innovations are gaining momentum globally, developing countries have an

opportunity to advance into the field of intelligent traffic systems (Papageorgiou, et al., 2019).

Methods like agile and other flexible solutions like the Deep Q-Network (DQN) based traffic signal control

system offer great promises in advancing the efficiency of traffic signal management systems. Deep Q-

Network algorithm is a form of reinforcement learning (RL). It is utilised for adapting traffic signal timings

based on real-time data inputs to improve traffic flow efficiency by reacting to real-time traffic scenarios (Qi,

et al., 2022). Real-time traffic data is collected using virtual cameras in the simulation environment, which

captures key metrics like vehicle counts and traffic density (Cornell University, 2019).

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025

Page 65

www.rsisinternational.org

With the increasing availability of comprehensive traffic datasets and increasing usage and improvements of

Deep reinforcement learning techniques, developing countries can utilise reinforcement learning (RL) for

traffic signal control. A key question for applying RL to traffic signal control is how to define the reward and

state. The ultimate objective in traffic signal control is to reduce the travel time, which is difficult to reach

directly (Jiang, et al., 2021). Existing studies often define reward as an ad-hoc weighted linear combination of

numerous traffic measures. However, there is no guarantee that the travel time will be optimised with the

reward. Recent RL approaches use more complicated state (e.g., image) in order to describe the full traffic

state. None of the existing studies has discussed whether such a complex state representation is necessary

(Jiang, et al., 2022).

This research explores and addresses these challenges, aiming to provide a scalable solution to improve urban

traffic management and reduce congestion.

Related Work

The traffic signal control sector has gone through significant transformations over the past two decades

primarily driven by technological advancements that cater for growing vehicle ownership for convenience,

real-time updates, and integrated services. Some studies include traditional methods that use pre-timed and

actuated signals (Haimerl, FHWA, and Haddad). They are restricted in adaptability as well as allowing for

integration. Multi-agent reinforcement learning shows potentials to solve the heavy traffic problems. It adopts

centralised or distributed strategies but suffers from scalability issues (Kim & Jeong, 2020: Kolat, 2023: Ge et

al., 2022). DRQN models combine Long Short Term Memory (LSTM) with Deep Q-Network for temporal

awareness (Ma et al., 2025) but require more data and training complexity. However, in some developing

countries these innovations are not yet fully leveraged.

(Spatharis & Blekas, 2024) Introduced a concept whereby traffic lights at individual intersections are treated as

autonomous agents. They collaborate in managing signal control, taking into account real time traffic

conditions and optimising their actions to alleviate congestion. This approach mimics a decentralised decision

making system, where each traffic light adapts its timing based on local traffic dynamics (Spatharis & Blekas,

2024).

A multi-element traffic light system featuring blue, yellow and red displays was proposed and it aimed at

improving the communication between traffic lights and drivers, making signal intentions more intuitive.

Genetic algorithms (GA) were used to adaptively adjust and efficiently control these multi-element traffic

lights. This approach highlighted the importance of human computer interaction and user centric design in

traffic control (Haimel et al., 2022).

(Zheng et al., 2019) Proposed a deep recurrent Q-network (DRQRN) technique that combines a recurrent

neural network (RNN) with a deep Q-network (DQN) to learn various traffic environments. The DQRN

minimises the total number of waiting vehicles before the stop line (Liu et al., 2023: Zheng et al., 2019). The

proposed model defined the queue length parameter in the same sense as the travel time of the vehicles.

Studies on reinforcement learning, in particular deep Q—Networks have shown that artificial intelligence (AI)

driven systems can outperform traditional methods by dynamically adjusting to traffic conditions (Qi, et al.,

2022).

Research Gap

Existing frameworks and traffic control systems are relevant, but, they are mainly applicable to developed

infrastructures (Cui et al., 2020). There is lack of camera data to enhance the environment for training the

traffic systems to adjust dynamically to changing scenarios (Wang et al., 2021). These traffic lights should be

used for collecting data on vehicle counts, waiting times and traffic flow at intersections.

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025

Page 66

www.rsisinternational.org

METHODOLOGY

The simulation research methodology guided the iterative design, analysis, and validation of the system model,

enabling a controlled and repeatable evaluation of urban traffic scenarios using a virtual environment. This

methodology facilitated the systematic modelling, simulation, and refinement of the Deep Q-Network-based

traffic signal control system, ensuring that it effectively addresses the challenges of congestion at intersections

while contributing to the broader domain of intelligent transportation systems, see Figure 1.

Figure 1: Simulation Research Methodology Steps for the Research.

For the technical implementation, the agile Kanban methodology, a lightweight, visual project management

approach, was adapted to support flexible, efficient, and iterative software development. Kanban enables a

clear visualisation of the progress of tasks, continuous delivery of requirements and adaptation to changing

requirements throughout the development of the system.

By aligning simulation-based research objectives with modular software development goals, the combined use

of simulation research methodology and agile Kanban ensured that the system was not only experimentally

validated but also practically implementable. This dual approach ensured that the resulting platform is robust,

adaptable and capable of supporting real-time traffic signal optimisations in rapid changing environments like

urban cities.

Ethical Considerations

To maintain the integrity of the research and ensure compliance with ethical standards this study was

conducted with a strong emphasis on confidentiality and data protection. All information gathered from and

datasets, free sources and from stakeholders such as city officials, transport authorities and road users were

securely stored to protect the privacy of individuals involved. Participants who contributed feedback during

prototype demonstrations and usability testing were fully informed of the research objectives, and their explicit

statements were obtained prior to their involvement. This ensured they were aware of their rights and the

voluntary nature of their participation. Additionally, in line with data protection regulations special care was

taken to avoid the collection and misuse of any sensitive or personal information. Although the system uses

simulated data in a virtual environment, future implementations involving real-time camera data will adhere to

ethical standards regarding surveillance, public safety and individual privacy.

System Architecture

Figure 2: System Architecture for Real-Time Traffic Signal Optimisation System.

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025

Page 67

www.rsisinternational.org

The system architecture shown in Figure 2 consists of three layers, the local server layer being the workstation,

the Deep Q-Network model layer and the traffic environment layer which were illustrated as a simulated

environment. The HIK Vision traffic cameras sends traffic data in image form to the model. The Deep Q-

Network model is python based and runs on a Google Coral Dev Board mini-computer which is responsible

for all traffic data processing and sends traffic signal light timings to the traffic lights at controlled

intersections. The server provides resources such as power, more storage and cooling mechanism to the Google

Coral Dev Board. The local server grants permissions to the PTV VISSIM simulation application, in which the

traffic cameras and traffic environment are simulated, and provides it with the resources to run this

environment. The simulation layer provides collected traffic data to the server and model through the COM

API (Component Object Model) and API analytics respectively. The model receives traffic data collected in the

simulation environment, analyses the data and makes decisions based on the Deep Q-Network algorithm. It

also sends signals to the local server if ever there are special resources needed to run the desired traffic

optimisation. This type of architectural design promotes easier debugging and consistent data flow.

Implementation

The development of the system followed an object-oriented design approach, facilitating the modular and

iterative implementation of system components. This enabled flexible development and integration of

simulation, data processing, and optimisation modules.

A. System Components and Integration

Three key intersections within Bulawayo’s Central Business District (in Zimbabwe), a developing country,

were modelled using the map functionality in PTV VISSIM. These included the intersections of Ninth Avenue

and Fort Street, Ninth Avenue and Hebert Chitepo Street, and the uncontrolled intersection at Ninth Avenue

and Joshua Mqabuko Nkomo Street. These locations were chosen due to their high congestion levels and

varying traffic light control schemes.

Simulated traffic environment: The traffic simulation environment was constructed to mirror the real-world

geometry and behaviour of vehicles, intersections, and traffic signals. The simulation was configured to

observe naturalistic vehicle interactions at intersections, highlighting traffic build-up and delays as would

occur in physical urban settings.

Traffic signal and camera setup: At each intersection, simulated Econolite Cobalt Series traffic lights were

deployed. Additionally, four simulated HIK vision traffic cameras were placed per junction, each covering one

approach to capture vehicle counts and movements. These cameras facilitated real-time vehicle detection via

the COM API, enabling data-driven optimisation.

Vehicle modelling: The simulation included diverse vehicle models to replicate realistic traffic scenarios.

Vehicles were assigned stochastic behaviours across lanes and directions, reflecting real-world traffic

dynamics.

B. Functional Implementation

The core Python script, main.py, controlled system behaviour. It imported modules for traffic analysis and

signal control, calling functions in a continuous loop every 10 seconds. The system collected traffic data,

determined the optimal green light configuration using the DQN model, and applied the decision to the traffic

environment.

Using PTV VISSIM’s Component Object Model (COM) API, traffic cameras collected data on vehicle speeds,

volumes, and lane occupancy within a fifty metres capture radius. This data was logged into CSV files and

represented the system's primary input for DQN training and inference.

DQN model training: The Deep Q-Network was implemented using TensorFlow v2.18.0 in Python. It was

trained on a dataset of 64 episodes covering many traffic flow conditions. Each episode corresponded to

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025

Page 68

www.rsisinternational.org

different lane combinations and signal scenarios. Training aimed to reduce queue lengths and waiting times by

learning the optimal signal policies based on observed states.

Data processing and decision making: Once trained, the model processed incoming traffic data in real time. It

compared vehicle volumes across opposing lanes and selected the highest congestion paths to receive a green

signal. Decision rules were based on learned policies that maximise throughput and minimise delay.

Signal optimisation and validation: Simulation outputs were validated by visual inspection of traffic flow

progresses at the selected intersections. In scenarios with high traffic density, green light sequences were

allocated to lanes with heavier volumes ensuring smoother flow and avoiding traffic accumulation.

C. Tools and Technologies used for system development.

Component Object Model (COM) is an API enabled Python-based interaction with PTV VISSIM, allowing for

real-time data extraction and signal control. Visual Studio Code v1.97.2 was used as the integrated

development environment (IDE) for Python and TensorFlow programming. It offered debugging, version

control, and code management features. Deep Q-Network (DQN) algorithm combines Q-Learning with deep

neural networks. It enables the system to process traffic data, learn optimal traffic light policies, and adjust

signal timings dynamically. PTV VISSIM 2025 (SP05 – Student) is a leading microscopic simulation tool used

to model urban traffic and interface with external control systems. Python v3.11.8 was used for algorithm

implementation, simulation control, data processing, and model training due to its simplicity and robust

libraries. TensorFlow v2.18.0 is a framework which provides a scalable and GPU-accelerated environment to

build, train, and deploy the Deep Q-Network model effectively.

RESULTS

The Traffic Signal Light Optimisation System was tested in a simulated environment and it successfully

addressed the traffic congestion problem.

Figure 3: Traffic simulation on intersection of Ninth Avenue and Fort Street

Figure 4: Traffic simulation on Ninth Avenue and Hebert Chitepo Street

Figure 3 and Figure 4 show the optimisation of traffic signal lights at controlled intersections. The controlled

intersections show signs of intelligence by minimising green light sequences for traffic lights facing lanes with

fewer to no traffic at all whilst providing the advantage to the ones with congestion. Figure 4 shows heavier

vehicles being given precedence at the intersection over less than four vehicles in lanes of comparison. The

Deep Q-Network model uses multiple layers of neurons to extract patterns from traffic flow, vehicle density,

and signal timing. It then learns an optimal traffic signal control policy by mapping observed states (traffic

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025

Page 69

www.rsisinternational.org

conditions) to the best possible actions (signal changes) to minimise congestion. This deep learning process is

used in decision-making, helping the Deep Q-Network agent predict and select the best traffic light

adjustments based on real-time data from the PTV VISSIM simulation. It then sends instructions of signal light

timings to the traffic signal light which minimises congestion at intersections and the vehicles respond to the

signal changes as expected.

The simulation shows that the developed system effectively prioritised congested lanes, demonstrating

improved traffic optimisation in the simulation. A real-time traffic signal optimisation system using Deep Q-

Network algorithm and camera data, to collect real-time traffic scenarios and process it using the Deep Q-

Network trained model to optimise traffic flow at congested intersections to reduce traffic congestion was

designed and implemented. Figure 5 shows a comparison of this developed system and other systems that are

already in existence.

Figure 5: Comparison with similar systems

CONCLUSION AND FUTURE WORK

To maintain relevance and scale impact, several enhancements are envisioned for future versions of the system.

For future system improvements, OpenCV can be used for real-time traffic analysis, detecting vehicle density

and movement from camera feeds. This data can improve the Deep Q-Network model’s decision-making,

leading to better traffic flow optimisation. Efforts will also be focused towards real-time deployment, which is

essential for accurately reflecting the unpredictable complexities inherent in real-world scenarios, such as

dynamic pedestrian movement and fluctuating traffic volumes. It is vital that the model undergoes rigorous

testing with real-world traffic data obtained from actual surveillance systems. This approach will allow for the

capture of the full spectrum of complexities, including variations in lighting, diverse weather conditions, and

unpredictable driver behaviour. Furthermore, pilot testing conducted in a controlled real-world environment

will be instrumental in collecting real-time operational data and gaining valuable perceptions into the model’s

performance stability and its readiness for wider, large-scale implementation.

REFERENCES

1. Liu, B., Liu, X., Chen, C., Huang, J., & Ding, Z. (2023). Decentralized Multi-Agent

Reinforcement Learning for Traffic Signal Control. In 2023 42nd Chinese Control Conference

(CCC) (pp. 6045-6050). IEEE.

2. Liu, J., Qin, S., Su, M., Luo, Y., Wang, Y., & Yang, S. (2023). Multiple Intersections Traffic Signal

Control Based on Cooperative Multi-agent Reinforcement Learning. Information Sciences, 647,

119484.

3. Lu, J., Li, B., Li, H., & Al-Barakani, A. (2021). Expansion of City Scale, Traffic Modes, Traffic

Congestion, and Air Pollution. Cities, 108, 102974.

INTERNATIONAL JOURNAL OF RESEARCH AND INNOVATION IN APPLIED SCIENCE (IJRIAS)

ISSN No. 2454-6194 | DOI: 10.51584/IJRIAS |Volume X Issue IX September 2025

Page 70

www.rsisinternational.org

4. Kolat, M., Kővári, B., Bécsi, T., & Aradi, S. (2023). Multi-agent Reinforcement Learning for

Traffic Signal Control: A Cooperative Approach. Sustainability, 15(4), 3479.

5. Qi, F., He, R., Yan, L., Yao, J., Wang, P., & Zhao, X. (2022, August). Traffic Signal Control with

Deep Q-Learning Network (DQN) Algorithm at Isolated Intersection. In 2022 34th Chinese

Control and Decision Conference (CCDC) (pp. 616-621). IEEE.

6. Jiang, S., Huang, Y., Jafari, M., & Jalayer, M. (2021). A Distributed Multi-agent Reinforcement

Learning with Graph Decomposition Approach for Large-scale Adaptive Traffic Signal

Control. IEEE Transactions on Intelligent Transportation Systems, 23(9), 14689-14701.

7. Cornell University, 2019. Diagnosing Reinforcement Learning for Traffic Signal Control. [Online]

Available at: https://arxiv.org/abs/1905.04716

[Accessed 24 September 2024].

8. Ge, H., Gao, D., Sun, L., Hou, Y., Yu, C., Wang, Y., & Tan, G. (2021). Multi-agent Transfer

Reinforcement Learning with Multi-view Encoder for Adaptive Traffic Signal Control. IEEE

Transactions on Intelligent Transportation Systems, 23(8), 12572-12587.

9. Jiang, X., Zhang, J. and Wang, B. (2022) ‘Energy-Efficient Driving for Adaptive Traffic Signal

Control Environment via Explainable Reinforcement Learning’, Applied Sciences (Switzerland),

12(11). Available at: https://doi.org/10.3390/app12115380.

10. Kim, D. and Jeong, O. (2020) ‘Cooperative Traffic Signal Control with Traffic Flow Prediction in

Multi-intersection’, Sensors (Switzerland), 20(1). Available at: https://doi.org/10.3390/s20010137.

11. Munuhwa, S., 2020. Approaches for Reducing Urban Traffic Congestion in the City of Harare.

[Online] Available at:

https://www.researchgate.net/publication/361860503_Approaches_for_Reducing_Urban_Traffic_

Congestion_in_the_City_of_Harare [Accessed 11 October 2024].

12. Papageorgiou, M., Diakaki, C., Dinopoulou, V., Kotsialos, A., & Wang, Y. (2003). Review of Road

Traffic Control Strategies. Proceedings of the IEEE, 91(12), 2043-2067.

13. Ma, J., Li, C., Hong, L., Wei, K., Zhao, S., Jiang, H., & Qu, Y. (2025). Vision-based attention deep

q-network with prior-based knowledge. Applied Intelligence, 55(6), 565.

14. Zheng, G. et al. (2019) ‘Diagnosing Reinforcement Learning for Traffic Signal Control’. Available

at: http://arxiv.org/abs/1905.04716.

15. Spatharis, C. and Blekas, K. (2024) ‘Multiagent Reinforcement Learning for Autonomous Driving

in Traffic Zones with Unsignalized Intersections’, Journal of Intelligent Transportation Systems:

Technology, Planning, and Operations, 28(1), pp. 103–119. Available at:

https://doi.org/10.1080/15472450.2022.2109416.

16. Wang, T., Cao, J. and Hussain, A. (2021) ‘Adaptive Traffic Signal Control for Large-scale Scenario

with Cooperative Group-based Multi-agent Reinforcement Learning’, Transportation Research

Part C: Emerging Technologies, 125, p. 103046. Available at:

https://doi.org/10.1016/J.TRC.2021.103046.

17. Haimerl, M., Colley, M. and Riener, A. (2022) ‘Evaluation of Common External Communication

Concepts of Automated Vehicles for People with Intellectual Disabilities’, Proceedings of the

ACM on Human-Computer Interaction, 6(MHCI). Available at: https://doi.org/10.1145/3546717.

18. Cui, H. et al. (2020) ‘Convolutional neural network for recognizing highway traffic congestion’,

Journal of Intelligent Transportation Systems: Technology, Planning, and Operations. Taylor and

Francis Inc., pp. 279–289. Available at: https://doi.org/10.1080/15472450.2020.1742121.