Please enable JavaScript.
Coggle requires JavaScript to display documents.
Device-to-Device communication in Narrowband Internet of Things - Coggle…
Device-to-Device communication in Narrowband Internet of Things
Narrowband-Internet of Things (NB-IoT)
transmission range of more than 3 km in urban and 15 km in open area
support the IoT devices with an intended life expectancy of 10 years
improve spectrum efficiency, in depth and extended coverage
integrated into LTE or Global System for Mobile Communications (GSM) networks to share spectrum and reuse the same hardware
requires one Physical Resource Block (PRB) of the LTE spectrum, that is 180 kHz from system bandwidth for downlink and uplink communication
network efficiency decreases with the increase of number of transmission repetitions
IoT devices are usually focused on the uplink transmission to upload the acquired data to the gateway/sink node or the cloud server.
paper aims
adapt the D2D communication as a routing extension to upload the urgent NB-IoT UE data to eNB in order to maximize PDR and minimize EED.
effective two-step RL-enabled intelligent D2D communication model, which considers relay selection as a MAB problem and solves it using the UCB algorithm
background of NB-IoT, D2D communication, and MAB learning schemes.
proposed I-D2D algorithm effectively selects the relay with minimum overheads and uploads the UEs data to BS/eNB with minimum delay and maximum PDR.
I-D2D communication has a higher PDR with minimum EED in terms of processing time.
deployment modes of NB-IoT
Stand-alone mode.
NB-IoT can also be deployed within 200 kHz of the GSM spectrum. NB-IoT can exploit the power of BS, which significantly improves the coverage of the system.
Guard-band mode.
The guard-band of the LTE spectrum is utilized for NB-IoT deployment.
In-band mode.
One of the PRBs of the LTE spectrum is allocated for NB-IoT deployment. The total power of eNB is shared between LTE and NB-IoT
D2D communication
direct communication link with nearby devices without considering the intervention of the cellular networks.
Reinforcement learning enabled relay selection
dynamic relay selection approach to learn about the relay, which is more likely to be available and provide the best PDR.
The learning process to select an optimum relay can be modeled as a MultiArm Bandit (MAB) system
quality of the relay changes based on the location of the relay and the channel condition (Signal-to-Interference-Noise power Ratio SINR)
selection of the optimum relay potentially leads towards the minimum delay and a reliable PDR, less costly in terms of system overheads and energy consumption.
Multi-arm bandit framework
hit and trial methodology
in real-time network scenarios dynamics are not known
RL algorithms are used to find the optimal solution in real-time networks
RL is the type of ML in which the learner (agent) has no prior knowledge of which action to perform in order to maximize the numerical reward (to move in the direction of the main objective)
RL has three main elements: agent, environment, and reward
State-space and action:
the player (agent) is the NB-IoT UE, and the state-space contain 𝐾 states (machines) of the environment, which are the Cellular UEs (CUEs) used as a relay node to upload NB-IoT data.