Abstract : This paper deals with the task offloading problem in the dynamic fog computing networks (FCNs) that involves the task and resource allocations between a set of task nodes (TNs) having task computation needs and a set of helper nodes (HNs) having available computing resources. The problem is associated with the presence of selfishness and rational nodes of these nodes, in which the objective of TNs is to minimize the task completion time by offloading the tasks to the HNs while the HNs tend to maximize their monetization of task offloading resources. To tackle this problem, we use the fairness and stability principle of matching theory to assign the tasks of TNs to the resources of HNs based on their mutual preferences in a decentralized manner. However, the uncertainty of computing resource availability of HNs as well as dynamics of QoS requirements of tasks result in the lack of preferences of TN side that mainly poses a critical challenge to obtain a stable and reliable matching outcome. To address this challenge, we develop the first, to our knowledge, Thompson sampling based multi-armed bandit (MAB) learning to acquire better exploitation and exploration trade-off, therefore allowing TNs to achieve the informed preference relations of HNs quickly. Motivated by the above considerations, this paper aims at design a bandit learning based matching model (BLM) to realize the efficient decentralized task offloading algorithms in the dynamic FCNs. Extensive simulation results demonstrate the potential advantages of the TS based learning over the ε-greedy and UCB based baselines.
Index terms : Distributed Task Offloading , Dynamic Fog Computing Networks , Thompson Sampling , Multi-armed bandit , Bandit Learning