Skip to Main content Skip to Navigation

Resource allocation with observable and unobservable environments

Santiago Duran 1
1 LAAS-SARA - Équipe Services et Architectures pour Réseaux Avancés
LAAS - Laboratoire d'analyse et d'architecture des systèmes
Abstract : This thesis studies resource allocation problems in large-scale stochastic networks. We work on problems where the availability of resources is subject to time fluctuations, a situation that one may encounter, for example, in load balancing systems or in wireless downlink scheduling systems. The time fluctuations are modelled considering two types of processes, controllable processes, whose evolution depends on the action of the decision maker, and environment processes, whose evolution is exogenous. The stochastic evolution of the controllable process depends on the the current state of the environment. Depending on whether the decision maker observes the state of the environment, we say that the environment is observable or unobservable. The mathematical formulation used is the Markov Decision Processes (MDPs).The thesis follows three main research axes. In the first problem we study the optimal control of a Multi-armed restless bandit problem (MARBP) with an unobservable environment. The objective is to characterise the optimal policy for the controllable process in spite of the fact that the environment cannot be observed. We consider the large-scale asymptotic regime in which the number of bandits and the speed of the environment both tend to infinity. In our main result we establish that a set of priority policies is asymptotically optimal. We show that, in particular, this set includes Whittle index policy of a system whose parameters are averaged over the stationary behaviour of the environment. In the second problem, we consider an MARBP with an observable environment. The objective is to leverage information on the environment to derive an optimal policy for the controllable process. Assuming that the technical condition of indexability holds, we develop an algorithm to compute Whittle's index. We then apply this result to the particular case of a queue with abandonments. We prove indexability, and we provide closed-form expressions of Whittle's index. In the third problem we consider a model of a large-scale storage system, where there are files distributed across a set of nodes. Each node breaks down following a law that depends on the load it handles. Whenever a node breaks down, all the files it had are reallocated to other nodes. We study the evolution of the load of a single node in the mean-field regime, when the number of nodes and files grow large. We prove the existence of the process in the mean-field regime. We further show the convergence in distribution of the load in steady state as the average number of files per node tends to infinity.
Document type :
Complete list of metadata

Cited literature [78 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Monday, October 26, 2020 - 12:21:26 PM
Last modification on : Thursday, June 10, 2021 - 3:05:56 AM


Version validated by the jury (STAR)


  • HAL Id : tel-02900413, version 2


Santiago Duran. Resource allocation with observable and unobservable environments. Systems and Control [cs.SY]. Université Paul Sabatier - Toulouse III, 2020. English. ⟨NNT : 2020TOU30018⟩. ⟨tel-02900413v2⟩



Record views


Files downloads