Skip to Main content Skip to Navigation
New interface

Resource allocation with observable and unobservable environments

Santiago Duran 1, 2 
1 LAAS-SARA - Équipe Services et Architectures pour Réseaux Avancés
LAAS - Laboratoire d'analyse et d'architecture des systèmes
2 IRIT-RMESS - Réseaux, Mobiles, Embarqués, Sans fil, Satellites
IRIT - Institut de recherche en informatique de Toulouse
Abstract : This thesis studies resource allocation problems in large-scale stochastic networks. We work on problems where the availability of resources is subject to time fluctuations, a situation that one may encounter, for example, in load balancing systems or in wireless downlink scheduling systems. The time fluctuations are modelled considering two types of processes, controllable processes, whose evolution depends on the action of the decision maker, and environment processes, whose evolution is exogenous. The stochastic evolution of the controllable process depends on the the current state of the environment. Depending on whether the decision maker observes the state of the environment, we say that the environment is observable or unobservable. The mathematical formulation used is the Markov Decision Processes (MDPs). The thesis follows three main research axes. In the first problem we study the optimal control of a Multi-armed restless bandit problem (MARBP) with an unobservable environment. The objective is to characterise the optimal policy for the controllable process in spite of the fact that the environment cannot be observed. We consider the large-scale asymptotic regime in which the number of bandits and the speed of the environment both tend to infinity. In our main result we establish that a set of priority policies is asymptotically optimal. We show that, in particular, this set includes Whittle index policy of a system whose parameters are averaged over the stationary behaviour of the environment. In the second problem, we consider an MARBP with an observable environment. The objective is to leverage information on the environment to derive an optimal policy for the controllable process. Assuming that the technical condition of indexability holds, we develop an algorithm to compute Whittle's index. We then apply this result to the particular case of a queue with abandonments. We prove indexability, and we provide closed-form expressions of Whittle's index. In the third problem we consider a model of a large-scale storage system, where there are files distributed across a set of nodes. Each node breaks down following a law that depends on the load it handles. Whenever a node breaks down, all the files it had are reallocated to other nodes. We study the evolution of the load of a single node in the mean-field regime, when the number of nodes and files grow large. We prove the existence of the process in the mean-field regime. We further show the convergence in distribution of the load in steady state as the average number of files per node tends to infinity.
Complete list of metadata

Cited literature [78 references]  Display  Hide  Download
Contributor : LAAS HAL-LAAS Connect in order to contact the contributor
Submitted on : Thursday, July 16, 2020 - 10:02:48 AM
Last modification on : Tuesday, October 25, 2022 - 11:58:11 AM
Long-term archiving on: : Tuesday, December 1, 2020 - 6:55:52 PM


DURAN Santiago.pdf
Files produced by the author(s)


  • HAL Id : tel-02900413, version 1



Santiago Duran. Resource allocation with observable and unobservable environments. Networking and Internet Architecture [cs.NI]. Université Toulouse 3 Paul Sabatier (UT3 Paul Sabatier), 2020. English. ⟨NNT : ⟩. ⟨tel-02900413v1⟩



Record views


Files downloads