Stephane Ross
search externally:   Google Scholar,   Springer,   CiteSeer,   Microsoft Academic Search,   Scirus ,   DBlife


I completed my M.Sc. in Computer Science at McGill University during the summer 2008 semester, under the supervision of Joelle Pineau, as part of the Reasoning and Learning Laboratory. My Master's research has been focusing on developing several efficient algorithms that allow computers/robots to simultaneously learn and plan to achieve a task or long-term goal, under various sources of uncertainty, akin to the ones a robot must face in the real world. In particular, I have worked on several extensions of Model-Based Bayesian Reinforcement Learning methods to more complex problems, such as problems involving partial observability of the world (POMDP), and continous domains. I have also worked on a more efficient extension of Model-Based Bayesian Reinforcement Learning that can exploit and discover hidden structure in the domain to learn more efficiently. The resulting algorithms I have developed allow one to find an optimal plan that trade-off between 1) learning the probabilitic model of the system 2) identifying the hidden state of the system 3) gathering rewards; such as to maximize the expected long-term rewards. This research has useful application in Robotics and Human-Computer interaction, where uncertainty on the parameters of the probabilitistic system is common, and to date, not taken into account in the planning process. The final version of my M.Sc. thesis is available in the publications section.

I completed my B.Sc. in Computer Science at Laval University at the end of the winter 2006 semester. In summer 2005, I have obtained a NSERC Undergraduate Student Research Award to work at the DAMAS Laboratory with S├ębastien Paquet, former Ph.D student, on the RobocupRescue project. I also assisted him and contributed to his research on novel hybrid POMDP approaches. During the last year of my Bachelor degree, I worked part time at the DAMAS Laboratory under the supervision of Brahim Chaib-draa, and investigated the field of multiagent reinforcement learning and game theory, particularly for their application in the RoboCupRescue project. In this environment, the problem is to find a way to make the different agents learn how to cooperate, as efficiently as possible, under partial observability and communication constraints. Furthermore, in summer 2006, I worked full time at the DAMAS laboratory on new efficient algorithms for online search in POMDPs.


flag A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
as author at  14th International Conference on Artificial Intelligence and Statistics (AISTATS), Ft. Lauderdale 2011,