Advertisement

Season 6 Finale - Data4Good, Hybrid ML-RL, Sequence Predictions, Unreliable Data

Season 6 Finale - Data4Good, Hybrid ML-RL, Sequence Predictions, Unreliable Data Thanks to for welcoming us and providing the drinks.
The presentation slides are here:
SPEAKERS:
** Emmanuel Bacry, CNRS, University Paris-Dauphine, Two ambitious AI for Good projects
I will present two AI related ambitious projects I am involved in. The National Health Data Hub (in which I am involved as the CSO) that started a few months ago and a startup I cofounded two years ago, nam.R (35 employees as of today).

** Sean Whitbeck, Liftoff, Reliable ML on unreliable data
In academic research, supervised learning benchmarks are performed against static datasets (e.g. MNIST, Imagenet). However, in industrial applications of machine learning, dynamic datasets are the norm. Data churn, label drift,
delayed data, censored data, corrupted data, training-serving skew, human error are but a few factors that can dramatically impact the performance of a machine learning system.

Over the years, Liftoff has developed many strategies to deliver reliable
predictions from unreliable data. In this talk, we'll discuss real-time
monitoring of serving accuracy, automated model safety checks, end-to-end, feature integrity tests, and how to efficiently patch immutable append-only datasets.

** Arnaud de Moissac, DCBrain, ML/RL for Combinatorial Optimisation : the next big thing
Deep RL is nice to win atari games but is useless to solve large and complex real-life problems like combinatorial optimisation issues. We can find these everyday problems in the industry, transportation, grid management, supply chain … Usually solved with Operational Research, we need to use new paradigms to be able to solve these issues fast enough to perform everyday optimisation. A hybrid approach between ML & OR can be very efficient as we tested it at DCbrain. And as Yoshua Bengio said « we strongly believe that this is just the beginning of a new era for combinatorial optimisation algorithms »

** Gilles Madi, prevision.io, Auto-Lag Networks for Real Valued Sequence to Sequence Prediction
Many machine learning problems involve predicting a sequence of future values of a target variable. State-of-the-art approaches for such use cases involve LSTM based sequence to sequence models.
To improve the performances, those models generally use lagged values of the target variable as additional input features. Therefore, appropriate lag factor has to be chosen during feature engineering. This choice often requires business knowledge of the data. Furthermore, state-of-the-art sequence to sequence models are not designed to naturally handle hierarchical time series use cases.

In this paper, we propose a novel architecture that naturally handles hierarchical time series. The contribution of this paper is thus two-folds. First we show the limitations of classical sequence to sequence models in the case of problems involving a real valued target variable, namely the error accumulation problem and we propose a novel LSTM based approach to overcome those limitations.
Second, we highlight the limitations of manually selecting fixed lag values to improve the performance of a model. We then use an attention mechanism to introduce a dynamic and automatic lag factor selection that overcomes the former limitations, and requires no business knowledge of the data.
We call this architecture Auto-Lag Network (AL-Net). We finally validate our Auto-Lag Net model against state-of-the-art results on real-world time series and hierarchical time series data sets.

#machinelearning,#meetup,#datascience,

Post a Comment

0 Comments