I think this is a fantastic read, thank you, Vincent, I'm using it in our experiments as we speak. At the same time, for publishing we do need to learn something general, beyond just the way you solved your specific problem. A set of simulators would be great to demonstrate and to generalize not only the particular methods you tried but also the experimental protocol itself. I know this is tough, a good simulator should reproduce not necessarily the same scores but the same rank ordering of the methods as your results on the real system. Then, playing with the simulators and observing the changing rankings can help us to understand what makes methods better or worse on certain types of systems.


Motivations and research themes

Joint work with the Huawei Paris Noah’s Ark team: Merwan Barlier, Igor Colin, Ludovic Dos Santos, Gabriel Hurtado, Cedric Malherbe, and Albert Thomas.

What do we aim for?

Our objective is to develop reinforcement learning autopilots which can be deployed by systems engineers within weeks of setting up the project.

We will make the systems better, cheaper, more reliable, safer, and more energy efficient.

Currently we are working on the cooling system of data centers and the systems that manage the connections in the wireless antenna or your local Wifi network, but the applications of the technology are countless: given that engineering systems are the…


What are the crucial model properties and which model to choose?

Joint work with Gabriel Hurtado and Albert Thomas. This research is part of a broader theme to build AI autopilots for engineering systems.
The corresponding
research paper.
Open source code to reproduce all our results.
Video teaser.

Context

One of our main research objectives at the Noah’s Ark Lab in Huawei France is to build AI autopilots for engineering systems.

We can improve various metrics: making the systems better, cheaper, more reliable, safer, or more energy efficient. Typical systems we are working on include the cooling system of data centers, the system that manages the connections in the wireless antenna, or…


Start with y. Concentrate on formalizing the predictive problem, building the workflow, and turning it into production rather than optimizing your predictive model. Once the former is done, the latter is easy.

If you like the blog, check out this podcast on the same topic.

There is no debate on how a well-functioning predictive workflow works when it is finally put into production. Data sources are transformed into a set of features or indicators X, describing each instance (client, piece of equipment, asset) which the prediction will act on. A predictor then turns X into an actionable piece of information y_pred (will the client churn?, will the equipment fail?, will the asset price go up?). In certain fluid markets (e.g., …


A friend of mine asked me this question: is there really a market for a premium ML as a service, hundreds of data scientist thinking about features and ever more clever algorithms? Or big data will take over even our jobs since more data eventually trumps clever algorithms?

The short answer: there are elements of what we do which are AI complete. Eventually, Artificial General Intelligence will eliminate the data scientist, but it’s not around the corner.

Still, I found it a very good question. Here are some loosely connected thoughts to elaborate on my short answer.

  1. The combination of…


The cyclical process of data science (source).

Curricula for teaching machine learning have existed for decades and even more recent technical subjects (deep learning or big data architectures) have almost standard course outlines and linearized storylines. On the other hand, teaching support for the data science process has been elusive, even though the outlines of the process have been around since the 90s. Understanding the process requires not only wide technical background in machine learning but also basic notions of businesses administration. …


Authors’ comments (Akin Kazakçi, Mehdi Cherti, Balázs Kégl)

A sample of new types of “hand-written” symbols discovered by the model.

Computational creativity has been gaining increased attention in the machine learning community and in the larger public. The wave started by the freaky psychedelic images of deep dream, generated by saturating neurons starting from a natural image (an artificial “migraine”?). Style transfer is another interesting example of a neural net exhibiting artistic traits, although it arguably falls short of what we normally call creative. Closer to the concept, Google’s new Magenta project released their first single, a tune generated by a neural net. Despite the obvious interest (indeed, it is hard to…


In a previous post, I examined the data science ecosystem with its actors, incentives, and challenges in the scientific world. Here I make an attempt to port this analysis to the industrial data science ecosystem. The two ecosystems and the motivations of their actors differ in several aspects so executives face very different challenges in building and managing them. Yet, the structure of the landscape, the actors and the roles are remarkably similar.

The schematic data science ecosystem in a company

Business and IT are well-established functional units of virtually all companies, certainly of those which are contemplating going data. Here I will analyze the remaining three new…


Once upon a time with had the great rationalist/empiricist debate: where does knowledge come from? Then came the scientific method. It didn’t settle the debate but it made it irrelevant. We have models about the world (or theories, or hypotheses, it doesn’t matter: the difference between those three is sociological). It doesn’t matter where they come from. The scientific method offers a set of tools and best practices to test the models. You make observations, in deliberate experiences, observatories, or even accidentally. Your goal is not to validate the models but to falsify them. …


A case for randomizing acceptance of borderline papers

One of the strongest opening lines is from this seminal book on pattern recognition (aka machine learning):

“Life is just a long random walk.”

It stuck into my mind. All I could add was “perhaps, maybe, with a drift” that you control, but that’s it. Our life is full of random events that we later try to fit into a well-constructed narrative, with more or less success.

There are several systems that proudly embrace the random: the US green card lottery, the Vietnam war draft lottery, or, more recently, the Paris medical student selection draw, not without controversy of course…

Balázs Kégl

Head of AI research, Huawei France, previously head of the Paris-Saclay Center for Data Science, co-creator of RAMP (http://www.ramp.studio).

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store