dmm | François Chollet, "The future of deep learning"

A rather overwhelming flow of interesting new things lately, a very incomplete selection here: twitter.com/ComputingByArts

In particular, it turns out that the creator of Keras wrote this text in the last chapter of his book 3 years ago:

blog.keras.io/the-future-of-deep-learning.html

It resonates a lot with what I am doing.

***

"At a high-level, the main directions in which I see promise are:

Models closer to general-purpose computer programs, built on top of far richer primitives than our current differentiable layers—this is how we will get to reasoning and abstraction, the fundamental weakness of current models.
New forms of learning that make the above possible—allowing models to move away from just differentiable transforms.
Models that require less involvement from human engineers—it shouldn't be your job to tune knobs endlessly.
Greater, systematic reuse of previously learned features and architectures; meta-learning systems based on reusable and modular program subroutines.

Additionally, do note that these considerations are not specific to the sort of supervised learning that has been the bread and butter of deep learning so far—rather, they are applicable to any form of machine learning, including unsupervised, self-supervised, and reinforcement learning. It is not fundamentally important where your labels come from or what your training loop looks like; these different branches of machine learning are just different facets of a same construct."

***

I am going to include the concluding summary of his text in the comment.

Flat | Top-Level Comments Only

From:

dmm

The concluding summary:

***

"In short, here is my long-term vision for machine learning:

* Models will be more like programs, and will have capabilities that go far beyond the continuous geometric transformations of the input data that we currently work with. These programs will arguably be much closer to the abstract mental models that humans maintain about their surroundings and themselves, and they will be capable of stronger generalization due to their rich algorithmic nature.

* In particular, models will blend algorithmic modules providing formal reasoning, search, and abstraction capabilities, with geometric modules providing informal intuition and pattern recognition capabilities. AlphaGo (a system that required a lot of manual software engineering and human-made design decisions) provides an early example of what such a blend between symbolic and geometric AI could look like.

* They will be grown automatically rather than handcrafted by human engineers, using modular parts stored in a global library of reusable subroutines—a library evolved by learning high-performing models on thousands of previous tasks and datasets. As common problem-solving patterns are identified by the meta-learning system, they would be turned into a reusable subroutine—much like functions and classes in contemporary software engineering—and added to the global library. This achieves the capability for abstraction.

* This global library and associated model-growing system will be able to achieve some form of human-like "extreme generalization": given a new task, a new situation, the system would be able to assemble a new working model appropriate for the task using very little data, thanks to 1) rich program-like primitives that generalize well and 2) extensive experience with similar tasks. In the same way that humans can learn to play a complex new video game using very little play time because they have experience with many previous games, and because the models derived from this previous experience are abstract and program-like, rather than a basic mapping between stimuli and action.

* As such, this perpetually-learning model-growing system could be interpreted as an AGI—an Artificial General Intelligence. But don't expect any singularitarian robot apocalypse to ensue: that's a pure fantasy, coming from a long series of profound misunderstandings of both intelligence and technology. This critique, however, does not belong here."

***

I don't necessarily endorse the last paragraph. The author thought about this topic a lot and published extensively on it elsewhere, but this does not mean that he is right: there is a rich diversity of opinions on this topic.

The rest of what he is saying, though, makes complete sense to me.

Speaking about the last point, here is his essay "The implausibility of intelligence explosion": https://medium.com/@francois.chollet/the-impossibility-of-intelligence-explosion-5be4a9eda6ec

I think that Eliezer Yudkowsky demolishes it quite convincingly: https://intelligence.org/2017/12/06/chollet/

After reading Yudkowsky I think that Chollet is quite wrong on this one.

They sort of "agreed to disagree": https://mobile.twitter.com/fchollet/status/938855547796779009

What's unusual about Chollet's position is that this comes from someone who seems to be quite optimistic about the pace of near-future development (usually, the position that we don't need to worry about AI-related existential risks comes from people who also state publicly that we are very far from "human-level AI", but Chollet seems to think that an "artificial programmer" might be achievable soon (or, perhaps, I am just imputing this, but he would not agree, who knows)).

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Dataflow matrix machines (by Anhinga anhinga)

François Chollet, "The future of deep learning"

François Chollet, "The future of deep learning"

no subject

no subject

Profile

May 2025

Most Popular Tags

Page Summary

Active Entries

Style Credit

Expand Cut Tags