Book Review: Why Machines Learn by Anil Ananthaswamy

This Week’s review is on Why Machines Learn: The Elegant Math Behind Modern AI by Anil Ananthaswamy.

A very dense book and one for those that want a history of AI in the 20th century rooted in the mathematical leaps that enable, Why Machines Learn is a walk through the leap from calculus and probability to basic AI. The story essentially ends in 2022 and thus lacks some of the relevancy that an avid reader in the field might want. In the end, this is a fun book for nerds that offers little in terms of application for those that use AI.

The book walks through the history of AI and the math that leads to vectors, probabilities, prediction, and training. It “explains the elegant mathematics and algorithms that have, for decades, energized and excited researchers in ‘machine learning’” (2). There are equations, diagrams, graphs, and, yes, even proofs. The book also makes oblique references to some of the deeper cognitive and philosophical connections. This type of thinking is very common on the frontier of AI: “maybe by understanding why [emphasis original], we can one day fully understand how ducklings and, indeed, humans learn” (12). Whether this is true is something that is not approached, but instead such connections are hinted at, especially when tracing the route of several scientists from an interest in the brain or learning to their impact on machine learning.

Each chapter essentially covers one researcher and their impact – mostly focusing on a key mathematical concept. Ananthaswamy does provide relatively easy to follow guides to the mathematics, but a background in at least calculus is recommended. This could make the book an interesting high school or collegiate read – one that ties math to AI, or nonfiction text in a grounded academic subject to a current event. However, it is not a history of AI and jumps from mathematical concept to mathematical concept; not filling in the details to the extent that a true history would. As such, a certain amount of background knowledge would be required to truly enjoy the book.

At the same time, by rooting the text in the math – we can see the “leaps” forward as vectors and training are created and iterated. While not a truly great scientific history (like Rhodes’ The Making of the Atomic Bomb for example), Why Machines Learn is an enjoyable text for the niche reader. Later chapters also present some of the more current challenges in ML: overtraining, bias, optimization, and alignment. Here, too, I found the deeper discussion of self-supervised learning helpful showing how “the training algorithm takes a small sentence, masks one work, and gives that sentence with the masked word as an unput to the network. The network’s task: to predict the missing word and complete the sentence” (402). A similar process happens with pixels for image training. These explanations are repeatedly grounded in the mathematical progress shown in the earlier chapters.

Why Machines Learn: The Elegant Math Behind Modern AI by Anil Ananthaswamy.

Rating: 3/5 Stars

Good For: People who like math and want to learn about AI.

Best nugget: The math behind self-supervised learning.

Please note: As an Amazon Associate I earn from qualifying purchases. However, I am not paid to provide reviews or use content.

Leave a Reply

Discover more from On Catholic Ed and the World of a Principal

Subscribe now to keep reading and get access to the full archive.

Continue reading