The Art of Simplifying: Embracing Sparsity and Parsimonious Models

Welcome back, friends! Today, we’re delving into the world of sparsity compression and compressed sensing. But before we get into the nitty-gritty, let’s take a moment to ponder the beauty of simplicity and parsimony in modeling.

So, what exactly is sparsity? Well, it’s not just about having data with mostly zeros. It’s about recognizing that out of the myriad pieces of information available, only a few truly matter. That’s where parsimonious models come into play.

Think of parsimony as being greedy in a good way. It means seeking the fewest descriptions or pieces of information needed to fully capture a phenomenon or data. Take, for example, the concept of greed in the movie Wall Street. Greed, when applied to modeling, can actually be a good thing! In fact, a 2004 paper by Joel trop titled “Greed is Good” showcases the remarkable properties of sparse and greedy optimization in solving problems in linear algebra.

In the realm of machine learning, when building models to fit data, we’re faced with a choice. Do we opt for a million degrees of freedom and countless parameters to tweak, or do we go for a model with just three simple parameters that gets the job done almost as well? Personally, I’d go with the latter—the parsimonious model that does more with less.

But this idea of parsimony is not new. It has been embraced by brilliant minds throughout history. Albert Einstein, for instance, famously said, “Everything should be made as simple as possible, but not simpler.” He understood that when it comes to describing the physical laws and models that govern our world, simplicity is key. Newton’s second law, F=ma, is a shining example of a parsimonious model that elegantly explains so much. Even Einstein’s own E=mc² is stripped down to its simplest form while still capturing the essence of our observations.

Further reading: Why Choose a Data Lakehouse Architecture

Occam’s razor, the notion that the simplest explanation tends to be the right one, further reinforces the principle of parsimony. This law of parsimony, or lex parsimony, as it’s called in Latin, urges us to embrace the simplest yet effective set of causes to explain phenomena. Aristotle and other great minds before him also espoused the idea that, when modeling the natural world, the same causes often govern similar effects.

The 80/20 rule, also known as the Pareto rule, states that roughly 80% of the effects come from 20% of the causes. This principle aligns with Occam’s razor, showing us that we can often explain the majority of things with just a few simple causes.

Interestingly, even the ancient Roman astronomer Ptolemy had a parsimonious theory for planetary motion, despite its ultimate incorrectness. His theory of circles within circles endured for 1,500 years because it offered a simple and effective explanation for observed data. It wasn’t until Newton and Kepler came along with an even simpler explanation that the Ptolemaic system was finally dethroned.

In the modern era, as we build models using machine learning from data, parsimonious or sparse models remain as essential as ever. By reducing the number of free variables or parameters to the bare minimum, we can prevent overfitting. Overfitting occurs when a model becomes too complex and begins fitting noise in the data rather than capturing the underlying causes. We want models that are interpretable, easy to understand, and robust.

Sparse models not only prevent overfitting, but they are also more interpretable. With fewer terms and parameters, they are easier to communicate and analyze. And let’s not forget the wisdom of Einstein, Newton, and Pareto—they all championed the importance of parsimony in modeling the physical world.

Further reading: "Discovering the Essence of Images: A Journey into Dimensionality Reduction"

As we delve into the mathematics of sparsity and compression, let’s remember why we care about this topic. It’s not just about compression; it’s about embracing simplicity and parsimony. It’s about adhering to Occam’s razor, striving for elegance and effectiveness in our models. Personally, I’m all for embracing the words of Einstein, Newton, and Pareto—they guide us towards the parsimonious and stingy models that truly capture the essence of our data.

And that, my friends, is the art of simplifying.

Thank you!

Techal

YouTube video — The Art of Simplifying: Embracing Sparsity and Parsimonious Models