Data science is new-age astrology
" /> Astrology is the oldest form of data science"/>Data science is new-age astrology
Share
It all started some six years ago, when I was in the middle of switching careers and thus had a fair bit of time on my hands. My close friend suggested that I use this time to learn astrology.
At the outset, I’d been taken aback, and found the suggestion bizarre; for I’ve never been religious or superstitious, or shown any sort of inclination to believe in astrology.
My close friend has background in astrology and specializes in something called “PrashNa shaastra“, where predictions are made based on the time at which the client asks the astrologer a question. She believes this has resulted in largely correct predictions (though I suspect a strong dose of confirmation bias there), and (very strangely to me) seems to believe in the stuff.
“ What’s the use of studying astrology if I don’t believe in it one bit," I asked. “Astrology is very mathematical, and you are very good at mathematics. So you’ll enjoy it a lot," she countered, sidestepping the question.
We went off into a long discussion on the origins of astrology, and how it resulted in early developments in astronomy (necessary in order to precisely determine the position of planets) and so on.
I’ve spent most of the last six years playing around with data and drawing insights from it. A lot of work that I’ve done can fall under the (rather large) umbrella of “data science", and some of it can be classified as “machine learning".Stripped to its bare essentials, machine learning is an exercise in pattern recognition. Given a set of inputs and outputs, the system tunes a set of parameters in a mathematical formula such that the outputs can be predicted with as much accuracy as possible given the inputs (I’m massively oversimplifying here, but this captures sufficient essence for this discussion).
One big advantage with machine learning is that algorithms can sometimes recognize patterns that are not easily visible to the human eye. The most spectacular application of this has been in the field of medical imaging, where time and again algorithms have been shown to outperform human experts while analyzing images.
Traditionally, the most common method to get around spurious correlations has been for the statistician (or data scientist) to inspect their models and to make sure that they make “intuitive sense". In other words, the models are allowed to find patterns and then a domain expert validates those patterns. In case the patterns don’t make sense, data scientists have tweaked their models in a way that they give more meaningful results.
The other way statisticians have approached the problem is to pick models that are most appropriate for the data at hand. Different mathematical models are adept at detecting patterns in different kinds of data, and picking the right algorithm for the data ensures that spurious pattern detection is minimized.
The way a large number of data scientists approach a problem is to take a data set and then apply all possible machine learning methods on it. They then accept the model that gives the best results on the data set at hand. No attempt is made to understand why the given inputs lead to the output, or if the patterns make “physical sense".
And this is not very different from the way astrology works. There, we have a bunch of predictor variables (position of different “planets" in various parts of the “sky") and observed variables (whether some disaster happened or not, in most cases). And then some of our ancients did some data analysis on this, trying to identify combinations of predictors that predicted the output (unfortunately, they didn’t have the power of statistics or computers, so in that sense the models were limited). And then they simply accepted the outputs, without challenging why it makes sense that the position of Jupiter at the time of wedding affects how someone’s marriage would go.
Armed with this analysis, I brought up the topic of astrology and data science again recently, telling to her that “after careful analysis I admit that astrology is the oldest form of data science".
---------------“Data science is new-age astrology, and not the other way round.“-------------------