Introduction: What’s Data Science?

And we allow ourselves indulge in our bewilderment for some time, first individually, and then, after we met, collectively over several Wednesday morning breakfasts. But we could not get rid of a nagging sense that there was something actual there, possibly something deep and deep representing a paradigm change in our civilization around information. Maybe, we believed, it is a paradigm change that plays to our strengths. Rather than dismissing it, we chose to research it more.

However, before we move into that, let us first delve deeper into what struck us confusing and obscure –maybe you have had similar inclinations. Then we will clarify what made us get beyond our personal concerns, to the stage where Rachel produced a class on information science at Columbia University, Cathy blogged the program, and you are reading a book on it.

Big Data and Data Science Hype

Thus, what’s eyebrow-raising about Big Data and information science? Let us count the ways:

There is a shortage of definitions around the most elementary terminology. What’s”Big Data” anyhow? What exactly does”data science” imply? What’s the association between Big Data and information science? Is information science that the science of Big Data? Is information science just the stuff happening in businesses such as Google and Facebook and tech businesses? Why is it that a lot of men and women refer to Big Data as crossing areas (astronomy, fund, technology, etc.) and also to information science as just occurring in technology? How large is large? Or is it simply a relative term? These phrases are so ambiguous, they are well-nigh meaningless.

There is a distinct lack of respect for those researchers from academia and business labs who’ve been working on this type of stuff for decades, and whose job relies on decades (sometimes, centuries) of work from statisticians, computer scientists, mathematicians, engineers, and scientists of all sorts. By how the media explains it, machine learning algorithms have been only invented a week and information wasn’t”large” before Google came along. This is not really true. A number of the techniques and methods we are using–and also the challenges we are facing today –are a part of the growth of everything that has come before. This does not indicate that there is not exciting and new things happening, but we think that it’s important to demonstrate some simple respect for all that came before.

The hype is mad –folks throw around exhausted phrases directly from the height of this pre-financial crisis age such as”Masters of the Universe” to clarify information scientists, which does not bode well. Generally, hype masks fact and raises the noise-to-signal ratio. The more the hype continues, the more many people will get turned it off, and the harder it’ll be to find out what is good under it all, if anything else.

Statisticians already feel they are working and studying on the”Science of information.” That is their bread and butter. Perhaps you, dear reader, aren’t a statistician and do not care, but envision that for the statistician, this seems a bit like how identity theft may feel for you. Though we’ll make the case that info science is not only a rebranding of data or machine learning but instead a discipline unto itself, the press frequently refers to data science in a means which makes it seem like if it is simply data or machine learning from the context of the technology market.

People have said ,”Anything which needs to call itself a science is not.” Even though there may be truth in there, which does not signify that the expression”data science” itself signifies nothing, but obviously exactly what it signifies may not be science but more of a craft.

Getting Past the Hype Rachel’s experience moving from acquiring a PhD in data to working in Google is a fantastic example to illustrate why people believed, regardless of the above reasons to be skeptical, there could be some meat at the information science sandwich. Quite simply:

It was apparent to me fairly fast the stuff I had been working on in Google was different than anything else I’d learned at college once I received my PhD in numbers. This isn’t to mention that my level was futile; far out of itwhat I had learned in college provided a frame and way of believing that I depended on everyday, and a lot of the genuine content supplied a strong theoretical and practical base required to perform my job.

However there were many abilities I needed to get at work in Google I had not learned in college. Obviously, my expertise is unique to me personally in the feeling that I had a data history and picked up more computation, coding, and visualization abilities, in addition to domain expertise while in Google. Another individual coming from as a computer scientist or even a social scientist or a physicist could have distinct openings and could fill them accordingly. However, what’s significant here is that, as humans, we all had different strengths and openings, yet we could address problems by placing ourselves together into a data group well-suited to fix the information issues that came our way.

Following is a fair response you may need to this narrative. It is a general truism that, if you move from college to an actual job, you understand there is a difference between what you learned in college and what you do at work. To put it differently, you’re only facing the gap between academic data and business statistics.


Leave a Comment