What is Data Science, How Does it Work, and Why is it Important?
By Adam Coscia
Foreword: I am a recently graduated Physics undergrad who researches human interactions in digital spaces and loves the interplay of technology, science, and information. I am clearly biased towards this field, which has made me all the more critical of it. In other words, if I’m going to dedicate my time and energy to learning and practicing this subject, I want to know it will be worth my time. And so, along the way, I’ve developed a unique understanding of what it is, how it applies to everyday life, and why it matters to the average person. I present to you my views on one of the hottest fields of research and application you may not know much about: Data Science.
Towards the end, I’ll also be giving an account from Rita Fuller, one of my mentors who has been practicing data science for years and wants to share her experience! Connect with her on LinkedIn: https://www.linkedin.com/in/rita-fuller-4396b38/
What is data science? Well for that matter, what are machine learning, big data, neural networks, etc.? You may have heard these buzzwords thrown around university, in the news, at conferences, across other technology-focused media outlets, usually in conjunction with business or computer applications. You know, like Google and Facebook algorithms that are “smart” and suggest articles and websites you might like? Or like Walmart and Target who send you coupons for the things you buy the most? How do these companies do this? What is this “data science” that powers the “learning” machines do in order to even make these suggestions?
Keep in mind before I ramble that most of the data science “lingo” you hear is buzzwords used for effect. This is a complex subject to talk about because it means different things to different people in different fields. Thankfully though, the core guiding principles are ubiquitous and simple—if they weren’t, this article would take much too long to read to be of use! Honestly, a need for efficiency and connections is a big reason why I took so strongly to data science. It emphasizes knowing a lot about things and making lots of connections between those things which, trust me, makes life easier, faster, and more convenient, and who doesn’t want that?
Let me first describe data science in a useful way: Let’s say you want to learn about something. You might ask around about it, research it, or practice it until you know enough to decide what to do about the information you just obtained. That hopefully creates more questions that you can answer by gathering more information as you work towards accomplishing your goals. Most of us are familiar with this methodology, especially those coming from university, because it’s how learning happens! At the end of the day, data science just formalizes this process with math, statistics, programming, and a good amount of interpretation. In brief, data science is the science of discovering, extracting, transforming, transporting, contextualizing, and employing information in the service of human endeavors.
How does it work? Sir Francis Bacon famously expressed the outcome of pursuing data science well: “Knowledge is power.” That is, if one knows something, they can influence other entities with this information to create change. Given this new-found perspective on knowledge, I am willing to bet you could guess the question I am about to pose: can humans actually accomplish more if we know more? History has shown us the answer to be a resounding YES! Therein lies our motivation behind the how: data science is employed to gather, create, and utilize information in any field, and this can lead to having more power, influence, or control over whatever you do. If you’re like me and appreciate a little more thoroughness to round out your explanations, then classically data science is a subset of information science, or the study of how information is exchanged. In a modern context, data science has become the practice of using computational machines, i.e. our trusty computers, to perform the time-consuming work of processing vast amounts of digital information found in “big data” to discover and create the knowledge we humans want to understand, interpret, and use for our benefit.
On the surface this may be clicking, but how does one actually “do” data science? Traditionally, that’s the job of statisticians, programmers, engineers, scientists, and business analysts. A data science team decides (or often their employer decides…) that they want to learn about something using available information, so the team sets about gathering, processing, and examining digital data from various sources. This could be numbers, words, pictures, sounds, or really anything that can be processed on a computer. The statisticians decide how reliable and/or useful the information is, the programmers write the computer code to gather, process, and extract the information, the engineers design the system to be efficient, scalable, and cost-effective, the scientists use their understanding of the field being studied to interpret the information, and the business analysts present the findings to people who want to monetize it. Each person has a unique role in creating a cohesive unit of discovery and I absolutely love this. Data science creates insights into how things work from basic information! It can be hard, sure, but it’s immensely rewarding. You can be in control of what to do with the things you learn, and people are willing to pay you to do this. The only thing stopping you is the desire to ask questions about the things you see around you. Why is this pattern appearing? How do people use this product? What is the best way to plan this project? Hence, data science truly is about knowing things and making connections between those things—when you do, you can understand so much about how the world works.
Well that’s all fine and dandy, but can anyone do it? I keep emphasizing the ubiquitous nature of data science—now I feel I can properly express just how many different fields use data science. Natural sciences? Of course! Physics, chemistry, biology, psychology, sociology—if you perform an experiment, you create data, and you need to analyze that data to see patterns and determine if you’ve discovered something. Engineering? Heck yeah! Any engineer worth their salt needs to plan and prepare for costs, materials, time, labor, and anything else required to create something, and that is data waiting to be crunched to get to the bottom-line. Business? You bet your bottom dollar. The patterns and habits of consumers and other businesses are a literal gold mine of information waiting to be interpreted and discovered using data science for profit. Humanities? Yes! Language, art, writing, history, politics, and other human-centered fields contain vast amounts of information that, when looked at together as a set of data, create an interconnected network of ideas that describe so much about what we do and why we do it. In all of these areas and more, it’s the data that tells us what we need to know and the processes that study this data are encapsulated in data science.
If my “data” hasn’t yet given you the results you’re looking for, then just ask my coworker Rita, who has been in the business of using data to get ahead for many years! She was and is an incredible mentor to me as I’ve pursued data science-related positions. I recently asked her about her time in the field and what she felt was pivotal in her success as a data scientist, and here is the account she wanted to share with you:
In her undergraduate career, Rita started off as a Physics major and double majored in Math. She reached a defining moment during her junior year when her professor invited her to engage in researching binary stars. For her, it was exciting to seek answers to a question that had never been asked before using data! She decided to pursue mathematics further and got an M.S. in mathematics as well as a 2nd master’s in applied statistics, when she began a career as a statistician. She says another pivotal moment came when given opportunities to work on non-traditional projects, such as a job as the sole statistician on a sales team. The opportunity to think on her feet and truly understand how data was used in business, as well as how data models were built, really set her up for future success. Rita also worked for a start-up, AGAIN as the only statistician! From merchandising to marketing and operations, her work taught her how providing insights and models to all areas brought great impact to the business. Since it was a start-up, her results were immediately implemented which was a very satisfying feeling.
If there is anything she would do differently, she would recognize her personal strengths and weaknesses much earlier in her career, exploiting the strengths and working on the weaknesses. Honest self reflection is vital to growth! Something she would do the same is to take advantage of those opportunities that come your way, even if, at first, they don’t look so appealing. At one point, she was placed in the ‘Analytics’ team instead of the ‘Statisticians’ team. Analytics was client-facing, and she knew she was picked for it because of her communication skills and ability to generate business insights. Yet that meant she would be spending less time on the things she loved to do! Rather than view it as a negative, however, Rita decided to immerse herself in this role and make the most of things, developing a keen business acumen and allowing herself to flow between the business, statistics, and engineering teams while being a total superstar! When life throws you lemons…
Her favorite moments have been the mentoring she’s done and making a difference in someone’s life. She wants aspiring data scientists to know that a career involving data requires you to be well-rounded. Be good at the data science, but also learn to work with people and make their experience with you a positive one. If you’d like to connect with her, you can reach her via LinkedIn: https://www.linkedin.com/in/rita-fuller-4396b38/. She regularly meets with a group specifically for women interested in data science!
Why does it matter? After all, up to this point, the what and how shouldn’t be too surprising. The idea of learning from data has cropped up in different ways in myriad cultures across the world—it is a large part of human history, playing a formative role in the foundation of societies with their own customs and ways of life. That is, much of what almost any society on Earth uses and values today can be traced back to prior generations, who discovered many things and, most importantly, passed lessons on to their friends, families and peers. It’s a “neural network” of people that grows organically to (hopefully!) make the best decisions using prior data from lots of different inputs across the world. You’ve probably participated yourself—any debate, discussion, or decision you contribute to is based on the things you know, i.e., the data from your life. Much of what we commonly attribute as the dawn of humankind began with the communication of ideas. We are clearly a species that learns from experience, and this has led us to create the amazing world of today using data from our past. We are fundamentally all practicing “data science” every day! So, when that lightbulb goes off over your head and you put “two and two” together from the things you see around you, you’re well on your way to knowing why data science matters to each and every one of us.
That’s why I want to stay in the subject and want other people to join in. It’s amazing because it applies to so much of what so many people do in such a natural way. All that’s required is asking questions and using data to find the answers—a simple Google search on “data science”, "machine learning”, and a field to apply it to will get you on your way to practicing it in a methodical, informative, and useful way.
You know what would end this article well? Some data that really drives home the point… Maybe you could crunch some numbers and make the case to convince yourself.
About the author: Hi, I'm Adam! I'm a 22-year-old recently graduated Physics undergrad from Stevens Institute of Technology. I love people, science, and technology, especially when they come together to inspire people to learn more about the world around them. That's why I'll be continuing my education this Fall at Georgia Institute of Technology in their Human-Centered Computing Ph.D. program as a President's Fellow. Reach out and say hi! firstname.lastname@example.org
Head of Data Science Development
Center for Data Science and Artificial Intelligence
New York Life Insurance Company