Название: Data Smart: Using Data Science to Transform Information into Insight, 2nd Edition Автор: Jordan Goldmeier Издательство: Wiley Год: 2024 Страниц: 445 Язык: английский Формат: pdf (true) Размер: 33.6 MB
A straightforward and engaging approach to Data Science that skips the jargon and focuses on the essentials.
In the newly revised second edition of Data Smart: Using Data Science to Transform Information into Insight, accomplished data scientist and speaker Jordan Goldmeier delivers an approachable and conversational approach to data science using Microsoft Excel’s easily understood features. The author also walks readers through the fundamentals of statistics, machine learning and powerful Artificial Intelligence (AI) concepts, focusing on how to learn by doing.
Data Science is the transformation of data using mathematics and statistics into valuable insights, decisions, and products:
• Data scientists identify relevant questions that can be solved with data. This may sound obvious, but many questions can’t be solved with data and technology. A good data scientist can tease out the problems in which algorithms and analyses make the most sense.
• Data scientists extract meaningful patterns and insights from data. Anyone can eyeball a set of numbers and draw their own conclusions. On the other hand, data scientists focus on what can be said statistically and verifiably. They separate speculation from science, focusing instead on what the data says.
• Finally, data scientists convey results using data visualization and clear communication. In many cases, a data scientist will have to explain how an algorithm works and what it does. Historically, this has been a challenge for many in the field. But a recent crop of books (like this one) aims at giving data scientists a way to explain how they came to their results without being too stuck into the weeds.
Many (but not all) veteran data scientists will tell you they loathe spreadsheets and Excel in particular. They will say that Excel isn’t the best place to create a data science model. To some extent, they’re right. But before you throw this book away, let’s understand why they say this. You see, there was a time before R and before Python. It was a time when MATLAB and SPSS reigned supreme. The latter tools were expensive and often required a computer with some major horsepower to run a model. Moreover, the files that these tools generated were not easily distributable. And, in a secure corporate or institutional environment, sending files with code in them over email would trip the unsafe- email alarms. As a result, many in the industry began building their work in Excel. This was particularly true of models that helped support executive decision- making. Excel was the secret way around these email systems. It was a way to build a mini data application without having to get approval from the security team. Many executive teams relied on Excel. Unfortunately, this also created a myopic view among executives who didn’t really understand data science. For them, Excel was the only place to do this type of work. It was where they were most comfortable.
They knew the product. They could see what the analyst created. And the analyst could walk them through each step. In fact, that’s why we’re using Excel in this book. But Excel (at the time) was limited. Limited by how much it could process at any moment. Limited by the amount of data it could store. The macro language behind Excel, Visual Basic for Applications (VBA), is still hailed by many executives as an advanced feature. But VBA is based on Visual Basic 6.0, which was deprecated in 1999. The Excel version of this language has received only the barest of updates. When today’s data scientists point out that VBA can’t do what R or Python can, it’s hard to disagree. On the flipside, however, Microsoft has paid attention over the last few years. The Excel product team has come to understand how data scientists use their tool. They’ve poured more research into some very specific use cases. For instance, we’ll talk about an entirely new data wrangling tool in Excel called Power Query. Power Query can do the same data wrangling tasks as in Python and R, often more quickly. And we’ll talk about new Excel functions that make data science in Excel a whole lot easier. Today, there is renewed interest in using Excel for data science problems beyond what was possible only a few years ago.
At the end of this book, I’ll show you how to implement what we’ve built in Excel in R. In fact, this follows my own path in building data science tools for companies. First, I would lay out my ideas in Excel. Use the spreadsheet as a way to validate my ideas and make sure I understand exactly what the algorithms do. Then, usually, when I’m ready, I move it to R or Python.
You’ll also find:
Four-color data visualizations that highlight and illustrate the concepts discussed in the book Tutorials explaining complicated data science using just Microsoft Excel How to take what you’ve learned and apply it to everyday problems at work and life
A must-read guide to data science for every day, non-technical professionals, Data Smart will earn a place on the bookshelves of students, analysts, data-driven managers, marketers, consultants, business intelligence analysts, demand forecasters, and revenue managers.
Скачать Data Smart: Using Data Science to Transform Information into Insight, 2nd Edition