Название: Applied Text Mining Автор: Usman Qamar, Muhammad Summair Raza Издательство: Springer Год: 2024 Страниц: 505 Язык: английский Формат: pdf (true) Размер: 12.9 MB
This textbook covers the concepts, theories, and implementations of text mining and Natural Language Processing (NLP). It covers both the theory and the practical implementation, and every concept is explained with simple and easy-to-understand examples.
It consists of three parts. In Part 1 which consists of three chapters details about basic concepts and applications of text mining are provided, including eg sentiment analysis and opinion mining. It builds a strong foundation for the reader in order to understand the remaining parts. In the five chapters of Part 2, all the core concepts of text analytics like feature engineering, text classification, text clustering, text summarization, topic mapping, and text visualization are covered. Finally, in Part 3 there are three chapters covering deep-learning-based text mining, which is the dominating method applied to practically all text mining tasks nowadays. Various Deep Learning approaches to text mining are covered, includingmodels for processing and parsing text, for lexical analysis, and for machine translation. All three parts include large parts of Python code that shows the implementation of the described concepts and approaches.
The textbook was specifically written to enable the teaching of both basic and advanced concepts from one single book. The implementation of every text mining task is carefully explained, based Python as the programming language and Spacy and NLTK as Natural Language Processing libraries. No prior knowledge of Python, Spacy, and NLTK is required. The book is suitable for both undergraduate and graduate students in Computer Science and engineering, who wish to study and learn more on this important active discipline that is considered an important milestone in Artificial Intelligence (AI). The book focuses in a unique style on looking at Generative AI to generate, understand, and interpret text.
“Part I: Text Mining Basics” consists of three chapters. Chapter 1 details the textual data, text mining operations, structure of the text information systems, and other basic concepts. Chapter 2 will cover details about what types of tasks are performed during the preprocessing of text. Chapter 3 will discuss the two common applications of text mining, i.e., sentiment analysis and opinion mining. Both applications are discussed with examples, the implementation in Python, and an explanation of the complete code.
“Part II: Text Analytics” consists of Chaps. 4 to 9. In Chap. 4, the process of feature engineering will be discussed in detail along with examples, Python implementation, and a complete explanation of the source code. Chapter 5 is on text classification. Text classification, also known as categorization, is one of the important text mining tasks. The entire process of text classification is based on supervised learning, where the text is categorized on the basis of training data. In this chapter, the task of text classification will be discussed in detail. Each and every step will be explained with the help of examples. Python code and a complete description will also be provided. Chapter 6 on the other hand is on text clustering. Similar to text classification, text clustering is another important task that is performed in the context of textual analysis. In the clustering process, the text is organized in the form of relevant groups and subgroups for further processing. The chapter will explain the clustering process in detail along with examples and implementation of each step in Python. Chapter 7 covers text summarization and topic modeling. Text summarization and topic modeling are other tasks that have become critically important, especially in the era of social media. Chapter 8 deals with taxonomy generation and dynamic document organization. Taxonomy generation refers to automatic category predefinition of the text. It is the process of generating topics or concepts and their relations from a given corpus. This chapter will explain each process’s details and will provide examples and a complete description of the accompanying Python source code. Finally, Chap. 9 covers visualization approaches. In the context of human-centric text mining, the interaction of the user with text mining systems has critical importance.
Finally, “Part III: Deep Learning in Text Mining,” consists of three chapters, i.e., Chaps. 10–12. Deep learning has obtained great importance for processing textual data, especially in text clustering and classification. Chapter 10 will explain how Deep Learning can be used in the context of text mining with examples and a complete description of the accompanying Python source code. Chapter 11 will explain the concepts related to Deep Learning in lexical analysis and parsing with practical examples and accompanying code. Chapter 12 will introduce the concepts of machine translation (MT) using Deep Learning models and techniques with examples and a complete description of the accompanying Python source code.