Programming for Corpus Linguistics with Python and Dataframes » MIRLIB.RU - ТВОЯ БИБЛИОТЕКА
Bright Cooking: Recipes for the Modern Palate Bright Cooking: Recipes for the Modern Palate Двужильная Россия. Дневники и воспоминания Двужильная Россия. Дневники и воспоминания Cult Cocktails: 100 recipes and tricks for the home bartender Cult Cocktails: 100 recipes and tricks for the home bartender The Complete Color Harmony: Deluxe Edition: Expert Color Information for Professional Color Results The Complete Color Harmony: Deluxe Edition: Expert Color Information for Professional Color Results Война кланов: "черный фронт" против НСДАП Война кланов: "черный фронт" против НСДАП Лефортово: история, люди и судьбы Лефортово: история, люди и судьбы The Pocket Universal Principles of Interior Design: 100 Ways to Develop Innovative Ideas, Enhance Usability, and Design The Pocket Universal Principles of Interior Design: 100 Ways to Develop Innovative Ideas, Enhance Usability, and Design Echoes of Equality: The Story of the French Revolution: Liberty, Equality, and the Shaping of Modern Ideals Echoes of Equality: The Story of the French Revolution: Liberty, Equality, and the Shaping of Modern Ideals Holy Wars: The Rise and Impact of the Crusades: Unraveling the Intricacies and Legacies of the Medieval Crusades Holy Wars: The Rise and Impact of the Crusades: Unraveling the Intricacies and Legacies of the Medieval Crusades Увлекательная геральдика: Факты, легенды, открытия в мире гербов и наград Увлекательная геральдика: Факты, легенды, открытия в мире гербов и наград История русской архитектуры История русской архитектуры Building Micro-Frontends, 2nd Edition (Second Release) Building Micro-Frontends, 2nd Edition (Second Release) Китайские целебные травы. Классический труд по фармакологии Китайские целебные травы. Классический труд по фармакологии Программирование инженерных задач на базе использования алгоритмов циклической структуры на языке C в среде VS C++. Модуль 2 Программирование инженерных задач на базе использования алгоритмов циклической структуры на языке C в среде VS C++. Модуль 2 Взорванная память. Уничтоженные памятники русской воинской славы Взорванная память. Уничтоженные памятники русской воинской славы Древнерусская государственность: генезис, этнокультурная среда, идеологические конструкты Древнерусская государственность: генезис, этнокультурная среда, идеологические конструкты
Bright Cooking: Recipes for the Modern Palate Bright Cooking: Recipes for the Modern Palate Двужильная Россия. Дневники и воспоминания Двужильная Россия. Дневники и воспоминания Cult Cocktails: 100 recipes and tricks for the home bartender Cult Cocktails: 100 recipes and tricks for the home bartender The Complete Color Harmony: Deluxe Edition: Expert Color Information for Professional Color Results The Complete Color Harmony: Deluxe Edition: Expert Color Information for Professional Color Results Война кланов: "черный фронт" против НСДАП Война кланов: "черный фронт" против НСДАП Лефортово: история, люди и судьбы Лефортово: история, люди и судьбы The Pocket Universal Principles of Interior Design: 100 Ways to Develop Innovative Ideas, Enhance Usability, and Design The Pocket Universal Principles of Interior Design: 100 Ways to Develop Innovative Ideas, Enhance Usability, and Design Echoes of Equality: The Story of the French Revolution: Liberty, Equality, and the Shaping of Modern Ideals Echoes of Equality: The Story of the French Revolution: Liberty, Equality, and the Shaping of Modern Ideals Holy Wars: The Rise and Impact of the Crusades: Unraveling the Intricacies and Legacies of the Medieval Crusades Holy Wars: The Rise and Impact of the Crusades: Unraveling the Intricacies and Legacies of the Medieval Crusades Увлекательная геральдика: Факты, легенды, открытия в мире гербов и наград Увлекательная геральдика: Факты, легенды, открытия в мире гербов и наград История русской архитектуры История русской архитектуры Building Micro-Frontends, 2nd Edition (Second Release) Building Micro-Frontends, 2nd Edition (Second Release) Китайские целебные травы. Классический труд по фармакологии Китайские целебные травы. Классический труд по фармакологии Программирование инженерных задач на базе использования алгоритмов циклической структуры на языке C в среде VS C++. Модуль 2 Программирование инженерных задач на базе использования алгоритмов циклической структуры на языке C в среде VS C++. Модуль 2 Взорванная память. Уничтоженные памятники русской воинской славы Взорванная память. Уничтоженные памятники русской воинской славы Древнерусская государственность: генезис, этнокультурная среда, идеологические конструкты Древнерусская государственность: генезис, этнокультурная среда, идеологические конструкты
Категория: КНИГИ » ПРОГРАММИРОВАНИЕ
Programming for Corpus Linguistics with Python and Dataframes
/
Название: Programming for Corpus Linguistics with Python and Dataframes
Автор: Daniel Keller
Издательство: Cambridge University Press
Год: 2024
Страниц: 114
Язык: английский
Формат: pdf (true), epub
Размер: 10.1 MB

This Element offers intermediate or experienced programmers algorithms for Corpus Linguistic (CL) programming in the Python language using dataframes that provide a fast, efficient, intuitive set of methods for working with large, complex datasets such as corpora. This Element demonstrates principles of dataframe programming applied to CL analyses, as well as complete algorithms for creating concordances; producing lists of collocates, keywords, and lexical bundles; and performing key feature analysis. An additional algorithm for creating dataframe corpora is presented including methods for tokenizing, part-of-speech tagging, and lemmatizing using spaCy. This Element provides a set of core skills that can be applied to a range of CL research questions, as well as to original analyses not possible with existing corpus software.

Programming often involves manipulating data. In CL, our data are samples of language, and our operations are things like counting word types, calculating association strength, measuring dispersion, and so on. To accomplish these things, we need to be able to hold and reference data in a computer’s memory, often in discrete chunks. We do this with variables. To perform operations on these variables, we write instructions (code) that the Python interpreter understands how to carry out. We can group sets of instructions and save them to be reused later. These are called functions. Often, we will use functions written by other people to save time and guarantee replicability.

This section introduces Pandas DataFrame and Series classes, methods for loading and saving them to disk, and methods and functions for counting values, grouping rows, and combining values. These form a core set of tools that can be used to accomplish a range of CL tasks. The focus in this section is on explaining these elements generally, while Section 4 describes algorithms that use these procedures to complete CL analyses specifically. We will use two data types extensively in this element, DataFrames and Series. These are not core data types in Python and must be imported through the Pandas package. However, once imported, we will be able to leverage the powerful methods built into them to do corpus linguistic tasks quickly, reliably, and with minimal hardware resources.

Скачать Programming for Corpus Linguistics with Python and Dataframes







[related-news]
[/related-news]
Комментарии 0
Комментариев пока нет. Стань первым!