Representing Words and Linguistic Style for Computational Analysis: A Gentle Introduction: DiLCo Methods Day 2022 (7 October) - Dong Nguyen, Anna Wegmann - Universität Hamburg
- Lecture2Go
- Videokatalog
- F.5 - Geisteswissenschaften
- Sprache, Literatur, Medien (SLM I + II)
- Digital language variation in context (DiLCo)
Videokatalog
Representing Words and Linguistic Style for Computational Analysis: A Gentle Introduction: DiLCo Methods Day 2022 (7 October)
Neural Network approaches have radically changed the field of NLP by introducing a way to represent the meaning of words: word embeddings. Such word embeddings are increasingly used as research objects to study social and linguistic research questions. More recently, researchers have also looked at representing sentences in a meaningful, data-driven way, including their style.
This lecture will first introduce word embeddings: What are word embeddings? And how are they learned from data? We will then continue with an introduction to representing the linguistic style of sentences: style embeddings. We explain how they can be created and illustrate how they can be used in downstream tasks e.g., by analyzing linguistic style accommodation in conversations.
DiLCo Methods Day 2022 - Natural language processing for digital language
DiLCo organised a "Methods Day " on computational and quantitative analysis of born-digital language. The workshop targets linguists and also other students and researchers from the humanities and beyond who want to broaden their methodological skills. Three lectures will introduce current innovative techniques of meaning representation, social media data collection and analysis.
--- DiLCo (‘Digital language variation in context’) is a 3-year international research network initiated in 2021 at the University of Hamburg. The network brings together researchers from Europe and USA with expertise in computational, interactional, and ethnographic approaches to digital language and linguistics. It aims to provide a platform for the development of interdisciplinary ideas in digital language and communication research, and for early-career capacity building.