Přednášky: Dr. Andrew Hardie (Lancaster)

Začátek:01/01/1970 - 00:00

Časové pásmo:Europe/Prague

Dr. Andrew Hardie (Lancaster)

27. 5. 2013, 17:00, Zelená studovna Knihovny FF UK (Jana Palacha 2)

Annotation and analysis: an overview of tools and techniques

The corpus research infrastructure at Lancaster’s UCREL research centre is based around the use of a number of standard tools for (a) automated annotation at various levels of language, for instance p[art-of-speech tagging and semantic tagging and (b) indexing, searching and analysing the resulting data. In this presentation, I will provide an introductory overview of the nature of these tools and how we make them work together. The presentation will conclude with a live (internet connection permitting!) demonstration of the analytic possibilities afforded by the CQPweb software when it operates across fully-annotated corpus data – in particular looking at different approaches to collocational phenomena.

28. 5. 2013, 13:00, ÚČNK (Národní 37, palác Platýz)

Applying cluster analysis to the problem of text-type classification

(co-author Ghada Mohamed)

This presentation illustrates (a) a new approach to the bottom-up analysis of text types based on cluster analysis, and (b) its cross-linguistic applicability, exemplified through analyses of English and Arabic corpora. Although there exist many different approaches to the classification of texts into categories, most such work can be considered top-down in orientation. Such approaches must, therefore, be complemented by bottom-up approaches where categorisation is based on features internal to the language of the texts; the most widely known approach of this kind is Biber’s (1988) Multi-Dimensional(MD) analysis of English, extended to cross-linguistic text typology by Biber (1995). Biber’s methodology is based on a multivariate statistical technique, factor analysis; this presentation will explore an alternative methodology for establishing text-type categories based on cluster analysis. Work using the British National Corpus and the Leeds Corpus of Contemporary Arabic shows cluster analysis to be a powerful tool for structuring frequency data from automated retrieval lexico-grammatical features, if its output is interpreted with care.

NÁSTĚNKA


19.4.	Pozvánka na přednášku Shakespeare's Language - Prof. Dr. Hans Sauer
14.3.	Právě vyšlo: "Language Periphery"
25.2.	Termíny SZK a dalších studijních povinností 05-06/2016
11.2.	Cyklus přednášek: "This World and the Other Worlds in the Middle Ages"
3.2.	Přípravné kurzy "Anglistika nanečisto"
25.4.	STIPENDIUM PRO DOKTORANDY - KANADA
25.4.	Gráfová: office hours 28.4.16
25.4.	Gráf – přesun konzultací z 27.4. na 26.4.2016
21.4.	Konzultace P. Šaldová

další novinky...