XML and TEI - by Dr Jukka Tyrkkö

23/11/2014

On 18 November Dr Jukka Tyrkkö delivered the seminar Corpora, XML and TEI at the Faculty of Philology and Translation of the University of Vigo. Dr Tyrkkö is a Fellow at the Institute for Advanced Social Research at the University of Tampere. Before moving to Tampere, he worked at the VARIENG research unit of University of Helsinki for many years; he keeps the affiliation to this institution as Adjunct Professor of English Philology.

In the morning session Dr Tyrkkö gave a very complete and digestible explanation to the rationale behind corpora annotation in XML language. Amongst the topics discussed were the concepts of ‘text structure’ and ‘structured data’, the difference between text and document, mark-up languages and corpus annotation, and the nuts and bolts of XML e.g. elements, attributes, entities, well-formedness and validity. The afternoon session successfully combined hands-on tasks with discussion of the standard TEI guidelines for mark-up. Drawing from useful examples drawn from actual data related to Dr Tyrkkö’s research projects, participants had the opportunity to explore the functionalities of the XML editor oXygen, such as building an XML file and preparing a TEI header. Other topics raised include the DTD, XLST, and the TEI Roma tool.

The seminar was attended by members of the LVTC research group, colleagues from the English Linguistics Circle, final-year undergraduate students and postgraduate students. It was financially funded by the Vicerreitoría de Investigación da University of Vigo, the research group LVTC (grant GPC2014 da Xunta de Galicia) and by the ‘Programa de Doutoramento Interuniversitario en Estudos Ingleses Avanzados: Lingüística, Literatura e Cultura’.