Automatic
Mapping Among Lexico-Grammatical Annotation Models (AMALGAM)
![]()
SCHOOL OF COMPUTER STUDIES
HOME PAGE | PREVIOUS PAGE| NEXT
PAGE
The AMALGAM project has developed a set of resources
for qualitative comparisons between the main Part-of-Speech tagsets and phrase
structure
grammar schemes used in English corpus linguistics.
Software has been developed to tag text with up
to 8 different PoS-tag schemes. This software was used to create a Multi-Tagged
Corpus, a sample of text annotated with a range of alternative PoS-tagging
schemes, to enable researchers to compare how the schemes apply to a common
"gold standard" corpus. We have also collected a MultiTreebank, a set of
sentences each annotated with a range of parse-trees from rival parsers
and parsing schemes.
BRIEF
OVERVIEW OF AMALGAM
IN-DEPTH
REVIEW OF AMALGAM
PUBLICATIONS
THE AMALGAM MULTI-TAGGED CORPUS
(A collection of 180 sentences tagged with 9 different tagging schemes)
A MULTI-TREEBANK
(A collection of 60 sentences tagged with several rival parsing schemes)
LINKS
TO OTHER SITES
SCHOOL OF COMPUTER
STUDIES HOME PAGE | PREVIOUS PAGE| NEXT
PAGE
This site (occasionally) maintained by Eric Atwell (eric@comp.leeds.ac.uk) using text provided by the staff and students of the NLP research group of the School of Computing at Leeds University.