Skip to main content
Academia.eduAcademia.edu
Recent research in automated learning has focused on algorithms that learn from a combination of tagged and untagged data. Such algorithms can be referred to as semi-supervised in contrast to unsupervised, which refers to algorithms... more
We describe the results of performing text mining on a challenging problem in natural language processing, word sense disambiguation. We compare two methods of unsupervised learning, Ward's minimum {variance clustering and the EM... more
This paper presents an empirical comparison of a variety of model evaluation criteria used in backwards sequential search (BSS). Both information criteria (IC) and signi cance tests are compared when applied to the problem of word sense... more
The Naive Mix is a new supervised learning algorithm that is based on a sequential method for selecting probabilistic models. The usual objective of model selection is to find a single model that adequately characterizes the data in a... more
The Naive Mix is a new supervised learning algorithm that is based on a sequential method for selecting probabilistic models. The usual objective of model selection is to find a single model that adequately characterizes the data in a... more
bruce~cs, unca. edu We present a corpus-based approach to word-sense disambiguation that only requires information that can be automatically extracted from untagged text. We use unsupervised techniques to estimate the parameters of a... more
The past three decades have seen a steady growth of interest in corpus-based techniques for speech and natural language processing. Beginning with the success of the early Markov chain-based speech recognition systems (Jelinek 1998) and... more
Lost-PLA casting is a digital adaption of a classic sculpture technique-lost-wax casting. Mesopotamian lost-wax cast artifacts date back to approximately 3500 BC [Hunt 1980]. At that time, wax carvings were encased in clay which was fired... more
This paper describes measures for evaluating the three determinants of how well a probabilistic classi er performs on a given test set. These determinants are the appropriateness, for the test set, of the results of (1) feature selection,... more