Speaker: Alexander Clark, lecturer at King’s College London, London, UK
Abstract: We will describe a theoretically well founded approach to learning context-free grammars from strings, based on what is called distributional learning. The approaches we discuss are based on the Syntactic Concept Lattice, which is an algebraic structure canonically associated with every language. One can show that every nonterminal that satisfies a very weak minimality condition will have non terminals that correspond to elements of this structure.
Based on this I will discuss several algorithms for learning different classes of grammars under different learning models. All of these algorithms are computationally efficient in a theoretical sense, but need modifications for application to large corpora. I will also discuss the harder problem of strong learning — learning grammars which generate not just the right strings, but also the right structures. These methods can also be extended to richer classes of mildly context sensitive grammars.