Google translate can instantly translate between any pair of over fifty human languages (for instance, from French to English). How does it do that? Why does it make the errors that it does? And how can you build something better? Modern translation systems *learn* how to translate by reading millions of words of already translated text, and this course will show you how they work. Despite demonstrable success over the last decade, much work remains to be done, so we will also identify open questions at the heart of current research, as well as computational and linguistic insights that may help solve them. The course covers a diverse set of fundamental building blocks from linguistics, machine learning, algorithms, data structures, and formal language theory, along with their application to a real and difficult problem in artificial intelligence.
Email: alopez (AT) cs (DOT) jhu (DOT) edu
Adam Lopez is a research scientist at Johns Hopkins University in the Human Language Technology Center of Excellence. His research and teaching focus on technology that will break the language barrier, in particular systems that learn how to translate from vast amounts of data (like Google Translate); his work draws on core ideas from algorithms, machine learning, formal language and automata theory, and computational linguistics. Previously he was a research fellow in the machine translation research group at the University of Edinburgh, where he moved after earning his Ph.D. in computer science from the University of Maryland.