Speaker: Rasmus Berg Palm, doctoral candidate at Technical University of Denmark, Copenhagen, Denmark.
Abstract: Computers are powerful aids and ubiquitous in modern society. Unfortunately they don’t yet understand human communication. As such, in order to use these powerful aids, humans constantly need to act as translators between messy human communication and the structured information the computer needs.
At Tradeshift we strive to learn an end-to-end system for extracting such structured information from invoices. In order to be viable our system must learn a general model of invoices such that it generalizes to new customer layouts with zero configuration and must learn under severe constraints on the quality of the available data. This is a challenging task with several interesting academic problems. I’ll detail our current system, the challenges identified, and our research into addressing these challenges. Since the path of research is a winding one, we end up predicting the end of the world and solving Sudokus.