AI is rapidly moving from cloud platforms into edge devices, with edge processors rapidly incorporating hardware accelerators for real-time AI processing. Faced with tight cost, bandwidth and power consumption constraints, edge AI processors must still deliver the high performance required by system developers. In this talk, Expedera co-founder and Chief Scientist Sharad Chole will discuss the problems chip designers and system architects face in trying to apply AI within tight power, performance, and area budgets, and how state-of-the-art packet-based NPUs (Neural Processing Units) can be employed to meet and beat system design goals.