Update! This is now also a blog post. Read it here.
Back in November of 2018, I gave a talk at DataEngConf (which has since rebranded to DataCouncil) in New York City. I had a ton of fun preparing for and giving the talk on some of the work going on at Cortico, trying to detect duplicate content in American talk radio. I’ll have a blog post up on how we went about detecting duplicate content (featuring Python, Spark, and Kubernetes), but for now, here’s a link to the talk and the slides!