#6 - OCR Part 1: teaching digital machines to read paper documents
Manage episode 313783536 series 3286751
Bank statements, credit card statements, and tax forms all contain valuable data, but it's trapped on paper and in PDFs. We humans recognize the ink patters them as letters, but they contain no instructions for the computer. Optical Character Recognition (OCR) is how machines learn to read.
We explore the mechanics of OCR - the scale of the paper problem in financial services and why paper-based data is so difficult for computers to extract. We look at how accuracy statistics for machines can be misleading and why that results in people - lots of people - staying involved in the digitization process.
This week's conversation is a prelude to the next where we'll look at OCR startups and the tremendous business opportunities they're starting to unlock.
Check out this week's letter for the full story. Follow @FatTailThoughts on Twitter and your co-hosts @KleeBeard and @StevenDickens3 for more content.
34 jaksoa