14: Building a Character-Based Text Classifier

The Data Life Podcast

Sisällön tarjoaa Sanket Gupta. Sanket Gupta tai sen podcast-alustan kumppani lataa ja toimittaa kaiken podcast-sisällön, mukaan lukien jaksot, grafiikat ja podcast-kuvaukset. Jos uskot jonkun käyttävän tekijänoikeudella suojattua teostasi ilman lupaasi, voit seurata tässä https://fi.player.fm/legal kuvattua prosessia.

5y ago 23:20

M4A•Jakson koti

Ever wonder how to automatically detect language from a script? How does Google do it?

Ever wonder how Amazon knows whether you are searching for a product or a SKU on its search bar?

We look into character-based text classifiers in this episode. We cover 2 types of models. First is the bag-of-words models such as Naive Bayes, logistic regression and vanilla neural network. Second we cover sequence models such as LSTMs and how to prepare your characters for the LSTMs including things like one-hot encoding, padding, creating character embeddings and then feeding these into LSTMs. We also cover how to set up and compile these sequence models.

Thanks for listening, and if you find this content useful, please leave a review and consider supporting this podcast from the link below.

--- Send in a voice message: https://podcasters.spotify.com/pod/show/the-data-life-podcast/message Support this podcast: https://podcasters.spotify.com/pod/show/the-data-life-podcast/support

27 jaksoa

#Tech #Sanket Gupta #Data Science