In part 1, we focused on data acquisition and formatting the categories. Here, we’ll focus on preparing the product titles (and short description, if you want) before training our model.
Recently, I had the opportunity to share about part of my work at Lazada—ranking products in catalog and search results to improve customer experience and conversion. Conference session details available here.
Here’s the deck presented. Any feedback on how we can improve our ranking framework, or how I can improve my presentation, is welcome.
To gain practice with building data products end-to-end, I recently developed a product classification API. The API helps classify products based on its title—instead of figuring out which category your product belongs to (out of thousands), you can provide the title and the API returns the top 3 most likely categories. The API is available for demo here: datagene.io/categorize_web.
In this series of posts, I’ll share about the process of building such an API, including:
- Data acquisition and formatting (this post)
- Data cleaning and preparation (part 2)
- API development (part 3)