Thoughts on CS6460: Education Technology

I recently completed the OMSCS course on Education Technology and found it to be one of the most innovative courses I’ve taken. There is no pre-defined curriculum and syllabus, though there are many videos and materials available. Learners have the autonomy and freedom to view the course videos and materials in any order and at their own pace. The course is focused around a big project, and learners pick up the necessary knowledge and skills as they progress on the project.

Here’s my thoughts on the course for those who are looking to enrol as well.

Why take the course?

One question I’ve asked myself (and close friends): “What do you think humanity needs most?” For Bill Gates, it was personal computing. For Elon Musk, it’s becoming a multi-planet species and clean vehicles and energy. Personally, my goals are not as lofty—I believe that humanity needs healthcare and education most. This belief, and the availability of these electives, was one of the key reasons I enrolled in OMSCS. Thus, I was elated to get a spot at the immensely popular EdTech course.

There are many rave reviews on how David Joyner is an excellent professor. His courses (i.e., human computer interaction, knowledge-based AI, and education technology) have great reviews and are notable for their rigour and educational value. He is also a strong proponent of scaling education (which I believe is one of the key approaches to improving education). Here’s his recent paper on scaling Computer Science education.

Being keenly interested on how I could use technology (and perhaps data science) to improve education and learning outcomes, I enrolled for the course in Summer 2018.

What’s the course like?

If you’re looking for a traditional post-graduate level course, you’ll not find it here. There is a surprising lack of obvious structure and step-by-step instructions. For some learners, they found this to be disorienting (initially), with some people getting lost along the way. For others, they found the course structure (wait, didn’t you say there’s no structure?) to be refreshing, allowing them to direct their focus and effort more effectively and learn more.

There’s no structure? What do you mean?

For a start, there are no weekly lectures. There is also no weekly reading list. Right from week 1, you’re immersed in the deep end. Your first assignment requires you to pick a few projects of interest, out of hundreds, and discuss them in an essay. There is a rich repository of curated videos, articles, and papers available from the first week, and you can view all of them in week 1, or none by the end of the course. This can feel like too much freedom for some learners, and slightly overwhelming.

Continue reading

Advertisements

Thoughts on CS6300: Software Development Process

Recently, I completed the Georgia Tech OMSCS Software Development Process (SDP6300) course over the summer. It was very enriching—I learnt about proper software engineering practices and created apps in Java and Android. Here’s an overview of my experience, for those who are considering taking it.

Why did I take this course?

Since entering the data and technology industry a couple of years ago, I’ve always felt the need to improve my skills in software engineering. This is compounded by my lack of (i) a computer science degree (I studied psychology) and (ii) hardcore software industry experience.

Via online course and work experience (at IBM and Lazada), I picked up decent coding and engineering skills. While I’m able to build robust and maintainable data products in Python and Scala (Spark), I felt the need for a formal class on software engineering fundamentals so as to develop more sophisticated applications with greater efficiency. This includes learning about good architecture design, software development cycles, etc.

What did we build during the course?

For the summer 2017 run of SDP6300, the bulk of the work revolved around two main projects and multiple smaller individual assignments:

  • Team project: Teams of 3 – 4 built an Android App where users could login and solve cryptogram puzzles. Solutions and scores for each puzzle were to be persisted locally as well as updated on an external web service. Users can then view a global leaderboard with top player scores. It also required functionality for administrative users to add new players and cryptograms. This had to be built over 3 weeks—many teams found this barely enough.

Player features and screens

user2

Admin features and screens

admin2

Continue reading

Product Categorization API Part 3: Creating an API

This post is part 3—and the last—of the series on building a product classification API. The API is available for demo here. Part 1 and 2 are available here and here.

In part 1, we focused on acquiring the data, and cleaning and formatting the categories. Then in part 2, we cleaned and prepared the product titles (and short description) before training our model on the data. In this post, we’ll focus on writing a custom class for the API and building an app around it.

The desired end result is a webpage where users can enter a product title and get the top three most appropriate categories for it, like so.

screen-shot-2016-10-23-at-18-22-42

Continue reading

Image search is now live!

After finishing the image classification API, I wondered if I could go further. How about building a reverse image search engine? You can try it out here: Image Search API

What is reverse image search?

In simple terms, given an image, reverse image search finds other similar images—this would be helpful in searching for similar looking products.

How do I use it?

“My son has this plushie he really likes, but I don’t know what the name is… How can I find similar plushies?”

b003cth3tw

Continue reading

Product Classification API Part 2: Data Preparation

This post is part 2 of the series on building a product classification API. The API is available for demo here: datagene.io/categorize_web. Part 1 available here; Part 3 available here.

In part 1, we focused on data acquisition and formatting the categories. Here, we’ll focus on preparing the product titles (and short description, if you want) before training our model.

Continue reading

Sharing about my work in Lazada at Strata + Hadoop 2016

Recently, I had the opportunity to share about part of my work at Lazada—ranking products in catalog and search results to improve customer experience and conversion. Conference session details available here.

Here’s the deck presented. Any feedback on how we can improve our ranking framework, or how I can improve my presentation, is welcome.

Image classification API is now live!

After toiling for a few months on this, product image classification is now live on Datagene.io! While the product classification API works with product titles, the image classification API works with product images, though only for fashion.

Some facts about the image classification API:

  • Works best with e-commerce like fashion images (as that’s what it was trained on)
  • Top-1 validation accuracy: 0.76; Top-5 validation accuracy: 0.974
  • Returns results under 300 milliseconds (will be faster in batch mode with GPU)
  • Built on Keras and Theano, and runs on a tiny AWS server without GPU.

Continue reading