Thoughts on CS6460: Education Technology

I recently completed the OMSCS course on Education Technology and found it to be one of the most innovative courses I’ve taken. There is no pre-defined curriculum and syllabus, though there are many videos and materials available. Learners have the autonomy and freedom to view the course videos and materials in any order and at their own pace. The course is focused around a big project, and learners pick up the necessary knowledge and skills as they progress on the project.

Here’s my thoughts on the course for those who are looking to enrol as well.

Why take the course?

One question I’ve asked myself (and close friends): “What do you think humanity needs most?” For Bill Gates, it was personal computing. For Elon Musk, it’s becoming a multi-planet species and clean vehicles and energy. Personally, my goals are not as lofty—I believe that humanity needs healthcare and education most. This belief, and the availability of these electives, was one of the key reasons I enrolled in OMSCS. Thus, I was elated to get a spot at the immensely popular EdTech course.

There are many rave reviews on how David Joyner is an excellent professor. His courses (i.e., human computer interaction, knowledge-based AI, and education technology) have great reviews and are notable for their rigour and educational value. He is also a strong proponent of scaling education (which I believe is one of the key approaches to improving education). Here’s his recent paper on scaling Computer Science education.

Being keenly interested on how I could use technology (and perhaps data science) to improve education and learning outcomes, I enrolled for the course in Summer 2018.

What’s the course like?

If you’re looking for a traditional post-graduate level course, you’ll not find it here. There is a surprising lack of obvious structure and step-by-step instructions. For some learners, they found this to be disorienting (initially), with some people getting lost along the way. For others, they found the course structure (wait, didn’t you say there’s no structure?) to be refreshing, allowing them to direct their focus and effort more effectively and learn more.

There’s no structure? What do you mean?

For a start, there are no weekly lectures. There is also no weekly reading list. Right from week 1, you’re immersed in the deep end. Your first assignment requires you to pick a few projects of interest, out of hundreds, and discuss them in an essay. There is a rich repository of curated videos, articles, and papers available from the first week, and you can view all of them in week 1, or none by the end of the course. This can feel like too much freedom for some learners, and slightly overwhelming.

Continue reading

Advertisements

Thoughts on CS7642: Reinforcement Learning

I know, I know, I’m guilty of not writing over the last four months. Things have been super hectic with project Voyager at Lazada, switching over to the new platform in end March and then preparing for our birthday campaign in end Apr. Any free time I had outside of that was poured into the Georgia Tech Reinforcement Learning (CS7642), which is the subject of this post.

The course was very enriching and fun. Throughout the course, we learnt techniques that allow one to optimize outcomes while navigating the world, and were introduced to several seminal and cutting edge (at least back in 2018) papers. I highly recommend this course for anyone who’s part of the Georgia Tech OMSCS.

An especially fun project involved landing a rocket in OpenAI’s LunarLander environment. This was implemented via deep reinforcement learning approaches. Here’s how the agent did on its first try (in reinforcement learning, we refer to the “models” as agents; more here). As you can see, it crashed right onto the surface. =(

Continue reading

Data Science Challenges & Impact @ Lazada

I was recently invited to share at the Big Data & Analytics Innovation Summit on Data Science at Lazada. There were plenty of sessions sharing on potential use cases and case studies based on other companies, but none on the challenges of building and scaling a data science function. Thus, I decided to share about some of the challenges faced during Lazada-Data’s three-year journey, as we grew from 4 – 5 pioneers to a 40-ish man team.

In a nutshell, the three key challenges faced were:

  • How much business input/overriding to allow?
  • How fast is “too fast”?
  • How to set priorities with the business?

How much business input/overriding to allow?

How do we balance the trade-off between having business and people providing manual input, vs machine learning systems that perform decision making automatically? Business input is usually in the form of rules or manual processes, while machine learning—when in production—is usually via a black box algorithm.

Before the data science team came to be, processes were done via rules or manual labour. E.g., rules (usually regex) to (i) categorize products, (ii) determine fraudulent transactions, or (iii) redirect users to specific pages based on their search terms. However, this approach was not scalable in the long run.

With the data science team helping with their “black box” algorithms and machine learning systems, the business had to get used to having those task automated. While there were several stakeholders that embraced the automation and freeing up of manpower, some resisted. Those that resisted wanted to retain control over business processes, usually through manual input and rules, as they believed the automated systems were inferior in some aspect. There was also the fear of being made redundant.

Our experience has been that manual input to override algorithms and systems is necessary to some extent, but harmful if overdone (example coming up next). In addition, rules are difficult to maintain! When you have more than 1,000 rules in each domain, who will maintain and QA them daily to check if they still make sense, are applied correctly, and lead to the desired outcomes?

Continue reading

Building a Strong Data Science Team Culture

I know, I know. I’m guilty of not posting over the past four months. Things have been super hectic at Lazada with Project Voyager (i.e., migrating to Alibaba’s tech stack) since last September and then preparing for our birthday campaign in end Apr. In fact, I’m writing this while on vacation =)

One of my first objectives after becoming Data Science Lead at Lazada—a year ago—was to build a strong team culture. Looking back, based on feedback from the team and leadership, this endeavor was largely a success and contributed to increased team productivity and engagement.

Why culture?

When I first joined the Lazada data team, we had 4-5 data engineers and data scientists combined. A year later, we grew to 16. After another year, we were 40-ish. During 1-on-1s with the team, some of the earlier team members raised concerns that our culture was being diluted as we scaled, and it “didn’t feel the same anymore”. Back then, different team members had different views of what our culture was.

In addition, during interviews, many candidates would ask about our culture—this was key in determining if Lazada Data Science was a good fit for them. Having a culture document available for sharing before interviews allowed candidates to learn more about us beforehand, and was more scalable (than answering questions at interviews).

Continue reading

Thoughts on CS7641: Machine Learning

I haven’t had time to write the past few months because I was away in Hangzhou to collaborate and integrate with Alibaba. The intense 9-9-6 work schedule (9am – 9pm, 6 days a week) and time-consuming OMSCS Machine Learning class (CS7641) left little personal time to blog.

Thankfully, CS7641 has ended, and the Christmas holidays provide a lull to share my thoughts on it.

Why take this class?

Why take another machine learning course? How will it add to my experience in applying machine learning on real world problems?

Truth be told, I am victim to imposter syndrome. Most of my machine learning knowledge and skills are self-taught, based on excellent MOOCs including those by Andrew Ng and Trevor Hastie and Rob Tibshirani. CS7641 provided an opportunity to re-visit the fundamentals from a different perspective (focusing more on algorithm parameter and effectiveness analysis).

Screen Shot 2017-12-27 at 22.00.16.png

Impact of the C parameter on SVM’s decision boundary

Additionally, CS7641 covers less familiar aspects of machine learning such as randomised optimisation and reinforcement learning. These two topics were covered at an introductory, survey level, and provided sufficient depth to understand how these algorithms work, and how to apply them effectively and analyse outcomes.

Screen Shot 2017-12-27 at 22.02.42.png

Effectiveness of randomised optimisation algorithms on the travelling salesman problem (randomised hill climbing, simulated annealing, genetic algorithm, MIMIC)

Continue reading

My first 100 days as Data Science Lead

I recently passed my 100-day mark as Lazada’s Data Science Lead. Through this period, it wasn’t always clear what to do, or how to do it, in my new leadership role. I had numerous questions about how to transition from an individual contributor to leader, how to lead former peers, etc.

Several mentors, books, and articles provided guidance on how to transition successfully. Looking back on these first 100 days, here’s some things I did that were helpful.

Shift in mindset

As an individual contributor, I had the opportunity, and was expected, to know my project inside out. I was deeply involved in the technicalities, writing code, measuring impact, and gained immense technical satisfaction from this depth. In contrast, as a leader, I was expected to know all of the team’s projects in significant, though not necessarily complete detail, and get involved when necessary. I had to learn how to switch contexts quickly, and be comfortable with not knowing all the nitty gritty details.

In addition, my new role meant my peers now reported to me. I was aware of the burdens of leadership, such as no longer being able to share information that I previously could as a peer. Mentors also warned that former peers might be less chummy with me, due to the new reporting relationship. Thus, I had to change my thinking on my relationships with the team—we might not remain as close as before and this is a natural given the new leadership role. One mentor suggested that I connect more with peers at my new level to get advice and build more relationships.

Continue reading

Sharing at Singapore Management University on Data Analytics

The Singapore Management University Business Intelligence and Analytics Club approached me with a request to share about data analytics with undergraduates. These undergraduates–which were mostly from a non-technical background–had the following questions in mind:

  • What is data analytics?
  • Why data analytics?
  • How to pick up data analytics? (covered in a previous blog post)
  • How did I enter the data analytics field?

Here’s what I shared with them. Any feedback and suggestions for improvement welcome =)