Thoughts on CS7646: Machine Learning for Trading

The 2019 spring term ended a week ago and I’ve been procrastinating on how ML4T (and IHI) went. I’ve known all along that writing is DIFFICULT, but recently it seems significantly more so.

Perhaps its because I’ve noticed this blog has been getting a lot more traffic recently. This includes having Prof Thad Starner commenting on my post for his course on Artificial Intelligence. This has increased my own expectations of my writing, making it harder for me to start putting pen to paper.

To tackle this, I looked to the stoicism techniques (i) to decide if something is within my locus of control, and (ii) to internalise my goals. Is it within my control how much traffic my writing receives? No. Is it within my control how much feedback I get on my writing? No.

Instead, what is within my control is writing in a simple and concise to share my views on the classes, so others can learn from them and be better prepared when they take their own classes. This has been the goal from the start—I guess I lost track or forgot about it over time, and got distracted by other metrics.

With that preamble, lets dive into how the ML4T course went.

Why take the course?

My personal interest in data science and machine learning is sequential data, especially on people and behaviour. I believe sequential data will help us understand people better as it includes the time dimension.

In my past roles in human resource and e-commerce, I worked with sequential data to identify the best notifications to send a person. For example, you would suggest a phone case after a person buys a phone, but not a phone after a person buys a phone case. Similarly, in my current role in healthcare, a great way to model a patient’s medical journey and health is via sequential models (e.g., RNNs, GRUs, transformers, etc). I’ve found that this achieves superior results in predicting hospital admissions and/or disease diagnosis with minimal feature engineering.

Thus, when I heard about the ML4t course, I was excited to take it to learn more about sequential modelling—stock market data is full of sequences, especially when technical analysis was concerned. In addition, framing the problem and data from machine and reinforcement learning should provide useful lessons that can be applied in other datasets as well (e.g., healthcare).

Continue reading

Advertisements

What does a Data Scientist really do?

As a data scientist, I sometimes get approached by others on questions related to data science. This could be while at work, or at the meetups I organise and attend, or questions on my blog or linkedIn. Through these interactions, I realised there is significant misunderstanding about data science. Misunderstandings arise around the skills needed to practice data science, as well as what data scientists actually do.

Perception of what is needed and done

Many people are of the perception that deep technical and programming abilities, olympiad level math skills, and a PhD are the minimum requirements, and that having such skills and education qualifications will guarantee success in the field. This is slightly unrealistic and misleading, and does not help to mitigate the issue of scarce data science talent, such as those listed here and here.

Similarly, based on my interactions with people, as well as comments online, many perceive that a data scientist’s main job is machine learning, or researching the latest neural network architectures—essentially, Kaggle as a full time job. However, machine learning is just a slice of what data scientists actually do (personally, I find it constitutes < 20% of my day to day work).

How do these perceptions come about?

One hypothesis is the statical fallacy of availability. For the average population, they would probably know about data scientists based on what they’ve seen/heard on the news and articles, or perhaps a course or two on Coursera.

What’s likely to be the background of these data scientists? If it’s from this article on the recent Turing Award for contributions in AI, you’ll find three very distinguished gentlemen who have amazing publishing records and introduced the world to neural networks, backpropogation, CNNs, and RNNs. Or perhaps you read the recent article about how neural networks and reinforcement learning achieved human expert level performance, and found that the team was largely comprised of PhDs. If it’s from a course, the person is likely to have a PhD, and went through deep mathematical proofs on machine learning techniques. Thus, based on what you can think of, or what is available in memory, many people tend to have a skewed perception on what background a data scientist should have.

The same goes for what data scientists actually do. Most of the sexy headlines on data science involve using machine learning to solve (currently) unsolvable problems, everything from research-based (computer games) to very much applied (self-driving cars). In addition, given that the majority of data science courses are on machine learning, its no wonder that the statistical fallacy of availability would skew people towards thinking that machine learning is the be all end all.

Continue reading

Thoughts on CS6601: Artificial Intelligence

Happy holidays! Have just completed the exceptionally difficult and rewarding course on artificial intelligence, just as my new role involved putting a healthcare data product into production (press release here). The timing could not have been better. The combination of both led to late nights (due to work) and weekends completely at home (due to study).

Why take this course?

I was curious about how artificial intelligence would be defined in a formal education syllabus. In my line of work, the term “Artificial Intelligence” is greatly overhyped, with snake oil salesmen painting pictures of machines that learn on their own, even without any new data, sometimes, without data at all. There are also plenty of online courses on “How to do AI in 3 hours” (okay maybe I’m exaggerating a bit, it’s How to do AI in 5 hours).

Against this context, I was interested to know how a top CS and Engineering college taught AI. To my surprise, it included topics such as adversarial search (i.e., game playing), search, constraint satisfaction, logic, optimzation, Bayes networks, just to name a few. This increased my excitement in learning about the fundamentals of using math and logic to solve difficult problems.

In addition, the course had a very good reviews (4.2 / 5, one of the highest), with a difficulty of 4.3 / 5, and average workload of 23 hours a week. Based on these three metrics, AI was rated better, more difficult, and requiring more time than Machine Learning, Reinforcement Learning, and Computer Vision—challenge accepted! It was one hell of a ride, and I learnt a lot. However, if I had to go back in time, I’m not sure if I would want to put myself through it again.

What’s the course like?

The course is pretty close to the real deal on AI education. Readings are based on the quintessential AI textbook “Artificial Intelligence”, co-authored by Stuart Russell and Peter Norvig. The latter is a former Google Search Director who also guest lectures on Search and Bayes Nets. Another guest lecturer is Sebastian Thrun, founder of Udacity and Google X’s self-driving car program. The main lecturer, Thad Starner, is an entrance examiner for the AI PhD program and draws from his industry experience at Google (where he led the Google Glass development) when structuring assignments.

Continue reading

Thoughts on CS6460: Education Technology

I recently completed the OMSCS course on Education Technology and found it to be one of the most innovative courses I’ve taken. There is no pre-defined curriculum and syllabus, though there are many videos and materials available. Learners have the autonomy and freedom to view the course videos and materials in any order and at their own pace. The course is focused around a big project, and learners pick up the necessary knowledge and skills as they progress on the project.

Here’s my thoughts on the course for those who are looking to enrol as well.

Why take the course?

One question I’ve asked myself (and close friends): “What do you think humanity needs most?” For Bill Gates, it was personal computing. For Elon Musk, it’s becoming a multi-planet species and clean vehicles and energy. Personally, my goals are not as lofty—I believe that humanity needs healthcare and education most. This belief, and the availability of these electives, was one of the key reasons I enrolled in OMSCS. Thus, I was elated to get a spot at the immensely popular EdTech course.

There are many rave reviews on how David Joyner is an excellent professor. His courses (i.e., human computer interaction, knowledge-based AI, and education technology) have great reviews and are notable for their rigour and educational value. He is also a strong proponent of scaling education (which I believe is one of the key approaches to improving education). Here’s his recent paper on scaling Computer Science education.

Being keenly interested on how I could use technology (and perhaps data science) to improve education and learning outcomes, I enrolled for the course in Summer 2018.

What’s the course like?

If you’re looking for a traditional post-graduate level course, you’ll not find it here. There is a surprising lack of obvious structure and step-by-step instructions. For some learners, they found this to be disorienting (initially), with some people getting lost along the way. For others, they found the course structure (wait, didn’t you say there’s no structure?) to be refreshing, allowing them to direct their focus and effort more effectively and learn more.

There’s no structure? What do you mean?

For a start, there are no weekly lectures. There is also no weekly reading list. Right from week 1, you’re immersed in the deep end. Your first assignment requires you to pick a few projects of interest, out of hundreds, and discuss them in an essay. There is a rich repository of curated videos, articles, and papers available from the first week, and you can view all of them in week 1, or none by the end of the course. This can feel like too much freedom for some learners, and slightly overwhelming.

Continue reading

Thoughts on CS7642: Reinforcement Learning

I know, I know, I’m guilty of not writing over the last four months. Things have been super hectic with project Voyager at Lazada, switching over to the new platform in end March and then preparing for our birthday campaign in end Apr. Any free time I had outside of that was poured into the Georgia Tech Reinforcement Learning (CS7642), which is the subject of this post.

The course was very enriching and fun. Throughout the course, we learnt techniques that allow one to optimize outcomes while navigating the world, and were introduced to several seminal and cutting edge (at least back in 2018) papers. I highly recommend this course for anyone who’s part of the Georgia Tech OMSCS.

An especially fun project involved landing a rocket in OpenAI’s LunarLander environment. This was implemented via deep reinforcement learning approaches. Here’s how the agent did on its first try (in reinforcement learning, we refer to the “models” as agents; more here). As you can see, it crashed right onto the surface. =(

Continue reading

Thoughts on CS7641: Machine Learning

I haven’t had time to write the past few months because I was away in Hangzhou to collaborate and integrate with Alibaba. The intense 9-9-6 work schedule (9am – 9pm, 6 days a week) and time-consuming OMSCS Machine Learning class (CS7641) left little personal time to blog.

Thankfully, CS7641 has ended, and the Christmas holidays provide a lull to share my thoughts on it.

Why take this class?

Why take another machine learning course? How will it add to my experience in applying machine learning on real world problems?

Truth be told, I am victim to imposter syndrome. Most of my machine learning knowledge and skills are self-taught, based on excellent MOOCs including those by Andrew Ng and Trevor Hastie and Rob Tibshirani. CS7641 provided an opportunity to re-visit the fundamentals from a different perspective (focusing more on algorithm parameter and effectiveness analysis).

Screen Shot 2017-12-27 at 22.00.16.png

Impact of the C parameter on SVM’s decision boundary

Additionally, CS7641 covers less familiar aspects of machine learning such as randomised optimisation and reinforcement learning. These two topics were covered at an introductory, survey level, and provided sufficient depth to understand how these algorithms work, and how to apply them effectively and analyse outcomes.

Screen Shot 2017-12-27 at 22.02.42.png

Effectiveness of randomised optimisation algorithms on the travelling salesman problem (randomised hill climbing, simulated annealing, genetic algorithm, MIMIC)

Continue reading

Thoughts on CS6300: Software Development Process

Recently, I completed the Georgia Tech OMSCS Software Development Process (SDP6300) course over the summer. It was very enriching—I learnt about proper software engineering practices and created apps in Java and Android. Here’s an overview of my experience, for those who are considering taking it.

Why did I take this course?

Since entering the data and technology industry a couple of years ago, I’ve always felt the need to improve my skills in software engineering. This is compounded by my lack of (i) a computer science degree (I studied psychology) and (ii) hardcore software industry experience.

Via online course and work experience (at IBM and Lazada), I picked up decent coding and engineering skills. While I’m able to build robust and maintainable data products in Python and Scala (Spark), I felt the need for a formal class on software engineering fundamentals so as to develop more sophisticated applications with greater efficiency. This includes learning about good architecture design, software development cycles, etc.

What did we build during the course?

For the summer 2017 run of SDP6300, the bulk of the work revolved around two main projects and multiple smaller individual assignments:

  • Team project: Teams of 3 – 4 built an Android App where users could login and solve cryptogram puzzles. Solutions and scores for each puzzle were to be persisted locally as well as updated on an external web service. Users can then view a global leaderboard with top player scores. It also required functionality for administrative users to add new players and cryptograms. This had to be built over 3 weeks—many teams found this barely enough.

Player features and screens

user2

Admin features and screens

admin2

Continue reading