Data Science and Agile—can or not?

Recently, I was invited to moderate a panel on the topic “Data Science and Agile–can or not?” It’s a Singlish way of asking if Agile can be applied in the domain of data science. The panel was held in conjunction with GovTech’s inaugural STACK conference for developers, programmers, and technologists from the private sector.

7589883120_IMG_1347

Who was in the panel?

The panel involved the following guests, from right to left in the photo above:

  • Ivan Zimine: Physicist and neuroscientist who works on complex systems while applying open source and open practices.
  • Adam Drake: Formerly Chief Data Office at Skyscanner and Redmart, with an exemplary record in the design, development, and delivery of cost-effective, high performance tech teams and systems.
  • Steven Koh: Director of Government Digital Services at GovTech leading the Agile Consulting and Engineering team and evangelising agile development in the government.
  • Eugene Yan (that’s me as moderator): Formerly VP of Data Science at Lazada (acquired by Alibaba), currently Senior Data Scientist at uCare.ai.

What were some of the notable questions and responses?

I don’t do anything related to technology–why should I care about agile or scrum?

The view from the panel was that Agile is a mindset and culture of having small iterations, continuous feedback, and course corrections and improvements. This is applicable not just in tech, but in other areas as well.

One panelist gave a humorous (though not very accurate) example of adopting agile in budget planning, where the daily spending is adjusted based on each month’s cumulative spending, with estimates for total spending in the month. While these estimates are unlikely to be accurate down to the dollars and cents, they provide a ballpark figure for one to aim for.


 

What does agile look like in the context of data science? How does the data science team fit into agile rituals? Do they follow daily stand-ups and planning?

Some audience members had difficulty understanding how agile could be adopted in a data science team. Others are part of data science teams that tried practicing it, but with limited success.

For the panelists (and myself), we felt that the mindset of agile could also be adopted in the context of data science, where projects are done in small iterative cycles. While the deliverables may not always be a working product, or additional features, there are measurable deliverables. This could come in the form of analyses that help understand possible causes of an issue, or the testing of multiple hypotheses to identify the key problem, or visualisations that provide better understanding of the context.

Overall, the intent is iterative development, instead of adopting a traditional waterfall approach. In a waterfall approach, significant time would be spent developing a project plan and technical specs which are then “frozen” (i.e., minor changes are difficult, major changes are almost impossible). Next, the system is throughly designed based on the tech specs and developed over several months or years.  Sometimes, the system delivered may be less or no longer relevant to the organization given that it was planned and designed years ago. Or worse, the project is found to be solving the wrong problem, or solving the right problem with the wrong approach, at the later stages of the waterfall cycle. Months/years of effort would have gone down the drain.

For example, perhaps a decision was made to develop a recommendation engine for an e-commerce site. After significant planning, designing, and development, it was finally released after a year or two. However, there was no measurable lift to site metrics (e.g., conversion, revenue, daily average users). Eventually, a study on user purchasing behaviour and journey found that most of the sales funnel was generated by the search engine, with it accounting for > 90% of clicks and purchases. Very few users actually browsed based on the recommendation engine on the homepage, search, and product pages.

In the example above, by adopting an agile approach, perhaps some quick analysis would have been done to determine which recommendation engine to develop first–search, product, or homepage? In this process, it would have been discovered that the potential gain from a recommendation engine would be significantly less than improving the search engine. The team can then course correct and redirect their efforts, saving precious resources and time and delivering measurable value.


 

What does it mean to practice Agile? Is there such as thing as true Agile? A lot of companies claim they are Agile but in reality have projects crunched by unrealistic deadlines, unclear requirements, etc.

Currently, it seems there are different schools of thought around the concept of agile and a variety of ways that one can be certified for agile and scrum. Nonetheless, at its core, agile is a mindset and the fundamental principles are the same.

The Agile Manifesto was raised, specifically, the first principle–“People over processes”. Imposing one school or practice of agile over another would be in violation of this principle. If a team finds practising the core principles of agile to be helpful in being more productive, that was good enough. There is no need to nitpick on whether they are adhering strictly to detailed agile methodology or techniques.

In case you’ve not seen it before or need a refresher, here’s the Agile Manifesto.

Screenshot 2018-10-28 at 17.38.24


 

I’m a recent technical graduate but have no experience with engineering in production, data science, etc. What can I do to get a technical role in either fields?

The key advice panelists had was to develop a portfolio demonstrating one’s work. This could be in the form of small apps on a cloud server, or past analysis and write-ups, or simply blog posts.

One apt example was shared by one of the panelists: He was out shopping for wedding photographers and assessing them based on their past portfolio. How many people would hiring a wedding photographer who did not have a portfolio? Similarly, as a technical candidate, having a portfolio helps to showcase your past work, giving potential hiring managers more confidence that you can deliver in the role.


 

Overall, it was an enjoyable discussion with the expert practitioners in the panel. I hoped the audience learnt and benefited a lot from their sharing–I certainty did!

As I moderate more and more panels, I find myself enjoying the discussion to a greater extent. In some of my past panels, I was nervous about keeping the conversation going and ensuring it was useful for the audience. Recently, the conversation is more casual, and I’m even able to joke around with the panelists. There were also natural follow-up questions that unearthed valuable experiences and anecdotes that the audience could take away. Looking forward to my next one!

Advertisements

One thought on “Data Science and Agile—can or not?

  1. Pingback: Data Science and Agile (What works, and what doesn’t) | blog.datagene.io

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s