A Primer on Electronic Health Records (EHRs)

My team recently had a brownbag on the types of healthcare data available and I took the opportunity to share a bit about electronic health records. Other types of data shared included the MIMIC dataset and imaging data (e.g., X-rays, CTs, MRIs). I received feedback that it was a useful “EHR 101” and thought to share it here too.

Healthcare has a problem

In most places around the world, primary care and hospitals maintain their own, distinct systems for electronic medical record (EMR) data. As a result, patient and medical data across different providers are incompatible with each other, leading to a lack of interoperability.

Providers want to control all digital records of their patients, ensuring patient retention. This leads to data being siloed at each institution. Patients’ prescriptions, lab tests, diagnosis, etc. are not visible across institutions, contributing to significant wastage.


The other problem is that of poor usability. Often, these systems don’t account for human computer interaction principles. Thus, clinicians often spend more time talking to their laptop than to the patient, contributing to clinician burnout. Furthermore, while the system works and data is dumped in, it is often in such a mess that it is impossible to use.

Enter the electronic health record (EHR).

Continue reading


Data Science and Agile (Frameworks for effectiveness)

This is the second post in a 2-part sharing on Data Science and Agile. In the last post, we discussed about the aspects of Agile that work, and don’t work, in the data science process. You can find the previous post here.

A quick recap of what works well

Periodic planning and prioritization: This ensures that sprints and tasks are aligned with organisational needs, allows stakeholders to contribute their perspectives and expertise, and enable quick iterations and feedback

Clearly defined tasks with timelines: This helps keep the data science team productive and on track, and being able to deliver on the given timelines — the market moves fast and doesn’t wait.

Retrospectives and demos: Retrospectives help the team to improve with each sprint, and provide feedback and insight into pain points that should be improved on. Demos help the team to learn and get feedback from one another. If stakeholders are involved, demos also provide a view into what the data science team is working on.

What about aspects that don’t work well? And how can we get around them?

Difficulty with estimations: Data science problems tend to be more ill-defined, with a larger search space for solutions. Thus, estimations tend to be tricker with a larger variance in error. One way around this is to have budgets for story-points / man days, and to time-box the experiments.

Rapidly changing scope and requirements: The rapidly evolving business environment may bring with it constantly changing organizational priorities. To mitigate this, have periodic prioritisations with stakeholders to ensure alignment. This also helps stakeholders better understand the overhead cost of frequent context switching.

Expectations for engineering-like deliverables after each sprint: Project managers and senior executives with an engineering background might expect working software with each sprint. This may require some engagement and education to bring about mindset change. While the outcome from each sprint may not be working code, they are also valuable (e.g., experimental results, research findings, learnings, next steps).

Being too disciplined with timelines: A happy problem is being too efficient and aligned with business priorities. Nonetheless, a data science team should be working on innovation. To take a leaf out of Google’s book, a team can build in 20% innovation time. Innovation is essential for 10x improvements.

How to adapt Agile for Data Science

In light of the points discussed above, how can we more effectively apply agile/scrum to data science?

Here, I’ll share some frameworks/processes/ideas that worked well for my teams and I — hopefully, they’ll be useful for you too. Namely, they are:

  • Time-boxed iterations
  • Starting with Planning and Prioritisation, Ending with Demo and Retrospective
  • Writing up projects before starting
  • Updated mindset to include innovation

Continue reading