Skip to main content
WiDS Posts | February 16, 2024

Three ways to enrich your data science career with the WiDS Datathon 2024 experience

By: Nithya Ramamoorthy
Analytics Lead, Mayo Clinic – Center for Digital Health

If you’re considering participating in the 2024 WiDS Datathon Challenge this year, you’re already one step ahead in using Data to make a positive impact in our society. Lack of Health Equity is a multidimensional complex problem in our current society that affects women disparately. A lot of hope for the future lies in being able to use Data and Population Health Analyses to improve health outcomes among women and other related marginalized groups. In this article, I will share three ways for WiDS Datathon participants to use this experience to progress and excel in their Data Science careers.

 

Ability to look beyond the question

Data literacy is a two way street. Outside academic settings and Datathons, data problems are often not explicitly asked. When you’re working with people on the left end of the Data Literacy spectrum, you will often have to look beyond their question and figure out what other insights your Analysis can provide that will effectively help solve their problem. Your ability to look beyond the question to understand what problems need to be solved using the provided dataset will go a long way in performing impactful analysis work.

The data exploration phase of the competition encourages an open ended investigation of all characteristics of your independent variables. Practicing exploratory analysis on rich datasets will provide you much needed practice on identifying relationships that may provide additional insights beyond what the target variable can answer. Adding on this phase from the Datathon, in your data science role, you may also be thinking about correlations that might emerge from the given variables that are not directly related to the dependent variable. For example in addition to predicting time for a patient to receive their first treatment, you’d also be thinking about other aspects such as number of doctor visits required to treat certain type of cancers, what determines successful completion of the treatment protocol etc. to ultimately address the Health Equity issue of “Is everyone getting the intervention they need to get treated for their cancer diagnosis?”

Ability to look for biases in underlying data

The Datathon challenge comes with a pretty robust data set with information about demographics, diagnosis and treatment options, geo and socio economic parameters and insurance provided about patients who were diagnosed with breast cancer. Compared to small scale datasets, working with real world datasets such as these will get you an opportunity to check if the data is well represented in all aspects provided. With enough practice on data sets like these, you will develop the skill to question the data quality, seeks ways to enrich the datasets before proceeding to use them for modeling and consistently improve based on the model feedbacks.

Get comfortable with Ambiguity

Datathon challenges are specific and self-contained. With a clearly defined objective and clean dataset, participants get the opportunity to hone their core modeling skills without having to worry about uncertain and ambiguous factors such as Project scope , Data collection, Data Quality and Data prepping. Building up this muscle automatically frees up brains space for more time consuming tasks that precede the modeling step in the real world. I’ve illustrated where these key skills come into play in the real word using a common strategy tool called “Rumsfeld matrix.

In essence, Uncertainty and Ambiguity are inevitable in the real world, but mastering the “known knowns” plays a critical role in closing the gap on the “known unknowns” and “unknown knowns” and navigating ambiguity. Most Data science roles require taking some responsibility over all these four quadrants. Your valuable experience gained in navigating the post data exploration, model building, validation and performance tuning will be a key differentiator skill set. This will let you spend more time to address “Known unknowns” such as identifying data enrichment opportunities and gaps and “Unknown knowns” such as addressing systemic and sampling biases.

I hope these tips serve as an inspiration for more Women to pursue the Datathon and use this experience as a stepping stone for launching their career and most importantly have fun with it.

Join us for the WiDS Datathon 2024 competitions. Enjoy the WiDS Datathon!

About the author:

Nithya Ramamoorthy currently serves as an Analytics Lead at the Mayo Clinic – Center for Digital Health. She has more than a decade’s worth of experience in all things Data, specifically in Healthcare and Consumer behavior. She holds a Master’s degree in Information Sciences, a Bachelor’s degree in Computer Science along with several other professional certifications. She strongly believes that Data plays a crucial role in creating Inclusive and Equitable Digital experiences rooted in empathy. She is passionate about advocating for Diversifying and empowering Data Science talents, especially women and she loves voicing her learnings and best practices from her career.