How Two Biologists Are Making Data Science Resources Accessible to Anyone Anywhere
The science of gathering, organizing, analyzing and visualizing large data sets for decision making has become a necessity, but a lot of the tools and resources needed are still inaccessible to a great majority of the earth’s populace. Dr Sara El-Gebali and Nazeefa Fatima are working to make the most essential tools and resources accessible to anyone regardless of their location on the planet. Having had contextual experiences from Egypt, Pakistan, Sweden and some other countries, the co-founders of OpenCIDER are pretty well-positioned to bridge some of the gaps in today’s data-driven digital divide.
Tell us about your journeys into data science.
Nazeefa Fatima: I have always had a strong interest in bioinformatics since I was in primary class 6. Had bioinformatics not existed, I don’t think I would have ever pursued data science. No one in my high school ever expressed a desire to study bioinformatics — I was definitely the odd one out. Unfortunately, bioinformatics wasn’t available at the university I was going to study at for my undergraduate studies, so I ended up choosing the closest option I could find — medical genetics.
In my final undergrad year, I found a good supervisor who was working on phylogenetics, which practically requires bioinformatics. I enjoyed working on my undergraduate thesis because it gave me the freedom to work anywhere so long as I had a laptop — my colleagues would tease and ask if I was a “proper scientist” in the sense that I wasn’t ever seen in the labs wearing lab coats.
Lund University had a Bioinformatics masters degree program which I got into, and that was when I further started exploring large and varying data sets. The skills I developed in data analysis for genomics and transcriptomics research during my grad school years prepared me for my work at the National Genomics Infrastructure at Uppsala, Sweden, that involved working with data from various sequencing technologies. I worked on different projects through research experiences that helped me build my interest in data science. This field is a huge one to explore and I learn something new every day.
Dr Sara El-Gebali: Unlike Nazeefa, I don’t have a technical background — I just do some basic programming. My background is in molecular biology with a master’s degree that focused on cancer research. After my master’s degree, I worked in various research labs on projects that were at the cutting edge, exploring new techniques and fields at the time.
One of those labs was in Bern, Switzerland, where I worked as a technician, studying the role of amino acid transporters in colon cancer progression. I was offered a PhD position at the University of Bern and continued working on the same project. I started working with microarrays and that was the point where I faced the challenge of handling big data. I realised I needed to acquire new skills.
I decided after my PhD to continue working with those skills, not only to analyze data myself but to also train others and enable them to look at data differently. I emphasized the need to arrange and structure data in a way that it could be used and reused.
That was when I stumbled upon the open data and open science concepts. My passion grew rapidly and I started attending conferences, presenting, networking, listening, discussing and sharing ideas. I was gutted at the isolation and lack of inclusivity in most of the conferences — the speakers and attendees were just being recycled.
For a woman of colour who was not seeing herself represented in the attendees, I felt a sharp lack of diversity. That was exacerbated when I started teaching. I realised that what gets taught in university-level programs, is actually not accessible to everybody. My question was “What do we do about this?”
How did OpenCIDER get started?
Dr Sara El-Gebali: It started off with a list I began creating when I noticed that the knowledge I was sharing in classrooms and labs were perfectly applicable in those given environments, but once taken out of those spaces, had high chances of being completely useless. My goal was to solve that problem, and I started listing basic dos and don’t in computational training.
My list grew taller and wider when I met some of the Carpentry trainers — they gave me so much more to think about. How to schedule sessions, how to think about student allocations, how to ensure that minorities are included, and many others.
I created the first version of OpenCIDER — it was just a bunch of repositories on my Github. I enrolled in the eLife innovation leadership program — that helped me a lot. It was 15 weeks of meetings and homeworks and mentorships, providing everything I needed to kick off the project. That was when I gave it a name and thought of how to properly present it.
What is OpenCIDER?
Nazeefa Fatima: I would say it is a collection of resources that help to promote knowledge sharing and encourage participation in open data and open science. Look at OpenCIDER as a one-stop-shop resource to help anyone whose work involves computational education, research, and training.
Dr Sara El-Gebali: For people who are setting up a data training event for the first time and they want it to be an inclusive one, OpenCIDER provides guidelines that help them know what to look for. They might be interested in which tools they could use, what organizations are out there that might want to support them, and ways to adapt resources to suit available bandwidths. OpenCIDER provides all these along with ways communities centred around data and data management could be healthily built and sustained.
OpenCIDER is also a place to understand what open data means and its importance.
Has OpenCIDER done any work in Ghana?
Dr Sara El-Gebali: We haven’t, but we are collaborating with Selassie David Opoku to learn more about his experiences from Ghana. We want to learn from experts and people in Ghana about their experiences and how they circumvent the challenges they encounter.
How can people in Ghana become a part of the OpenCIDER community?
Dr Sara El-Gebali: Come and talk to us. Tell us what challenges you face and what you need and how can we all come together to talk about using open data for research.
How can people contribute to or support the work that you are doing?
Dr Sara El-Gebali: They can share their experiences. They can share how they teach computing and programming in their context. What are the challenges they face? How did they reach out to their audience? What impact did they get to make?
Any final words?
Dr Sara El-Gebali: It means a lot for me to help people gain new skills and knowledge. I wouldn’t be where I am today if not for the educational opportunities I had. I think it is an important factor in improving people’s lives. That is why I focus on inclusion in computational training and teaching.
Nazeefa Fatima: I would really love to see more people from low and middle-income countries to have the opportunity to learn more in open science and research. That helps us all to have a truly global community that is not restricted and offers everyone a chance.
Nominate someone for me to interview.
Angelique van Rensburg! She’s doing some amazing work with The Carpentries in South Africa — expanding computational training and digital skills