Data from a Gender Perspective… How to Close the Data Gap between Genders


According to statistics, males represent 50.5% of the total world population while females represent 49.5% of it. Despite the close ratios, this is not reflected in data on which different fields all over the world depend. Among these fields, that are increasingly and hugely impacting all aspects of life, are the Artificial Intelligence (AI) systems. During the development of AI algorithms, women, i.e., half of the world’s population, are mostly ignored, while depending almost exclusively on data representing males. This leads to such systems being biased due to the bias of data they depend on.

Data is a vital tool through which a more just, inclusive, and sustainable world may be created. But data can also reinforce injustice, oppression, and inequality in case decision-makers ignored women and girls. Such ignoring might be a result of the lack or absence of information related to women and girls’ lives and experiences, thus the gap in gender data becomes a problem in a world in which data guides policies and decision making.

The gap in data between the two sexes exists for several reasons, first among them is the lack of resources required for gathering high-quality gender data. The severity of this problem is especially apparent in low-income countries where the gap is more pronounced. Also, decision-makers don’t put gender data on their priorities list. And finally, data systems fail to record women and girls’ experiences, which occurs due to failing to collect data about critical aspects of women’s lives.

Statistics of Data Gap between the Sexes

The demand for data sciences jobs is increasing day after day. However, the ratio of women working as data researchers in the United Kingdom is only 26%. 2020 polls by Boston Consulting Group found out that women working in jobs related to data science averaged between 15% to 22%. This gap is wider in the higher leading positions. Only 35% of women in the UK join science, technology, engineering, and math departments in higher education stages. 50% of women working in technology give up their jobs at the age of 35.

According to Harnham Diversity Report in Data and Analytics report for 2020, the number of women working in jobs related to data science and digital analytics has increased noticeably, especially in marketing, and market data analysis sectors where women amount to 40% of its workers. While women represent only 22% of workers in the data and technology sector.

Generally, only 30% of female students worldwide focus on subjects related to sciences, technology, engineering, and math during their higher education stage. Additionally, 28% of women join Information and Communications Technology (ICT) programs, compared to 72% of men. These ratios defer considerably on regional and international levels, and they are the severest in African countries.

Negative Impacts of Data Gap between the Sexes

The choices data scientists make about how to measure, collect, organize and analyze the data they use impact the insights they come up with. They also open the way for bias. Data scientists, either consciously or unconsciously, incorporate their values, interests, and life experiences into the data they work on. This shapes the results according to how they make sense of the world, thus data and algorithms may be sets of values in code. The weakness of women’s participation in data science leads to increasing risks of planning and implementing policies harmful to women’s interests.

The book Hidden Women: Revealing the data bias in a world designed for men has pointed out some examples where data bias has harmed women and girls:

  • Crash Tests Dummies: Historically, male bodies were used as criteria for designing military technologies tests dummies, as women were excluded from main fight roles. This bias wasn’t taken into consideration when similar dummies were developed for car safety tests, which was one of the reasons that women are more likely to be severely injured in car accidents than men.
  • Recruitment Algorithms: In a trial by Amazon to design an AI system to guide recruitment decisions the company used the resumes it received throughout a decade as training data for its AI system. As most of these resumes were for male candidates, the system was biased against women, considering male candidates better than female candidates. This led to rejecting many women from jobs they were fully qualified for. Amazon fixed the situation later and stopped using this system for evaluating job candidates.
  • Safe Public WCs for Women: Women need on average double the time men need to use WCs. They also use them more frequently. However, one in three women, all over the world, struggles to find safe public WCs, which exposes them to health and safety hazards including contracting diseases and being sexually assaulted. Dependence on just and egalitarian data when deciding on locations and designs of public WCs can help remedy some of these issues.
  • Personal Protective Equipment: PPE, like lab coats and overalls, are designed to fit men’s bodies. Thus only 10% of women working in the energy sector wear personal protective equipment designed for women.

Why We Need Women in Data Science

Weakness of representation of women and girls and other marginalized groups in data and AI sciences, in parallel with the bias of data and algorithms due to the data gap between the sexes, is not only a case of social and economic justice but also an issue of diversity. The weakness of representation leads to embedded and amplified bias in innovative technologies which creates dangerous corrective feedback. A series of research documents the fact that AI and machine learning systems may show biases, while AI systems increasingly make news headlines due to their biased results.

Increasing women’s participation and fixing the structural discrimination is necessary for ensuring women’s right to incorporate their points of view and priorities in analytics developed by data scientists, the AI systems they build, and the research agenda they set.

The diversity of work teams decreases AI systems prediction errors, related to race and gender, which will also improve work quality. Additionally work team diversity helps increase innovation return, as people of different backgrounds and experiences deal with issues in different ways and reach various solutions, which increases the possibility that one of them is financially successful.

How the Data Gap Can Be Closed

There are major steps that should be taken to remedy the weakness of women’s participation in data sciences. Policymakers should first seek to ensure that women and girls obtain basic skills required for being involved in digital technology, like knowledge of reading, writing, and arithmetic.

Women can also -through obtaining required digital skills- get high-paying jobs of those offered by the digital economy, including jobs as data analysts and scientists, AI specialists, and software and machine learning developers. The access of women to such skills, especially in low- and middle-income countries, is a solution to their loss of traditional jobs because of automation, where replacing workers in such jobs with machines and algorithms is increasing. For example, 42% of jobs in Ghana are threatened by automation, compared to 6% in South Korea.

Many initiatives have worked on providing women with the digital skills required to get these jobs, and increasing their participation in data science and AI fields, of these:

  • Stanford University’s Women in Data Science (WiDS): an initiative by Stanford University. It has reached more than 100,000 women working or interested in data sciences, all over the world. The initiative hosts a “Data Marathon” where the participants’ digital skills related to data are improved. The initiative also offers a series of podcasts hosting pioneers of data science and encourages high school female students to join jobs in data science, AI, and related fields after graduation.
  • In Bangladesh, Robi Axiata supports a governmental initiative for training women and girls in rural areas on basic digital skills through providing access to the Internet, laptops, and specialized programs. This initiative has reached more than 63,000 women and girls and aims to reach 166,000 others by the end of 2022.
  •  Technology Enabled Girl Ambassadors: A research application on mobile phones. It offers training for teenage girls in Nigeria, Malawi, Tanzania, Rwanda, India, and Bangladesh on making interviews and collecting data from people in their neighborhoods.
  • Tanzania Data Lab has formed a partnership with Dar Aslam University in Tanzania to launch the first master’s program in data sciences in eastern Africa. The three first groups of participant data scientists have included female experts in Blockchains and AI.
  • Data Collaboratives for Local Impact (DCLI) in collaboration with Tanzania Data Lab, Vodacom, and UNICEF  has contributed to increasing women and girls’ participation in data literacy through organizing programming and data visualization camps in Tanzania.
  • In Cote d’Ivoire, DCLI has financed the Des Chriffres et des Jeunes project for training young men and women on data science through a fellowship it launched for empowering women and closing the data gap between the sexes.