3.3 Identifying Sex/Gender Bias in Research

The field of psychology has produced many studies with relevance to people’s lives, and results found using the scientific method have tremendous economic, political, and social influence (Hare-Mustin & Merecek, 1990). As you may recall from Chapter 1, Mamie and Kenneth Clark’s research was one factor that contributed to school segregation being deemed unconstitutional. Many people take research results that they read about in the media at face value, assuming that if something is published it must be “true.” However, much research, even when it’s published in peer-reviewed journals or summarized in popular magazines, contains hidden bias. This section will give you the tools to be a more critical consumer of that research and be better able to identify sources of potential bias.

Being a Critical Consumer of Research

Research is a valuable source of information, but no research is perfect. To be a critical consumer of research, you need to question how it was done and what else could be a factor in explaining the results.

Video courtesy of Roxanne Felig

Who Is the Researcher?

How can the identity of the researcher contribute to bias?

Many factors influence the research process, including who actually does the research. The phrase “the myth of the impartial researcher” refers to a common misperception—drawn from positivism—that people who conduct research are value-neutral and thoroughly objective. As discussed earlier, psychological theories start from personal observations or previously established theories (i.e., concepts based on previous researchers’ work, perhaps based on those researchers’ own observations). Therefore, if science is primarily conducted by certain types of people, it results in a narrow range of possible theories. Most senior authors of peer-reviewed publications are white, upper-class, cisgender men who are professors at top research universities (Cundiff, 2012; Eagly & Riger, 2014). Although there may be value in developing theories that stem from the life experiences of white, upper-class, cisgender men, it becomes a problem when that is the predominant worldview represented in psychological research.

In 1988, several feminist psychologists published “Guidelines for Avoiding Sexism in Psychological Research” (Denmark et al., 1988). They noted many examples of the predominance of white male researchers causing a bias in what is studied and how questions are asked. For example, they noted that leadership had traditionally been understood as dominance and that topics relating to white men (e.g., links between media and aggression) were seen as more important than topics relating to other groups of people (e.g., pregnancy, menopause). In 2015, the American Psychological Association published “Guidelines for Psychological Practice with Transgender and Gender Nonconforming People” and noted that because psychological research with trans people has been misused and misinterpreted, researchers should consider collaborative research models in which trans participants are involved in determining the types of questions asked, the methods used to answer those questions, and the interpretation of data (American Psychological Association, 2015).

What Is the Research Question?

How do research questions themselves contribute to bias?

Who the researcher is will influence not only the formation of theories but also the specific research questions asked. In discussing research about sex/gender differences, for example, psychologist Rhoda Unger (1979) argued that the very decision to study differences instead of similarities between women and men reinforced the status quo rather than allowing the discovery of anything new. It’s intriguing to consider how different the entire field might be if we changed the question to ask “How are women and men similar?” or if we didn’t assume a sex/gender binary in the first place.

Research questions may also be based on hidden assumptions. Let’s imagine, for example, a group of researchers exploring the effects of daycare on child development, with a specific interest in maternal working status. Their research question might be “Should mothers be employed out of the home early in their child’s life?” This question doesn’t include fathers or parents who don’t identify with labels such as “mother”/“father.” Hidden biases in this research question include, but aren’t limited to, the following assumptions:

that people identify within a sex/gender binary,
that children all have mothers,
that mothers are the ideal caretakers for children,
that we should question mothers’ employment status—but not that of other caretakers—during a child’s early years,
and that working or not working while raising children is a matter of choice rather than of necessity.

How else could the research question be phrased if some of these assumptions weren’t being made (Figure 3.3)? To give one example, what if the researchers asked “Should fathers be employed out of the home early in their child’s life?” How might the research findings differ with this framing?

A photo shows a male preschool teacher in a classroom interacting with a child at a kid’s table with other children writing. — Figure 3.3 Avoiding Bias in Research

Instead of asking only about the role of mothers, researchers who are interested in early childhood development might also ask questions about experiences in preschool, enrollment in enrichment programs, and the effects of interacting with adults outside children’s families.

Who Are the Research Participants?

How do the social identities of research participants contribute to bias?

Another area of potential bias is choice of participants. Psychological research has a long history of using male participants and then generalizing results to all humans (Eagly & Riger, 2014). For example, Carol Gilligan criticized fellow psychologist Lawrence Kohlberg because his theory of moral development was entirely based on interviews with wealthy white boys and men but attempted to explain human development (Gilligan, 1982).

Psychological researchers continue to be guilty of this tendency, which undermines generalizability, or the ability to use findings from a given study to explain phenomena that occur in the general population. Typically, psychological samples tend to be WEIRD (Western, Educated, Industrialized, Rich, Democratic; Cheon et al., 2020; Matsick, Kruk, Oswald, & Palmer, 2021), which limits the generalizability of results. For one thing, the use of WEIRD samples can influence the very questions that are asked and deemed important, as discussed in the previous section. And notably, when WEIRD samples are used, it’s much less likely that the nature of the sample is reported in the title, implying that these samples represent and generalize to all people (Cheon et al., 2020). The titles of studies also rarely indicate when participants are exclusively men or exclusively white (Cundiff, 2012). Yet when participants deviate from the white male norm, researchers not only provide a rationale for their decision but also often signal it in the title of their article (Cundiff, 2012).

Recently, the number of female research participants has substantially increased—probably due to their overrepresentation in psychology undergraduate participant pools, which are a source of convenience samples for many psychologists (Cundiff, 2012). As a result, most of what we know about human behavior is now based on a very specific type of participant. She’s likely an undergraduate student at a large university who is enrolled in an Introduction to Psychology course (Figure 3.4). Such sampling bias overlooks the experiences of all other people. If you think back to other psychology courses you’ve taken, it might be interesting to recall how often your textbooks indicated who was included in the samples used in the studies described. If you knew that a study sample was based only on undergraduate women, would that change your perception of the results?

A photo shows a college classroom full of students, a professor giving a lecture, and three screens with lecture slides. — Figure 3.4 Convenience Samples

Students in college classes frequently participate in psychological research. What is learned from these participants? What is not explored when we rely on these types of convenience samples?

Sampling bias is also a challenge within the field of psychology of women and gender. In its early stages, feminist research didn’t fully address the diversity among women. Instead, it focused primarily on the experiences of white, well-educated, middle-class, heterosexual, cisgender women who were disproportionality members of Western, industrialized, rich, democratic societies (Henrich et al., 2010). In a field that was supposedly trying to undo bias, feminist researchers’ sole focus on such privileged women actually replicated power hierarchies. Moreover, contemporary psychological research has been criticized for conceptualizing people as unidimensional, such as when a study focuses on one dimension of identity (e.g., gender) but doesn’t acknowledge other aspects of participants’ identity (e.g., age, social class, race; McCormick-Huhn et al., 2019).

Bias also exists when women with socially marginalized identities are clumped into one social identity category—as when the term LGBTQ+ (lesbian, gay, bisexual, transgender, and queer) places all sexual minority individuals into one category, ignoring differences among them. For example, there’s tremendous diversity among bisexual women, but this will be overlooked if researchers don’t study bisexuality separately from other sexual orientations. Similarly, if a study groups the experiences of all women of color together in contrast to the experiences of white women, differences among Black women, Latine women, Indigenous women, and others will be missed. This type of failure to consider distinct groups on their own terms reinforces the idea that some people are the norm against which all others should be compared and creates a bias within research.

How Are the Variables Measured?

How does the operationalization of variables contribute to bias?

Another potential source of bias involves the way variables are defined and measured. For example, in the past, researchers measured aggression in mostly overt ways (e.g., physical and verbal assaults; Brown, 2005). Because they used this type of measure, their data showed that boys were more aggressive than girls. However, when researchers began to measure relational aggression (e.g., gossiping and socially isolating other people), a different picture emerged: Girls were aggressive too (Crick & Grotpeter, 1995). The next time you read a study, ask yourself how the different variables are being measured. Then ask whether that is the best, or most inclusive, way of measuring the variable. When we review the research on sex/gender similarities and differences below, you’ll see that the way variables are measured can profoundly affect the results.

When psychological measures are developed, they’re usually tested on a sample to see if they are valid, which means that they’re measuring what the researchers want them to measure. However, because this testing is often done with WEIRD samples, many researchers limit participant selection to populations in which measures have been validated (Mena et al., 2019). Unfortunately, this means that many studies continue to center the experiences of people whose perspectives have already been deemed important to measure.

Although measures can be used on samples that differ from those used in its development, doing so presents problems and isn’t considered best practice. When measures that were developed with one group are used to assess different groups, it’s not always clear that the items or the entire measure have the same meaning for the different groups. This is considered a problem of construct equivalence—is the idea being measured or understood in the same way among different groups? For example, social and political realities influence the meaning and expression of depression symptoms among Latine women, making it more likely they will report somatic symptoms (e.g., headaches, stomachaches) rather than explicit feelings of sadness (Santiago-Rivera et al., 2015). Psychological assessment tools that have been developed using white women will likely not reflect these cultural nuances and therefore will inaccurately record depressive symptoms among Latine women (Butcher et al., 2007).

Further, measures are problematic when they don’t take into account the impact of multiple aspects of identity (McCormick-Huhn et al., 2019). For example, there are measures to assess gender-based microaggression and measures to assess race-based microaggression. However, research didn’t adequately explore the lived experience of Black women until a new measure was developed that specifically measured microaggressions that happen to Black women (Lewis & Nevill, 2015). By considering the ways in which race and gender work together to shape Black women’s experiences, researchers avoided intersectional invisibility, or the tendency to ignore a person’s experience because it doesn’t align with expectations connected to a single aspect of social identity (Sesko & Biernat, 2016).

Even asking demographic questions can contribute to bias. Historically, when researchers relied on surveys, they often included demographic questions about sex/gender with only two response options: female and male (American Psychological Association, 2015). This approach not only alienated trans and/or gender nonbinary participants but also made it impossible to develop important scientific knowledge about people who identified outside of a sex/gender binary (Institute of Medicine, 2011). Demographic questions also don’t typically take into account how participants’ identities, or the way they report them, may vary across time and situation (McCormick-Huhn et al., 2019). For example, in a study with trans youth of color, participants reported that they strategically changed their self-definitions of gender and race/ethnicity in different contexts to combat trans prejudice (Singh, 2013).

Since bias can occur in questions connected to identity, researchers should carefully consider what type of demographic questions they ask and why. Even when demographic questions are justified and inclusive, their placement in a study can contribute to bias. Therefore, one recommended practice is to place demographic questions at the end of a survey or ask them after the experimental manipulation and subsequent measurement takes place so that identity isn’t cued in a way that may change how participants respond to other parts of the study (e.g., Fernandez et al., 2016). In Chapter 2, we talked about stereotype threat and how reminders of different aspects of one’s social identity can change performance on tasks such as test-taking. While it is often important to know who the participants in a given study were so readers can determine the extent to which those findings are generalizable, asking about identity at the start of a study can change the data collected in that study—and therefore the conclusions that researchers draw.

What Kind of Data Are Collected, and How Are the Data Analyzed?

Why do some psychologists choose qualitative research methods, and how can the method used to analyze data contribute to bias?

Most psychologists, including many feminist psychologists, use quantitative methods and apply statistics to analyze their data. The use of statistics also perpetuates bias because the design of many statistics aims to look for differences within a sample. Remember that a significant p-value when comparing groups means that there’s a significant difference, not a significant similarity. In fact, there are no tools to measure similarity (Nelson, 2015). When a study looking for sex/gender differences doesn’t find any, the findings are considered “non-significant” (e.g., no significant differences are found) and generally remain unpublished, leading to the file drawer problem (Rosenthal, 1979). In other words, studies that don’t find differences are often filed away and not discussed or distributed. For this reason, the field is biased toward finding and explaining difference.

Because of these biases inherent in quantitative research, some feminist scholars have advocated for the use of another methodology. Qualitative methods produce descriptive data, with little attention to statistics. Qualitative researchers often rely on interviews, diaries, observations, and archival data. In these cases, data analysis involves identifying themes or patterns in participants’ responses in order to understand how participants interpret various aspects of their lives. Since themes emerge from the data, researchers are, theoretically, less likely to control or manipulate variables and are more flexible in making design changes. This methodology is not without critics, however, because the researchers identifying patterns in the data are likely influenced by their preexisting expectations in the same way that all researchers bring personal expectations to their work. Qualitative researchers, however, tend to be more open about their involvement with the research and the way that involvement can influence their findings—a topic we’ll return to later in this chapter.

In qualitative research, participants and researchers can develop close relationships—a very different dynamic from the detached relationships usually associated with quantitative methods. Feminist researchers are critical of the rigid separation of researchers from study participants, suggesting that it reflects power and control (Rutherford & Granek, 2010). In more traditional labs, researchers are considered experts who manipulate situations in order to study outcomes. This hierarchy may be problematic for collecting accurate data, and qualitative methods offer a valuable alternative. One method reflecting this collaboration of the researcher and the participants is participatory action research (PAR) (Figure 3.5). In PAR, participants are involved in the decision process during every stage of the research (Yost & Chmielewski, 2013). For example, participants might use cameras to document their day-to-day experiences and then collaborate with the researcher about how to use this information to develop research questions and to design a study. Researchers who conduct field research with American Indian and Alaska Native participants often utilize a similar approach and invite tribal members to contribute to research questions, methods, data management, and sharing of findings (Trimble & Morse, 2019). Such participation shapes the research so that communities’ unique histories, values, and worldviews are reflected in all aspects of the process.

A photo shows five adults seated in chairs arranged in a circle, interacting and smiling. — Figure 3.5 Participatory Action Research

When researchers involve participants in the design and execution of research rather than using them solely as sources of data, findings can be richer and interpretations more nuanced.

How Are the Results Interpreted?

In what ways can the interpretation of results be biased?

As noted above, the quantitative measures used in the sciences are designed to detect and report differences. When differences are found, they’re often seen as representing the inherent characteristics of the groups being studied—that is, as describing who people are (e.g., women are like this; men are like that) rather than what social identity distinctions do (Collins, 2015). Remember that aspects of social identity, such as sex/gender, race, class, and so forth, can’t be randomly assigned. Therefore, differences can’t be interpreted as resulting from those identity variables; at most, we can conclude that they are related (i.e., showing correlation rather than causation). Studies generally don’t explore what else is different between these groups of people in their social environments, such as inequities in resources and access to power, which may contribute to the differences that are identified (Settles et al., 2020).

This problem arises even when taking a more intersectional approach and looking at several aspects of identity at a time. Perhaps a researcher wants to explore the interaction of sex/gender and race. One would expect this study to be undertaken using an intersectional lens—looking at two kinds of social identities at the same time. However, if the results are interpreted as being caused by the identity categories (e.g., white women are like this; Black women are like that), then we come up against the same interpretation problem of not accounting for power and social structures. This oversight has important implications for how people might understand study results. For example, in one study, researchers found that participants, particularly those who had been exposed to messages about how women should empower themselves at work, were more likely to attribute workplace inequity to problems or barriers within women rather than to inequities within the work environment (Kim et al., 2018).

Importantly, a study that includes participants with diverse social identities isn’t inherently intersectional. Knowing who is in a sample and whether that sample reflects the population it’s drawn from is important, but for the work to be truly intersectional, the researchers must discuss how participant identities interconnect with social power, access to resources, and so forth. These perspectives must be present in the research design for it to truly be a tool for social justice (Grzanka & Cole, 2022).

Where Are the Results Published?

How do the outlets where research is published contribute to bias?

Many people who publish research on sex/gender do so in one of the scientific journals specializing in that topic. For example, in a review of the literature in 2012, researchers found that 89% of articles published about sex and gender appeared exclusively in Psychology of Women Quarterly or in Sex Roles (Eagly et al., 2012). This means that broader, nonspecialist journals, as well as those focusing on other aspects of psychology, are less likely than these specialized publications to include feminist perspectives.

Further, according to feminist psychologist Stephanie Riger (1992), feminist research shouldn’t simply be used for the production of knowledge; it should also promote social action and social justice. If research is published only in academic journals, it won’t reach most of the population that would benefit from it. Feminist researchers prefer to “give research away.” Examples include sharing results with the general public through social media and presenting results in formal conversations with policy makers and other legislative bodies, who often overlook the psychological needs of individuals with socially marginalized identities. These efforts are in distinct contrast to the role of the stereotypical detached and objective researcher. Indeed, feminist researchers become advocates as they share the results of their research and continue to ask questions in hopes of finding answers that can increase equity.

Another way to give research away is to talk about research results in the popular press. Unfortunately, as discussed earlier in this chapter, the popular press tends to oversimplify or misinterpret the findings of psychological research, particularly studies that confirm popularly held stereotypes (O’Connor & Joffe, 2014). When research is shared this way, most of the coverage exaggerates differences between groups of people (Fine, 2010; O’Connor & Joffe, 2014). Popular press stories rarely explain how the variables were measured, how the results were analyzed, or other factors that could help readers determine whether research bias was present. Readers may want to track down the original article, but it can be hard to do so.

A circular diagram describes open science. — Figure 3.6 Open Science

How could free and easily accessible research articles change the way the general public understands and interacts with science and scientific research?

Recently, feminist scholars have emphasized the importance of providing free and open access to research to help ensure equal access to knowledge (Siegel, Calogero, et al., 2021). The term open science refers to a range of practices that increase the accessibility, transparency, and reproducibility of data-driven science (Figure 3.6). Examples include posting preprints of research before publication and sharing data without participant names or identifying information to open-source repositories (Siegel, Calogero, et al., 2021). Many feminist researchers believe that the open-science movement enhances feminist psychology by promoting a more critical, inclusive, and transparent psychology (Matsick, Kruk, Oswald, & Palmer, 2021).

EMPOWERING OR OPPRESSING?

SHOW HIDE

Quantitative and Qualitative Research

Historically, there has been a divide among social scientists regarding the use of quantitative methods versus qualitative methods. Feminist researchers have criticized users of quantitative methods for their presumptions of objectivity and their focus on pursuing universal “truths” that can be generalized to all people across contexts (Matsick, Kruk, Oswald, & Palmer, 2021). Instead, some feminist researchers advocate for qualitative methods that prioritize the subjective experiences of participants (Gervais et al., 2021). However, other feminist psychologists warn that a total rejection of quantitative methods would further marginalize feminist research and reinforce a dichotomy between quantitative and qualitative research methods (Westmarland, 2001). They believe that the method itself is not as critical as how the results are applied to improve the lives of women and girls (Tracy & Sorsoli, 2004).

Despite the appeal of qualitative methods among feminist researchers, most undergraduate psychology curriculums retain a strong focus on quantitative research methods, and students who are taught quantitative methods experience difficulty in later learning qualitative methods (Roberts & Castell, 2016). Further, qualitative articles are far less likely than quantitative ones to appear in psychology journals. In a review of all psychology journals listed with the academic database PsycINFO from 1960 to 2009, only 8.7% of articles were qualitative (Eagly et al., 2012). Interestingly, most were written and published outside the United States (e.g., in Canada, Australia, the United Kingdom), and qualitative methods were more likely to be used in research focusing on sex/gender (Eagly et. al., 2012).

Because qualitative methods are less likely to be taught and studies using them are less likely to appear in mainstream psychology journals, their value and influence may not be substantial—particularly in political domains, where statistics can inspire and support the adoption of legislation and social policies (Westmarland, 2001). However, some researchers believe that if psychology departments more readily taught qualitative methods and journals more frequently published research using these methods, legislators and the general public might better understand the value of this approach. Further, it might reduce restrictive assumptions about what counts as science and, ultimately, result in better science.

What are your thoughts on this debate? Do the methods used to research topics contribute to empowering and/or oppressing people? Why or why not? Thinking specifically about qualitative methods: Have you taken a course that featured or taught qualitative methods? How often do you read (and understand) qualitative research? How might we encourage more mixed-methods approaches that include both qualitative and quantitative methods? Who would most benefit from research such as this, and why?

generalizability: The ability to use findings from a study to explain things that occur in the general population.
sampling bias: A bias that occurs when participants in a study do not adequately represent the population of interest.
qualitative methods: An approach to research that produces descriptive data with little attention to statistics.
open science: A range of practices that increase accessibility, transparency, and reproducibility of data-driven science.