Why representation and beauty culture matters

California state Sen. Holly Mitchell’s golden dreadlocks spiraled onto her shoulders in loose helixes as she stood in the Senate chamber and asked her colleagues to educate themselves about Black people’s hair.

It was April 22, 2019, and the Los Angeles Democrat was introducing legislation to extend legal protection to hair as essential to a person’s racial identity.

“We’re talking about hairstyles like mine, quite frankly, which would, without question, fit in an image of professionalism if bias and stereotypes were not involved,” said Mitchell, now a Los Angeles County supervisor.

Former state Sen. Holly Mitchell

Former California state Sen. Holly Mitchell authored the CROWN Act, making California the first state to ban discrimination of natural hair or styles like locs, braids and twists in workplaces and public schools. Damian Dovarganes / Associated Press

SB 188, the Creating a Respectful and Open World for Natural Hair (CROWN) Act, passed soon thereafter, making California the first state to include hair in anti-discrimination law, notably natural hair and protective hairstyles like dreadlocks and braids.

Since then, incidents of hair discrimination — from people being denied employment, told they must forfeit a wrestling match or refused entrance to their own graduation — have gone viral and played a major role in the national examination of race and the call to make inclusion the standard in practice, rather than a platitude.

We will begin by examining the media’s role in perpetuating biases that inform people’s treatment and perceptions of themselves and one another. The Chronicle doesn’t stand outside this conversation. In its 150 years of existence, it has published depictions of hair that were driven by negative racial stereotypes.

The Chronicle analyzed about 10,000 images, spanning 20 years, from Vogue, one of the oldest U.S. fashion magazines. While the publication doesn’t aim to reflect most people’s lived experience, it attempts to depict the pinnacle of beauty. Like many forms of media, it informs people of what they should want to look like.

The results were stark: Pixie cuts, pony tails and long, straight hair are represented far more frequently than styles that are wider and more upward, like afros.

Which hairstyles are most (and least) represented in the media?

It’s a simple question, but a challenging one to answer.

Keep scrolling to read more or jump to the analysis.

There are many ways we could have quantified how hair is represented in the media, from texture to style. But to analyze data from thousands of images, we focused on hair shape, which shows whether volume normally seen in textured hair is underrepresented across images.

We analyzed photos from Vogue (which has an archive dating back to 1892) largely because it allowed for an examination of modern depictions of hair going back to 2000. To detect the faces in each image and crop around them, we used code. The faces above were cropped from the same page.

We used machine learning — a process often referred to as “artificial intelligence” — to figure out which portion of an image was made up of hair. Without machine learning, converting more than 10,000 images to a hair shape wouldn’t have been doable. The process of training machine learning “models” is not perfect; for instance, textured or printed backgrounds yielded really poor results, as did low lighting and resolution.

This single image can actually tell you a lot about representation in media. It’s the “average” representation of all the Vogue images we analyzed — the whiter the pixel, the more often images had hair in that spot. What we see is that a lot of pictures have hair along the crown; and those that have more hair tend to show long styles, rather than wide or voluminous shapes.

Our analysis shows that, for the majority of images, hair takes up little of the frame. In other words, Vogue depicts very few images of big hair. The majority of those images with less hair in the frame were pixie cuts and long hair pulled back into a ponytail.

Over 1/3 of the images are represented by the first three bars

29 images were at least 40 percent hair. Their bars are barely visible in the chart.

In addition to looking at the amount of hair in an image, we also looked for where in the image you could find the most hair. Was the average location of hair near the top? The sides? This could tell us more about which types of hair were represented in Vogue’s photos.

We found that, when there is more hair in the image, the hair tends to be toward the bottom of the image. So, when there is more hair than a pixie cut, that hair is more likely to be long, rather than wide or vertical, which is what we’d expect for images of voluminous and textured hair.

While race (or any rough approximation of race) was not included in the analysis, our findings suggest natural Black hairstyles were far less represented than other hairstyles.

This analysis is just one step toward gaining a better understanding of diversity in media. We invite you to read more about our methodology and upload your own image to the model by clicking the button below.

All studies of human experience should be done in inches, not miles, because there will always be limitations to the data and resources. Our analysis, for instance, was limited to hair size and did not include styles or textures. We consider the caveats for this analysis to be invitations for further exploration.

In future chapters, we will cover hair: the joys, hardships and everything in between. We’ll hear from entrepreneurs, educators, policymakers and people who have something to say about their hair. In particular, we will be asking folks what it will take to make our world one where no person is made to feel their hair is out of place.

Methodology

We started with a basic question: Which hairstyles are most (and least) represented in the media? It’s a simple quantitative question, but one rife with barriers. What dataset could answer that question? Was the data accessible and representative? What should we try to measure? Ultimately, we were concerned with what was doable and what we could exclude from the dataset without introducing bias to the analysis. We did not incorporate any analysis of race or gender represented in the images, but both warrant further research.

The data

For our set of images, we limited ourselves to the Vogue Archive from 2000 to the most recent content (which, at the time of analysis, was the April 2021 issue, featuring Selena Gomez in a fabulous off-the-shoulder floral dress lined with black fur). Vogue, of course, does not represent all media. But it is one of the most prolific fashion and beauty publications in the world. And it has an archive that most folks can access with an internet connection and a library card. We manually downloaded all content the Vogue Archive database indicated contained a photograph.

There are limitations here: For one, we aren’t capturing a lot of crucial content for understanding representation, such as advertisements. We were limited to covers, fashion shoots and articles. Second, because of constraints of ProQuest, the website we downloaded the data from, we only pulled the first page or double-page spread for each listed article. For a six-page article, for instance, we would, at most, grab images from the first third of the content.

Once we had all the data downloaded into PDFs, we used face detection (not the same thing as facial recognition!) to find faces on each page and then crop images around people’s faces and write the images to PNG files.

The text preceding the full magazine pages and covers often contained the date of publication, but not always. We had hoped to do a more in-depth analysis of how representation changed over time, but weren’t able to because much of the content was listed without publication dates.

We excluded images from our analysis that, once cropped around a detected face, were not high enough resolution for our machine learning model to accept. We also manually discarded any images where a face was improperly detected. There were many obviously duplicated images that we did not discard because, while they often represented cross-listing of content that shared an article page, they also sometimes appeared more than once because the image was republished. In the end, we had more than 11,000 images to analyze.

The model

In addition to compiling the dataset, we also had to train the machine learning model to identify hair within an image — a process known as “segmentation.” More specifically, the model accepted an image and returned a grayscale representation of the likelihood that a given pixel was hair — in the model output images below (often referred to as “labels” or “masks” in machine learning), the lighter the pixel, the more certain the model was that the pixel in the corresponding coordinate of the image was hair, and vice versa.

The next section is about to get even more technical, so strap in. If you have not engaged with machine learning before, some of the terms below may be unfamiliar to you.

For our analysis, we took inspiration from machine learning expert Elle O’Brien and her work in The Pudding’s The Big Data of Big Hair. Like O’Brien, we started with a U-NET model.
We trained the model on the Figaro1K dataset, which contains 1,050 images that researchers at the University of Brescia, in Brescia, Italy, manually “labeled” (meaning they went through each image and created masks that look a lot like the black and white ones we’ve been looking at). Once the model had its initial training, we processed our own data and hand-selected the best output to retrain the original U-NET model, along with some images that we manually labeled.

After some flailing and futzing up a steep learning curve, the final model used for analysis in this piece was trained on 1,501 images. The model isn’t perfect. It struggles with low lighting and resolution, as well as with textures and some headwear.

Once the final model processed the dataset, we converted the grayscale images into binary labels using a process called “thresholding” — basically, if the model was at least 50% certain the pixel was hair, we cast the pixel to white; anything less than that, and we cast it to black. We then combed back through the masks, comparing them with the images and discarding any data where the hair identified was clearly from another person in the image or if the model was largely off-base (i.e. either many of the white pixels clearly weren’t hair in the image or many of the black pixels were obviously hair in the image). This process was a judgment call that may have introduced errors.

The analysis

We then measured the data we’d extracted from the images. First, we calculated the percentage of each image that was made up of hair — a simple way to measure how often big hair was represented.

We also wanted to better understand the distribution of those hair pixels in the image, so we calculated the centroid (the point in the image that represents the mean x-value and mean y-value) of the hair shapes. We also added a bounding box where at least 98 pixels (1% of the average number of hair pixels in each image) were captured outside the lines. This kept the box from interpreting a few erroneous pixels as the outermost edge of the hair. Even when the model improperly categorized regions of the image, the bounding boxes were often robust to those errors and returned meaningful, accurate results.

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel