Sunday, April 19, 2009

Cataloging & search algorithms:
the Amazon "scandal"

In Sunday's New York Times, Motoko Rich has an Ideas & Trends article entitled "Crowd Forms Against an Algorithm."

Rich's starting point is last week's sudden online controversy over a major "cataloging error" by Amazon.com, one which "caused thousands of books—a large proportion of them gay and lesbian themed—to lose their sales ranking, making them difficult to find in basic searches."

Maurice, by E.M. ForsterIn fact, "more than 57,000 titles were affected, including [top-selling classics like] E.M. Forster's Maurice, the children's book Heather Has Two Mommies and False Colors, a gay historical romance by Alex Beecroft."

One focus of Rich's article is the massive, almost instantaneous reaction of Amazon's online critics, which ranged from accusations of deliberate, homophobic censorship to arguments that, "while Amazon's intentions may not have been overtly prejudiced, the assumptions underpinning the retailer's categorization of books were still suspect."

What most caught my attention, though, was Rich's additional focus on the fact that the traditional ethical challenges of cataloging are being compounded by today's complex algorithms for categorizing digital content.

"It wasn't the first time," Rich writes, "that a technological failure or an addled algorithm has spurred accusations of political or social bias. Nor is it likely to be the last. Cataloging by its very nature is an act involving human judgment, and as such has been a source of controversy at least since the Dewey Decimal System of the19th century" [emphasis added].

Some folks this past week tried to exonerate Amazon of any intentional discrimination, acknowledging, as did Ed Lasazska of the University of Washington, that there can be all sorts of "unintended consequences of either computer algorithms or human behavior."

As Rich explains, though, others, including some who would not accuse Amazon of outright bigotry, still hold Amazon accountable. Mary Hodder wrote in a post on TechCrunch.com:

The ethical issue with algorithms and information systems generally is that they make choices about what information to use, or display or hide, and this makes them very powerful.... These choices are never made in a vacuum and reflect both the conscious and subconscious assumptions and ideas of their creators.
Dewey is a case in point, says Rich. The DDS is heavily biased toward the dominant white, European-American, Christian culture in which it was birthed. "More than half of the 10 subcategories devoted to religion, for example, catalog Christian subjects."

Most of us 'Net-age library professionals have little direct input into the processes of subject classification and cataloging—either in the library world proper or in the commercially driven world of online search algorithms.

Nonetheless, this one brief article has reminded me that we at least have an ethical responsibility as professionals to be watchful for and, if necessary, to counterbalance both the intentional and the accidental biases of the cataloging and searching tools which we and our customers use.

This includes, by the way, being scrupulously watchful for our personal biases in listening to our customers requests and doing the reference interview. I don't mean just the obvious biases having to do with class, culture, race, religion, etc. I also mean those subconscious biases which can arise out of disinterest in, discomfort with or outright distaste for certain subject areas which customers wish to explore.

For example, I've never had any interest in business. In fact, I grew up with a 1960s liberal kid's cynicism toward what I supposed were the motives of business people. That disinterest in and latent resentment toward the business world means not only that I have remained ignorant of the needs of customers with business questions, but also that I have avoided learning what resources would serve them.

Not very professional on my part. Certainly not good service to my customers.

Rich's article is worth reading in its entirety, and the issue of bias in categorizing and cataloging is an important one for library professionals to keep in plain sight.

One more demonstration that the mandate of the library is to protect everyone's access to all information.

Thanks,
Mike


No comments: