Sunday, November 7, 2021

Classification

I just finished reading Kate Crawford’s Atlas of AI. It is less about the nuts and bolts of A.I. (although there is some) and more about the ethics and implications of current and future use. Why an atlas? So that one can both zoom out for the broad view, but zoom in at some of the details, and see the intertwining. While the book roams through several different aspects, the question of who benefits and who is being taken advantage of comes up repeatedly. They may be tech workers, Uber drivers, criminals whose mugshots are stored in a database, mining companies, and more. The first paragraph of the book’s concluding chapter, “Power”, is a good summary of its contents:

 

Artificial Intelligence is not an objective, universal, or neutral computational technique that makes determinations without human direction. Its systems are embedded in social, political, cultural, and economic worlds, shaped by humans, institutions, and imperatives that determine what they do and how they do it. They are designed to discriminate, to amplify hierarchies, and to encode narrow classifications. When applied in social contexts… they can reproduce, optimize, and amplify existing structural inequities. This is no accident: AI systems are built to see and intervene in the world in ways that primarily benefit the states, institutions, and corporations they serve. In this sense, AI systems are expressions of power that emerge from wider economic and political forces, created to increase profits and centralize control for those who wield them.

 


These last several weeks, the media has been shining a spotlight on Facebook and its practices. Crawford’s analysis would have been spot-on, as would Zuboff’s, and many others who have sounded the alarm on the unregulated and extractive nature of A.I. and how it impinges on our lives whether we like it or not. We are entangled with machines and there’s no going back. As Roose warns, the question is whether we end up being machine-assisted or machine-managed. The latter is colonizing more area every day in the name of techno-efficiency. I’m not going down this rabbit-hole in today’s post. Instead I want to focus on Crawford’s statement of A.I. encoding narrow classifications. This is what Chapter 4 in Crawford’s book is about, and her main example is ImageNet.

 

How does one classify images to train an A.I.? Well, you need a training set. Lots of pictures and lots of cheap manual labor (via Amazon Mechanical Turk for instance) to attach labels to these pictures. Where do these labels come from? There’s WordNet, which classifies nouns in a particular taxonomy. Hierarchy is pre-built in a narrow way based on the classification system, and the training set further narrows the scope, not to mention implicit human biases in making classification choices.

 

One of my research projects involves training a program to calculate some molecular properties. I built the training set and have been playing around with a standard classification system that subdivides larger molecules into smaller fragments. This isn’t going so well, so I’ve been dreaming up some alternative classification systems that may do a better job “taming” the data. I’m sure my biases are affecting things in some way although likely not to the extent of present machine-learning face-recognition systems. Reading Crawford’s chapter on classification didn’t impact my research so much as make me think about chemistry and how I teach it.

 

As a so-called expert in chemistry, I have particular notions of how chemistry is organized; you might call these my “expert schema”. My goal is to help students move along the continuum from novice to expert by helping them build their schema. To do this, I have to classify things into categories so students can start to spot similarities and differences. The periodic table is an organizing schema of a sort. The division into elements that are metals or non-metals is a useful classifier. The model of atoms as balls connected by springs works well for covalent molecules with their directional bonds. I’ve used the general bond energy curve to help students think more broadly about how particles interact with each other. For that matter, helping students to think about energy requires a number of classifications.

 

Classification is by nature reductionist. This has its pros and cons. Reducing something that would otherwise seem very complicated with all sorts of things going on is very helpful to the novice learner. However, if the system isn’t merely complicated but complex, then the process of simplifying things via classification may throw out the baby with the bathwater. And chemistry is complex – at least I think so. I hope that in the upper-division classes of our curriculum, we guide students back through some of the complexity and help them to enrich their simplistic schema. This is the challenge of teaching chemistry, or any other complex subject material that one can’t just soak up by osmosis unconsciously. It takes a lot of effort on the part of the student to replace their folk-science understanding with the strangeness of nature.

 

But perhaps how scientists look at the natural world is blinkered by the classification schemes we have used so we can comprehend it – or at least we think we do. And this reductionist approach may be part of why science seems both strange and alien, at least when you get into its intricacies. Conceptualization is tricky; I don’t quite understand how it works, and yet somehow I’ve acquired abstract ideas about chemistry that organize my knowledge. Somehow that passes for expertise; at least that’s how students classify me at the moment – the one who knows the stuff. Do I really? I’m not so sure.

No comments:

Post a Comment