Sometimes reading one thing on the web leads you to another
thing, and then you discover something humorous, well-written and very thought
provoking. Yesterday, for me, that was Deep-Fried Data. I encourage you to read
the post in its entirety; it is the text version of a talk given to the Library
of Congress. The post has a link to a video if you prefer the audio-visual
experience. I personally prefer the text because I can chew over the arguments
slowly at my own leisure.
It’s a relatively long article (from a 15-20 minute talk),
but I’ll quote three short sections to whet your appetite.
“Today I'm here to talk to you about
machine learning. I'd rather you hear about it from me than from your friends
at school, or out on the street. Machine learning is like a deep-fat fryer. If
you’ve never deep-fried something before, you think to yourself: ‘This is
amazing! I bet this would work on anything!’ And it kind of does...”
“I find it helpful to think of
algorithms as a dim-witted but extremely industrious graduate student, whom you
don't fully trust. You want a concordance made? An index? You want them to go
through ten million photos and find every picture of a horse? Perfect... You
want them to draw conclusions on gender based on word use patterns? Or infer
social relationships from census data? Now you need some adult supervision in
the room.”
“People are pragmatic. In the absence
of meaningful protection, their approach to privacy becomes ‘click OK and
pray’. Every once in a while a spectacular hack shakes us up. But we have yet
to see a coordinated, tragic abuse of personal information. That doesn't mean
it won't happen. Remember that we live in a time when a spiritual successor to
fascism is on the ascendant in a number of Western democracies. The stakes are
high.”
The article is particularly thought-provoking as I have been
warily watching the rise of data analytics approaches to solving the problems
of higher education. When behemoth companies start plunking down gobs of money
into selling products and services to universities, the faculty should really
take notice. The calls for “data-driven” assessment that pervades our
institutions should make us pause and ask questions. What is this data for? How
is it chosen? What does it actually tell us? What is left out in the choice?
How is the data reduced into a digestible sound-byte, often some numerical
value? Who owns the data? How would it drive decision-making and strategic
planning?
My ears now perk up when I hear the phrase “data-driven”.
It’s something like “best practices”, usually implying one best practice
determined by whoever is bandying the phrase. As a scientist, I’m strongly in
favor of using data to support an argument, or to make a case. When I was
department chair, I would go to the administration with a data-driven argument,
accompanied by graphs and tables in a clear and pre-digested format to get what
I needed resource-wise. It’s an effective tactic given the way the winds have
been blowing in the increasingly all-administrative university. Put several of
these tactics together and you get a strategy. But is it a wise strategy?
With data science programs popping up all over (such as this one at the University of Illinois), fully online of course, and costing a
chunk, the lure of big-data jobs sings its siren song. While the corporate
world is infatuated with Big Data, there will be plenty of takers. I’ve never
taken any of these courses but I sincerely hope that students learn how to
interrogate themselves as they mine their data quarries, looking for the riches
hidden within. It is human nature to see patterns and weave them into a
narrative. But human choices are made. Every step of the way. Humans design the
algorithms. Choices were made in the underlying theoretical models. And as the
layers get deeper and more complex, we start to understand less and rely more
on the output happily provided by the black box.
After all, everything tastes good when it is deep-fried.
No comments:
Post a Comment