Wednesday, October 05, 2016

RubyConf 2015 - Keynote: Consequences of an Insightful Algorithm by Carina C. Zona


toolkit for empathetic coding and we're gonna be delving into some really
specific issues and examples of uncritical programming and the painful
results from doing things in ways that were benignly intended and so I want to
start off with a content warning cuz I'm gonna delving into examples that deal
with a number of very sensitive topics grief PTSD depression miscarriage
infertility sexual history consent docking racial profiling and Holocaust
and while none of those are the main topic they are examples that are gonna
come up so anyone who feels the sudden need for coffee please do I won't feel
it all offended that's about 10 minutes into got some time to think about that
algorithms impose consequences on people all the time and were able to extract
remarkably precise insights about an individual but the question is do we
have a right to know what they didn't consent to share even when they
willingly share the data that leads us there and how to mitigate against
unintended consequences from doing that
start by asking the very basic question what is an algorithm it's a step by step
set of operations for predictably arriving at an outcome that's a very
generic definition and of course usually when we talk about algorithms were
talking about those of computer science or mathematics the patterns of
instructions articulated in cold or in mathematical formulas but you can also
think of algorithms in everyday life
their patterns of instructions and all sorts of ways articulated for example as
recipes for directions or even a crochet powder
deep learning is the new hotness right now for data-mining essentially its
algorithms for fast trainable artificial neural networks is a branch of machine
learning that's been around since the early eighties but mostly it's been
locked in academia in part because of difficulties with scale very recently
there's been a number of breakthroughs that have made it possible to finally
put this in real production so it's become realistically possible to extract
from really meaningful insights out of big data and i'm talking bout in
production
another way to think of this is that deep learning relies on Annan's
automatic discovery of patterns within a training dataset and those are applied
to drawing intuitions about future inputs so just in terms of process what
this means is that interprets that we called training data can be various
things array of words images sounds objects concepts and when I say in a
race like a few terabytes so you haven't put an execution is simply running a
series of functions repeatedly on the array that iteration is referred to as
layers and with each layer it's getting more and more precise more fine-grained
it's running each of those hours isn't on the next set of findings and so it's
really getting down to a very precise level of thousands of factors and this
is all without having to labeled data categorize it you're just throwing the
data at an earth would that means because deep learning as promised on a
black box
CNN has drill down to those tens or hundreds of thousands of factors but
they're really subtle and its belief has predictive value but we don't know what
it's facing that all
on the right now this is a technology that's driving major advances in a
variety of areas medical diagnostics pharmaceuticals predictive text
including Skype's voice-activated commands like Siri fraud detection
sentiment analysis language translation and even the self-driving cars and more
specifically today we're going to look at some really concrete examples of
those including ad targeting behavior prediction image classification and
facial recognition but you know all this sounds a little out of track so let's
look at are really sort of simple concrete example that's kind of fun this
is Marv I O
she just self how to play Super Mario World it starts with absolutely no clue
whatsoever and i mean about it swirled about even the concept of rules are
gaming all it does is manipulating numbers and it notices that sometimes
things happen it notices that some of those things cumulatively produce
outcomes
learning movement in play via a purely self training session and it does this
in those layers for 24 hours of experimentation that leaves it to
identify patterns and use those patterns to predict insights such that is
actually able to play the game so speaking of games let's play one right
now it looks something like bingo but it's called data mining fail insightful
algorithms are full of pitfalls like these so by looking at case studies were
able to explore some of the things on this board so are you ready
all right here so we're gonna play it first one is target in the retail sector
the second month the second I'm sorry trimester of pregnancy is referred to as
the Holy Grail and the reason is because it's one of the few times in our lives
were all of our shopping habits suddenly come up for sale again so are buying
loyalty our store loyalty our brand loyalty everything we do in terms of
spending suddenly were rethinking and it's a great opportunity obviously to
create a lifetime customer potentially even a family lifetime customer service
valuable stuff most retailers are only able to use data to find out the third
trimester so one day
target had a few marketers who walked across the office and ask for the
programmers are really simple question
if we wanted to figure out if a customer is pregnant even if she doesn't want us
to know could you do that it's a really interesting challenge right I mean would
you think about it would you under and experiment a bit he actually did come up
with an algorithm and it turned out to be very reliable they started sending
out out for maternity infant care one day a man comes into on the stores and
he's very angry he's yelling how dare you you sent this to my teenage daughter
are you trying to tell her of sex are you trying to tell her to get pregnant
now the store manager obviously is not in charge of the National mailers
nevertheless I appologize away the man came back the next day and he said I owe
you an apology it turns out there's some things I didn't know about going on in
my household my daughter is pregnant
other people complain to and target took a lesson from that I thought about it
real hard and they decided the best thing to do was to deceive and
manipulate people
so they do the same ad targeting but now they catch those ads in unrelated
products they don't care about those they're still sending those same ads but
the customer doesn't know so you might as well have something like for instance
divers and Cologne in the same ad and I think this is really great because as
long as pregnant woman thinks she hasn't been spied on as long as we don't spook
her it works
Shutterfly was in a somewhat similar situation they sent out emails saying as
a new parent congratulations time to send thank you notes for your birth
announcement not everyone who receive those announcements actually had a baby
this was a little awkward although some people found it quite amusing not
everyone did thanks Shutterfly for the congratulations on my new bundle of joy
I'm horribly infertile but adopting a cat lost the baby in November who would
have been due this week it was like hitting a wall all over again
Shutterfly responded that the intent of this email was to target customers who
recently had a baby will die
that's not an apology it's barely even an explanation this cause real harm a
few months ago Mark Zuckerberg excitedly announced that he's going to be a father
and he wrote about a series of miscarriages that they dealt with as a
couple
he said you start imagining the become and dreaming of hopes for the future
you start making plans and then they're gone it's a lonely experience
juxtaposition
Facebook Year in Review been around for a few years it was considered
essentially beta and it was mainly something that was so selecting you
could pick through your posts in the past year and create a little sort of
memoria this past year they decided to do this algorithmically and so your new
speed would fill with the wonderful moments in your past year what they
failed to take into account is that our lives are constantly changing the
relationships we've had that jobs we've had our memories don't necessarily stay
the same things that were joyous back then may not be now
Erica Meier coined the term inadvertent algorithmic cruelty and he defines it as
the result of code that works in the overwhelming majority of cases but
doesn't take other use cases into account so that the be naming this is
he's one of the people that happened to this is a picture of my daughter who is
dead who died this year the Year in Review comes up and it keeps coming up
in my feed rotating through different fun and fabulous backgrounds as a
celebrating her death and there's no obvious way to stop it
Eric calls on us to increase awareness of and consideration of the failure
modes the edge cases the worst-case scenarios and obviously I hope to do
that here today
day but it also would really like you to carry that forward to others and with
that in mind I'm gonna give you my first recommendation be humble we cannot into
it
interstate emotions are private subjectivity not yet anyway
Eric's blog post was just last December and garnered a lot of attention because
hey he's Eric Meyer and it got attention within the industry obviously it also
got it from the mainstream media and there's really a question of how do you
avoid blind siding someone with unpleasant stuff annually and obviously
Facebook must have done a bit of interest acting on this problem because
it's not easy four months later they introduced to change and it's called on
this day its daily reminders
five years ago today you became Facebook friends with somebody two years ago you
went hiking a year ago you had and I noticed you say you know we care about
you
implication is we get it this time here's a memory from three years ago and
we think you'll like it
be on this day you posted a picture of your dog thanks Facebook for picking
today to hit me with this dumb feature and remind me that my dog died three
years ago today sometimes Facebook's on this day sends me memories from high
school and you know it's triggering I did not enjoy high school I want and
need to forget
you do not get to decide what parts of my past I should keep fresh in my mind
and which parts I walk away from mark you as programmers have to learn from
mistakes we need to learn from ours we need to learn from others we need to
decide that harm full and harm less are not consequences that balance each other
out so when it started out as you know it's essentially a game of Faiz various
personal aspects you know how many jumping jacks you did how far you walked
your weight loss and late game it also had another feature it had a sex tracker
and it was default public
we talk about Jeff for a minute
remember the generic definition of algorithm step-by-step too predictable
outcome the algorithm here was treating all day it is the same the outcome was
making everything public because we're all competing right on all their metrics
this is the fail they didn't know this was public with out morning
data is different from others we don't get to just spill it all public as if
all data is equal
alright so obviously most of us need some sort of internal obstacles right
before monitoring performance tuning business metrics tough ok so we know
this is called God HBU did not limit access to the admins are restricted to
operational use employees including drivers could freely identify any
passenger and monitor the person's movements drivers had access to God
record to God views records as well even a job applicant was welcomed to access
these private records managers felt free to abused God view for non operational
purposes altogether such as stocking celebrities rides in real time and
showing at this party entertainment negligence is abuse of an algorithm so
you might remember a few years back the research group at OkCupid used to blog
about things that they were learning from a good trend data and a blog
focused on sharing insights into simple ways that you could
OkCupid user use the dating site well that's his purpose rates were used to
blog about it stated to with a crucial difference it's not about improving
customer experience of the service in fact if you look closely Luber can and
does track your one-night stands this is purely invading people's privacy not for
business purposes but purely for the sake of judging and shaming people that
is not a predictable consequence a signing up for a rideshare service
words a few years back there was a harvard study looked at two different
sets of data one on ad words itself and one on the site hosting Edwards both had
the same at templates and what they did was they threw some names get ones that
were very correlated with black people and ones that were correlated with white
people and they did searches for people without the actual first names and who
are real people with other last names just see what would come up and what
comes up with this a black identifying name was 25 percent more likely to
result in an ad that implied that they had an arrest record examples like this
algorithm folks focuses on predicting what we click on that's it right the
real world is irrelevant
words its job is simply to figure out what makes us click and based on what it
knows about us and it's based on the activity of other people what it knows
about them what's observed what we see in this is our collective bias being
both reflected to us and reinforced data is generated by people it's not
objective its constrained by our tunnel vision it replicates are flaws it
expose our preconceptions Twitter
accidental algorithmic run in a phrase obviously familiar but just a little bit
different than what Eric was talking about Joanne macneil coined this term
she didn't give it a formal definition but this can be roughly summarized
classifying people of similar were careless prompts create scenarios harder
to control and prepare for typically were talking about some version of
recommendation engines and essentially what it means is that you're trapped by
a recommendation system that is determined to show you someone similar
that you'd actually rather avoid it's a false positive that can easily be
detected algorithmically or corrected by the user
sometimes that similarity factor is pretty trivial right my boyfriend's acts
sometimes the factor connecting you to the person is intensely upsetting if you
are stalked by a former coworker Twitter may reinforce this connection
algorithmically boxing you into a past that you're trying to move on from your
finger score with your harasser will grow and grow with every person who
follows them at Reuters recommendation and noticed something just like AdWords
the algorithm doubles down on its fall certainty with every action that third
parties take you're not in control of it
similarity action algorithms become in effect a proxy for her Oscars and many
of these systems don't give any way to turn that off
Google photos
alright so when facial recognition start becoming mainstream we saw plenty of
humorous examples like this raised lots of mistakes early versions of iPhoto
helpfully detecting faces and your baked goods at a great it's a false positive
but it doesn't really matter
here's another one probably remember this one from six months ago this is
Microsoft's how old botnet and it's actually using deep learning itself to
take facial recognition to the next level it's drawing intuitions about age
based on nothing but your photo at the signing tags for gender also looking at
nothing but that photo and of course inevitably this is pretty nu for them
it's gonna make the mistakes along the way and these ones are pretty harmless
to rate you're not really using them for anything false positives are not funny
like this
such as the next one which is also from just six months ago
Flickr classify this as children's playground equipment
this is knocked out the concentration camp with the infamous morrow work will
set you free
something else here
great eggs are the photographers manually added ones the white tags are
flickers this is a consequence of algorithmic hubris treating human
understanding as a relevant for machine intuition treating data as inherently
rule trial as we know obviously especially with something like this it
isn't bigger hug him as an animal originally it also tagged him as an ape
and that's a comparison that we all know has a particularly troubling history in
America so let's be clear here this is Anna criticism of machine learning this
isn't about anyone company or coder we are all subject to the same pitfalls
that they fall into school photos just a month after that last one is you fucked
up my friends not a gorilla right so we've seen this happened at least twice
raped how does it happen for one answer you have to go all the way back to the
nineteen fifties Kodak was developing color film and they optimized for white
skin black skin was not a relevant market for them so lab technicians
everyday had to calibrate equipment based on a car they called surely cards
after I think for detail and white skinned women and in white accessories
making sure that this is where the details showing up
which means that our development processes have to respond to this legacy
data it's generations old but the tools used to make film the science of it are
not racially neutral we are throwing terabytes of data at it and some of them
are hold they carry this with them and even today has been moved into digital
sensors think about it no one was going to allow an entirely different way of
exposure we would have said these cameras forget digital they had to mimic
the same kind of exposure that we were used to four decades so here we have
what essentially is censored no choice and adequate centers contaminating data
even now black skin is not represented in detail the same way and that's a hard
problem to deal with and that makes it really tempting to just avoid thinking
about it at all
might not be as familiar with them they are a consumer lending company they
extend smallish amount of credit for buying particular consumer goods they've
also expanded now into loans for things like coding schools their application is
super simple they take just a few factors they're looking at is your name
here email your mobile number your birthday and your social security number
from that they go hunting
they asked for a bit more information sometimes like for instance your
Linkedin are good how profile practically everyone here has one right
2 percent of open-source contributors are women
this immediately is introducing biased just by asking the question at all you
hear real factors are looked at including how long it takes you to type
of how much you pause while reading the Terms of Service which must be awkward
for people like Stephen Hawking these our algorithms for reinforcing privilege
has remember that algorithm is just a procedure for reliably coming to an
outcome this is reliable up to us to take into account the impact that the
outcomes are gonna lead to the outcome here is reliably identifying privileged
people and reliably excluding most people who don't have an abundance of
privilege this isn't about their creditworthiness it's about their
ability to have access to the privilege of paying it all looks at a deluge of
random data points and learns how to assign labels to them about factors like
this
the assumption of how long it takes you to read something well you know
sometimes it's because you have a kid that you're chasing after there's lots
of reasons that have absolutely nothing to do with your creditworthiness and yet
you're being judged by it how does this give comprehension of meaning and
contacts without these biases always gonna run rampant immense power of
machine intuition is absolutely in irreplaceable I'm not questioning that
it's not a replacement for comprehension though Alan Turing reminds us that if a
machine is expected to be infallible it cannot also be intelligent
so firm analyzes the applicants social media accounts including things like
Facebook so to some other companies in 2012 Germany's biggest credit rating
agency considered evaluating applicants based on their facebook relationships
that's really weird as Facebook friend is not necessarily at all the famous
friend friend and what about when the Facebook friend really is in front
Facebook recently defended a patent that pushes further than a firm does making
credit decisions about a person based on the unrelated credit history of your
Facebook friends this is an algorithm with potential to deeply intrude and
alter a person's relationships if you have to buy out of knowing them
interacting with them just to be able to get your own loan is about being
financially shamed and punished by an algorithm it's important to maintain the
discipline of not trying to explain too much as the CEO of firm adding him
assumptions you noted could introduce bias into the data analysis
Jude is not objective data always has a bias inherent at minimum from how it was
collected how it was interpreted and every one of those flaws and assumptions
in that first aid training set and the original functions it was passed you of
course are having influence on the algorithms that are being generated and
that's the outcomes of the next data thrown at it
a firm says it assesses 70,000 factors none of which it knows for sure how many
of those have potential for discriminatory outcomes how would anyone
even know like someone can tell you what criteria led to a decision this is
completely different from the credit lending we do now
shells from the algorithm can only be seen from inside that black box so it
took the photo for you of the inside of a black box making lending decisions
inside of a back black box is not a radical new business model it's
destructive but it's a regression was disrupting is fairness and oversight and
always have underlying assumptions about meaning about accuracy about the world
in which that data was generated in the first place
about how code should assign meaning to it underlying assumptions influence
outcomes and consequences right now we're in an arms race a data-mining arms
race major players have been making big bets right now on deep learning and
those opaque intuitions and yeah for the moment quality varies like Microsoft's
example but we need to remember that deep learning is all about it originally
drawing predictive intuitions are extremely fine grained levels all these
things matter
ur which means that they're growing both more precise and their correctness and
more damaging in their wrong this which presents a dilemma for us I really
believe that we can flip the paradigm because we do care about getting this
stuff right we do want to be empathetic odors so the question becomes how do we
flip that paradigm gonna give you some starting points 1 considered decisions
potential impact on others how might a false positive affect someone such as
those Shutterfly customers are those Twitter users how might a false negative
affect someone like being denied alone how many other ways can algorithms
intuition be superficially correct and yet deeply wrong in the human contacts
like that photo of dachau or Eric Myers being reminded of his daughter we can do
things like project the likelihood of consequences to others and minimize
negative consequences to others and if you notice I keep repeating to others
because this is the kind of stuff that we're really good about thinking about
for ourselves and our companies were not so good about thinking about impact on
the users and you think about this much like the hippocratic of doctors pledging
to first do no harm
let's try that we also need to be honest and trustworthy and obviously we're
going to try to do those things just because there could but we close in this
case we also really need to be able to be trusted when we make mistakes because
we're definitely gonna make them we need to be able to say it was a mistake it
was an honest mistake we will correct this it will not happen again we
apologize and be believed this is the value of making sure that this is part
of our development in the first place
and that also means that it's really important to build in recourse for
someone to easily correct when we do get a conclusion that's wrong with the
provide others with full disclosure of limitations and call attention to signs
of risk and harm to those people and we need revision areas about creating more
ways to counteract counteract by Stata bias analyses biased impacts we need to
anticipate diverse ways to screw as long as the teams were charged with defining
data collection and use and analysis are less diverse than the intended user base
we will keep failing them black people know that pictures are not so great for
them it's not a surprise we must have decision-making authority in the hands
of highly diverse teens culture fit is the antithesis of diversity its
superficial variations being allowed to exist but their unique perspective is
being suppressed because the point of culture fit is to avoid group thing you
know dimensional variety is not diversity either diversity is wildly
varied on as many dimensions as possible different origins different ages
different assumptions differing experiences diversities where there is
no majority that you can identify we need ask for permission with the default
being no you don't have permission we need to focus on the many who are
eagerly willing to share themselves and who are enthusiastic about giving
consent they want to be known they want to be served better we can serve them
well and you know what I say permission I don't mean add to the Terms of Service
and the mile-long privacy policy doesn't have to be louder edit it could be
something as simple as putting the algorithm in the hands of users like
letting Twitter users decide that they announced a you should really follow
this person and as long as we're doing stuff like Follow Friday how hard is it
for something like Twitter too
simply extract that in turn that into a list of recommendations were already
making these recommendations the algorithm is ignoring them it's things
like just giving a checkbox a pregnant person is not in fact the only person
who has a stake in the pregnancy rate like here's one person shopping but
there's many there may be partners grandparents neighbors friends who
happily would buy stuff for that pregnant person and for that baby I
think it was that is a checkbox here is really a whole lot better than trying to
invade someone's privacy we need to audit outcomes comments because as i
mentioned the black know what I mean by that is this is a solution that's most
commonly used in things like auditing housing discrimination job
discrimination the ideas really simple here you put an application that are
identical except varied on just one factor for instance race or age or
income and look at the outcome if you've got a different result than you know
that there was discrimination based on that factor you don't have to look at
the algorithm outcome itself tells you what you need to know and this is
something that we're going to need to use a lot to check our work and part of
that means that we need to commit to data transparency and algorithmic
transparency both of them and I know your thinking that this is the really
hard conversation to have internally and it feels like an unrealistic one too
many companies keep thinking that her Fri jerry is the only way to win you
know I think back it was really not that long ago that we were fighting for open
source in art school and you know we really pushed back on companies that
insisted the proprietary was the only way to go and we were right
Nationals we know that transparency is crucial for drawing insights that are
genuine and useful so please start the conversations argue for increasing
transparency because it's for the sake of better product
Cleaner features fewer bugs stronger tests beer users public trust as we
build stuff that matters is really harsh on this but she's right if your product
has to do with something that people are deeply affected by either care about it
or quit and go live in a cage and stop hurting people and it's so easy to and
thinking we build an app of data mining fails like these building differently
requires us awareness critical thinking and most of all just citing as a whole
team take a stance say listen here's the deal we do not build things here without
understanding consequences to users
this is the way we work this is good this is our process were hired for more
than just a code or not code monkeys were hired as professionals to solve
problems
apply our expertise and judgment about how to solve problems
code is that what we do code is how we enable our role is to be opinionated
about how to make codes serve a problem space well so when were asked to write
code that presumes to into a people's internal life and act on those
assumptions as professionals were gonna have to be people's proxies be there
advocates say no on their behalf to using their data in ways that they have
not enthusiastically consented to say no
uncritically reproduce and systems that were biased to begin with
say no to writing code that imposes unauthorized consequences on to their
lives
refused to play along