#differential_privacy
The Perils of Explainability
via Michelle.
This is a little confusing, so let’s go over this a little slowly.
The rise in the usage of black-box models and algorithms across all aspects of human life has raised concerns and red-flags about their credibility and fairness, especially at the junction between the law and society. Concurrently, there has been a push to ensure that these algorithms are both fair and interpretable.1 These goals go hand in hand, as interpretable models are easier to determine if they are fair. Of particular note (and relevance to this discussion) is the push from the legal academics for some standard of explainability when using machine learning in the courtroom.
It’s important to note that this is nothing new. For the longest time, we have had statisticians as expert witnesses bring their wares into the courtroom, and I imagine on some occasions the methods they’ve used have been at the edge of interpretability.
So I think it’s helpful to take a step back and think about why only now are the warning bells being rung?
- I think part of it stems from the fact that ML/AI has successful seeped into the mainstream (probably due to FAANG being such an integral part of our lives now). This isn’t particularly interesting.
- More substantively is the fact that we, the people, have been sold this as some form of Artificial Intelligence, which comes parcelled with its baggage of ideas. This makes the prospect of “having one’s fate be decided by an AI” seem decidedly more nefarious than “used a regression model in the analysis.”
- Perhaps the most reasonable (though not necessarily the most likely) reason is that the scope of problems tackled by our methods has grown, to the point where we’re now at the high table, so to speak, and making important decisions (and oftentimes being the arbiter, or a crucial component).
It seems to me that part of the issue here is the notion of autonomy, and where humans are in the chain of decision-making. I would expect that, provided there’s a human in the loop, then people will not be so worried.
At the outset, this feels like a very reasonable ask: in fact, it feels like a moral imperative for any black-boxes used in deciding the fate of human lives to at least be explainable. Why do I say that? It somehow feels wrong for such decision making to rest on the virtue of the machines.2 Perhaps this is two different biases coming into play
Outline/Plan
- xAI, what it is, why people are pushing it
- what are the potential pitfalls?
- solutions?
- we should at least propose some solutions
- make it robust to such effects: and perhaps that’s the overarching theme in all this: we have all these potential feedback loops, how do we go about making sure that our systems are robust to it
- literature review (difficult)
Quantification (or is it Pontification)
In a similar spirit to #differential_privacy, let’s see if we can systemize or quantify this problem. It helps to think in terms of extremes: suppose everyone has access to the whole model, so they can reproduce all the results, and play around with tweaking the inputs. Two problems immediately come to mind:
- perhaps there are blindspots (a little similar to the spirit of #adversarial_training) in the model, allowing adversaries to create profiles whose model output would be deemed surprising
- while this is an important consideration, we leave this to those who work in that area of research to help robustify models.
- even with a completely robust model, without blindspots, one still has the potential problem of actors curating their covariates in such a way as to game the model.
- for instance, they might know how to answer the questionnaires as part of the COMPAS dataset to minimize their supposed probability of reoffending
- incentivized to move cities, have more (or less) children
- anti-causal, perverse incentives/selection
Let’s try to make this notion of gaming more formal:
- we can categorize/quantify the degree to which a covariate is game-able, or mutable (mutability, taking a leaf from CS). obviously it’s easier for everyone if we could do this automatically, but I feel it is important to distinguish things like race/gender from, say, how many books you read a night.
- having weighed each covariate by their mutability, then it depends on the explainable method
- we could adopt a sort of differential privacy framework, where we obfuscate any queries to the model (i.e. if we don’t want to reveal the actual model, but we don’t mind allowing people to query, which we just obfuscate with noise). in this case, then somehow the weights would factor in to data privacy leakage, though in this case game model leakage.
- alternatively, when we create the surrogate models, we might want to make the approximation more vague for covariates with higher mutability
- the problem here is that unless it’s specifically noted that this model is an approximation, then people will still try to game, though they’ll just not succeed (is that a problem, I can’t tell).
Backlinks
- [[project-fairness]]
- The last factor, whether or not someone put a phone number down, is something that feels incredibly easy to game. And even in the last paragraph, a staffer encourages a felon to put down a phone number. This definitely rings of [[the-perils-of-explainability]]: if you give people the factors, then basically it’s no longer going to be meaningful.
Does Learning Require Memorization?
src: (Feldman 2019Feldman, Vitaly. 2019. “Does Learning Require Memorization? A Short Tale about a Long Tail.” arXiv.org, June. http://arxiv.org/abs/1906.05271v3.)
A different take on the interpolation/memorization conundrum.
The key empirical fact that this paper rests on is this notion of data’s long-tail.1 This was all the rage back in the day, with books written about it. Though that was more about #economics and how the internet was making it possible for things in the long tail to survive. Formally, we can break a class down into subpopulations (say species2 Though they don’t have to be some explicit human-defined category.), corresponding to a mixture distribution with decaying coefficients. The point is that this distribution follows a #power_law distribution.
Now consider a sample from this distribution (which will be our training data). You essentially have three regimes:
- Popular: there’s a lot of data here, you don’t need to memorize, as you can take advantage of the law of large numbers.
- Extreme Outliers: this is where the actual population itself is already incredibly rare, so it doesn’t really matter if you get these right, since these are so uncommon.
- Middle ground: this is middle ground, where you still might only get one sample from this subpopulation, but it’s just common enough (and there are enough of them) that you want to be right. And since they’re uncommon, you basically only have one copy anyway, so your best choice is to memorize.
Key: a priori you don’t know if your samples are from the outlier, or from the middle ground. So you might as well just memorize.3 What you have is that actually, what you don’t mind is selective memorization. Though it’s probably too much work to have two regimes, so just memorize everything.
My general feeling is that there is probably something here, but it feels a little too on the nose. It basically reduces the power of deep learning to learning subclasses well, when I think it’s more about the amalgum of the whole thing.
Relation to DP
Here’s an interesting relation to #differential_privacy. One of the motivations for this paper is that DP models (DP implies you can’t memorize) fail to achive SOTA results for the same problem as these memorizing solutions. If you look at how these DP models fail, you see that they fail on the exact class of problems as those proposed here, i.e. it cannot memorize the tail of the mixture distribution. This is definitely something to keep in mind for [[project-interpolation]].