#law

The Perils of Explainability

Outline/Plan
Quantification (or is it Pontification)
Backlinks

via Michelle.

This is a little confusing, so let’s go over this a little slowly.

The rise in the usage of black-box models and algorithms across all aspects of human life has raised concerns and red-flags about their credibility and fairness, especially at the junction between the law and society. Concurrently, there has been a push to ensure that these algorithms are both fair and interpretable.11 These goals go hand in hand, as interpretable models are easier to determine if they are fair. Of particular note (and relevance to this discussion) is the push from the legal academics for some standard of explainability when using machine learning in the courtroom.

It’s important to note that this is nothing new. For the longest time, we have had statisticians as expert witnesses bring their wares into the courtroom, and I imagine on some occasions the methods they’ve used have been at the edge of interpretability.

So I think it’s helpful to take a step back and think about why only now are the warning bells being rung?

I think part of it stems from the fact that ML/AI has successful seeped into the mainstream (probably due to FAANG being such an integral part of our lives now). This isn’t particularly interesting.
More substantively is the fact that we, the people, have been sold this as some form of Artificial Intelligence, which comes parcelled with its baggage of ideas. This makes the prospect of “having one’s fate be decided by an AI” seem decidedly more nefarious than “used a regression model in the analysis.”
Perhaps the most reasonable (though not necessarily the most likely) reason is that the scope of problems tackled by our methods has grown, to the point where we’re now at the high table, so to speak, and making important decisions (and oftentimes being the arbiter, or a crucial component).

It seems to me that part of the issue here is the notion of autonomy, and where humans are in the chain of decision-making. I would expect that, provided there’s a human in the loop, then people will not be so worried.

At the outset, this feels like a very reasonable ask: in fact, it feels like a moral imperative for any black-boxes used in deciding the fate of human lives to at least be explainable. Why do I say that? It somehow feels wrong for such decision making to rest on the virtue of the machines.22 Perhaps this is two different biases coming into play

Outline/Plan

xAI, what it is, why people are pushing it
what are the potential pitfalls?
solutions?
- we should at least propose some solutions
- make it robust to such effects: and perhaps that’s the overarching theme in all this: we have all these potential feedback loops, how do we go about making sure that our systems are robust to it
literature review (difficult)

Quantification (or is it Pontification)

In a similar spirit to #differential_privacy, let’s see if we can systemize or quantify this problem. It helps to think in terms of extremes: suppose everyone has access to the whole model, so they can reproduce all the results, and play around with tweaking the inputs. Two problems immediately come to mind:

perhaps there are blindspots (a little similar to the spirit of #adversarial_training) in the model, allowing adversaries to create profiles whose model output would be deemed surprising
- while this is an important consideration, we leave this to those who work in that area of research to help robustify models.
even with a completely robust model, without blindspots, one still has the potential problem of actors curating their covariates in such a way as to game the model.
- for instance, they might know how to answer the questionnaires as part of the COMPAS dataset to minimize their supposed probability of reoffending
- incentivized to move cities, have more (or less) children
- anti-causal, perverse incentives/selection

Let’s try to make this notion of gaming more formal:

we can categorize/quantify the degree to which a covariate is game-able, or mutable (mutability, taking a leaf from CS). obviously it’s easier for everyone if we could do this automatically, but I feel it is important to distinguish things like race/gender from, say, how many books you read a night.
having weighed each covariate by their mutability, then it depends on the explainable method
- we could adopt a sort of differential privacy framework, where we obfuscate any queries to the model (i.e. if we don’t want to reveal the actual model, but we don’t mind allowing people to query, which we just obfuscate with noise). in this case, then somehow the weights would factor in to data privacy leakage, though in this case game model leakage.
- alternatively, when we create the surrogate models, we might want to make the approximation more vague for covariates with higher mutability
  - the problem here is that unless it’s specifically noted that this model is an approximation, then people will still try to game, though they’ll just not succeed (is that a problem, I can’t tell).

Backlinks

[[project-fairness]]
- The last factor, whether or not someone put a phone number down, is something that feels incredibly easy to game. And even in the last paragraph, a staffer encourages a felon to put down a phone number. This definitely rings of [[the-perils-of-explainability]]: if you give people the factors, then basically it’s no longer going to be meaningful.

Judicial Demand for xAI

src: paper, via Michelle.

For the [[project-fairness]], we have been thinking in terms of algorithms and how to better design them to be better vanguards of such societal principles as equality. Something that I hadn’t thought about is the legal side of things; that is, what are the legal raminifications of introducing algorithms (and in the future, more powerful AI) to the legal process? If a judge’s decision rests on the results of an algorithm (e.g. criminal proceedings and giving bail), if lawyers themselves use algorithms for automating tasks (even creation of bills), or if it is the result of the algorithm in question that causes something persecutable (automated drones, self-driving cars) – these are all different examples of how AI might be part of the process.

The COMPAS debacle is a concrete example, and we saw the defendent argue that “the court’s use of the risk assessment violated his due process rights, in part because he was not able to assess COMPAS’s accuracy”. Clearly this is getting into legal territory – namely, what are his due process rights? What is sufficient for him to assess the accuracy?

The paper takes the view that explainable AI (xAI) is the answer to all this, and that it should be the judges (via the process of common law) that ought to decide the nature of the kinds of xAI that should be required depending on the circumstances. I don’t have much to say about the latter aspect of the thesis, as IANAL. However, my feeling is that this is just a much easier problem than the author makes it out to be; namely, just make it so that every algorithm is equipped with a full suite of xAI. It’s really not that difficult, and not much of an onus on the engineers. And one could go so far as require all closed-source algorithms have an open-source alternatives (or, as mentioned in the paper, I’m sure you can use existing legal structures involving confidential material).

This all feels like a red-herring to me though, and sidesteps the crux of the problem, which goes back to what I’ve been working on: the actual equitability or fairness of these algorithms. The idea is that, if you have xAI, and you can see the inner workings, then you can catch it doing bad things (which I agree), but if we’re going to be using algorithms, then we better actually be using justified algorithms before we apply them in court, at which point the xAI part of things is moot. For instance, if you were to show the defendent in the COMPAS example all the deconstructed analysis of the model, even right now there is no consensus among even academics about the fairness of the algorithm, how is this going to help anyone?

Also, the first thing that comes to my mind whenever we talk about xAI is that humans often have a hard time actually explaining their thought process – and yet, that has rarely stopped anyone in court. So part of this, I feel, is more a faith-in-humanity-more-than-machines type argument, which, as long as our prejudices are laid bare, I’m fine with that. Perhaps underlying that line of argument is the reasonableness or rationality of humans versus machines.11 This reminds me of how for self-driving cars, the bar is much higher when it comes to the level of safety before people feel comfortable. There’s this sort of weird disconnect between what we expect from AI, and what we expect from our fellow humans.

Explainable AI

Explainability can come in various forms. The easiest, as it does not require opening the box, are wrapper-type methods that essentially try to describe the function being approximated in terms of something understandable, whether it be things like counterfactuals, or english interpretations of the form of the function. Similarly, one can build surrogate models (e.g. decision trees, linear models) that approximate the function (though you obviously lose accuracy), and provide a better trade-off.22 This approach seems a little weird to me though (and feels a little bit like russian dolls), where you’re basically trading (peeling) off complexity for more simple and interpretable models, but you’re not really comparing apples to oranges at that point.

Here’s an [[idea]]: what if you can come up with something like the tangent plane, but the explainable plane, in that for every point estimate provided by the machine learning model, you can just find a super simple, interpretable model that does a good, local job of explaining stuff for that particular defendent (has to exactly predict what the ML model predicted, hence the parallels to the tangent plane). Actually, this is very similar to LIME (Ribeiro, Singh, and Guestrin ⊕2016Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “"Why Should I Trust You?".” In KDD ’16: The 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 1135–44. New York, NY, USA: ACM.).

The other class of methods involve delving into the black-box, and providing method-specific interpretations of the actual learned model (e.g. for things like CNN, you can look at pixel-level heat-maps), or something more naive like just providing the model as a reproducible instance.