A couple of my friends were discussing the ideological leanings of the Supreme Court in light of Neil Gorsuch’s confirmation hearing, and they argued that despite the court having 4 career liberals and 4 career conservatives, the court actually leaned left. Their reasoning being that both Roberts and Kennedy are so called RINOs (Republicans in Name Only).
Kennedy was appointed during the Reagan administration while there was a Democratic Senate, so Reagan was arguably forced to pick a more moderate Republican as a compromise candidate. Chief Justice Roberts on the other hand has been the crucial swing vote on several landmark cases such as US vs. Windsor (the DOMA ruling), NFIB vs. Sebelius (ACA ruling), and Citizens United vs. FEC (Super PACs). All 8 other justices ruled along party lines in each of those cases, but Roberts voted with the liberal judges on the former 2.
While researching my friends’ claims, I stumbled upon a pretty interesting dataset on a Wikipedia page detailing SCOTUS Justice leanings across history. I thought it would be interesting to apply a more rigorous test to my friend’s assessment of the court’s positions with a little machine learning.
The table contains features for some of the most important functions of the court. Taken from the Wikipedia Page:
- Criminal Procedure = A higher number means pro-defendant votes in cases involving the rights of persons accused of crime, except for the due process rights of prisoners.
- Civil Rights = A higher number means more votes permitting intervention on First Amendment freedom cases which pertain to classifications based on race (including Native Americans), age, indigence, voting, residence, military, or handicapped status, sex, or alienage.
- First Amendment = A higher number reflects votes that advocate individual freedoms with regards to speech.
- Union = A higher number means pro-union votes in cases involving labor activity.
- Economic = A higher number means more votes against commercial business activity, plus litigation involving injured persons or things, employee actions concerning employers, zoning regulations, and governmental regulation of corruption other than that involving campaign spending.
- Federalism = A higher number means votes for a larger, more empowered government in conflicts between the federal and state governments, excluding those between state and federal courts, and those involving the priority of federal fiscal claims.
- Federal Taxes = A higher number means more votes widening the government’s ability to define and enforce tax concepts and policies in cases involving the Internal Revenue Code and related statues.
However, as Thomas Sowell famously explained in Judge Robert Bork’s confirmation hearing, slapping numbers on sometimes subjective criteria like whether a decision is pro-union or pro-defendant does not make the underlying issue any more objective. So baked into these numbers is perhaps some implicit bias, but these definitions are fairly rigid and definitely make for a good starting point.
We treat judges as sample vectors and each ruling issue as feature vectors. Since each cell in the matrix represents a percentage of votes, the data is already normalized and requires little preprocessing. This is an unsupervised problem since the dataset doesn’t label party affiliation, so K-means where k=2 is a good candidate for deducing political party. I held out the current sitting judges as a test set and trained on historical judges to see if we could indeed discern their leanings without biasing the training with their contributions. These are the resulting predictions:
|Fred M. Vinson||Conservative|
|Tom C. Clark||Conservative|
|John Marshall Harlan II||Conservative|
|William J. Brennan, Jr.||Liberal|
|Charles Evans Whittaker||Conservative|
|Warren E. Burger||Conservative|
|Lewis F. Powell, Jr.||Conservative|
|John Paul Stevens||Liberal|
|Sandra Day O’Connor||Conservative|
|Ruth Bader Ginsburg||Liberal|
Even with a pretty scarce amount of data, we are able to predict all 8 of the current justice’s affiliations properly and the predictions for judges that we trained on also seem to confirm my intuitions.
Here is a more clear visualization of the two clusters in the dataset:
I reduced the dimensionality of the dataset using PCA for plotting purposes, but these two components explain 90% of the variance in the data set. I also interpolated a meshgrid over the plot to illustrate where the cluster boundary for this space is.
The observation that immediately strikes me is that both Roberts and Kennedy are fairly distant from the hyperplane that separates these clusters. In fact, many former conservative judges are closer to the frontier of the cluster than they are, and interestingly, even the current liberal judges are closer to “moderate” than they are. Looking purely at the judges’ track records in this dataset, it seems Roberts and Kennedy are firmly in red territory despite occasional dissent with their peers.
I suspect that while Roberts has ruled with the liberals on a handful of high profile cases, he votes in line with his party more often on the cases that don’t get as much lip service. Since these are all given equal weight in the dataset, the justice is not highly distinguishable from his colleagues.
As a textualist, Gorsuch will probably be fairly close to Scalia and Thomas on the plot, bringing the court further to the right as it was before Scalia’s death. But I do see the point my friends make because the high profile cases do matter and even with Scalia, judicial activism was often the prevailing legal theory on the bench in the news. The dynamics of the bench are certainly in the air these next 4 years, and it’s entirely possible that Trump may get to choose two more justices with Ginsberg and Kennedy likely reaching the end of their careers.
Ideas for follow up analysis and future projects regarding SCOTUS:
- Biplot for feature contribution to PCA
- Either manually label the dataset or label it using the clustering output and construct an SVM classifier. Distance from the separating hyperplane can be a proxy score for party affiliation.
- Can measure the bench’s average party affiliation over time
- Bench ruling coocurrence matrix
- Semantic analysis on opinions
- Topic discovery via LDA
- Use this output to label opinions, amicus briefs, etc. and classify documents