Part II: Using Logistic Regression to Classify and Predict Supreme Court Cases

NOTE: this is part 2 of a 2-part series exploring the Supreme Court. The purposes of this mini-series are to identify trends in the Supreme Court using case data, and to consider whether cases can be predicted using variables that are understandable by the common person (such as myself). You can find part 1 here. Part 2 will use logistic regression models, utilizing LASSO, elastic net, and ridge regression techniques along with 10-fold cross validation on each. All errors are my own. To get the R code utilized to do this analysis, reach out to me at ahegedus@hegedusanalytics.com. The data can be found at the database here, though I can also provide it on request.

The Theoretical Model

For this analysis, I will employ a logistic regression framework aiming to classify Supreme Court decisions by whether they are liberal or conservative. A basic theoretical model is as follows,

where decisionDirection indicates whether a case will skew conservative or liberal; certReason indicates the reason given that the case was reviewed by the Court, if one was given; issueArea represents the area of the law the case pertains to; lcDispositionDirection is the direction the previous court ruled, if applicable (conservative, liberal, or unclear); consjustices indicates the number of conservative justices on the court at the time of arguments; and libjustices indicates the number of liberal justices on the Court at the time of arguments (1).

These variables were selected on the basis that they will be mostly known prior to going into a Supreme Court case. By extension, this model would be able to predict the general political direction a Supreme Court direction would take (as defined by the Supreme Court database here), allowing for a better, less biased understanding of how the court rules, as well as assigning probabilities to a decision going one direction.

Dataset

I utilized data from the Supreme Court Database for this research. The initial dataset was narrowed down to cases where John Roberts was Chief Justice, leaving 1348 cases (2). Cases where there was no clear direction that the case went (either conservative or liberal) or were not applicable were removed, leaving 1310 cases. NA’s present in the data were dealt with in different ways for each column. CertReason received a different factor level to indicate it was not applicable, issueArea was treated as unclear and reclassified to that level, which also occurred for lcDispositionDirection.

Models and Discussion of Results

For this exercise, I employ a LASSO (alpha = 1), ridge (alpha = 0), and elastic net penalty technique (0 < alpha < 1, in increments of 0.1). The goal of these techniques is to take less important coefficients closer to zero, with ridge being unable to take coefficients to exactly zero, with increasingly higher values of alpha allowing more coefficients to go to zero. After using 10-fold cross-validation, these models will be evaluated primarily on their mean test accuracy. A separate, optimal lambda value was found for each alpha. A standard decision boundary of 0.5 is utilized (3). Coefficients for each model along with mean training and test accuracy and the lambda used can be found in the gallery below.

See below for the mean test accuracy on each value of alpha. Overall, an alpha of 0.7 performed best, with a test accuracy of around 66.03%. The test accuracies ranged from around 64.8% to the maximum. I have intentionally zoomed in here to show the differences, but the overall mean accuracies are relatively similar.

Of the 1310 cases considered by the model, 699 went in a conservative direction, with 611 going in a liberal direction. Accordingly, if one were to guess that a case went in a conservative direction, they would be right about 53.4% of the time, while guessing liberal every time would result in a 46.6% success rate. Therefore, a mean accuracy rate of about 66% is a massive result, and shows clearly that patterns can be found within Supreme Court cases to make them increasingly predictable.

The mean false positive (misclassified liberal cases) and negative (misclassified conservative cases) were remarkably similar to each other through all values of alpha. However, the mean false positive rate was minimized at alpha = 0.6, while the mean false negative rate was minimized at alpha = 0.7. Again, I have zoomed in here, so note that the scale of the y-axis is very compressed.

It is also worth considering which regressors fell away as alpha increased. With the ridge logistic regression, all regressors stayed other than the issue factor regarding interstate commerce, though this is because of a feature in the data rather than the regression method. Of particular interest is the performance of the conservative and liberal justice performance indicators. By the point alpha = 0.2, the liberal justice predictor had been relegated to zero, while the conservative justice predictor had a log odds decrease of about 0.04 by the time alpha = 1, both controlling for the other factors in the model. This is particularly intriguing in the context of judicial philosophy: specifically, it provides some additional evidence that the Supreme Court under Chief Justice Roberts generally uses a philosophy of judicial restraint, meaning that the Court should simply interpret the law rather than create new law, which does tend to skew conservative. Also, the prior court returning a liberal decision showed an increased chance of the Supreme Court returning a conservative decision, controlling for the other factors in the model, showing that the court can intervene in decisions where a decision disagreeing with the conservative Supreme Court, which may show a stoppage of judicial activism.

Limitations

This model does feature a handful of limitations. 10-fold cross validation was employed to maximize test accuracy, but the variety of folds means that some folds may have differing success rates. However, given the improvements, it is safe to say that there is at least some improvement. Also, this data is meant to be understood by a common person, so an attorney with modeling knowledge would likely be able to perform better in this exercise given their additional knowledge of law. Finally, it would be interesting to test this model on new court cases to see how it performs on data it was not trained on in any way.

Conclusions

In this blog post, we explored using varying levels of alpha in logistic regression, including using LASSO, ridge, and elastic net penalty, in order to predict the direction of a Supreme Court case ruling. We generally found an improved correct classification rate at all levels of alpha compared to random guessing of cases. This work gives more evidence of judicial restraint as a philosophy by the Roberts Courts and an emphasis on stopping judicial activism. Further research into this topic may include increasing the breadth of cases considered beyond the Roberts Courts and testing on current cases to assess continued accuracy of the model.

Thank you for reading!

Citations linked on relevant lines. Thanks to:

  • Dr. Jeff Summerlin-Long for inspiring the idea

  • Dr. Christopher Handy and Dr. Andrii Babii for their teaching of these techniques

  • Dr. Rui Sun for teaching me to code initially (and for teaching modeling techniques)

  • The Mercer University Economics and Mathematics Departments

  • The University of North Carolina at Chapel Hill Economics and Public Policy departments

Notes

(1) These variables were created using the date of the argument, or date of decision if no oral arguments occurred. See Appendix A for a detailed discussion of assumptions behind these variables and argument for their validity.

(2) As the goal of this project is to classify modern Supreme Court cases, Roberts as Chief Justice served as a natural cut-off point. I may revisit this at another time using the full dataset if I can find an efficient way to replicate all variables, namely the consjustice and libjustice variables.

(3) This means that if P(decisionDirection = liberal) > 0.5 according to the model, it would guess that the decisionDirection went in a liberal direction, and the reverse is true for conservative decision predictions.

Appendix A

The consjustices and libjustices variables were created utilizing the date of the argument for the case, or date of decision if no oral arguments occurred in the case. Using the tenure of each Justice, I was then able to backtrack and roughly figure out how many justices sat on the court at the time of the case being argued, and made the assumption that those same Justices would vote on the case. While a few cases may be slightly off for this, this alone would not have been enough to give me pause. However, there remains a problem from cases that did not have an oral argument: they do not have an argument date. Given that this represented 123 of the 1310 cases, or about 9% of the dataset, this must be addressed in some way. This is further cemented by considering that a vast majority of cases are not necessarily ruled on “quickly:” cases with oral arguments in the dataset generally took in excess of 30 days to receive an opinion on.

Accordingly, I could not ignore this issue. I considered two approaches to solve this problem:

  1. Remove all 123 cases that did not have an oral argument from the dataset entirely.

  2. Substitute the date a decision was given for the date of oral arguments. This approach would come with the understanding that the Justice counts may be off (by no more than 1) for some number of cases.

While it was difficult to say which approach was correct, given that there are only 1187 cases in the dataset without these, it was more ideal to find a fix to the missing data.

Option 2 does not necessarily mean all 123 cases would be skewed. In fact, many cases would not be an issue at all. Because of the way some Justices would step down, it meant that we could be sure of an exact number of Justices regardless of which Justice was on the Supreme Court. Consider when Justice Sandra Day O’Connor stepped down on January 31st, 2006, and was immediately replaced by Justice Samuel Alito. As such, there was no gap where a different number of Justices might be on the court. Also, many of the gaps observed in the data fell during the Court’s break, meaning few to no decisions were released over that time, leaving little to no concern for these to have incorrect Justice counts. Accordingly, I considered the gaps between Justices on the court, either due to retirement or death, and highlighted any of the 123 cases that may be troublesome. These cases are highlighted in red below, with green representing cases that are lower risk to have incorrect Justice counts.

Note my use of “may.” It is not a given that any of the cases actually have incorrect counts, and it is possible that cases outside of this highlighting have slightly incorrect Justice counts. But of those that would be “higher-risk” under my criteria, there are 11 cases that could have trouble with Justice count. Accordingly, I elected to leave all 123 cases in, replacing their date of argument with their date of decision.

Next
Next

New Project: Using Supreme Court Data to Find Trends & Predict Case Outcomes