I haven’t been surprised by the extensive discussion of the recent paper by Michael Simkovic and Frank McIntyre. The paper deserves attention from many readers. I have been surprised, however, by the number of scholars who endorse the paper–and even scorn skeptics–while acknowledging that they don’t understand the methods underlying Simkovic and McIntyre’s results. An empirical paper is only as good as its method; it’s essential for scholars to engage with that method.
I’ll discuss one methodological issue here: the small sample sizes underlying some of Simkovic and McIntyre’s results. Those sample sizes undercut the strength of some claims that Simkovic and McIntyre make in the current draft of the paper.
What Is the Sample in Simkovic & McIntyre?
Simkovic and McIntyre draw their data from the Survey of Income and Program Participation, a very large survey of U.S. households. The authors, however, don’t use all of the data in the survey; they focus on (a) college graduates whose highest degree is the BA, and (b) JD graduates. SIPP provides a large sample of the former group: Each of the four panels yielded information on 6,238 to 9,359 college graduates, for a total of 31,556 BAs in the sample. (I obtained these numbers, as well as the ones for JD graduates, from Frank McIntyre. He and Mike Simkovic have been very gracious in answering my questions.)
The sample of JD graduates, however, is much smaller. Those totals range from 282 to 409 for the four panels, yielding a total of 1,342 law school graduates. That’s still a substantial sample size, but Simkovic and McIntyre need to examine subsets of the sample to support their analyses. To chart changes in the financial premium generated by a law degree, for example, they need to examine reported incomes for each of the sixteen years in the sample. Those small groupings generate the uncertainty I discuss here.
Confidence Intervals
Statisticians deal with small sample sizes by generating confidence intervals. The confidence interval, sometimes referred to as a “margin of error,” does two things. First, it reminds us that numbers plucked from samples are just estimates; they are not precise reflections of the underlying population. If we collect income data from 1,342 law school graduates, as SIPP did, we can then calculate the means, medians, and other statistics about those incomes. The median income for the 1,342 JDs in the Simkovic & McIntyre study, for example, was $82,400 in 2012 dollars. That doesn’t mean that the median income for all JDs was exactly $82,400; the sample offers an estimate.
Second, the confidence interval gives us a range in which the true number (the one for the underlying population) is likely to fall. The confidence interval for JD income, for example, might be plus-or-minus $5,000. If that were the confidence interval for the median given above, then we could be relatively sure that the true median lay somewhere between $77,400 and $87,400. ($5,000 is a ballpark estimate of the confidence interval, used here for illustrative purposes; it is not the precise interval.)
Small samples generate large confidence intervals, while larger samples produce smaller ones. That makes intuitive sense: the larger our sample, the more precisely it will reflect patterns in the underlying population. We have to exercise particular caution when interpreting small samples, because they are more likely to offer a distorted view of the population we’re trying to understand. Confidence intervals make sure we exercise that caution.
Our brains, unfortunately, are not wired for confidence intervals. When someone reports the estimate from a sample, we tend to focus on that particular reported number–while ignoring the confidence interval. Considering the confidence interval, however, is essential. If a political poll reports that Dewey is leading Truman, 51% to 49%, with a 3% margin of error, then the race is too close to call. Based on this poll, actual support for Dewey could be as low as 48% (3 points lower than the reported value) or as high as 54% (3 points higher than the reported value). Dewey might win decisively, the result might be a squeaker, or Truman might win.
Is the Earnings Premium Cyclical?
Now let’s look at Figure 5 in the Simkovic and McIntyre paper. This figure shows the earnings premium for a JD compared to a BA over a range of 16 years. The shape of the solid line is somewhat cyclical, leading to the Simkovic/McIntyre suggestion that “[t]he law degree earnings premium is cyclical,” together with their observation that recent changes in income levels are due to “ordinary cyclicality.” (pp. 49, 32)
But what lies behind that somewhat cyclical solid line in Figure 5? The line ties together sixteen points, each of which represents the estimated premium for a single year. Each point draws upon the incomes of a few hundred graduates, a relatively small group. Those small sample sizes produce relatively large confidence intervals around each estimate. Simkovic & McIntyre show those confidence intervals with dotted lines above and below the solid line. The estimated premium for 1996, for example, is about .54, but the confidence interval stretches from about .42 to about .66. We can be quite confident that JD graduates, on average, enjoyed a financial premium over BAs in 1996, but we’re much less certain about the size of the premium. The coefficient for this premium could be as low as .42 or as high as .66.
So what? As long as the premiums were positive, how much do we care about their size? Remember that Simkovic and McIntyre suggest that the earnings premium is cyclical. They rely on that cyclicality, in turn, to suggest that any recent downturns in earnings are part of an ordinary cycle.
The results reported in Figure 5, however, cannot confirm cyclicality. The specific estimates look cyclical, but the confidence intervals urge caution. Figure 5 shows those intervals as lines that parallel the estimated values, but the confidence intervals belong to each point–not to the line as a whole. The real premium for each year most likely falls somewhere within the confidence interval for each year, but we can’t say where.
Simkovic and McIntyre could supplement their analysis by testing the relationship among these estimates; it’s possible that, statistically, they could reject the hypothesis that the earnings premium was stable. They might even be able to establish cyclicality with more certainty. We can’t reach those conclusions from Figure 5 and the currently reported analyses, however; the confidence intervals are too wide for certain interpretation. All of the internet discussion of the cyclicality of the earnings premium has been premature.
Recent Graduates
Similar problems affect Simkovic and McIntyre’s statements about recent graduates. In Figure 6, they depict the earnings premium for law school graduates aged 25-29 in four different time periods. The gray bars show the estimated premium for each time period, with the vertical lines indicating the confidence interval. Notice how long those confidence intervals are: The interval for 1996-1999 stretches from about 0.04 through about 0.54. The other periods show similarly extended intervals.
Those large confidence intervals reflect very small sample sizes. The 1996 panel offered income information on just sixteen JD graduates aged 25-29; the 2001 panel included twenty-five of those graduates; the 2004 panel, seventeen; and the 2008 panel twenty-six graduates. With such small samples, we have very little confidence (in both the every day and statistical senses) that the premium estimates are correct.
It seems likely that the premium was positive throughout this period–although the very small sample sizes and possible bimodality of incomes could undermine even that conclusion. We can’t, however, say much more than that. If we take confidence intervals into account, the premium might have declined steadily throughout this period, from about 0.54 in the earliest period to 0.33 in the most recent one. Or it might have risen, from a very modest 0.05 in the first period to a robust 0.80 more recently. Again, we just don’t know.
It would be useful for Simkovic and McIntyre to acknowledge the small number of recent law school graduates in their sample; that would help ground readers in the data. When writing a paper like this, especially for an interdisciplinary audience, it’s difficult to anticipate what kind of information the audience may need. I’m surprised that so many legal scholars enthusiastically endorsed these results without noting the large confidence intervals.
Onward
There has been much talk during the last two weeks about Kardashians, charlatans, and even the Mafia. I’m not sure any legal academic leads quite that exciting a life; I know I don’t. As a professor who has taught Law and Social Science, I think the critics of the Simkovic/McIntyre paper raised many good questions. Empirical analyses need testing, and it is especially important to examine the assumptions that lie behind a quantitative study.
The questions weren’t all good. Nor, I’m afraid, were all of the questions I’ve heard about other papers over the years. That’s the nature of academic debate and refining hypotheses: sometimes we have to ask questions just to figure out what we don’t know.
Endorsements of the paper, similarly, spanned a spectrum. Some were thoughtful, others seemed reflexive. I was disappointed at how few of the paper’s supporters engaged fully in the paper’s method, asking questions like the ones I have raised about sample size and confidence intervals.
I hope to write a bit more on the Simkovic and McIntyre paper; there are more questions to raise about their conclusions. I may also try to offer some summaries of other research that has been done on the career paths of law school graduates and lawyers. We don’t have nearly enough research in the field, but there are some other studies worth knowing.
From time to time, I like to read real books instead of electronic ones. During a recent ramble through my law school’s library, I stumbled across an intriguing set of volumes: NALP employment reports from the late nineteen seventies. These books are so old that they still have those funny cards in the back. It was the content, though, that really took my breath away. During the 1970s, NALP manipulated data about law school career outcomes in a way that makes more contemporary methods look tame. Before I get to that, let me give you the background.
NALP compiled its first employment report for the Class of 1974. The data collection was fairly rudimentary. The association asked all ABA-accredited schools to submit basic data about their graduates, including the total number of class members, the number employed, and the number known to be still seeking work. This generated some pretty patchy statistics. Only 83 schools (out of about 156) participated in the original survey. Those schools graduated 17,188 JDs, but they reported employment data for just 13,250. More than a fifth of the graduates (22.9%) from this self-selected group of schools failed to share their employment status with the schools.
NALP’s early publications made no attempt to analyze this selection bias; the reports I’ve examined (for the Classes of 1977 and 1978) don’t even mention the possibility that graduates who neglect to report their employment status might differ from those who provide that information. The reports address the representativeness of participating schools, but in a comical manner. The reports divide the schools by institutional type (e.g., public or private) and geographic region, then present a cross-tabulation showing the number and percentage of schools participating in each category. For the Class of 1977, participation rates varied from 62.5% to 100%, but the report gleefully declares: “You will note the consistently high percentage of each type of institution, as well as the large number of schools sampled. I believe we can safely say that our study is, in fact, representative!” (p. 7)
Anyone with an elementary grasp of statistics knows that’s nonsense. The question isn’t whether the percentages were “high,” it’s how they varied across categories. Ironically, at the very time that NALP published the quoted language, I was taking a first-year elective on “Law and Social Science” at my law school. It’s galling that law schools weren’t practicing the quantitative basics that they were already teaching.
NALP quickly secured more participating schools, which mooted this particular example of bad statistics. By 1978, NALP was obtaining responses from 150 of the 167 ABA-approved law schools. Higher levels of school participation, however, did not solve the problem of missing graduates. For the Classes of 1974 through 1978, NALP was missing data on 19.4% to 23.7% of the graduates from reporting schools. Blithely ignoring those graduates, NALP calculated the employment rate each year simply by dividing the number of graduates who held any type of job by the number whose employment status was known. This misleading method, which NALP still uses today, yielded an impressive employment rate of 88.1% for the Class of 1974.
But even that wasn’t enough. Starting with the Class of 1975, NALP devised a truly ingenious way to raise employment rates: It excluded from its calculation any graduate who had secured neither a job nor bar admission by the spring following graduation. As NALP explained in the introduction to its 1977 report: “The employment market for new attorneys does not consist of all those that have graduated from ABA-approved law schools. In order for a person to practice law, there is a basic requirement of taking and passing a state bar examination. Those who do not take or do not pass the bar examination should therefore be excluded from the employment market….” (p. 1)
That would make sense if NALP had been measuring the percentage of bar-qualified graduates who obtained jobs. But here’s the kicker: At the same time that NALP excluded unemployed bar no-admits from its calculation, it continued to include employed ones. Many graduates in the latter category held jobs that we call “JD Advantage” ones today. NALP’s 1975 decision gave law schools credit for all graduates who found jobs that didn’t require a law license, while allowing them to disown (for reporting purposes) graduates who didn’t obtain a license and remained jobless.
I can’t think of a justification for that–other than raising the overall employment rate. Measure employment among all graduates, or measure it among all grads who have been admitted to the bar. You can’t use one criterion for employed graduates and a different one for unemployed graduates. Yet the “NALP Research Committee, upon consultation with executive committee members and many placement directors from throughout the country” endorsed this double standard. (id.)
And the trick worked. By counting graduates who didn’t pass the bar but nonetheless secured employment, while excluding those who didn’t take the bar and failed to get jobs, NALP produced a steady rise in JD employment rates: 88.1% in 1974 (under the original method), 91.6% in 1975, 92.5% in 1976, 93.6% in 1977, and a remarkable 94.2% in 1978. That 94.2% statistic ignored 19.5% of graduates who didn’t report any employment status, plus another 3.7% who hadn’t been admitted to the bar and were known to be unemployed but, whatever.
NALP was very pleased with its innovation. The report for the Class of 1977 states: “This revised and more realistic picture of the employment market for newly graduated and qualified lawyers reveals that instead of facing unemployment, the prospects for employment within the first year of graduation are in fact better than before. Study of the profile also reveals that there has been an incremental increase in the number of graduates employed and a corresponding drop in unemployment during that same period.” (p. 21) Yup, unemployment rates will fall if you ignore those pesky graduates who neither found jobs nor got admitted to the bar–while continuing to count all of the JD Advantage jobs.
I don’t know when NALP abandoned this piece of data chicanery. My library didn’t order any of the NALP reports between 1979 and 1995, so I can’t trace the evolution of NALP’s reporting method. By 1996, NALP was no longer counting unlicensed grads with jobs while ignoring those without jobs. Someone helped them come to their senses.
Why bring this up now? In part, I’m startled by the sheer audacity of this data manipulation. Equally important, I think it’s essential for law schools to recognize our long history of distorting data about employment outcomes. During the early years of these reports, NALP didn’t even have a technical staff: these reports were written and vetted by placement directors from law schools. It’s a sorry history.
Cafe Manager & Co-Moderator
Deborah J. Merritt
Cafe Designer & Co-Moderator
Kyle McEntee
Law School Cafe is a resource for anyone interested in changes in legal education and the legal profession.
Have something you think our audience would like to hear about? Interested in writing one or more guest posts? Send an email to the cafe manager at merritt52@gmail.com. We are interested in publishing posts from practitioners, students, faculty, and industry professionals.