Monday, April 8, 2019

Have the Identification Police Become Overly Intrusive?

Every intro statistics class teaches "correlation is not causation"--that is, because two patterns consistently move together (or consistently opposite), you can't jump to a conclusion that A causes B, B causes A, some alternative factor C is affecting both A and B, or that among all the millions of possible patterns you can put side-by-side, maybe the correlation between this specific A and B is just a fluky coincidence. 

As part of the "credibility revolution" in empirical economics, researchers in the last 20 years or so have become much more careful in thinking about what kind of a study would demonstrate causality. For example, one approach is to set up an experiment in which some people are randomly assigned to a certain program, while others are not. For example, here are discussions of experiments about the effectiveness of preschool, health insurance, and subsidized employment. Another approach is to look for real-world situations where some randomness exists, and then use that as a "natural experiment." As an example, I recently wrote about research on the effects of money bail which take advantage of the fact that defendants are randomly assigned to judges, some of who are tougher or more lenient in granting bail. Or in certain cities, admission to oversubscribed charter schools uses a lottery, so some students are randomly in the school and others are not. Thus, one can study the effects of bail based on this randomness.

This search for an underlying random factor that allows a researcher to obtain an estimate of an underlying cause is called "identification." It's hard to overstate how much this change has affected empirical work in economics. Pretty much every published paper or seminar presentation has a discussion of the "identification strategy." If you present correlations without such a strategy, you need to be very explicit that you are not drawing any causal inferences, just describing some patterns in the data.

There's not any dispute that this greater thoughtfulness about how to infer causality is overall a good thing. However, one can question whether it has gone too far. Christopher J. Ruhm raised this question in his "Presidential Address: Shackling the Identification Police?" given to the Southern Economic Association last November. The talk doesn't seem to be freely available online, but it has now been published in the April 2019 issue of the Southern Economic Journal (85:4, pp. 1016–1026) and is also available as an NBER working paper.

There are two main sets of concerns about the focus on looking for sources of experimental or natural randomness, as a way of addressing issues about causality. One is that these approaches have issues of their own. For example, imagine a study where people volunteer to be in a program, and then are randomly assigned. It might easily be true that the volunteers are not a random sample of the entire population (after all, they are the ones with connections to hear about the study and motivation to apply), and so the results of as study based on such a group may not generalize to the population as a whole. Ruhm acknowledges these issues, but they are not his main focus.

Ruhm's concern is that when research economists obsess over the issue of identification and causality, they can end up focusing on small questions where they have a powerful argument for causality, but ignoring large questions where getting a nice dose of randomization so that causality can be inferred is difficult or even impossible. Ruhm writes:
I sent out the following query on social media (Facebook and Twitter) and email: “I would like to get your best examples of IMPORTANT microeconomic questions (in labor/health/public/environmental/education etc.) where clean identification is difficult or impossible to obtain.” Responses included the following.
  • Effects of trade liberalization on the distribution of real wages.
  • Contributions of location, preferences, local policy decisions, and luck to geographic differences in morbidity and mortality rates.
  • Effects of the school climate and work environment on teacher and student outcomes.
  • Importance of norms on firms’ wage setting.
  • Extent to which economic factors explain the rise in obesity.
  • Impact of family structure on child outcomes.
  • Effects of inequality, child abuse, and domestic violence on later life outcomes.
  • Social cost of a ton of SO2 emissions.
  • Effect of race on healthcare use.
  • Effect of climate change on agricultural productivity.
Ruhm argues that for a number of big picture questions, an approach which starts by demanding a nice clear source of randomness for clear identification of a causal factor is going to be too limiting. It can look at slices of the problem, but not the problem as a whole. He writes (footnotes and citations omitted):
For a more concrete indication of the value and limitations of experimental and quasiexperimental approaches, consider the case of the fatal drug epidemic, which is possibly the most serious public health problem in the United States today. To provide brief background, the number of U.S. drug deaths increased from 16,849 in 1999 to 63,632 in 2016 and they have been the leading cause of injury deaths since 2009. The rise in overdose mortality is believed to have been initially fueled by enormous increases in the availability of prescription opioids, with more recent growth dominated by heroin and fentanyl. However, some researchers argue that the underlying causes are economic and social decline (rather than supply factors) that have particularly affected disadvantaged Americans. What role can different methodological approaches play in increasing our understanding of this issue? 
RCTs [randomized control trials] could be designed to test certain short-term interventions—such as comparing the efficacy of specific medication-assisted treatment options for drug addicts—but probably have limited broader applicability because randomization will not be practical for most potential policies and longer term effects will be difficult to evaluate. Quasi-experimental methods have provided useful information on specific interventions such as the effects of prescription drug monitoring programs and the effects of , like the legalization of medical marijuana. However, the challenges of using these strategies should not be understated because the results often depend on precise characteristics of the policies and the timing of implementation, which may be difficult to ascertain in practice. Moreover, although the estimated policy impacts are often reasonably large, they are dwarfed by the overall increase in fatal drug overdoses. 
Efforts to understand the root causes of the drug epidemic are therefore likely to be resistant to clean identification and instead require an “all of the above” approach using experimental and quasiexperimental methods where possible, but also the accumulation evidence from a variety of data sources and techniques, including descriptive and regression analyses that in isolation may fail to meet desired standards of causal inference but, hopefully, can be combined with other investigations to provide a compelling preponderance of evidence. 
The relationship between smoking and lung cancer provides a striking example of an important question that was “answered” using strategies that would be viewed as unacceptable today by the identification police. The understanding of tobacco use as a major causal factor was not based upon RCTs involving humans but rather resulted from the accretion of evidence from a wide variety of sources including: bench science, animal experiments, and epidemiological evidence from nonrandomized prospective and retrospective studies. Quasi-experimental evidence was eventually provided (e.g., from analyses of changes in tobacco taxes) but long after the question had been largely resolved. 
To summarize, clean identification strategies will frequently be extremely useful for examining the partial equilibrium effects of specific policies or outcomes—such as the effects of reducing class sizes from 30 to 20 students or the consequences of extreme deprivation in-utero—but will often be less successful at examining the big “what if ” questions related to root causes or effects of major changes in institutions or policies.
In summing up, Ruhm writes:
Have the identification police become too powerful? The answer to this question is subjective and open to debate. However, I believe that it is becoming increasingly difficult to publish research on significant questions that lack sufficiently clean identification and, conversely, that research using quasi-experimental and (particularly) experimental strategies yielding high confidence but on questions of limited importance are more often being published. In talking with PhD students, I hear about training that emphasizes the search for discontinuities and policy variations, rather than on seeking to answer questions of fundamental importance. At professional presentations, experienced economists sometimes mention “correlational” or “reduced-form” approaches with disdain, suggesting that such research has nothing to add to the canon of applied economics.
Thus, Ruhm is pointing to a tradeoff. Researchers would like to have a study with a strong and defensible methodology, and also a study that addresses a big and important question. Tackling a big question by looking at a bunch of correlations or other descriptive evidence is going to have some genuine limitations--but at least it's looking at fact patterns about a big question. Using a great methodology to tackle a small question will never provide more than a small answer--although there is of course a hope that if lots of researchers use great methods on small questions, the results may eventually form a body of evidence that supports broader conclusions. My own sense is that the subject of economics is hard enough to study that researchers should be willing to consider, with appropriate skepticism, a wide array of potential sources of insight.

Friday, April 5, 2019

Four Snapshots of China's Growth and Inequality

Here's China's share of world population and the global economy since the start of its economic reforms. Since 1978, China's share of world population has declined mildly from 23% to about 19%. In those same 40 yeas, China's share of world GDP has risen dramatically from 3% to about 20%
Here's a sens of this economic growth on a per adult basis. The vertical axis is in yuan, so for US readers one might want to divide by the exchange rate of roughly 6.5 yuan/dollar. But look at the annual growth rates of national income per adult--especially the average of 8.1% per year from 1998-2015.  

These images are taken from an article by Thomas Piketty, Li Yang and Gabriel Zucman, "Income inequality is growing fast in China and making it look more like the US: Study provides the first systematic estimates of the level and structure of China’s national wealth since the beginning of market reforms," which appears at the LSE Business Review website (April 1, 2019). It's a preview of their forthcoming research article in the American Economic Review. The main focus of their research has been to look at income and wealth inequality--and in particular, data on patterns of wealth in China has been hard to find. 

Here's the pattern of  China's national wealth over time, expresses as a share of national income. Wealth includes the value of companies, the value of the housing stock, and other assets. China's wealth as a share of national income has almost doubled since 1978. And all of the increase is a result of rising wealth by households, not government. 
Finally, here's a figure showing China's shift in income inequality over time. China's economic growth has meant a larger share of income for the top 10%, and a falling share for the bottom 50%. 
The authors write: "To summarise, the level of inequality in China in the late 1970s used to be less than the European average – closer to those observed in the most egalitarian Nordic countries – but they are now approaching a level that is almost comparable with the USA." Of course, it's important to remember that in a Chinese economy that has been growing rapidly for decades, this doesn't means that the bottom 50% have had stagnant growth in income or are actually worse off in absolute terms. It just means that the growth in incomes for the bottom half hasn't been as rapid as for the top 10%. 

Thursday, April 4, 2019

"Bias Has Been Overestimated at the Expense of Noise:" Daniel Kahneman

Daniel Kahneman (Nobel 2002) is of course known for his extensive work on behavioral biases and how they affect economic decisions. He's now working on a new book, together with Olivier Sibony and Cass Sunstein, in which he focuses instead on the concept of "noise," and argues that 

Here's a prĂ©cis of Kahneman's current thinking on this and other topics, drawn from an interview with Tyler Cowen (both video and a transcript are available at "Daniel Kahneman on Cutting Through the Noise," December 19, 2018).
KAHNEMAN: First of all, let me explain what I mean by noise. I mean, just randomness. And it’s true within individuals, but it’s especially true among individuals who are supposed to be interchangeable in, say, organizations. ...
I’ll tell you where the experiment from which my current fascination with noise arose. I was working with an insurance company, and we did a very standard experiment. They constructed cases, very routine, standard cases. Expensive cases — we’re not talking of insuring cars. We’re talking of insuring financial firms for risk of fraud.
So you have people who are specialists in this. This is what they do. Cases were constructed completely realistically, the kind of thing that people encounter every day. You have 50 people reading a case and putting a dollar value on it.
I could ask you, and I asked the executives in the firm, and it’s a number that just about everybody agrees. Suppose you take two people at random, two underwriters at random. You average the premium they set, you take the difference between them, and you divide the difference by the average.
By what percentage do people differ? Well, would you expect people to differ? And there is a common answer that you find, when I just talk to people and ask them, or the executives had the same answer. It’s somewhere around 10 percent. That’s what people expect to see in a well-run firm.
Now, what we found was 50 percent, 5–0, which, by the way, means that those underwriters were absolutely wasting their time, in the sense of assessing risk. So that’s noise, and you find variability across individuals, which is not supposed to exist.
And you find variability within individuals, depending morning, afternoon, hot, cold. A lot of things influence the way that people make judgments: whether they are full, or whether they’ve had lunch or haven’t had lunch affects the judges, and things like that.
Now, it’s hard to say what there is more of, noise or bias. But one thing is very certain — that bias has been overestimated at the expense of noise. Virtually all the literature and a lot of public conversation is about biases. But in fact, noise is, I think, extremely important, very prevalent.
There is an interesting fact — that noise and bias are independent sources of error, so that reducing either of them improves overall accuracy. There is room for . . . and the procedures by which you would reduce bias and reduce noise are not the same. So that’s what I’m fascinated by these days.
Now, it’s hard to say what there is more of, noise or bias. But one thing is very certain — that bias has been overestimated at the expense of noise. Virtually all the literature and a lot of public conversation is about biases. But in fact, noise is, I think, extremely important, very prevalent. ...
COWEN: Do you see the wisdom of crowds as a way of addressing noise in business firms? So you take all the auditors, and you somehow construct a weighted average? ...

KAHNEMAN: With respect to the underwriters, I would expect, certainly, that if you took 12 underwriters assessing the same risk, you would eliminate the noise. You would be left with bias, but you would eliminate one source of error, and the question is just price. Google, for example, when it hires people, they have a minimum of four individuals making independent assessments of each candidate. And that reduces the standard deviation of error at least by a factor of two.
COWEN: So is the business world, in general, adjusting for noise right now? Or only some highly successful firms?
KAHNEMAN: I don’t know enough about that. All I do know is that, when we pointed out the results, the bewildering results of the experiment on underwriters, and there was another unit — people who assess the size of claims. Again, actually, it’s more than 50 percent. Like 58 percent. The thing that was the most striking was that nobody in the organization had any idea that this was going on. It took people completely by surprise.
My guess now, that wherever people exercise judgment, there is noise. And, as a first rule, there is more noise than people expect, and there’s more noise than they can imagine because it’s very difficult to imagine that people have a very different opinion from yours when your opinion is right, which it is. ...
COWEN: If you’re called in by a CEO to give advice — and I think sometimes you are — how can I reduce the noise in my decisions, the decisions of the CEO, when there’s not a simple way to average? The firm doesn’t have a dozen CEOs. What’s your advice? ...
KAHNEMAN: [T]here is one thing that we know that improves the quality of judgment, I think. And this is to delay intuition. ... Delaying intuition until the facts are in, at hand, and looking at dimensions of the problem separately and independently is a better use of information.

The problem with intuition is that it forms very quickly, so that you need to have special procedures in place to control it except in those rare cases ...  where you have intuitive expertise. That’s true for athletes — they respond intuitively. It’s true for chess masters. It’s true for firefighters ... I don’t think CEOs encounter many problems where they have intuitive expertise. They haven’t had the opportunity to acquire it, so they better slow down. ... It’s not so much a matter of time because you don’t want people to get paralyzed by analysis. But it’s a matter of planning how you’re going to make the decision, and making it in stages, and not acting without an intuitive certainty that you are doing the right thing. But just delay it until all the information is available.
COWEN: And does noise play any useful roles, either in businesses or in broader society? Or is it just a cost we would like to minimize?
KAHNEMAN: There is one condition under which noise is very useful. If there is a selection process, evolution works on noise. You have random variation and then selection. But when there is no selection, noise is just a cost. ... Bias and noise do not cover the universe. There are other categories.

Replacing LIBOR: An International Overview

LIBOR stands for "London Interbank Offered Rate." For a long time, it was probably most common  benchmark interest rate in the world--that is, it was the built into trillions of dollars worth of loans and financial contracts that if the LIBOR interest rate went up or down, the contract would adjust accordingly.

However, a huge scandal erupted back in 2010. Turns out that the LIBOR was not based on actual market transactions; instead, LIBOR was based on a survey in which someone at a bank gave a guess on what interest rate their bank would be charged if the bank wanted to borrow short-term from another bank on a given morning, in a particular currency. A few of the people responding to the survey were intentionally giving answers that pulled LIBOR up just a tiny bit one day, or pulled it down a tiny bit another day. Given that the LIBOR was linked to trillions of dollars in financial contracts, market traders who knew in advance about these shifts could and did reap fraudulent profits.

LIBOR tightened up its survey methods. But it clearly made sense to shift away from using a benchmark interest rate based on a survey, and instead to use one based on an actual market for short-term low-risk borrowing. Various committees formed to consider options. As I noted here about six weeks ago, the US is switching from LIBOR to SOFR--the Secured Overnight Financing Rate. I wrote: "It refers to the cost of borrowing which is extremely safe, because the borrowing is only overnight, and there are Treasury securities used as collateral for the borrowing. The SOFR rate is based on a market with about $800 billion in daily transactions, and this kind of overnight borrowing doesn't just include banks, but covers a wider range of financial institutions. The New York Fed publishes the SOFR rate every morning at 8 eastern time."

But what about the switch away from LIBOR in the rest of the world? Andreas Schrimpf and Vladyslav Sushko describe what's happening in "Beyond LIBOR: a primer on the new benchmark rates," which appears in the March 2019 issue of the BIS Quarterly Review (pp. 29-52). Here's  table showing the alternative risk-free rate (RFR) benchmarks being used with other currencies.
There are several big issues ahead in this area. One is that the LIBOR is actually going to be discontinued in 2021, so any loan or financial contract with a benchmark rate will have to migrate to something else. There will be literally trillions of dollars of contracts that need to shift in this way. Moreover, the LIBOR debacle has made a lot of financial industry participants think more carefully about exactly what benchmark interest rate may be appropriate in any given contract--for example, an appropriate benchmark might include not only an overnight risk-free rate, but also some built-in adjustment for other kinds of risks, including risks over different periods of time or risks at the firm or industry level.

For most of us, discussions of benchmark interest rates have a high MEGO (My Eyes Glaze Over) factor. But when I think in terms of trillions of dollars of loans and financial contracts around the world, all being adjusted in ways that are thoughtful but untested, I find it easier to pay attention to the subject.

Wednesday, April 3, 2019

Some LGBT Economics in High Income Countries

Every few years, the OECD puts out an its Society at a Glance report. The later chapters offer  comparisons across high-income countries on a variety of economic, demographic, health, education, and social variables. For the 2019 edition, the first chapter is on the more specific topic,  "The LGBT challenge: How to better include sexual and gender minorities?" Here, I'll focus on labor market issues, but there is more in the chapter on other issues.

This figure summarizes the results of 46 studies of differences in employment and wages for LBGT people across the OECD countries. The usual method in these studies is to adjust for lots of observable factors: age, education, race/ethnicity, children in the household, hours worked, occupation/industry, location (like urban or rural), and so on.  The horizontal axis shows various groups. The figure then shows the gap that remains after adjusting for these factors, in employment rates, labor earnings, and the extent to which this group is found in high managerial roles.
A common pattern in these studies is that gaps look worse for men than for women. Indeed, LGB women or just Lesbians as a group have positive employment and labor earnings gaps compared to the rest of the population.

As social scientists have long recognized, this kind of "gap" study suggests the presence of discrimination, but it doesn't prove either that discrimination exists or, just as important, it doesn't point to the main locus of discrimination. The "gap" doesn't measure anything directly: it's just what is left over after accounting for the other factors on which data existed.

Even if the gap does result from discrimination, a "gap" study is uninformative about whether the main force of that discrimination hits early in life, perhaps in treatment in schools and families, or whether it's mainly because of discrimination by employers later in life.  The OECD study offers a useful example: "For instance, the fact that lesbians and gay men in Sweden display lower employment rates in regions with more hostile attitudes toward homosexuals may simply reflect that more productive lesbians and gay men are more likely to move out of regions showing low acceptance of homosexuality."

Thus, it's common to complement "gap" studies with other methods that provide more direct evidence of discriminatory behavior. For example, in a "correspondence" study, the researcher sends out job applications to real job ads. The applications are functionally identical, except that some of the applications include evidence some information that could  lead an employer to infer sexual orientation or gender identity--say, listing a membership in a certain volunteer organization, or giving the name of a job candidate's partner in a way that is likely to lead to inferences about the sex of that partner.  The OECD describes the results of 13 studies across 10 countries taking this approach:
Homosexual female and male applicants are 1.5 times less likely to be invited to a job interview when sexual orientation is conveyed through their volunteer engagement or work experience in a gay and/or lesbian organisation. By contrast, insisting on the family prospects of female fictitious candidates by signalling homosexuality through the sex of the candidate’s partner leads to the virtual disappearance of hiring discrimination against lesbians. This pattern could reflect that employers attach a lower risk of maternity to lesbians relative to heterosexual women and are therefore less inclined to discriminate against them ...
Correspondence studies have also been done in the market for rental housing--that is, applying for apartments rather than jobs.
In the rental housing market, correspondence studies show that homosexual couples get fewer responses and invitations to showings from the landlords than heterosexual couples, a result mainly driven by male same-sex partners – see Ahmed, Andersson, & Hammarstedt (2008[38]) and Ahmed & Hammarstedt (2009[39]) in Sweden; Lauster & Easterbrook (2011[40]) in Canada; U.S. Department of Housing and Urban Development (2013[41]) in the United States and Koehler, Harley, & Menzies (2018[24]) in Serbia. In Serbia, for instance, almost one in five (18%) of same-sex couples were refused rental of an apartment by the landlord, while none of the opposite-sex couples were. This average result masks strong disparities by gender: 29% of male same-sex couples wererejected, as opposed to only 8% of female same-sex couples. The absence (or lower magnitude) of discrimination against female same-sex couples could flow from landlords’ well documented preference for female rather than male tenants (Ahmed, Andersson and Hammarstedt, 2008[38]). In this setting, the benefit of having two women as tenants could counterbalance the perceived cost of renting to a lesbian couple.

Other experiments create situations in which people who need help in some way: asking for money, or a "wrong number" or "lost letter" approach.

"In the United Kingdom, various experiments have also involved actors wearing a T-shirt with either a pro-gay slogan or without any slogan. These actors approach passers-by asking them to provide change. The findings point to less help provided to the ostensibly pro-gay person."

In the "wrong-number" approach, households get a call from someone who says their car has broken down, they are at a payphone, they are out of change, and now they have called a wrong number. They ask the person receiving the call to make a call to their boyfriend or girlfriend. Those who ask for a call to placed to someone of the opposite sex are more likely to get help than someone who asks for a call to be placed to someone of the same sex.

In the  “lost-letter technique,” a number of unmailed letters, with addresses and stamps are dropped in city streets. Some of the letters are addressed to LGBT organizations; some are not. Those that are not are more likely to be dropped in the mail by whoever finds them.

For most noneconomists, discrimination is just morally wrong, and that's enough. Economics can add that discrimination also leads to underuse and misallocation of society's human resources, which imposes costs on the economy as a whole.

Tuesday, April 2, 2019

The Unfairness of Money Bail

About 40 years ago, when I was a junior on the high school debate team, we argued for the abolition of the money bail system. Like many positions taken by high school juniors in debate tournaments, our arguments were sweeping and simplistic. But we were correct in recognizing that there are real problems with money bail.

As one example, 14 elected prosecutors wrote a joint letter to New York state lawmakers on March 6, 2019.  The prosecutors  wrote:
We support ending money bail because safety, not wealth, should be the defining feature of the pretrial justice system. Three out of every four people in New York cannot afford to pay the bail amount that the judge sets at their arraignment. That means many people are jailed simply because they are too poor to purchase their freedom. … The only people who should be detained pretrial are those who a judge finds pose a specific, clear and credible threat to the physical safety of the community, or who are a risk of intentionally evading their court dates. Jails across New York frequently are over-capacity, and they are filled with people who do not need to be there. … Research shows that people who spend even a short period in jail, as opposed to being released pretrial, are more likely to commit a future crime. This makes sense. Jail is traumatizing. Jobs are lost. Families can’t pay rent. For reasons big and small, people who are away from their family, their job, and their community become more vulnerable and less stable.

Patrick Liu, Ryan Nunn, and Jay Shambaugh provide a useful backgrounder on this subject in "The Economics of Bail and Pretrial Detention," written for the Hamilton Project at the Brookings Institution (December 2018). Will Dobbie and  Crystal Yang have now offered "Proposals for Improving the U.S. Pretrial System," written as a Hamilton Project Policy (March 2019).  Here's an overview comment from the conclusion of the Liu, Nunn, and Shambaugh paper:
"Bail has been a growing part of the criminal justice system. Nonfinancial release has been shrinking, and more and more defendants are using commercial bonds as a way to secure their release while awaiting trial. Bail can make it more likely that defendants will reappear in court, and as such reduce costs for the criminal justice system. There are, however, extensive costs. Beyond the direct costs of posting the bail, either from paying a fee or having to liquidate assets, widespread use of bail has meant that many people are incarcerated because they are unable to post bail.

"Nearly half a million people are in jail at any given time without having been convicted of a crime. The overwhelming majority of these people are eligible to be released—that is, a judge has deemed that they are safe to be released—but are unable to raise the funds for their release. The impact of monetary bail falls disproportionately on those who are low-income, cannot post bail out of liquid assets, and thus often remain in jail for extended periods. Furthermore, as a growing body of literature has shown, the assignment of financial bail increases the likelihood of conviction due to guilty pleas, and the costs—to both individuals and society— from extra convictions can be quite high."
Let's spell some of this out more explicitly.

Nearly half a million people are incarcerated on any given day without having been convicted of a crime. Add it all up, and over 10 million people during a given year year are locked up without being convicted of anything. Roughly one-quarter of all inmates in state and local jails have not been convicted. Here's a figure from Liu, Nunn, and Shambaugh:


In the last few decades, the use of money bail has been rising. As Dobbie and Yang write (figures and references omitted):
The high rate of pretrial detention in the United States in recent years is largely due to the increasing use of monetary or cash bail—release conditional on a financial payment—and the corresponding decreasing use of release on recognizance (ROR), a form of release conditional only on one’s promise to return to the court. The share of defendants assigned monetary bail exceeded 40 percent in 2009 in the set of 40 populous U.S. counties where detailed data are available, an 11 percentage point–increase from 1990. The fraction of defendants released on their own recognizance decreased by about 13 percentage points over the same period in these counties, with only 14 percent of defendants being released with no conditions in 2009. The widespread use of monetary bail directly leads to high pretrial detention rates in most jurisdictions because many defendants are unable or unwilling to pay even relatively small monetary bail amounts. In New York City, for example, an estimated 46 percent of all misdemeanor defendants and 30 percent of all felony defendants were detained prior to trial in 2013 because they were unable or unwilling to post bail set at $500 or less.
The time that accused people spend in pretrial detention can be significant. Liu, Nunn, and Shambaugh write:
[T]the amount of time that a person is detained if they are unable to afford bail is substantial, ranging from 50 to 200 days, depending on the felony offense. The pretrial detention period is also growing ...  From 1990 to 2009, the median duration of pretrial detention increased for every offense, ranging from an increase of 34 percent for burglary to 104 percent for rape.  ... Even for durations that are relatively short—for example, 54 days for those accused of a driving-related felony—pretrial detention represents a nearly two-month period during which individuals are separated from their families and financial hardships are exacerbated. Moreover, the typical wait until trial is much longer in some places than others (e.g., 200 days in one sample of Pennsylvania counties).
Dobbie and Yang also point out that when it comes to international comparisons, the US locks accused people at a much higher rate before trial than other countries. This figure shows the number of people detained pre-trial per 100,000 population. The US is the tallest bar on the far right.

Sorting out the costs and benefits of different levels of pretrial detention isn't easy. The direct costs of holding people in jails and prisons is straightforward. But how many of those accused people would not have appeared before the court? If they did not appear for court, how would the cost of finding them have compared to the cost of locking them up for days--and in some cases for weeks or even months? What are the additional costs of being locked up in terms of loss of employment opportunities, or stresses on families? How many of those would have committed crimes if not detained? (And how comfortable are we as a society with locking people up not because they have been convicted of a crime, but because we suspect they might commit a crime in the future?) When thinking about conditions of pretrial release, judges are supposed to take all of this account: for example, the presumption that an accused person is innocent, the risk of the person not showing up for trial and the costs of finding them, the risk of the person committing another crime if they are not detained, the person's social ties to the community, the person's economic ability to put up a monetary bond.

To figure out the effects of different methods of pretrial detention, a social scientist might ideally like to take a large pool of people accused of crimes and conduct a randomized experiment, in which some randomly get offered differing levels of money bail, some are released on their own recognizance, and we see what happens. While it would be grossly inappropriate for the justice system to plan to operate in this way, it turns out that this randomized experiment is being conducted by reality.

Decisions about whether to offer bail, or at what level, are not made consistently across the judicial system. When there are multiple judges in a given court, some will tend to be tougher in granting bail and some will be easier, so whether defendants like it or not, they are living in a randomized experiment depending on the judge to whom they are randomly assigned. In addition, the evidence show that even the same judge will treat accused people with seemingly identical characteristics in the same way, which adds another element of randomness. Thus, research in this area can start by setting aside those who are essentially always granted bail or essentially never granted bail, and instead focus on those with seemingly identical characteristics who are more-or-less randomly granted bail in some cases but not in others.

Dobbie and Yang have been among the leading researchers in this area, and they describe the results of this research in their paper.  For example:

Those who are detailed pretrial are more likely to be found guilty, mainly because those who are detailed pretrial are more likely to take a plea bargain--which may include credit for time already served. Pretrial detention clearly reduces the risk of pretrial flight and pretrial crime, but at least in some studies, greater exposure to jail time before the trial is associated with a rise in posttrial crime. Defendants who because of the randomness ins the system are released pretrial, rather than being held pretrial, are more likely to have income and to be employed 2-4 years later.  In some jurisdictions, the randomness in the process of granting bail takes the form of racial disparities.

For those of us who aren't ready to take the plunge and eliminate the money bail system altogether, what might we do to move in the direction of reducing the use of money bail and rationalizing the system? Dobbie and Yang offer some proposals based on the existing research.

Some are pretty simple. When defendants are released on their own recognizance before trial, set up a system of text message or emails to remind them of their court date. For low-risk crimes, make greater use of writing citations, rather than arresting people, and when people are arrested, lean toward releasing them on their own recognizance. For those defendants where a higher degree of monitoring seems appropriate, make greater use of electronic or personal monitoring.

Some more complex proposals involve machine learning. It's now possible to plug the data on the characteristics of those who get bail, or are released on their recognizance, into a computer algorithm, which can look for patterns in those who are more or less likely to flee before trial, or more or less likely to commit crimes. The feedback based on studies can then be turned over to judges, so they have a systematic sense of  how they ruled in past similar cases, and how they compare with how other judges have ruled in similar cases. It's easy to feel queasy about this approach. Are we going to let the result of computer number-crunching play a substantial role in whether people are granted bail? But computer number-crunching may have greater clarity and consistency in its decisions than at least some judges, and could help produce results that both let more people out before trial while also leading to lower pretrial flight risks and crime. Pilot tests along these lines in jurisdictions willing to give it a try seem warranted.


Monday, April 1, 2019

Federal Employee Pay: A Trial Balloon

"The Federal Government is the Nation’s largest employer, and its footprint is global. The total workforce comprises approximately 2.1 million non-postal civilian workers and 1.4 million active duty military, as well as approximately one million military reserve personnel serving throughout the country and the world. The postal workforce includes an additional 500,000 employees. Approximately 85 percent of the Federal workforce, or 1.7 million people, live outside of the Washington, D.C., metropolitan area. Notably, an even larger “indirect” workforce carries out much of the work paid for by Federal funds. This includes Federal contractors and State, local, and nonprofit employees whose jobs are funded by Federal contracts, grants and transfer payments."

This reminder is from Chapter 5 of the Analytical Perspectives volume published with the proposed FY 2020 budget from the Trump Administration. In any given year, a lot of what is in the Analytical Perspectives volume is just an update of the previous year. But the topic of federal employee pay gets more than a quick update; it's an announcement that the Trump administration plans to push on the topic of federal employee pay in the next couple of years. 

Here are some background figures from the budget documents. The first two figures compare education levels for federal workers and the private-sector workers, and how they have evolved over time. The first figure shows that the share of federal workers with at least a master's degree has roughly doubled from 15 to 30% since 1990. In the private sector, the share of workers with a master's degree is less than half this level, although also rising over time. 

The reverse pattern holds for workers with a high school degree or less. This group was 30% of the federal workforce in 1990, but is now about half that level. For all firms in the private sector, 50% of workers had a high school degree or less in 1990, and it's now about 40%.


The patterns suggest a real disjunction between federal and private-sector workers. For at least some readers, it may come as a surprise to recognize that 40% of private-sector workers in the US have a high school degree or less. hat seems like a real-world solution, or a useful process of paperwork and forms, might look rather different to members of these two workforces. 

Federal workers tend to be significantly older, too. 
Comparing the compensation of federal and private-sector workers isn't straightforward. A full analysis would need to take into account take-home pay, benefits, differences in skill levels, and likelihood of being fired or being able to stay on the job as long as you want. The budget document point to a study by the Congressional Budget Office:
A Congressional Budget Office (CBO) report issued in April 2017 found that, based on observable characteristics, Federal employees on average received a combined 17 percent higher wage and benefits package than the private sector average over the 2011-2015 period. The difference is overwhelmingly on the benefits side. CBO found that Federal employees receive on average 47 percent higher benefits and 3 percent higher wages than counterparts in the private sector. In CBO’s analysis, these differences reflect higher Federal compensation paid to individuals with a bachelor’s degree or less, with Federal employees with professional degrees undercompensated relative to
private sector peers.
This general pattern that wages for federal employees are similar to the private sector, given education level, but benefits are higher for government workers, goes back a few years: for example, I laid out the pattern in a blog post in 2012. budget reproduces a chart from the 2017 CBO study. It separates the workforce into five groups by education level. For each group, the left-hand bar show wages and benefits for federal workers, while the right-hand bar shows wages and benefits for private-sector workers. Again, wages are fairly similar, but retirement and health benefits are clearly better for the federal workers. 
The budget document discusses a variety of changes to federal pay: having employees contribute more to their retirement benefits; fewer days off for federal employees, but more flexibility in which days can be taken off; fewer across-the-board pay increases, and more merit increases; greater hiring of "term" federal employees who spend a few years in the government before heading back to the private sector; and more. These kinds of proposals for adjusting federal employee pay are fairly common, but they often tend to end up on that long list of perhaps-useful-but-not-necessarily-right-now topics that never quite make it into law.