Saturday, April 20, 2019

One Case for Keeping "Statistical Significance:" Beats the Alternatives

I wrote a few weeks back that the American Statistical Association has published a special issue of it journal, the American Statistician, with a lead article proposing the abolition of "statistical significance" ("Time to Abolish `Statistical Significance'"?). John Ioannidis has estimated that 90% of medical research is statistically flawed, so one might expect him to be among the harsher critics of statistical significance.  But in the Journal of the American Medical Association, he goes the other way in "The Importance of Predefined Rules and Prespecified Statistical Analyses: Do Not Abandon Significance" (April 4, 2019). Here are a few of his themes:

The result of statistical research is often a yes-or-no outcome. Should a medical treatment be approved or not? Should a certain program or policy be expanded or cut? Should one potential effect be studied more, or should it be ruled out as a cause? Thus, while it's fine for researchers to emphasize that all results come with degree of uncertainty, at some point it's necessary to decide both how research and how applications of that research in the real world should proceed. Ioannidis writes:
Changing the approach to defining statistical and clinical significance has some merits; for example, embracing uncertainty, avoiding hyped claims with weak statistical support, and recognizing that “statistical significance” is often poorly understood. However, technical matters of abandoning statistical methods may require further thought and debate. Behind the so-called war on significance lie fundamental issues about the conduct and interpretation of research that extend beyond (mis)interpretation of statistical significance. These issues include what effect sizes should be of interest, how to replicate or refute research findings, and how to decide and act based on evidence. Inferences are unavoidably dichotomous—yes or no—in many scientific fields ranging from particle physics to agnostic omics analyses (ie, massive testing of millions of biological features without any a priori preference that one feature is likely to be more important than others) and to medicine. Dichotomous decisions are the rule in medicine and public health interventions. An intervention, such as a new drug, will either be licensed or not and will either be used or not.
Yes, statistical significance has a number of problems. It would be foolish to rely on it exclusively. But what will be used instead? And will it be better or worse as a way of making such decisions? No method of making such decisions is proof against bias. Ioannidis writes: 
Many fields of investigation (ranging from bench studies and animal experiments to observational population studies and even clinical trials) have major gaps in the ways they conduct, analyze, and report studies and lack protection from bias. Instead of trying to fix what is lacking and set better and clearer rules, one reaction is to overturn the tables and abolish any gatekeeping rules (such as removing the term statistical significance). However, potential for falsification is a prerequisite for science. Fields that obstinately resist refutation can hide behind the abolition of statistical significance but risk becoming self-ostracized from the remit of science. Significance (not just statistical) is essential both for science and for science-based action, and some filtering process is useful to avoid drowning in noise.
Ioannidis argues that the removal of statistical significance will tend to make things harder to rule out, because those who wish to believe something is true will find it easier to make that argument. Or more precisely: 
Some skeptics maintain that there are few actionable effects and remain reluctant to endorse belabored policies and useless (or even harmful) interventions without very strong evidence. Conversely, some enthusiasts express concern about inaction, advocate for more policy, or think that new medications are not licensed quickly enough. Some scientists may be skeptical about some research questions and enthusiastic about others. The suggestion to abandon statistical significance1 espouses the perspective of enthusiasts: it raises concerns about unwarranted statements of “no difference” and unwarranted claims of refutation but does not address unwarranted claims of “difference” and unwarranted denial of refutation.
The case for not treating statistical significance as the primary goal of an analysis seems to me ironclad. The case is strong for putting less emphasis on statistical significance and correspondingly more emphasis on issues like what data is used, the accuracy of data measurement, how the measurement corresponds to theory, the potential importance of a result, what factors may be confounding the analysis, and others. But the case for eliminating statistical significance from the language of research altogether, with the possibility that it will be replaced by an even squishier and more subjective decision process, is a harder one to make.



Friday, April 19, 2019

When Did the Blacksmiths Disappear?

In 1840, Henry Wadsworth Longfellow published a poem called "The Village Blacksmith." In my humble economist-opinion, not his best work. But the opening four lines are very nice:
Under a spreading chestnut tree
The village smithy stands;
The smith, a mighty man is he,
With large and sinewy hands ...
When I was a little boy, and my father read the poem to me, I remember the pleasure in his voice at that word "sinewy." Indeed, there was at time when every town of even modest size had at least one blacksmith. When did they go away?
Jeremy Atack and Robert A. Margo answer the question in "Gallman revisited: blacksmithing and American manufacturing, 1850–1870," published earlier this year in Cliometrica (2019, 13: 1-23).  Gallman is an earlier writer who classified blacksmiths as a service industry, while Atack and Margo argue that they instead should be treated as an early form of manufacturing. From a modern view, given the concerns over how technology may affect current and future jobs, blacksmiths offer an example of how a prominent industry of skilled workers went away. Atack and Margo write:
The village blacksmith was a common sight in early nineteenth-century American communities, along with cobblers, shoemakers, grist mill operators, and other artisans. Blacksmiths made goods from wrought iron or steel. This metal was heated in a forge until pliant enough to be worked with hand tools, such as a hammer, chisel, and an anvil. Others also worked with metal but what distinguished blacksmiths was their abilities to fashion a wide range of products from start to finish and even change the properties of the metal by activities such as tempering, as well as repair broken objects. ...

Blacksmiths produced a wide range of products and supplied important services to the nineteenth-century economy. In particular, they produced horseshoes and often acted as farriers, shoeing horses, mules, and oxen. This was a crucial service in an economy where these animals provided the most of the draft power on the farm and in transportation and carriage. The village blacksmith also produced a wide range of goods from agricultural implements to pots and pans, grilles, weapons, tools, and carriage wheels among many other items familiar and unfamiliar to a modern audience—a range of activities largely hidden behind their generic occupational title.
Blacksmithing was a sufficiently important activity to qualify as a separate industrial category in the nineteenth-century US manufacturing censuses, alongside more familiar industries as boots and shoes, flour milling, textiles, and clock making. The 1860 manufacturing census, for example, enumerated 7504 blacksmith shops employing 15,720 workers ... —in terms of the number of establishments, the fourth most common activity behind lumber milling, flour milling, and shoemaking. ...
As ‘‘jacks-of-all-trades,’’ they [blacksmiths] were generally masters of none (except for their service activities). Moreover, the historical record reveals that several of those who managed to achieve mastery moved on to become specialized manufacturers of that specific product. Such specialized producers had higher productivity levels than those calling themselves blacksmiths producing the same goods, explaining changes in industry mix and the decline of the blacksmith in manufacturing. ...

Consider the goods produced historically by blacksmiths, such as plows. Over time, blacksmiths produced fewer and fewer of these, concentrating instead on services like shoeing horses or repairs. But even controlling for this, only the most productive of blacksmiths (or else those whose market was protected from competition in some way) survived—a selection effect. On the goods side of the market, production shifted toward establishments that were sufficiently productive that they could specialize in a particular ‘‘industry,’’ such as John Deere in the agricultural implements industry. As this industry grew, it drew in workers—some of whom in an earlier era might have opened their own blacksmith shops, but most of whom now worked on the factory floor, perhaps doing some of the same tasks by hand that blacksmiths had done earlier but otherwise performing entirely novel tasks, because production process was increasingly mechanized. On average, such workers in the specialized industry were more productive than the ‘‘jack-of-all-trades,’’ the blacksmith, had been formerly. The village smithy could and did produce rakes and hoes, but the village smithy eventually and increasingly gave way to businesses like (John) Deere and Company who did it better. ...
During the first half of the nineteenth century, blacksmiths were ubiquitous in the USA, but by the end of the century they were no longer sufficiently numerous or important goods producers to qualify as a separate industry in the manufacturing census.

Thursday, April 18, 2019

When US Market Access is No Longer a Trump Card

When the US economy was a larger share of the world economy, then access to the US market meant more. For example, World Bank statistics say that the US economy was 40% of the entire world economy in 1960, but is now about 24%. The main source of growth in the world economy for the foreseeable future will be in emerging markets.

For a sense of the shift, consider this figure from chapter 4 of the most recent World Economic Outlook report, published by the IMF (April 2019). The lines in the figure show the trade flows between countries that are at least 1% of total world GDP. The size of the dots for each country is proportionate to the country's GDP.

In 1995, you can see international trade revolving around the United States, with another hub of trade happening in Europe and a third hub focused around Japan.  Trade between the US and China shows up on the figure, but China did not have trade flows greater than 1% of world GDP with any country other than the US.

The picture is rather different in 2015. The US remains an international hub for trade. Germany remains a hub as well, although fewer of its trade flows now exceed 1% of world GDP. And China has clearly become a hub of central importance in Asia.

The patterns of trade have also shifted toward greater use of global value chains--that is, intermediate products that are shipped across national borders at least once, and often multiple times, before they become final products. Here's the overall pattern since 1995 of falling tariffs and rising participation in global value chains for the world economy as a whole.

Several decades ago, emerging markets around the world worried about having access to selling in US and European markets, and this market access could be used by the US and European nations as a bargaining chip in economic treaties and more broadly in international relations. Looking ahead, US production is now more tied into global value chains, and the long-term growth of US manufacturing is going to rely more heavily on sales to markets outside the United States.

For example, if one is concerned about the future of the US car industry, the US now produces about 7% of the world's cars in 2015, and about 22% of the world's trucks. The future growth of car consumption is going to be primarily outside the US economy. For the health and long-term growth of the US car business, the possibility of unfair imports into the US economy matters a lot less than the access of US car producers to selling in the rest of the world economy.

The interconnectedness of global value chains means that General Motors already produces more cars in China than it does in the United States. In fact, sales of US multinationals now producing in China are already twice as high as exports from the US to China. Again, the long-term health of many US manufacturers is going to be based on their ability to participate in international value chains and in overseas production.

Although what caught my eye in this chapter of the World Economic Outlook report was the shifting patterns of world trade, the main emphases of the chapter are on other themes that will come as no surprise to faithful readers of this blog.  One main theme is that shifts in bilateral and overall trade deficits are the result of macroeconomic factors, not the outcome of trade negotiations, a theme I've harped on here (for example, here, here, and here).

The IMF report also offers calculations that higher tariffs between the US and China will cause economic losses for both sides. From the IMF report:
US–China trade, which falls by 25–30 percent in the short term (GIMF) and somewhere between 30 percent and 70 percent over the long term, depending on the
model and the direction of trade. The decrease in external demand leads to a decline in total exports and in GDP in both countries. Annual real GDP losses range from –0.3 percent to –0.6 percent for the United States and from –0.5 percent to –1.5 percent for China ... Finally, although the US–China bilateral trade deficit is reduced, there is no economically significant change in each country’s multilateral trade balance.
Some advocates of higher tariffs take comfort in noting that the estimated losses to China's economy are bigger than the losses to the US economy.  Yes, but it's losses all around! As the 21st century economy evolves, the most important issues for US producers are going to involve their ability to
compete in unfettered ways in the increasingly important markets outside the US.

Tuesday, April 16, 2019

The Utterly Predictable Problem of Long-Run US Budget Deficits

For anyone who can do arithmetic, it did not come as a surprise that the "baby boom generation," born from 1946 up through the early 1960s, started turning 65 in 2010. Here's the pattern over time of the "Daily Average Number of People Turning 65." The jump of the boomer generation is marked.

Because two major federal spending programs are focused on older Americans--Social Security and Medicare--it has been utterly predictable for several decades that the long-run federal budget situation would come under strain at about this time. That figure comes from a report by the US Government Accountability Office, "The Nation’s Fiscal Health Action Is Needed to Address the Federal Government’s Fiscal Future" (April 2019).

Here's a breakdown of the GAO predictions on federal spending for the next 30 years. The "all else" category bumps up about 0.6% of GDP during this time, and normal politics could deal with that easily enough. Social Security spending is slated to bump up about 1% of GDP, which is a bigger problem. Still, some mixture of limits on benefits (like a later retirement age) and a modest bump in the payroll tax rate could deal address this. Indeed, it seems to me an indictment of the US political class, from both parties, that no forward-looking politician has built a movement around steps to "save Social Security."
But the projected rise in government health care spending of 3.2% of GDP is a challenge that no one seems to know how to fix. It's a combination of the rising share of older people, and in particular the rising share of the very-old who are more likely to face needs for nursing home and Alzheimer's care, combined with an overall trend toward higher per person spending on health care. As I've noted before, every dollar of government health care spending represents both care for a patient and income for a provider, and both groups will fight hard against cutbacks.

Meanwhile, this rise in spending, coupled with the assumption that tax revenues remain on their current trajectory, means higher government deficits and borrowing. Over time, this also means higher government interest payments. So if we lack the ability to control the rise in deficits, interest payments soar. In 2018, about 7.9% of federal spending is interest payments on past borrowing. By 2048, on these projections, about 22% of all federal spending will be interest payments on past debt. Of course, this also means that finding ways to reduce the deficit with spending and tax changes also has the benefit that it avoids this soaring rise in interest payments.

As a side note, I thought a figure in the GAO report showing who holds US debt was interesting.  The report notes:
Domestic investors—consisting of domestic private investors, the Federal Reserve, and state and local governments—accounted for about 60 percent of federal debt held by the public as of June 2018, while international investors accounted for the remaining 40 percent. International investors include both private investors and foreign official institutions, such as central banks and national government-owned investment funds. Central banks hold foreign currency reserves to maintain exchange rates or to facilitate trade. Therefore, demand for foreign currency reserves can affect overall demand for U.S. Treasury securities. An economy open to international investment, such as the United States, can essentially borrow the surplus of savings of other countries to finance more investment than U.S. national saving would permit. The flow of foreign capital into the United States has gone into a variety of assets, including Treasury securities, corporate securities, and direct investment.


The arguments for restraint on federal borrowing are fairly well-known. There's the issue that too much debt means less ability to respond to a future recession, or some other crisis. There's an issue that high levels of federal borrowing soak up investment capital that might have been used more productively elsewhere in the economy. There's a concern that federal borrowing is financing a fundamental shift in the nature of the federal government: it used to be that most of government spending was about making investments in the future--infrastructure, research and development, education, and so on--but over the decades it has become more and more about cutting checks for immediate consumption in spending programs.

But at the moment, I won't argue for this case in any detail, or offer a list of policy options. Those who want to come to grips with the arguments should look at William Gale's just-published book, Fiscal Therapy: Curing America's Debt Addiction and Investing in the Future,  For an overview of Gale's thinking, a useful starting point is the essay "Fiscal therapy: 12 framing facts and what they mean" (Brookings Instution, April 3, 2019).

Of course, I personally like some of Gale's proposals better than others. But overall, what I really like is that he takes the issue of rising government debt in the long-run seriously, and takes the responsibility of offering policy advice seriously. He doesn't wave his hands and assume that faster economic growth will bring in additional waves of tax revenue; or that "taxing the rich" is a magic elixir; or that the proportion of older people who are working will spike upward; or that we can ignore the deficit for five or ten or 15 years before taking some steps; or that government spending restraint needs nothing more than avoiding duplication, waste, and fraud. He uses mainstream estimates and makes concrete suggestions. For an outside analysis of his recommendations, John Ricco, Rich Prisinzano, and Sophie Shin run his proposals through the Penn Wharton Budget Model in "Analysis of Fiscal Therapy: Conventional and Dynamic Estimates (April 9, 2019).

As a reminder, here's the pattern of US debt as a share of GDP since 1790. Through most of US history, the jumps in federal debt in the past are about wars: the Revolutionary War, the Civil War, World Wars I and II. There's also a jump in the 1930s as part of fighting the Great Depression, and the more recent jump which was part of fighting the Great Recession.
But the aging of the US population and rising health care spending are going to take federal spending and deficits to new (peacetime) levels. On the present path, we will surpass the previous high of federal debt at 106% of GDP in about 15-20 years. It would b imprudent to wait and see what happens.

Monday, April 15, 2019

US Attitudes Toward Federal Taxes: A Rising Share of "About Right"

The Gallup Poll has been asking Americans since the 1950s whether they think that the income tax they pay is "too high" or "about right." The figure shows the responses over time including the 2019 poll, taken in early April. (The "don't know" and the "too low" answers are both quite small, and are not shown on the figure.)
Line graph. Americans’ opinions of the federal income tax they pay, since 1956.
What's interesting to me here is that from the late 1960s up through the 1990s, a healthy majority of Americans consistently viewed their income taxes as too high. But since around 2000, the gap has been much narrower. Indeed, the share of Americans saying that their income taxes are "about right" has been at its highest historical level in 2018 and 2019.

In more detailed questions, the Gallup poll also asks different income groups, and whether they are paying too much, a fair amount, or too little. Unsurprisingly, the general sentiment is that those with upper income levels could pay more. But there are some patterns I wouldn't necessarily have expected in the results, as well. Here are the results for what people think about what upper-income people are paying in federal taxes (for readability, I omitted the 2-4% in the "don't know" column):
What jumps out at me is that the proportion saying that upper-income people are paying a fair share of federal taxes was at its highest ever in 2019, while the share saying that upper-income people are paying "too little" was at its lowest ever. Yes, there's still a majority saying that upper-income people are paying "too little." But it's a shrinking majority. Given trends toward greater inequality of incomes over the 25 years shown in the table, I wouldn't have expected that pattern.

Here's the pattern for federal taxes paid by middle-income people:
Here, you can see the pattern from the figure above: that is, the share saying that middle-income people pay too much is falling, while the share saying that they pay their fair share seems to have risen over time.

For the share paid by lower-income people, the pattern looks like this:
Overall, the most common answer is that lower-income people pay "too much" in federal taxes. But it's interesting that the share saying that lower income people pay "too little" is higher than the share saying that middle-income people pay "too little." The share saying that lower-income people pay "too little" in federal taxes was especially high in 2014 and 2015, although it's dropped off a little since then,

Of course, questions about whether a tax code is fair are going to be influenced by political partisanship. For example, here are some poll responses from the Pew Research Center. Their polling shows that the share of American thinking the federal tax system is very or moderately fair rose overall slightly in the last year overall. However, this modest overall rise is a result of Republicans being much more willing to say the tax system is fair than at any time in the last 20 years, and Democrats being much less willing to say so.
Widest partisan gap in views of fairness of tax system in at least two decades

As I've pointed out in the past, poll responses on economic questions like whether free trade offers benefits are also influenced dramatically by political preferences. During the past couple of years, as President Trump has inveighed against international trade and called for protectionism, Democrats have suddenly become much more positive about trade. One suspects that this pattern emerged more from anti-Trump feeling than from increased time spent reading economics textbooks.

Still, it's interesting to me that a plurality of American now see the federal income tax as "about right," while the proportion saying that upper-incomes pay too little is down, the proportion saying that middle-incomes pay a fair share is up, and the proportion saying that lower incomes pay too little has risen. Perhaps politicians who call for cutting taxes, or for dramatic tax increases, are refighting battles from the 1990s that are of less relevance to current voters.

Saturday, April 13, 2019

The Captain Swing Riots; Workers and Threshing Machines in the 1830s

"Between the summer of 1830 and the summer of 1832, riots swept through the English countryside. Over no more than two years, 3,000 riots broke out – by far the largest case of popular unrest in England since 1700. During the riots, rural laborers burned down farmhouses, expelled overseers of the poor and sent threatening letters to landlords and farmers signed by the mythical character known as Captain Swing. Most of all, workers attacked and destroyed threshing machines."

 Bruno Caprettini and  Joachim Voth provide a readable overview of their research on the riots in "Rage against the machines New technology and violent unrest in industrializing England," written as a Policy Brief for the UBS International Center of Economics in Society (2018, #2). They write:
"Threshing machines were used to thresh grain, especially wheat. Until the end of the 1700s, threshing grain was done manually and it was the principal form of employment in the countryside during the winter months. Starting from the Napoleonic Wars (1803-1815), threshing machines spread across England, replacing workers. Horse-driven or water-powered threshers could finish in a matter of weeks a task that would have normally kept workers busy for months. Their use arguably depressed the wages of rural workers." 
Here's a figure showing locations of the Captain Swing riots: 

The authors collect evidence about where threshing machines were being adopted based on newspaper advertisements for the sale of farms--which listed threshing machines at the farm as well as other property included with the sale. They show a correlation between the presence of more threshing machines and rioting. But as always, correlation doesn't necessarily  mean causation. For example, perhaps areas where local workers were already more rebellious and uncooperative were more likely to adopt threshing machines, and the riots that followed only show why local farmers didn't want to deal with their local workers. 

Thus, the authors also collect evidence on what areas were especially good soil for wheat, which makes using a thresher more likely, and what areas had water-power available to run threshers. it turns out that these areas are also where the threshers were more likely to be adopted. So a more plausible explanation seems to be that the new technology was adopted where it was most likely to be effective, not because of pre-existing local stroppiness. 

The Captain Swing riots are thus one more example, an especially vivid one, that new technologies which cause a lot of people to lose a way of earning income can be highly disruptive. The authors write: "The results suggest that in one of the most dramatic cases of labor unrest in recent history, labor-saving technology played a key role. While the past may not be an accurate guide to future upheavals, evidence from the days of Captain Swing serve as a reminder of how disruptive new, labor-saving technologies can be in economic, social and political terms."

Friday, April 12, 2019

Building Worker Skills in a Time of Rapid Technological Change

 I'm congenitally suspicious of "this time is different" arguments, which often seem very quick to toss out historical experience for the sake of a lively narrative. So when I find myself in discussions of  whether the present wave of technological change is unprecedented or unique, I often end up making the argument that while the new technologies are obviously different in their specifics from older technologies, the fact of technology leading to very dramatic disruptions of labor markets is not at all new. To me, the more interesting questions are question how the economy, government, and society react to that ongoing pattern of technological change.

Conor  McKay, Ethan Pollack, and Alastair Fitzpayne offer a useful broad overview of these issues in "Automation and a Changing Economy," a two-part report written for the Aspen Institute Future of Work Initiative (April 2019). The first volume focuses on the theme "A Case for Action," with background on how technological change and automation has affected labor markets over time, while the second volume is "Polices for Shared Prosperity," with a list of policy options.

It may turn out to be true that the current wave of technological innovation is uniquely different in some ways. (It's very hard to disprove that something might happen!)  But it's worth taking a moment to acknowledge that technologies of the past severely disrupted the US labor market, too. For example, here's a figure showing shifts in the pattern of US jobs over time: the dramatic rise in white-collar jobs, with falls in other areas.

And of course if one goes beyond broad skill categories and looks in more detail at jobs, the necessary skill mix has been changing quite substantially as well. Remember that in the 1970s, word-processing was mostly on typewriters; in the 1980s, written communications involves mail, photocopying, and sometimes fax machines; in the 1990s, no one carried a smartphone. It's not just changes in information technologies and the web, either. Workers across manufacturing and services jobs have had to learn how to use new generations of  physical equipment as well.

Of course, we can noodle back and forth over how new technologies might have bigger effects on labor markets. The report has some discussion of these issues, and dinner parties for economists have been built on less. But the lesson I'd take away, to quote from the report, is: "Automation need not be any more disruptive in the future than it has been in the past to warrant increased policy intervention."

One key issue in navigating technological change is how workers can obtain the skills that employers want. And here a problem emerges, which is that although employers were a primary source of such training in the past, they have backed away from this role. The report notes (footnotes omitted):
Employers traditionally have been the largest source of funding for workforce training, but businesses are training fewer workers than in the past. From 1996 to 2008, the percentage of workers receiving employer-sponsored or on-the-job training fell 42 percent and 36 percent, respectively. This decline was widespread across industries, occupations, and demographic groups. ...  More recent data on employer-provided training has been mixed. Data from the Society for Human Resource Management suggests that employer-provided tuition assistance has been falling in recent years, from 66 percent of surveyed businesses offering tuition assistance benefits in 2008 down to 53 percent in 2017. Meanwhile, data from the Association for Training & Development suggests that employer training investments have been roughly flat over the last decade. ...
[A]s unions have lost power and membership, ... businesses have had a freer hand to hire already trained external candidates, often leading to fewer within-firm career pathways and  higher turnover. ...
Public sector investment has declined, too. For example, WIOA Title I state grants, which fund the core of the federal workforce development system, have been cut by over 40 percent since 2001. The program is currently underfunded by $367 million relative to its authorized levels. Government spending on training and other programs  to help workers navigate job transitions is now just 0.1 percent of GDP, lower than all other OECD countries except for Mexico and Chile, and less than half of what it was 30 years ago.
Why have employers backed away from providing training? The report notes:
Without intervention, business investment in workers may continue to decline. In a recent Accenture survey of 1,200 CEOs and other top executives, 74 percent said that  they plan to use artificial intelligence to automate tasks in their workplace over the next three years. Yet only three percent reported planning to significantly increase investments in training over the same time period.
In part, the decline in employer-provided training can be explained by changes in the employer-employee relationship over the past forty years. ... If businesses plan to retain employees over a long period, they will benefit more directly from their training investments. But as relationships between workers and businesses become less stable and short-term, businesses have a difficult time capturing the return on their training investments. The result is less investment in training even as the workforce requires greater access to skills training.
Recent legislation could accelerate this trend. Businesses often have to choose between using workers or machines to accomplish a task. The 2017 Tax Cuts and Jobs Act allows businesses to immediately expense the full cost of equipment purchases—including automation technology—rather than deduct the cost of the equipment over a period of time. By reducing the after-tax cost of investing in physical capital but not providing a similar benefit for investments in human capital, the legislation may further shift business priorities away from worker training.
There are a number of ways one might seek to rebuild connections between employers and job training. The report suggests an employer tax break for spending money on employee training, similar to the tax break now given for investing in research and development. A complementary approach would be to build through a dramatic expansion of the community college system, which has the advantage that it can train workers for an multiple-employer industry that is locally prominent. Yet another approach is a considerable expansion of apprenticeships. Yet another approach would be much greater support for "active labor market policies," that assist workers with job search and training. 

A lot of the concern over adapting to technological change, and whether the economy is providing "good jobs" or devolving toward alternative "gig jobs," seems to me rooted in concerns about the kind of attachment that exists between workers and employers.  It relates to the extent that workers feel engaged with their jobs, and to whether the worker and employer both have a plausible expectation that the job relationship is likely to persist for a time--allowing both of them to invest in acquiring skills with the possibility (or likelihood?) of a lasting connection in mind. 

Ultimately, it will matter whether employers view their employees as imperfect robots, always on the verge of being  replaced when the better robots eventually arrive, of whether they view their employees as worthy of investment in themselves. It's the difference between automation replacing workers, or complementing them. 

Wednesday, April 10, 2019

Interview with Preston McAfee: Economists and Tech Companies

David A. Price interviews R. Preston McAfee in the most recent issue of Econ Focus from the Federal Reserve Bank of Richmond (Fourth Quarter 2018, pp. 18-23). From the introduction to the interview:
"Following a quarter-century career in academia at the California Institute of Technology, the University of Texas, and other universities, McAfee was among the first economists to move from academia to a major technology firm when he joined Yahoo in 2007 as chief economist. Many of the younger economists he recruited to Yahoo are now prominent in the technology sector. He moved to Google in 2012 as director of strategic technologies; in 2014, he joined Microsoft, where he served as chief economist until last year. McAfee combined his leadership roles in the industry with continued research, including on the economics of pricing, auctions, antitrust, and digital advertising. He is also an inventor or co-inventor on 11 patents in such wide-ranging areas as search engine advertising, automatically organizing collections of digital photographs, and adding user-defined gestures to mobile devices. While McAfee was still a professor in the 1990s, he and two Stanford University economists, Paul Milgrom and Robert Wilson, designed the first Federal Communications Commission auctions of spectrum."
Here are some comments from the interview that especially caught my eye--although the whole interview is worth reading:

On the antitrust and competition issues with the FAANG companies: 
Of course, a lot of the discussion today is focused on FAANG — Facebook, Apple, Amazon, Netflix, and Google. ... First, let's be clear about what Facebook and Google monopolize: digital advertising. The accurate phrase is "exercise market power," rather than monopolize, but life is short. Both companies give away their consumer product; the product they sell is advertising. While digital advertising is probably a market for antitrust purposes, it is not in the top 10 social issues we face and possibly not in the top thousand. Indeed, insofar as advertising is bad for consumers, monopolization, by increasing the price of advertising, does a social good. 
Amazon is in several businesses. In retail, Walmart's revenue is still twice Amazon's. In cloud services, Amazon invented the market and faces stiff competition from Microsoft and Google and some competition from others. In streaming video, they face competition from Netflix, Hulu, and the verticals like Disney and CBS. Moreover, there is a lot of great content being created; I conclude that Netflix's and Amazon's entry into content creation has been fantastic for the consumer. ...
That leaves Apple, and the two places where I think we have a serious tech antitrust problem. We have become dependent on our phones, and Apple does a lot of things to lock in its users. The iMessage program and FaceTime are designed to force people into the Apple ecosystem. Also, Apple's app store is wielded strategically to lock in users (apps aren't portable), to prevent competition with Apple services, and to prevent apps that would facilitate a move to Android. My concern is that phones, on which we are incredibly dependent, are dominated by two firms that don't compete very strongly. While Android is clearly much more open than Apple, and has competing handset suppliers, consumers face switching costs that render them effectively monopolized. ...
The second place I'm worried about significant monopolization is Internet service. In many places, broadband service is effectively monopolized. For instance, I have only one company that can deliver what anyone would reasonably describe as broadband to my house. The FCC says I have two, but one of these companies does not actually come to my street. I'm worried about that because I think broadband is a utility. You can't be an informed voter, you can't shop online, and you probably can't get through high school without decent Internet service today. So that's become a utility in the same way that electricity was in the 1950s. Our response to electricity was we either did municipal electricity or we did regulation of private provision. Either one of those works. That's what we need to do for broadband.
Using "double machine-learning" to separate seasonal and  price effects
Like most computer firms, Microsoft runs sales on its Surface computers during back-to-school and the December holidays, which are also the periods when demand is highest. As a result, it is challenging to disentangle the effects of the price change from the seasonal change since the two are so closely correlated. My team at Microsoft developed and continues to use a technology to do exactly that and it works well. This technology is called "double ML," double machine learning, meaning it uses machine learning not once but twice.
This technique was originally created by some academic economists. Of course, as with everything that's created by academic economists, including me, when you go to apply it, it doesn't quite work. It almost works, but it doesn't quite work, so you have to change it to suit the circumstances.
What we do is first we build a model of ourselves, of how we set our prices. So our first model is going to not predict demand; it's just going to predict what decision-makers were doing in the past. It incorporates everything we know: prices of competing products, news stories, and lots of other data. That's the first ML. We're not predicting what demand or sales will look like, we're just modeling how we behaved in the past. Then we look at deviations between what happened in the market and what the model says we would have done. For instance, if it predicted we would charge $1,110, but we actually charged $1,000, that $110 difference is an experiment. Those instances are like controlled experiments, and we use them in the second process of machine learning to predict the actual demand. In practice, this has worked astoundingly well.
On the power of AI
AI is going to create lots of opportunities for firms in every industry. By AI, I mean machine learning, usually machine learning that has access to large volumes of data, which enables it to be very clever. 
We're going to see changes everywhere: from L'Oréal giving teenagers advice about what makeup works best for them to airplane design to logistics, everywhere you look within the economy.
Take agriculture. With AI, you can start spot-treating farms for insect infestation if you can detect insect infestations, rather than what we do today, which is spread the treatment broadly. With that ability to finely target, you may be able to reduce pesticides to 1 percent of what you're currently using, yet still make them more effective than they are today and have them not deteriorate so rapidly in terms of the bugs evolving around them.
For a recent article about "Economists (and Economics) in Tech Companies," interested readers may want to check the article by Athey, Susan, and Michael Luca in the Winter 2019 issue of the Journal of Economic Perspectives (33:1, pp. 209-30).

Tuesday, April 9, 2019

Snapshots of Trade Imbalances: US in Global Context

A substantial amount of the discussion of international trade issues starts from the premise that the United States has huge trade deficits and China has huge trade surpluses. But what if only half of that premise is true? Here are a couple of tables on trade balances that I've clipped out of the IMF's World Economic Outlook for April 2019. One shows national trade deficits and surpluses in dollars; the other shows them as a share of the nation's GDP. Of course, the 2019 figures are projections.

The US trade deficit is large, both in absolute dollars (-$469 billion in 2018) and as a share of GDP (-2.3% of GDP). Indeed, the imposition of tariffs by the Trump administration in 2018 is projected to be followed by a larger US trade deficit in 2019--which would tend to confirm the standard lesson that trade deficits and surpluses result from underlying macroeconomic patterns of  domestic consumption, saving, and investment, not from trade agreements.

But China's trade surplus is not especially large, at about $49 billion in 2018, which is 0.4% of China's GDP. Indeed the IMF is projecting that China's small trade surpluses will turn into trade deficits by 2024. As China's population ages, it has been shifting toward becoming a higher-consumption society, and its trade surpluses have fallen accordingly.

What other countries have large trade deficits, like the US? And if it's not China, what countries have the large trade surpluses?

The US has by far the biggest trade deficit in absolute terms. When measured as a size of its economy, however, the US trade deficit is smaller than the trade deficits in the United Kingdom, Canada, South Africa (or the nations of sub-Saharan Africa as a group) and India.

When it comes to trade surpluses, the absolute size of the surpluses in Germany (+$403 billion), Japan (+$173 billion), Russia (+$114 billion) and Italy (+$53 billion) all outstrip the size of China's trade surplus (+$49 billion) in 2018. These economies are also smaller in size than China's, so as a percentage of GDP, their trade surpluses are larger than China's. Also, trade surpluses for the "other advanced economies" is large. This group is made up of the advanced economies outside the Group of Seven countries listed in the table and outside the euro area, so examples would include Korea, Australia, Norway, Sweden, Taiwan, and others.


One final note: Measures of international trade flows are imperfect. An obvious illustration of this point is that the world balance of trade must always be zero, by definition, because exports from any one location are imports for some other location. However, these tables show the world as having an overall trade surplus projected at $154 billion in 2019.

Monday, April 8, 2019

Have the Identification Police Become Overly Intrusive?

Every intro statistics class teaches "correlation is not causation"--that is, because two patterns consistently move together (or consistently opposite), you can't jump to a conclusion that A causes B, B causes A, some alternative factor C is affecting both A and B, or that among all the millions of possible patterns you can put side-by-side, maybe the correlation between this specific A and B is just a fluky coincidence. 

As part of the "credibility revolution" in empirical economics, researchers in the last 20 years or so have become much more careful in thinking about what kind of a study would demonstrate causality. For example, one approach is to set up an experiment in which some people are randomly assigned to a certain program, while others are not. For example, here are discussions of experiments about the effectiveness of preschool, health insurance, and subsidized employment. Another approach is to look for real-world situations where some randomness exists, and then use that as a "natural experiment." As an example, I recently wrote about research on the effects of money bail which take advantage of the fact that defendants are randomly assigned to judges, some of who are tougher or more lenient in granting bail. Or in certain cities, admission to oversubscribed charter schools uses a lottery, so some students are randomly in the school and others are not. Thus, one can study the effects of bail based on this randomness.

This search for an underlying random factor that allows a researcher to obtain an estimate of an underlying cause is called "identification." It's hard to overstate how much this change has affected empirical work in economics. Pretty much every published paper or seminar presentation has a discussion of the "identification strategy." If you present correlations without such a strategy, you need to be very explicit that you are not drawing any causal inferences, just describing some patterns in the data.

There's not any dispute that this greater thoughtfulness about how to infer causality is overall a good thing. However, one can question whether it has gone too far. Christopher J. Ruhm raised this question in his "Presidential Address: Shackling the Identification Police?" given to the Southern Economic Association last November. The talk doesn't seem to be freely available online, but it has now been published in the April 2019 issue of the Southern Economic Journal (85:4, pp. 1016–1026) and is also available as an NBER working paper.

There are two main sets of concerns about the focus on looking for sources of experimental or natural randomness, as a way of addressing issues about causality. One is that these approaches have issues of their own. For example, imagine a study where people volunteer to be in a program, and then are randomly assigned. It might easily be true that the volunteers are not a random sample of the entire population (after all, they are the ones with connections to hear about the study and motivation to apply), and so the results of as study based on such a group may not generalize to the population as a whole. Ruhm acknowledges these issues, but they are not his main focus.

Ruhm's concern is that when research economists obsess over the issue of identification and causality, they can end up focusing on small questions where they have a powerful argument for causality, but ignoring large questions where getting a nice dose of randomization so that causality can be inferred is difficult or even impossible. Ruhm writes:
I sent out the following query on social media (Facebook and Twitter) and email: “I would like to get your best examples of IMPORTANT microeconomic questions (in labor/health/public/environmental/education etc.) where clean identification is difficult or impossible to obtain.” Responses included the following.
  • Effects of trade liberalization on the distribution of real wages.
  • Contributions of location, preferences, local policy decisions, and luck to geographic differences in morbidity and mortality rates.
  • Effects of the school climate and work environment on teacher and student outcomes.
  • Importance of norms on firms’ wage setting.
  • Extent to which economic factors explain the rise in obesity.
  • Impact of family structure on child outcomes.
  • Effects of inequality, child abuse, and domestic violence on later life outcomes.
  • Social cost of a ton of SO2 emissions.
  • Effect of race on healthcare use.
  • Effect of climate change on agricultural productivity.
Ruhm argues that for a number of big picture questions, an approach which starts by demanding a nice clear source of randomness for clear identification of a causal factor is going to be too limiting. It can look at slices of the problem, but not the problem as a whole. He writes (footnotes and citations omitted):
For a more concrete indication of the value and limitations of experimental and quasiexperimental approaches, consider the case of the fatal drug epidemic, which is possibly the most serious public health problem in the United States today. To provide brief background, the number of U.S. drug deaths increased from 16,849 in 1999 to 63,632 in 2016 and they have been the leading cause of injury deaths since 2009. The rise in overdose mortality is believed to have been initially fueled by enormous increases in the availability of prescription opioids, with more recent growth dominated by heroin and fentanyl. However, some researchers argue that the underlying causes are economic and social decline (rather than supply factors) that have particularly affected disadvantaged Americans. What role can different methodological approaches play in increasing our understanding of this issue? 
RCTs [randomized control trials] could be designed to test certain short-term interventions—such as comparing the efficacy of specific medication-assisted treatment options for drug addicts—but probably have limited broader applicability because randomization will not be practical for most potential policies and longer term effects will be difficult to evaluate. Quasi-experimental methods have provided useful information on specific interventions such as the effects of prescription drug monitoring programs and the effects of , like the legalization of medical marijuana. However, the challenges of using these strategies should not be understated because the results often depend on precise characteristics of the policies and the timing of implementation, which may be difficult to ascertain in practice. Moreover, although the estimated policy impacts are often reasonably large, they are dwarfed by the overall increase in fatal drug overdoses. 
Efforts to understand the root causes of the drug epidemic are therefore likely to be resistant to clean identification and instead require an “all of the above” approach using experimental and quasiexperimental methods where possible, but also the accumulation evidence from a variety of data sources and techniques, including descriptive and regression analyses that in isolation may fail to meet desired standards of causal inference but, hopefully, can be combined with other investigations to provide a compelling preponderance of evidence. 
The relationship between smoking and lung cancer provides a striking example of an important question that was “answered” using strategies that would be viewed as unacceptable today by the identification police. The understanding of tobacco use as a major causal factor was not based upon RCTs involving humans but rather resulted from the accretion of evidence from a wide variety of sources including: bench science, animal experiments, and epidemiological evidence from nonrandomized prospective and retrospective studies. Quasi-experimental evidence was eventually provided (e.g., from analyses of changes in tobacco taxes) but long after the question had been largely resolved. 
To summarize, clean identification strategies will frequently be extremely useful for examining the partial equilibrium effects of specific policies or outcomes—such as the effects of reducing class sizes from 30 to 20 students or the consequences of extreme deprivation in-utero—but will often be less successful at examining the big “what if ” questions related to root causes or effects of major changes in institutions or policies.
In summing up, Ruhm writes:
Have the identification police become too powerful? The answer to this question is subjective and open to debate. However, I believe that it is becoming increasingly difficult to publish research on significant questions that lack sufficiently clean identification and, conversely, that research using quasi-experimental and (particularly) experimental strategies yielding high confidence but on questions of limited importance are more often being published. In talking with PhD students, I hear about training that emphasizes the search for discontinuities and policy variations, rather than on seeking to answer questions of fundamental importance. At professional presentations, experienced economists sometimes mention “correlational” or “reduced-form” approaches with disdain, suggesting that such research has nothing to add to the canon of applied economics.
Thus, Ruhm is pointing to a tradeoff. Researchers would like to have a study with a strong and defensible methodology, and also a study that addresses a big and important question. Tackling a big question by looking at a bunch of correlations or other descriptive evidence is going to have some genuine limitations--but at least it's looking at fact patterns about a big question. Using a great methodology to tackle a small question will never provide more than a small answer--although there is of course a hope that if lots of researchers use great methods on small questions, the results may eventually form a body of evidence that supports broader conclusions. My own sense is that the subject of economics is hard enough to study that researchers should be willing to consider, with appropriate skepticism, a wide array of potential sources of insight.

Friday, April 5, 2019

Four Snapshots of China's Growth and Inequality

Here's China's share of world population and the global economy since the start of its economic reforms. Since 1978, China's share of world population has declined mildly from 23% to about 19%. In those same 40 yeas, China's share of world GDP has risen dramatically from 3% to about 20%
Here's a sens of this economic growth on a per adult basis. The vertical axis is in yuan, so for US readers one might want to divide by the exchange rate of roughly 6.5 yuan/dollar. But look at the annual growth rates of national income per adult--especially the average of 8.1% per year from 1998-2015.  

These images are taken from an article by Thomas Piketty, Li Yang and Gabriel Zucman, "Income inequality is growing fast in China and making it look more like the US: Study provides the first systematic estimates of the level and structure of China’s national wealth since the beginning of market reforms," which appears at the LSE Business Review website (April 1, 2019). It's a preview of their forthcoming research article in the American Economic Review. The main focus of their research has been to look at income and wealth inequality--and in particular, data on patterns of wealth in China has been hard to find. 

Here's the pattern of  China's national wealth over time, expresses as a share of national income. Wealth includes the value of companies, the value of the housing stock, and other assets. China's wealth as a share of national income has almost doubled since 1978. And all of the increase is a result of rising wealth by households, not government. 
Finally, here's a figure showing China's shift in income inequality over time. China's economic growth has meant a larger share of income for the top 10%, and a falling share for the bottom 50%. 
The authors write: "To summarise, the level of inequality in China in the late 1970s used to be less than the European average – closer to those observed in the most egalitarian Nordic countries – but they are now approaching a level that is almost comparable with the USA." Of course, it's important to remember that in a Chinese economy that has been growing rapidly for decades, this doesn't means that the bottom 50% have had stagnant growth in income or are actually worse off in absolute terms. It just means that the growth in incomes for the bottom half hasn't been as rapid as for the top 10%. 

Thursday, April 4, 2019

"Bias Has Been Overestimated at the Expense of Noise:" Daniel Kahneman

Daniel Kahneman (Nobel 2002) is of course known for his extensive work on behavioral biases and how they affect economic decisions. He's now working on a new book, together with Olivier Sibony and Cass Sunstein, in which he focuses instead on the concept of "noise," and argues that 

Here's a précis of Kahneman's current thinking on this and other topics, drawn from an interview with Tyler Cowen (both video and a transcript are available at "Daniel Kahneman on Cutting Through the Noise," December 19, 2018).
KAHNEMAN: First of all, let me explain what I mean by noise. I mean, just randomness. And it’s true within individuals, but it’s especially true among individuals who are supposed to be interchangeable in, say, organizations. ...
I’ll tell you where the experiment from which my current fascination with noise arose. I was working with an insurance company, and we did a very standard experiment. They constructed cases, very routine, standard cases. Expensive cases — we’re not talking of insuring cars. We’re talking of insuring financial firms for risk of fraud.
So you have people who are specialists in this. This is what they do. Cases were constructed completely realistically, the kind of thing that people encounter every day. You have 50 people reading a case and putting a dollar value on it.
I could ask you, and I asked the executives in the firm, and it’s a number that just about everybody agrees. Suppose you take two people at random, two underwriters at random. You average the premium they set, you take the difference between them, and you divide the difference by the average.
By what percentage do people differ? Well, would you expect people to differ? And there is a common answer that you find, when I just talk to people and ask them, or the executives had the same answer. It’s somewhere around 10 percent. That’s what people expect to see in a well-run firm.
Now, what we found was 50 percent, 5–0, which, by the way, means that those underwriters were absolutely wasting their time, in the sense of assessing risk. So that’s noise, and you find variability across individuals, which is not supposed to exist.
And you find variability within individuals, depending morning, afternoon, hot, cold. A lot of things influence the way that people make judgments: whether they are full, or whether they’ve had lunch or haven’t had lunch affects the judges, and things like that.
Now, it’s hard to say what there is more of, noise or bias. But one thing is very certain — that bias has been overestimated at the expense of noise. Virtually all the literature and a lot of public conversation is about biases. But in fact, noise is, I think, extremely important, very prevalent.
There is an interesting fact — that noise and bias are independent sources of error, so that reducing either of them improves overall accuracy. There is room for . . . and the procedures by which you would reduce bias and reduce noise are not the same. So that’s what I’m fascinated by these days.
Now, it’s hard to say what there is more of, noise or bias. But one thing is very certain — that bias has been overestimated at the expense of noise. Virtually all the literature and a lot of public conversation is about biases. But in fact, noise is, I think, extremely important, very prevalent. ...
COWEN: Do you see the wisdom of crowds as a way of addressing noise in business firms? So you take all the auditors, and you somehow construct a weighted average? ...

KAHNEMAN: With respect to the underwriters, I would expect, certainly, that if you took 12 underwriters assessing the same risk, you would eliminate the noise. You would be left with bias, but you would eliminate one source of error, and the question is just price. Google, for example, when it hires people, they have a minimum of four individuals making independent assessments of each candidate. And that reduces the standard deviation of error at least by a factor of two.
COWEN: So is the business world, in general, adjusting for noise right now? Or only some highly successful firms?
KAHNEMAN: I don’t know enough about that. All I do know is that, when we pointed out the results, the bewildering results of the experiment on underwriters, and there was another unit — people who assess the size of claims. Again, actually, it’s more than 50 percent. Like 58 percent. The thing that was the most striking was that nobody in the organization had any idea that this was going on. It took people completely by surprise.
My guess now, that wherever people exercise judgment, there is noise. And, as a first rule, there is more noise than people expect, and there’s more noise than they can imagine because it’s very difficult to imagine that people have a very different opinion from yours when your opinion is right, which it is. ...
COWEN: If you’re called in by a CEO to give advice — and I think sometimes you are — how can I reduce the noise in my decisions, the decisions of the CEO, when there’s not a simple way to average? The firm doesn’t have a dozen CEOs. What’s your advice? ...
KAHNEMAN: [T]here is one thing that we know that improves the quality of judgment, I think. And this is to delay intuition. ... Delaying intuition until the facts are in, at hand, and looking at dimensions of the problem separately and independently is a better use of information.

The problem with intuition is that it forms very quickly, so that you need to have special procedures in place to control it except in those rare cases ...  where you have intuitive expertise. That’s true for athletes — they respond intuitively. It’s true for chess masters. It’s true for firefighters ... I don’t think CEOs encounter many problems where they have intuitive expertise. They haven’t had the opportunity to acquire it, so they better slow down. ... It’s not so much a matter of time because you don’t want people to get paralyzed by analysis. But it’s a matter of planning how you’re going to make the decision, and making it in stages, and not acting without an intuitive certainty that you are doing the right thing. But just delay it until all the information is available.
COWEN: And does noise play any useful roles, either in businesses or in broader society? Or is it just a cost we would like to minimize?
KAHNEMAN: There is one condition under which noise is very useful. If there is a selection process, evolution works on noise. You have random variation and then selection. But when there is no selection, noise is just a cost. ... Bias and noise do not cover the universe. There are other categories.

Replacing LIBOR: An International Overview

LIBOR stands for "London Interbank Offered Rate." For a long time, it was probably most common  benchmark interest rate in the world--that is, it was the built into trillions of dollars worth of loans and financial contracts that if the LIBOR interest rate went up or down, the contract would adjust accordingly.

However, a huge scandal erupted back in 2010. Turns out that the LIBOR was not based on actual market transactions; instead, LIBOR was based on a survey in which someone at a bank gave a guess on what interest rate their bank would be charged if the bank wanted to borrow short-term from another bank on a given morning, in a particular currency. A few of the people responding to the survey were intentionally giving answers that pulled LIBOR up just a tiny bit one day, or pulled it down a tiny bit another day. Given that the LIBOR was linked to trillions of dollars in financial contracts, market traders who knew in advance about these shifts could and did reap fraudulent profits.

LIBOR tightened up its survey methods. But it clearly made sense to shift away from using a benchmark interest rate based on a survey, and instead to use one based on an actual market for short-term low-risk borrowing. Various committees formed to consider options. As I noted here about six weeks ago, the US is switching from LIBOR to SOFR--the Secured Overnight Financing Rate. I wrote: "It refers to the cost of borrowing which is extremely safe, because the borrowing is only overnight, and there are Treasury securities used as collateral for the borrowing. The SOFR rate is based on a market with about $800 billion in daily transactions, and this kind of overnight borrowing doesn't just include banks, but covers a wider range of financial institutions. The New York Fed publishes the SOFR rate every morning at 8 eastern time."

But what about the switch away from LIBOR in the rest of the world? Andreas Schrimpf and Vladyslav Sushko describe what's happening in "Beyond LIBOR: a primer on the new benchmark rates," which appears in the March 2019 issue of the BIS Quarterly Review (pp. 29-52). Here's  table showing the alternative risk-free rate (RFR) benchmarks being used with other currencies.
There are several big issues ahead in this area. One is that the LIBOR is actually going to be discontinued in 2021, so any loan or financial contract with a benchmark rate will have to migrate to something else. There will be literally trillions of dollars of contracts that need to shift in this way. Moreover, the LIBOR debacle has made a lot of financial industry participants think more carefully about exactly what benchmark interest rate may be appropriate in any given contract--for example, an appropriate benchmark might include not only an overnight risk-free rate, but also some built-in adjustment for other kinds of risks, including risks over different periods of time or risks at the firm or industry level.

For most of us, discussions of benchmark interest rates have a high MEGO (My Eyes Glaze Over) factor. But when I think in terms of trillions of dollars of loans and financial contracts around the world, all being adjusted in ways that are thoughtful but untested, I find it easier to pay attention to the subject.

Wednesday, April 3, 2019

Some LGBT Economics in High Income Countries

Every few years, the OECD puts out an its Society at a Glance report. The later chapters offer  comparisons across high-income countries on a variety of economic, demographic, health, education, and social variables. For the 2019 edition, the first chapter is on the more specific topic,  "The LGBT challenge: How to better include sexual and gender minorities?" Here, I'll focus on labor market issues, but there is more in the chapter on other issues.

This figure summarizes the results of 46 studies of differences in employment and wages for LBGT people across the OECD countries. The usual method in these studies is to adjust for lots of observable factors: age, education, race/ethnicity, children in the household, hours worked, occupation/industry, location (like urban or rural), and so on.  The horizontal axis shows various groups. The figure then shows the gap that remains after adjusting for these factors, in employment rates, labor earnings, and the extent to which this group is found in high managerial roles.
A common pattern in these studies is that gaps look worse for men than for women. Indeed, LGB women or just Lesbians as a group have positive employment and labor earnings gaps compared to the rest of the population.

As social scientists have long recognized, this kind of "gap" study suggests the presence of discrimination, but it doesn't prove either that discrimination exists or, just as important, it doesn't point to the main locus of discrimination. The "gap" doesn't measure anything directly: it's just what is left over after accounting for the other factors on which data existed.

Even if the gap does result from discrimination, a "gap" study is uninformative about whether the main force of that discrimination hits early in life, perhaps in treatment in schools and families, or whether it's mainly because of discrimination by employers later in life.  The OECD study offers a useful example: "For instance, the fact that lesbians and gay men in Sweden display lower employment rates in regions with more hostile attitudes toward homosexuals may simply reflect that more productive lesbians and gay men are more likely to move out of regions showing low acceptance of homosexuality."

Thus, it's common to complement "gap" studies with other methods that provide more direct evidence of discriminatory behavior. For example, in a "correspondence" study, the researcher sends out job applications to real job ads. The applications are functionally identical, except that some of the applications include evidence some information that could  lead an employer to infer sexual orientation or gender identity--say, listing a membership in a certain volunteer organization, or giving the name of a job candidate's partner in a way that is likely to lead to inferences about the sex of that partner.  The OECD describes the results of 13 studies across 10 countries taking this approach:
Homosexual female and male applicants are 1.5 times less likely to be invited to a job interview when sexual orientation is conveyed through their volunteer engagement or work experience in a gay and/or lesbian organisation. By contrast, insisting on the family prospects of female fictitious candidates by signalling homosexuality through the sex of the candidate’s partner leads to the virtual disappearance of hiring discrimination against lesbians. This pattern could reflect that employers attach a lower risk of maternity to lesbians relative to heterosexual women and are therefore less inclined to discriminate against them ...
Correspondence studies have also been done in the market for rental housing--that is, applying for apartments rather than jobs.
In the rental housing market, correspondence studies show that homosexual couples get fewer responses and invitations to showings from the landlords than heterosexual couples, a result mainly driven by male same-sex partners – see Ahmed, Andersson, & Hammarstedt (2008[38]) and Ahmed & Hammarstedt (2009[39]) in Sweden; Lauster & Easterbrook (2011[40]) in Canada; U.S. Department of Housing and Urban Development (2013[41]) in the United States and Koehler, Harley, & Menzies (2018[24]) in Serbia. In Serbia, for instance, almost one in five (18%) of same-sex couples were refused rental of an apartment by the landlord, while none of the opposite-sex couples were. This average result masks strong disparities by gender: 29% of male same-sex couples wererejected, as opposed to only 8% of female same-sex couples. The absence (or lower magnitude) of discrimination against female same-sex couples could flow from landlords’ well documented preference for female rather than male tenants (Ahmed, Andersson and Hammarstedt, 2008[38]). In this setting, the benefit of having two women as tenants could counterbalance the perceived cost of renting to a lesbian couple.

Other experiments create situations in which people who need help in some way: asking for money, or a "wrong number" or "lost letter" approach.

"In the United Kingdom, various experiments have also involved actors wearing a T-shirt with either a pro-gay slogan or without any slogan. These actors approach passers-by asking them to provide change. The findings point to less help provided to the ostensibly pro-gay person."

In the "wrong-number" approach, households get a call from someone who says their car has broken down, they are at a payphone, they are out of change, and now they have called a wrong number. They ask the person receiving the call to make a call to their boyfriend or girlfriend. Those who ask for a call to placed to someone of the opposite sex are more likely to get help than someone who asks for a call to be placed to someone of the same sex.

In the  “lost-letter technique,” a number of unmailed letters, with addresses and stamps are dropped in city streets. Some of the letters are addressed to LGBT organizations; some are not. Those that are not are more likely to be dropped in the mail by whoever finds them.

For most noneconomists, discrimination is just morally wrong, and that's enough. Economics can add that discrimination also leads to underuse and misallocation of society's human resources, which imposes costs on the economy as a whole.

Tuesday, April 2, 2019

The Unfairness of Money Bail

About 40 years ago, when I was a junior on the high school debate team, we argued for the abolition of the money bail system. Like many positions taken by high school juniors in debate tournaments, our arguments were sweeping and simplistic. But we were correct in recognizing that there are real problems with money bail.

As one example, 14 elected prosecutors wrote a joint letter to New York state lawmakers on March 6, 2019.  The prosecutors  wrote:
We support ending money bail because safety, not wealth, should be the defining feature of the pretrial justice system. Three out of every four people in New York cannot afford to pay the bail amount that the judge sets at their arraignment. That means many people are jailed simply because they are too poor to purchase their freedom. … The only people who should be detained pretrial are those who a judge finds pose a specific, clear and credible threat to the physical safety of the community, or who are a risk of intentionally evading their court dates. Jails across New York frequently are over-capacity, and they are filled with people who do not need to be there. … Research shows that people who spend even a short period in jail, as opposed to being released pretrial, are more likely to commit a future crime. This makes sense. Jail is traumatizing. Jobs are lost. Families can’t pay rent. For reasons big and small, people who are away from their family, their job, and their community become more vulnerable and less stable.

Patrick Liu, Ryan Nunn, and Jay Shambaugh provide a useful backgrounder on this subject in "The Economics of Bail and Pretrial Detention," written for the Hamilton Project at the Brookings Institution (December 2018). Will Dobbie and  Crystal Yang have now offered "Proposals for Improving the U.S. Pretrial System," written as a Hamilton Project Policy (March 2019).  Here's an overview comment from the conclusion of the Liu, Nunn, and Shambaugh paper:
"Bail has been a growing part of the criminal justice system. Nonfinancial release has been shrinking, and more and more defendants are using commercial bonds as a way to secure their release while awaiting trial. Bail can make it more likely that defendants will reappear in court, and as such reduce costs for the criminal justice system. There are, however, extensive costs. Beyond the direct costs of posting the bail, either from paying a fee or having to liquidate assets, widespread use of bail has meant that many people are incarcerated because they are unable to post bail.

"Nearly half a million people are in jail at any given time without having been convicted of a crime. The overwhelming majority of these people are eligible to be released—that is, a judge has deemed that they are safe to be released—but are unable to raise the funds for their release. The impact of monetary bail falls disproportionately on those who are low-income, cannot post bail out of liquid assets, and thus often remain in jail for extended periods. Furthermore, as a growing body of literature has shown, the assignment of financial bail increases the likelihood of conviction due to guilty pleas, and the costs—to both individuals and society— from extra convictions can be quite high."
Let's spell some of this out more explicitly.

Nearly half a million people are incarcerated on any given day without having been convicted of a crime. Add it all up, and over 10 million people during a given year year are locked up without being convicted of anything. Roughly one-quarter of all inmates in state and local jails have not been convicted. Here's a figure from Liu, Nunn, and Shambaugh:


In the last few decades, the use of money bail has been rising. As Dobbie and Yang write (figures and references omitted):
The high rate of pretrial detention in the United States in recent years is largely due to the increasing use of monetary or cash bail—release conditional on a financial payment—and the corresponding decreasing use of release on recognizance (ROR), a form of release conditional only on one’s promise to return to the court. The share of defendants assigned monetary bail exceeded 40 percent in 2009 in the set of 40 populous U.S. counties where detailed data are available, an 11 percentage point–increase from 1990. The fraction of defendants released on their own recognizance decreased by about 13 percentage points over the same period in these counties, with only 14 percent of defendants being released with no conditions in 2009. The widespread use of monetary bail directly leads to high pretrial detention rates in most jurisdictions because many defendants are unable or unwilling to pay even relatively small monetary bail amounts. In New York City, for example, an estimated 46 percent of all misdemeanor defendants and 30 percent of all felony defendants were detained prior to trial in 2013 because they were unable or unwilling to post bail set at $500 or less.
The time that accused people spend in pretrial detention can be significant. Liu, Nunn, and Shambaugh write:
[T]the amount of time that a person is detained if they are unable to afford bail is substantial, ranging from 50 to 200 days, depending on the felony offense. The pretrial detention period is also growing ...  From 1990 to 2009, the median duration of pretrial detention increased for every offense, ranging from an increase of 34 percent for burglary to 104 percent for rape.  ... Even for durations that are relatively short—for example, 54 days for those accused of a driving-related felony—pretrial detention represents a nearly two-month period during which individuals are separated from their families and financial hardships are exacerbated. Moreover, the typical wait until trial is much longer in some places than others (e.g., 200 days in one sample of Pennsylvania counties).
Dobbie and Yang also point out that when it comes to international comparisons, the US locks accused people at a much higher rate before trial than other countries. This figure shows the number of people detained pre-trial per 100,000 population. The US is the tallest bar on the far right.

Sorting out the costs and benefits of different levels of pretrial detention isn't easy. The direct costs of holding people in jails and prisons is straightforward. But how many of those accused people would not have appeared before the court? If they did not appear for court, how would the cost of finding them have compared to the cost of locking them up for days--and in some cases for weeks or even months? What are the additional costs of being locked up in terms of loss of employment opportunities, or stresses on families? How many of those would have committed crimes if not detained? (And how comfortable are we as a society with locking people up not because they have been convicted of a crime, but because we suspect they might commit a crime in the future?) When thinking about conditions of pretrial release, judges are supposed to take all of this account: for example, the presumption that an accused person is innocent, the risk of the person not showing up for trial and the costs of finding them, the risk of the person committing another crime if they are not detained, the person's social ties to the community, the person's economic ability to put up a monetary bond.

To figure out the effects of different methods of pretrial detention, a social scientist might ideally like to take a large pool of people accused of crimes and conduct a randomized experiment, in which some randomly get offered differing levels of money bail, some are released on their own recognizance, and we see what happens. While it would be grossly inappropriate for the justice system to plan to operate in this way, it turns out that this randomized experiment is being conducted by reality.

Decisions about whether to offer bail, or at what level, are not made consistently across the judicial system. When there are multiple judges in a given court, some will tend to be tougher in granting bail and some will be easier, so whether defendants like it or not, they are living in a randomized experiment depending on the judge to whom they are randomly assigned. In addition, the evidence show that even the same judge will treat accused people with seemingly identical characteristics in the same way, which adds another element of randomness. Thus, research in this area can start by setting aside those who are essentially always granted bail or essentially never granted bail, and instead focus on those with seemingly identical characteristics who are more-or-less randomly granted bail in some cases but not in others.

Dobbie and Yang have been among the leading researchers in this area, and they describe the results of this research in their paper.  For example:

Those who are detailed pretrial are more likely to be found guilty, mainly because those who are detailed pretrial are more likely to take a plea bargain--which may include credit for time already served. Pretrial detention clearly reduces the risk of pretrial flight and pretrial crime, but at least in some studies, greater exposure to jail time before the trial is associated with a rise in posttrial crime. Defendants who because of the randomness ins the system are released pretrial, rather than being held pretrial, are more likely to have income and to be employed 2-4 years later.  In some jurisdictions, the randomness in the process of granting bail takes the form of racial disparities.

For those of us who aren't ready to take the plunge and eliminate the money bail system altogether, what might we do to move in the direction of reducing the use of money bail and rationalizing the system? Dobbie and Yang offer some proposals based on the existing research.

Some are pretty simple. When defendants are released on their own recognizance before trial, set up a system of text message or emails to remind them of their court date. For low-risk crimes, make greater use of writing citations, rather than arresting people, and when people are arrested, lean toward releasing them on their own recognizance. For those defendants where a higher degree of monitoring seems appropriate, make greater use of electronic or personal monitoring.

Some more complex proposals involve machine learning. It's now possible to plug the data on the characteristics of those who get bail, or are released on their recognizance, into a computer algorithm, which can look for patterns in those who are more or less likely to flee before trial, or more or less likely to commit crimes. The feedback based on studies can then be turned over to judges, so they have a systematic sense of  how they ruled in past similar cases, and how they compare with how other judges have ruled in similar cases. It's easy to feel queasy about this approach. Are we going to let the result of computer number-crunching play a substantial role in whether people are granted bail? But computer number-crunching may have greater clarity and consistency in its decisions than at least some judges, and could help produce results that both let more people out before trial while also leading to lower pretrial flight risks and crime. Pilot tests along these lines in jurisdictions willing to give it a try seem warranted.