Thursday, August 13, 2020

Sounding an Alarm on Data Mining

Back in 2013, someone going under the nom de plume of Economist Hulk wrote on Twitter (all caps, natch): "WHEN FACTS CHANGE, HULK SMASH FACTS UNTIL THEY FIT HIS PRE-CONCEIVED THEORY. HULK CALL THIS ‘ECONOMETRICS’."

Gordon Tullock offered an aphorism in a similar spirit, which he attributed to verbal comments from Ronald Coase ("A Comment on Daniel Klein's "A Plea to Economists Who Favor Liberty,"  Eastern Economic Journal , Spring, 2001, 27: 2, pp. 203- 207). Tullock wrote: "As Ronald Coase says, `if you torture the data long enough it will confess.'"

Ronald Coase (Nobel 1991) put the point just a little differently in a 1981 lecture, "How Should Economists Choose?", while attributing a similar point to Thomas Kuhn. Coase wrote (footnotes omitted): 
In a talk I gave at the University of Virginia in the early 1960s ... I said that if you torture the data enough, nature will always confess, a saying which, in a somewhat altered form, has taken its place in the statistical literature. Kuhn puts the point more elegantly and makes the process sound more like a seduction: "nature undoubtedly responds to the theoretical predispositions with which she is approached by the measuring scientist." I observed that a failure to get an exact fit between the theory and the quantitative results is not generally treated as calling for the abandonment of the theory but the discrepancies are put on one side as something calling for further study. Kuhn says this: "Isolated discrepancies ... occur so regularly that no scientist could bring his research problems to an end if he paused for many of them. In any case, experience has repeatedly shown that in overwhelming proportion, these discrepancies disappear upon closer scrutiny ." Because of this, Kuhn argues that "the efficient procedure " is to ignore them, a conclusion economists will find it easy to accept.
Gary Smith offers an overview of what these issues are all about in "Data Mining Fool's Gold" (Journal of Information Technology, posted "Online First" as a forthcoming paper on May 22, 2020, subscription needed for access). Smith offers what he calls "the paradox of big data":
It is tempting to believe that patterns are unusual and their discovery meaningful; in large data sets, patterns are inevitable and generally meaningless. ... Data-mining algorithms—often operating under the label artificial intelligence—are now widely used to discover statistical patterns. However, in large data sets streaks, clusters, correlations, and other patterns are the norm, not the exception. While data mining might discover a useful relationship, the number of possible patterns that can be spotted relative to the number that are genuinely useful has grown exponentially—which means that the chances that a discovered pattern is useful is rapidly approaching zero. This is the paradox of big data:
It would seem that having data for a large number of variables will help us find more reliable patterns; however, the more variables we consider, the less likely it is that what we find will be useful.
Along with useful background discussion, Smith offers some vivid examples of data mining gone astray. For instance, Smith put a data-mining algorithm to work on President Donald Trump's tweets in the first three years of his term. He found: 
It turned out that the S&P 500 index of stock prices is predicted to be 97 points higher 2 days after a one-standard-deviation increase in the Trump’s use of the word president, ... my 10-fold cross-validation data-mining algorithm discovered that the low temperature in Moscow is predicted to be 3.30°F higher 4 days after a one-standard-deviation increase in Trump’s use of the word ever, and that the low temperature in Pyongyang is predicted to be 4.65°F lower 5 days after a one-standard-deviation increase in the use of the word wall. ... I considered the proverbial price of tea in China. I could not find daily data on tea prices in China, so I used the daily stock prices of Urban Tea, a tea product distributer headquartered in Changsha City, Hunan Province, China, with retail stores in Changsha and Shaoyang that sell tea and tea-based beverages. The data-mining algorithm found that Urban Tea’s stock price is predicted to fall 4 days after Trump used the word with more frequently.
Indeed, Smith created a random variable, put it into the date-mining algorithm, and found: 
The data-mining algorithm found that a one-standard deviation increase in Trump’s use of the word democrat had a strong positive correlation with the value of this random variable 5 days later. The intended lessons are how easy it is for data-mining algorithms to find transitory patterns and how tempting it is to think up explanations after the fact. ... That is the nature of the beast we call data mining: seek and ye shall find.
At this point, a common response is something like: "Well, of course it's possible to find correlations that don't mean anything. What about the correlations that do mean something?" The response is true enough in some sense: perhaps in looking at a bucket-full of correlations, one of them may suggest a theory that can be tested in various ways to see if it has lasting power. But also notice that when you start talking making judgments that certain statistical findings "mean something" and other statistical findings derived in exactly the same way do not "mean something," you aren't actually doing statistics any more. The data isn't telling you the answer: you are deciding on other grounds what the answers are likely to be. 

Smith's example about Trump's tweets is not just a hypothetical example. Several studies have tried to find whether Trump's tweets--or a selection of Google search terms, or other big data sets--would cause the stock market to rise or fall in some systematic way. 

Two other examples from Smith involve investment companies called Equabot and Voleon. They were going to run their investments with data-mining tools, not human intuition. But a funny thing happened on the way to the profits: returns at both companies somewhat underperformed the S&P 500. One plausible explanation is that the data-mining algorithms kept finding correlations that weren't lasting or valid, so when the algorithm made investments on the basis of those correlations, it was either investing mostly randomly, like an index fund, or even counterproductively. 

Or you may remember that about 10 years back, Google Flu Trends sought to predict flu outbreaks by looking at Google searches. The early evidence looked promising: "However, after issuing its report, Google Flu Trends over-estimated the number of flu cases for 100 of the next 108 weeks, by an average of nearly 100% (Lazer et al., 2014). Google Flu Trends no longer makes flu predictions." As another example, a British insurance company in 2016 decided to base its car insurance rates on data mined from Facebook posts, and found that it was charging different rates according to whether people were more likely to mention Michael Jordan or Leonard Cohen, a process "which humans would recognize as ripe with errors and biases." 

I do not mean to imply that using algorithms to search systematically through big data are not useful,  just that their results must be interpreted with care and precision. Otherwise, ridiculousness can result. For some discussions of how big data and machine learning can be used in economics, starting points from the journal that I edit include: 

Tuesday, August 11, 2020

The Rise of the US Social Insurance State

One sometimes hears a reference to "the rise of the welfare state," but for the US in the 20th century, the pattern is more accurately described as a "rise of the social insurance state"--which isn't quite the same thing. Price V. Fishback uses this insight as a stepping-stone in "Social Insurance and Public Assistance in the Twentieth-Century United States," which he delivered as the Presidential Address to the Economic History Association last year, and now has been published in the Journal of Economic History (June 2020,  80:2, pp. 311-350).  Fishback's argument unfolds in three parts: 

First, what many people call the rise in the welfare state in the United States might better be described as the “Rise of the Social Insurance State.”  “Rise of the Welfare State” could be seen as a rise in public assistance  transfers to the poor. Yet, such transfers have not been very large relative to gross domestic product (GDP). A very large share of the expansion in social welfare spending has come from social insurance that covers people throughout the income distribution. The social insurance programs were much easier to sell politically and carried much less social stigma because participants felt that they or their employers had paid for their benefits up front.
Here's an illustrative figure. Social insurance spending in the 20th century goes from 1% of GDP to 10%. If one includes the rise in private health insurance and private pensions--both encouraged by the government through favorable tax treatment and friendly regulations--the rise of the social insurance state would look even bigger.  In contrast, work relief (during the Great Depression) and public assistance programs or the poor (in the later 20th century) are much smaller.  


Fishback's second point suggests that total US spending on "social welfare" categories is similar to that of European countries--the US just has more of this spending happening in the private sector. He writes: 
Second, America is commonly perceived to spend much less on social welfare than many European countries. This perception arises because most comparisons focus on gross public social welfare spending. In fact, after taking into account taxation, public mandates, and private spending, the United States in the late twentieth century spent a higher share on combined private and net public social welfare relative to GDP than did most advanced economies. Americans just did it differently because the governments operated a safety net system that relied to a much greater extent on private insurance and pensions and taxed lower income people less heavily.
Fishback points out in in OECD statistics, "government social welfare expenditures include old-age pensions, survivor benefits (not from private life insurance), incapacity-related aid, health expenditures, aid to families, unemployment benefits, income maintenance, government job training, and housing subsidies." But in the US, of course, a large share of health expenditures and old-age pensions happens through the private sector. When you include both public and private spending in these areas, total US spending looks much more similar to countries like Germany, Sweden, and France. 

This point is well-taken. But of course, it's also true that if total US spending on these social welfare categories is close to these other countries, if lagging a bit behind, and it's also true that US spending on health care as share of GDP is much bigger than these other countries, then it follows that US social welfare spending in the non-health categories must be lagging well behind. 

Finally, Fishback sets out to disentangle some of the complexity that arises because many US social welfare programs are run through the states, which can have different rule. Fishback writes: 

Third, the United States has 51 or more social welfare systems that set target minimum benefits for public assistance for the elderly and families in poverty and maximum benefits for social insurance programs. ... [T]he spending is designed to fill the gap between a target level of living and the household’s resources. ... I develop estimates of these target benefits for each state for several years in the twentieth century and compare them to several measures of economic well-being: the national poverty line before and after adjusting for cross-sectional differences in cost of living, state  manufacturing earnings and benefits, state per capita incomes, and per capita GDPs in countries around the world. The comparisons show that the public assistance targets have been more generous to the elderly poor than to poor families. 

In broad terms, Social Security and Medicare--both of which can be categorized as "social insurance" programs--have been a substantial boost for the elderly poor. However, the social insurance and public assistance programs aimed at families--for example, cash welfare payments, unemployment insurance,  and workman's compensation--pay much less in the United States than in many other countries. 

Monday, August 10, 2020

Randomness in the Location of Tech Clusters

It seems as if every city wants to be the center of a tech cluster. When Amazon announced a couple of years back that it was taking bids from cities for the location of a second headquarters, 238 cities applied. But other than hoping Amazon falls into your lap, how could a city create conditions for a tech cluster to emerge? And given that most cities are not going to be come tech clusters, what's the next-best strategy? William R. Kerr and Frederic Robert-Nicoud discuss what we know about this subject  "Tech Clusters" (Journal of Economic Perspectives, Summer 2020, 34:3. pp. 50-76). 

What are the current US tech clusters? Looking at a variety of definitions, they write: 
Six cities appear to qualify under any aggregation scheme: San Francisco, Boston, Seattle, San Diego, Denver, and Austin all rank among top 15 locations for venture capital and for patents (scale) and hold shares for venture capital, patents, employment in R&D-intensive sectors, and employment in digital-connected occupations that exceed their population shares (density). They also pass our highly rigorous “sniff test”—that is, they just make sense. Washington, Minneapolis-St. Paul, and Raleigh-Durham would join the list if relaxing the expectation that the share of venture investment exceed population share (which is hard due to the very high concentration in San Francisco). New York and Los Angeles are more ambiguous: they hold large venture capital markets (and venture investors frequently declare them leading tech clusters), but their patents and employment shares in key industries and fields are somewhat less than their population shares. Were we to disaggregate these huge metro areas, we would likely identify a sub-region that would independently fit on this short list by still holding sufficient scale and yet having achieved a more recognizable density. Said differently, there is surely a part of New York and Los Angeles that would be stand-alone equal to or greater than Austin (for example, Egan et al. 2017). Chicago’s activity is mostly equal to its population share or less.
What about trying to create a new tech cluster? In looking at why current tech clusters are located where they are, it becomes clear that there is often a large amount of chance involved--that is, decisions made by a few key actors at key moments, sometimes for family reasons, about where an "anchor firm" for the tech cluster would end up being located. For example, before Bill Gates and Paul Allen moved to Seattle, they were living in Albuquerque where Microsoft was based. Kerr and Robert-Nicoud write:  
In most accounts of the origin of tech clusters, such as Klepper’s (2010, 2016) comparisons of Detroit and Silicon Valley, emphasis is given to the initial placement of a few important firms and the spinoff companies they subsequently generate. This outsized influence for anchor firms generates ample room for random influences on the early location decisions vital to a future cluster. For example, William Shockley, who shared a Nobel Prize in Physics for his work on semiconductors and transistors, moved to the San Francisco area to be near his ailing mother. Later, the spinoffs from his firm Shockley Semiconductors included Intel and AMD. Similarly, Moretti (2012) describes how personal factors led Bill Gates and Paul Allen to move Microsoft from Albuquerque to Seattle, their hometown. At the time, Albuquerque was considered the better place to live, it was favored by most of Microsoft’s early employees and the location of many early clients. Yet proximity to family won out, and this decision has reverberated well beyond Microsoft’s direct employment. The agglomeration advantages sparked by Microsoft have attracted countless other tech firms to Seattle, including Jeff Bezos relocating from New York City to Seattle when he founded Amazon. Had Gates and Allen not moved home to Seattle, Albuquerque might be home to two of America’s three most valued companies in 2020.

A similar and related randomness arises due to the often-serendipitous nature of breakthrough discoveries and their outsized subsequent importance. Zucker, Darby, and Brewer (1998) show that the location of biotech industry follows the positioning of star scientists in the nascent field, and the surging prominence of Toronto for artificial intelligence harkens back to the choice of some key early researchers to locate there, well before the field became so prominent. Duranton (2007) formalizes how
random breakthroughs could lead to shifts in the leadership of cities for a tech field
or industry, such as the migration of semiconductors from Boston to Silicon Valley. ...
It seems important to note that the process of becoming a tech center doesn't necessarily favor the biggest cities. Indeed, mid-sized cities may have certain advantages. 
While random sparks play a role, the same breakthroughs often occur contemporaneously in two or more locations (Ganguli, Lin, and Reynolds 2019). Accordingly, a new line of work considers the factors that shape which location emerges as the winner. Duran and Nanda (2019), for example, study the widespread experimentation during the late 1890s and early 1900s as local automobile assemblers learned about the fit between this emerging industry and their city. Despite having fewer entrants initially, activity coalesced in smaller cities—Cleveland, Indianapolis, St. Louis, and Detroit—with Detroit being the ultimate winner by the late 1920s. The smaller city advantage may have been due to the higher physical proximity of relevant stakeholders, allowing for easier experimentation, prototyping, and circulation of ideas. So long as smaller cities had sufficient local input supplies, they may have provided more attention and financial support to the new technology compared to larger markets and fostered relational contracts.
But other than hoping that a rising mega-entrepreneur has relatives already living in your town, what steps might a city that wants to be a tech center realistically take? The perhaps uncomfortable but realistic advice is to lower your expectations. If there was a cookbook recipe for becoming a tech center, it would be happening in a lot more places. However, cities can try to be what the research calls a "nursery city," friendly to innovation and tech companies, build up your local university, and also work to create ties to several different tech centers. Kerr and Robert-Nicoud write: 
The unique origin of each existing tech cluster suggests future efforts to seed from scratch are likely to be similarly frustrating. Instead, a better return is likely to come from efforts to reduce the local costs of experimentation with ideas (Kerr, Nanda, and Rhodes-Kropf 2014), alongside the provision of a good quality of life. There is also likely a role for cities that have developed a position in an emerging sector, even if by random accident due to family ties, to increase the odds they are favored in the shakeout process. Such support is more likely to work if it is broad-based to a sector and avoids attempting to “pick winners” by targeting individual companies. Other cities can take the strategy of increasing
their connectivity to leading centers via remote work. Tulsa Remote pays qualified workers with remote jobs $10,000 to move to Tulsa, Oklahoma, and similar programs are popping up elsewhere. Rather than seeking to “become the next Silicon Valley,” these efforts focus on connecting with the existing hotspots and being an attractive alternative with a lower cost of living.

Beyond anchor firms, universities also feature prominently in the history of tech clusters, both for the United States and globally (Markusen 1996; Dittmar and Meisenzahl 2020). Under the guidance of Fred Terman, Stanford University fostered a strong relationship with the growing tech community, such as the 1948 creation of the Stanford Industrial Park that would house 11,000 workers from leading tech firms by the 1960s. ... Hausman (2012) documents how university innovation fosters local industry growth, and these spillovers can attenuate rapidly (see also Andersson, Quigley, and Wilhelmsson 2009; Kantor and Whalley 2014). ... These historical examples are starting to provide insight that will advance our theory on tech clusters. Duranton and Puga (2001) model a system of cities
in which new industries are emerging in large and diverse “nursery” cities.
Perhaps the true challenge for local policymakers is that having your local economy participate in tech cluster jobs and growth is not just a matter of attracting branches of a couple of big firms, and it's not about being hyperspecialized in producing a particular product. Instead, it's about providing an economic environment conducive to breeding a stream of innovations and providing an economic and regulatory context where those can be turned into local companies.

Sunday, August 9, 2020

The Most Romantic Sentences Ever About Government Statistics?

Arunabh Ghosh offers a candidate for the role of most romantic sentence ever about statistical tables. It's at the start of his article "Counting China: By rejecting sampling in favour of exhaustive enumeration, communist China’s dream of total information became a nightmare" (Aeon, July 23, 2020). Try not to get teary as you read it. Ghosh writes: 
Sometime in the fall of 1955, a Chinese statistical worker by the name of Feng Jixi penned what might well be the most romantic sentence ever written about statistical work. ‘Every time I complete a statistical table,’ Feng wrote:
my happiness is like that of a peasant on his field catching sight of a golden ear of wheat, my excitement like that of a steelworker observing molten steel emerging from a Martin furnace, [and] my elation like that of an artist completing a beautiful painting.

A Martin furnace is a kind of open-hearth furnace where impurities like carbon are burnt out of pig iron as part of the process of producing steel. If you can suggest a more poetic example of love for government statistics, please pass it along. While not quite as emotionally resonant, Ghosh offers a second candidate for romantic comments about economic planning statistics: 

These numbers, the poet and author Ba Jin gushed:
gather the sentiments of 600 million people, they also embody their common aspirations and are their signpost [pointing to the future]. With them, correctly, step by step, we shall arrive on the road to [building a] socialist society. They are like a bright lamp, illuminating the hearts of 600 million.
The numbers about which Ba waxed poetic were those related to planning and economic management.
But love for statistics is not always requited, and Ghosh offers some useful reflections based in China's evolving experience with government statistics in the 1950s and 1960s. In the 1950s, for example, China's government statistics rejected the idea of probability or randomness. After all, the government would use statistics to announce an economic plan, and the plan would be achieved. No uncertainty!
In a speech in 1951, Li Fuchun, one of a handful of technocratically minded leaders, summarily dismissed the utility of Nationalist-era statistics, branding them an Anglo-American bourgeois conceit, unsuitable for ‘managing and supervising the country’. New China needed a new kind of statistics, he declared. .... With their sights set and rightful purpose claimed, Chinese statisticians proceeded to interpret Marxism’s explicit teleology as grounds to reject the existence of chance and probability in the social world. In their eyes, there was nothing uncertain about mankind’s march towards socialism and, eventually, communism. What role, then, could probability or randomness play in the study of social affairs?
The implications for statistical methods were profound. In rejecting probability, and the larger area of mathematical statistics within which it belonged, China’s statisticians discarded a large array of techniques, none more critical than the era’s newest and most exciting fact-generating technology – large-scale random sampling. Instead, they decided that the only correct way to ascertain social facts was to count them exhaustively. Only in this way could extensive, complete and objective knowledge be generated. 
Enormous efforts were made to manage China's economy in a direct way with the use of government statistics. The economy was divided into 13 sectors. Information was to flow from villages to counties to provincial bureaus to the national government, and then plans would flow back down through this hierarchy. In the mid-1950s, more than 200,000 statistics workers were spread across 750,000 villages and 2,200 counties. Unsurprisingly, this system was plagued by slow turnaround times, estimates from lower levels that couldn't be reconciled at higher levels, problems where those lower in the pyramid did not to produce as they were told, and over-optimistic numbers. 
[T]he dream of total information, so alluring as an ideal, was a nightmare in practice. Every level of the statistical system contributed to the overproduction of data. In a system that valued the production of material goods above all else, the only way a white-collar service such as statistics could draw attention to itself was by claiming, as Feng did, that statistical tables were a material contribution to the economy, just like wheat and steel. With the production of tables so incentivised, the entire system responded with gusto to produce them. Soon, there were so many reports circulating that it was impossible to keep track of them. Internal memoranda bemoaned the chaos, but it was a pithy four-character phrase that truly captured the exasperation. Translated, it reads: ‘Useless at the moment of creation!’
By the late 1950s, both statistical sampling and exhaustive enumeration were out of favor. Instead, government "statistics were to rely on case studies.
In the world of data collection, the Great Leap Forward marked a turn away from exhaustive enumeration and the adoption, instead, of decentralised and ethnographic methods. A tract from 1927 on rural investigation, authored by Mao, became the new methodological model for data collection. True knowledge could be gained only by a detailed, in-person investigation, not through vast exhaustive surveys nor through randomised sampling. The shift left the statistical apparatus with no reliable means to check its own data.
In the right hands, case studies can be an extraordinarily valuable approach in the social sciences, because you can drill down into the kinds of specific details that can be of key importance, but are often not covered in broad cross-industry or cross-economy statistical work. In the wrong hands, case studies is a polite name for propaganda. 

After this reflection on alternative visions of statistics, Ghosh offers an epigram for those who produce and use statistics: "[A]ll data are biased, but ... not all biases are the same ..."

Saturday, August 8, 2020

COVID: Perception and Numeracy

Kekst CNC is a "communications strategy" firm that has been doing a series of public opinion polls about COVID-19. Their "COVID-19 Opinion Tracker: Fourth Edition" report surveyed 1,000 adults in different countries, during the time period from July 10-15, 2020. The report includes questions about beliefs on how long the virus will last, how local and national government is performing, how business is performing, feelings about returning to work and wearing masks, and other topics. 

One result that caught my eye was that they asked people to estimate the number of people in their country who had contracted COVID-19 and the number who had died from it. Here are the results for the UK (blue) the US (yellow), Germany (pink), Sweden (green) and France (purple). 

The first question about the number of COVID cases is ambiguous. Kekst CNC points out that the estimates are far higher than the number of confirmed cases according to national health authorities: for example, public perception is 20 times as high as confirmed cases in the US, and 46 times as high as confirmed cases in Germany and France. The estimates of cases is probably high, but one might argue that testing has been limited, and the number of actual cases is likely to be well above the "confirmed" number. But there's less wiggle room and more exaggeration for the question about number of COVID deaths. Here the estimates are at least 100 times as high as the number of actual deaths in these five countries, with public opinion 225 times too  high in the US  and 300 times too high in Germany. 

When you point out that people are wildly wrong in their numerical estimates of cases and especially deaths from COVID-19, a  common answer is something like "you are just trying to minimize the problem." Look, I understand that in this 21st-century, social-media world, there is a widespread view that if you aren't exaggerating, you aren't serious. I hope you will forgive me for being lousy at that form of "persuasion." (Also, if people estimated that half or two-thirds of the total population had already died, would that make them even more "serious"?)

My own belief is that this poll confirms that many people are innumerate about basic percentages. In the US, for example, saying that 9% of the existing population has already died of COVID-19 would imply roughly 29 million deaths so far. It would imply that one out of every eleven people in the country had already died from COVID-19 in mid-July. Given that in a big country, some places will inevitably have more severe outbreaks than others, there would need to be news stories where, say, half the people in certain neighborhood or small towns had died of COVID-19. 

It's hard to get revved up about innumeracy. But if you have little feel for what kinds of numbers are even plausible in a given situation, you are more likely to end up as a sucker for a glib talker with a scenario to sell.  

Friday, August 7, 2020

When Colleges Went to Online Instruction: Some Patterns

Almost all the US colleges and universities closed down their in-person operations and went to online education in March and April. But they didn't all do so at exactly the same time, nor do they all have the same plans for this fall.  In "Tracking Campus Responses to the COVID-19 Pandemic," Christopher R. Marsicano, Kathleen M. Felten, Luis S. Toledo, and Madeline M. Buitendorp (Davidson College Educational Studies Working Paper No. 1,  April 2020) look at the shutdown pattern. They have also set up a College Crisis Initiative website that, among other tasks, keeps track of plans of colleges and universities for next fall. 

These maps from the working paper show the transition to online instruction in March 2020. Yellow dots mean that a college announced a transition to online instruction; purple dots mean the transition was implemented. 

What's the overall pattern? From the abstract: "Between March 1st and April 4th, over 1,400 colleges and universities closed their doors and transitioned to online instruction. This paper uses a novel dataset and draws upon theories of institutional isomorphism to descriptively examine the trends in how higher education institutions responded to the Coronavirus pandemic. It finds little difference in institutional response based on campus infrastructure including, residence hall capacity, hospital affiliation, and medical degree offerings. There is some suggestive evidence, however, that institutions may have responded to external coercive isomorphic pressures from state governments and may have relied on a heuristic of peer institution closures to inform their decisions." 

In other words, the timing by which schools moved to online education seems to have been affected in some cases by pressure from state government or because they were following peer institutions. One suspects that similar dynamics--a mixture of evidence, political pressure, and following the crowd--will govern the move back from online education as well.  Here are the current plans for fall, updated through Friday, August 7, at the "Dashboard" website of the College Crisis Initiative

What's especially interesting to me is how different the plans are across institutions, even though students have actually started moving into dormitories at some institutions and are packing their bags for a move in the next week or two at other places. California is primarily green--that is, primarily or fully online. The middle of the country has a lot of pink, for primarily or fully in person. I'm sure colleges and universities are watching each other closely, both for whether there will be a widespread on-campus outbreak of COVID-19, and also the willingness of students to deal with in-person restrictions or online learning. 

Thursday, August 6, 2020

The World Economy Through a PPP Lens

When comparing the size of economies, you need to use an exchange rate to convert the GDP of one country, measured in its own currency, to compare with the GDP of the other country. But what exchange rate to use? 

An obvious choice is the market exchange rate. An equally obvious problem is that the exchange rates fluctuate.  If the exchange rate of a country strengthens by 10% in a certain month, that of course doesn't mean that the standard of living for people in that country rose by 10% during that month. When you compare economies using a market exchange rate, it's a little like measuring with a ruler which without warning grows and shrinks. 

There's also a more subtle problem with using market exchange rates to compare economies. Say that in one country, housing or health care or higher education is much cheaper than in some other country. As a concrete example, one often reads about moving or retiring to some other country with a much lower cost of living--where you can buy that lovely home for so much less than in the United States. If you just look at total size of GDP for that country with lower prices, converted at a market exchange rate, you would not be able to tell that because of the lower price level, the GDP actually represents a higher quantity of goods being consumed than you might expect. 

A common alternative is "purchasing power parity" exchange rates. Like the name implies, these exchange rates are calculated to reflect the purchasing power of a currency within the country. This is not a new idea: the first academic research using a PPP exchange rate was published in 1940. But in recent decades, the standard source for calculating PPP exchange rates is the International Comparison Project at the World Bank, which has published its report "Purchasing Power Parities and the Size of World Economies: Results from the 2017 International Comparison Program" (May 2020)

It is typically true that lower-income countries have cheaper goods and services. Thus,  when you look at the purchasing power of their currency in their own country, it tends to be greater than the market exchange rate would imply. Here's a figure showing price levels across countries. The horizontal axis shows per capita GDP, so higher-income countries are on the right. The vertical axis shows the price level across countries. The figure helps explain why places like Mexico and Thailand are such popular tourist and even retirement destinations for people from high-income countries: your income buys more in those countries. 
Essentially, a PPP exchange rate adjusts for these differences in buying power. As a result, using PPP exchange rates makes GDP for lower- and middle-income countries. As the report notes: "In 2017, global output, when measured by purchasing power parities (PPPs), was $119,547 billion, compared with $79,715 billion, when measured by market exchange rates.... In 2017 lower-middle-income
economies contributed around 16 percent to PPP-based global GDP, while upper-middle-income
economies contributed 34 percent. At the same time, high-income economies contributed 49 percent.
In terms of market exchange rates, these shares were 8 percent, 28 percent, and 64 percent,
respectively." 
If the comparison is done using a PPP exchange rate, the economy of China was bigger than that of the United States in 2017. 

Here's an overview of the global economy measured in per capita GDP. The vertical axis measures the share of global population--and population is shown by the height of each bar. Thus, the bars for China, India, and the United States are especially tall. The horizontal width of the bar measures per capita GDP, using a PPP comparison. The light-blue lines on the left show 2011; the darker-blue bars on the right show 2017. You can see a number of modest changes in this six-year period: the rise of China and India, Turkey moving ahead of the Russian Federation in per capita GDP, and others.  
Two additional thoughts are worth passing along. First, the PPP exchange rates are explicitly intended to compare GDP, per capita GDP, and similar measures across countries, to make that useful adjustment for different price levels across countries. For other uses, they may not be appropriate. The report notes: "ICP PPPs are designed specifically for international comparisons of GDP. They are not designed
for comparisons of monetary flows or trade flows. International comparisons of flows—such
as development aid, foreign direct investment, migrants’ remittances, or imports and exports of
goods and services—should be made with market exchange rates, not with PPPs."

Second, it may have occurred to you that measuring price levels in a comparable way across all the countries of the world, in a way that adjusts for differences in quality and availability of various goods and services, is a Herculean task. There's a reason why the estimated PPP exchange rates for 2017 are being published in 2020--it takes time to put all this together. The report describes the methodology in some detail, but there is room for skepticism. Indeed, back in in 2010 Angus Deaton devoted his Presidential Address to the American Economic Association (freely available on-line here) to detailing the "weak theoretical and empirical foundations" of such measurements. But for anyone who ha read this far, it will come as no surprise that imperfect economic statistics can still be useful, when applied with context and caution.