Sunday, August 9, 2020

The Most Romantic Sentences Ever About Government Statistics?

Arunabh Ghosh offers a candidate for the role of most romantic sentence ever about statistical tables. It's at the start of his article "Counting China: By rejecting sampling in favour of exhaustive enumeration, communist China’s dream of total information became a nightmare" (Aeon, July 23, 2020). Try not to get teary as you read it. Ghosh writes: 
Sometime in the fall of 1955, a Chinese statistical worker by the name of Feng Jixi penned what might well be the most romantic sentence ever written about statistical work. ‘Every time I complete a statistical table,’ Feng wrote:
my happiness is like that of a peasant on his field catching sight of a golden ear of wheat, my excitement like that of a steelworker observing molten steel emerging from a Martin furnace, [and] my elation like that of an artist completing a beautiful painting.

A Martin furnace is a kind of open-hearth furnace where impurities like carbon are burnt out of pig iron as part of the process of producing steel. If you can suggest a more poetic example of love for government statistics, please pass it along. While not quite as emotionally resonant, Ghosh offers a second candidate for romantic comments about economic planning statistics: 

These numbers, the poet and author Ba Jin gushed:
gather the sentiments of 600 million people, they also embody their common aspirations and are their signpost [pointing to the future]. With them, correctly, step by step, we shall arrive on the road to [building a] socialist society. They are like a bright lamp, illuminating the hearts of 600 million.
The numbers about which Ba waxed poetic were those related to planning and economic management.
But love for statistics is not always requited, and Ghosh offers some useful reflections based in China's evolving experience with government statistics in the 1950s and 1960s. In the 1950s, for example, China's government statistics rejected the idea of probability or randomness. After all, the government would use statistics to announce an economic plan, and the plan would be achieved. No uncertainty!
In a speech in 1951, Li Fuchun, one of a handful of technocratically minded leaders, summarily dismissed the utility of Nationalist-era statistics, branding them an Anglo-American bourgeois conceit, unsuitable for ‘managing and supervising the country’. New China needed a new kind of statistics, he declared. .... With their sights set and rightful purpose claimed, Chinese statisticians proceeded to interpret Marxism’s explicit teleology as grounds to reject the existence of chance and probability in the social world. In their eyes, there was nothing uncertain about mankind’s march towards socialism and, eventually, communism. What role, then, could probability or randomness play in the study of social affairs?
The implications for statistical methods were profound. In rejecting probability, and the larger area of mathematical statistics within which it belonged, China’s statisticians discarded a large array of techniques, none more critical than the era’s newest and most exciting fact-generating technology – large-scale random sampling. Instead, they decided that the only correct way to ascertain social facts was to count them exhaustively. Only in this way could extensive, complete and objective knowledge be generated. 
Enormous efforts were made to manage China's economy in a direct way with the use of government statistics. The economy was divided into 13 sectors. Information was to flow from villages to counties to provincial bureaus to the national government, and then plans would flow back down through this hierarchy. In the mid-1950s, more than 200,000 statistics workers were spread across 750,000 villages and 2,200 counties. Unsurprisingly, this system was plagued by slow turnaround times, estimates from lower levels that couldn't be reconciled at higher levels, problems where those lower in the pyramid did not to produce as they were told, and over-optimistic numbers. 
[T]he dream of total information, so alluring as an ideal, was a nightmare in practice. Every level of the statistical system contributed to the overproduction of data. In a system that valued the production of material goods above all else, the only way a white-collar service such as statistics could draw attention to itself was by claiming, as Feng did, that statistical tables were a material contribution to the economy, just like wheat and steel. With the production of tables so incentivised, the entire system responded with gusto to produce them. Soon, there were so many reports circulating that it was impossible to keep track of them. Internal memoranda bemoaned the chaos, but it was a pithy four-character phrase that truly captured the exasperation. Translated, it reads: ‘Useless at the moment of creation!’
By the late 1950s, both statistical sampling and exhaustive enumeration were out of favor. Instead, government "statistics were to rely on case studies.
In the world of data collection, the Great Leap Forward marked a turn away from exhaustive enumeration and the adoption, instead, of decentralised and ethnographic methods. A tract from 1927 on rural investigation, authored by Mao, became the new methodological model for data collection. True knowledge could be gained only by a detailed, in-person investigation, not through vast exhaustive surveys nor through randomised sampling. The shift left the statistical apparatus with no reliable means to check its own data.
In the right hands, case studies can be an extraordinarily valuable approach in the social sciences, because you can drill down into the kinds of specific details that can be of key importance, but are often not covered in broad cross-industry or cross-economy statistical work. In the wrong hands, case studies is a polite name for propaganda. 

After this reflection on alternative visions of statistics, Ghosh offers an epigram for those who produce and use statistics: "[A]ll data are biased, but ... not all biases are the same ..."