All the news that's fit to parse
I’ve uploaded a new working paper called “Patterns of Panic: Financial Crisis Language in Historical Newspapers”, available here. In the paper, I’m analyzing titles of these five major US newspapers since the 19th century to construct a new indicator of financial stress:
Newspaper | Since | Titles |
---|---|---|
Chicago Tribune | 1853 | 9.0m |
Boston Globe | 1872 | 6.7m |
Washington Post | 1877 | 7.7m |
Los Angeles Times | 1881 | 7.9m |
Wall Street Journal | 1889 | 3.9m |
For the paper, I looked at a whole lot of trends in newspaper language. It led to some cool figures that don’t fit in the paper, so I’m putting them here instead. The figures show the number of titles per quarter that contain some words, averaged across five newspapers.
Here is the figure for the words “war” and “peace” (the gray bars are the world wars):
The “war” series jumps, unsurprisingly, around the major wars. Next to the world wars, the American Civil War (starting 1861) stands out. It reaches a new high in the first quarter of 1863 on Lincoln’s signing of the Emancipation Proclamation. The usage of “peace” has a correlation of 0.51 with “war”.
The next figure shows the occurences of the names of U.S. presidents:
“lincoln” spikes first in the third quarter of 1858 during the Lincoln–Douglas debates, then again during the presidential elections of 1860 and last when the Confederate States surrendered in 1865. For “roosevelt” and “bush”, two presidents shared a surname. The name “kennedy” appears during John F. Kennedy’s presidency and then spikes again when Ted Kennedy tried to run for president in the first quarter of 1980.
And, staying within my home country:
The fake Hitler diaries were published by the magazine Stern in 1983, so that probably explains that spike. The 1912 spike in “Brandt” might be due to the founding of the German company of the same name in that year.
The next figure shows the occurence of “earthquake(s)” in titles.
Large spikes occur around the following earthquakes
- Djijelli (1856)
- Hayward (1868)
- Charleston (1886)
- San Francisco (1907)
- Kantō (1923)
- San Fernando (1971)
- Loma Prieta (1989)
- Sichuan (2008)
- Haiti (2010)
- Nepal earthquake (2015)
In 1994, the Northridge earthquake devastated Los Angeles. The big spike in this series is driven by reporting in the Los Angeles Times in which 501 (out of 27,669) titles in the first quarter of 1994 contained the term “earthquake(s)”.
Here are the terms “soviet” and “russia”:
The series for “russia” spikes during the Russo-Japanese War and at the beginning and end of World War I. During the Russian Revolution, the word “soviet” appears for the first time. The two words move together for three decades, but then paths diverge after 1960. The Cold War pitted the West against the Soviet empire and the importance of stand-alone “russia” declined. This changed in the autumn of 1991 when the Soviet Union disassembled and nation states, including Russia, took its place.
Checkout the paper, if you like. I’d be grateful for any comments you might have.