“ChinAI #137: Year 3 of ChinAI: Reflections on the Newsworthiness of Machine Translation”, Jeffrey Ding2021-04-05 (, , ; similar)⁠:

What is newsworthy? This question should haunt everyone with a platform.

Last month, Stanford HAI published the AI Index Report2021, a 222-page report on the state of AI, put together by an all-star team supported by a lot of data and strong connections to technical experts. What was newsworthy in this report? According to The Verge, “Artificial intelligence research continues to grow as China overtakes US in AI journal citations.” In fact, the article takes its cue from what the report authors themselves deemed important, given that “China overtakes the US in AI journal citations” features as one of the report’s 9 key takeaways.

Dig deeper into the data, however, and you’ll uncover alternative takeaways. Look at the cross-national statistics on average field-weighted citation impact (FWCI) of AI authors, for example, which gives a sense of the quality of the average AI publication from a region. Interestingly enough, the US actually increased its relative lead in FWCI over China over the past couple years. According to the 2019 version of the AI Index, the FWCI of US publications was about 1.3× greater than China’s; in 2021, that gap has widened to almost 3× greater (pg24).

So, working off the same materials as released in the AI index, here’s another way one could have distilled key takeaways: “The US increases its lead over China in average impact of AI publications.” Or, if you wanted to be cheeky: “China lags behind Turkey in average impact of AI publications.” Just as newsworthy, in my opinion.

However, what I found most newsworthy about the AI Index went beyond horse-race reporting about “who’s winning the AI race‽” Instead, I was most intrigued by the rise of commercially available machine translation (MT) systems, covered on page 64. According to data from Intento, a startup that assesses MT services, there are now 28 cloud MT systems with pre-trained models that are commercially available—an increase from just 8 in 2017. But wait…there’s more: Intento also reports an incredible spike in MT language coverage, with 16,000+ language pairs supported by at least one MT provider (slide 33 of Intento’s “State of Machine Translation” report).

…Somehow, these incredible advances in translation are not relevant to the effect of AI on U.S.-China relations, at least based on existing discussions. Compare the complete dearth of Twitter discussions centered on the following keywords: U.S., China, and “machine translation” against what you get when you replace “machine translation” with “facial recognition.” Consider another reference point, the recently published 756-page report by the National Security Commission on Artificial Intelligence (NSCAI). 62 of those pages mention the word “weapon” at least once. Only 9 pages mention the word “translation”, and most do not substantively discuss translation (eg. the word appears in a bibliographic reference for a translated text).

Yet, I could make a convincing case that translation is more important than targeting for US national security. Think about the potential of improved translation capabilities for the intelligence community. Another obvious vector is the effect of translation on diplomacy.


Top 9 Takeaways:

  1. AI investment in drug design and discovery increased substantially: “Drugs, Cancer, Molecular, Drug Discovery” received the greatest amount of private AI investment in 2020, with more than $13.8 billion, 4.5× higher than 2019.

  2. The industry shift continues: In 2019, 65% of graduating North American PhDs in AI went into industry—up from 44.4% in 2010, highlighting the greater role industry has begun to play in AI development.

  3. Generative everything: AI systems can now compose text, audio, and images to a sufficiently high standard that humans have a hard time telling the difference between synthetic and non-synthetic outputs for some constrained applications of the technology.

  4. AI has a diversity challenge: In 2019, 45% new US resident AI PhD graduates were white—by comparison, 2.4% were African American and 3.2% were Hispanic.

  5. China overtakes the US in AI journal citations: After surpassing the US in the total number of journal publications several years ago, China now also leads in journal citations; however, the US has consistently (and substantially) more AI conference papers (which are also more heavily cited) than China over the last decade.

  6. The majority of the US AI PhD grads are from abroad—and they’re staying in the US: The percentage of international students among new AI PhDs in North America continued to rise in 2019, to 64.3%—a 4.3% increase from 2018. Among foreign graduates, 81.8% stayed in the United States and 8.6% have taken jobs outside the United States.

  7. Surveillance technologies are fast, cheap, and increasingly ubiquitous: The technologies necessary for large-scale surveillance are rapidly maturing, with techniques for image classification, face recognition, video analysis, and voice identification all seeing substantial progress in 2020.

  8. AI ethics lacks benchmarks and consensus: Though a number of groups are producing a range of qualitative or normative outputs in the AI ethics domain, the field generally lacks benchmarks that can be used to measure or assess the relationship between broader societal discussions about technology development and the development of the technology itself. Furthermore, researchers and civil society view AI ethics as more important than industrial organizations.

  9. AI has gained the attention of the US Congress: The 116th Congress is the most AI-focused congressional session in history with the number of mentions of AI in congressional record more than triple that of the 115th Congress.