The world of AI research is in shambles. From the academics prioritizing easy-to-monetize schemes over breaking novel ground, to the Silicon Valley elite using the threat of job loss to encourage corporate-friendly hypotheses, the system is a broken mess.
And Google deserves a lion’s share of the blame.
How it started
There were approximately 85,000 research papers published globally on the subject of AI / ML in the year 2000. Fast-forward to 2021 and there were nearly twice as many published in the US alone.
To say there’s been an explosion in the field would be a massive understatement. This influx of researchers and new ideas has led to deep learning becoming one of the world’s most important technologies.
Between 2014 and 2021 big tech all but abandoned its “web first” and “mobile first” principles to adopt “AI first” strategies.
Now, in 2022, AI developers and researchers are in higher demand (and command more salary) than nearly any other jobs in tech outside of the C-suite.
But this sort of unfettered growth also has a dark side. In the scramble to meet the market demand for deep learning-based products and services, the field’s become as cutthroat and fickle as professional sports.
In the past few years, we’ve seen the “GANfather,” Ian Goodfellow, jump ship from Google to Apple, Timnit Gebru and others get fired from Google for dissenting opinions on the efficacy of research, and a virtual torrent of dubious AI papers manage to somehow clear peer-review.
The flood of talent that arrived in the wake of the deep learning explosion also brought a mudslide of bad research, fraud, and corporate greed along with it.
How it’s going
Google, more than any other company, bears responsibility for the modern AI paradigm. That means we need to give big G full marks for bringing natural language processing and image recognition to the masses.
It also means we can credit Google with creating the researcher-eat-researcher environment that has some college students and their big-tech-partnered professors treating research papers as little more than bait for venture capitalists and corporate headhunters.
At the top, Google’s showed its willingness to hire the world’s most talented researchers. And it’s also demonstrated numerous times that it’ll fire them in a heartbeat if they do not toe the company line.
The company made headlines around the globe after firing Timnit Gebru, a researcher it’d hired to help lead its AI ethics division, in December of 2020. Just a few months later it fired another member of the team, Margaret Mitchell.
Google maintains that the researchers’ work was not up to spec, but both women and numerous supporters claim the firings only occurred after they brought up ethical concerns over research the company’s AI boss, Jeff Dean, had signed off on.
It’s now barely over a year later and history is repeating itself. Google fired another world-renowned AI researcher, Satrajit Chatterjee, after he led a team of scientists in challenging another paper Dean had signed off on.
The mudslide effect
At the top, this means the competition for high-paying jobs is fierce. And the hunt for the next talented researcher or developer begins earlier than ever.
Students working towards advanced degrees in the fields of machine learning and AI, who eventually want to work outside of academia, are expected to author or co-author research papers that demonstrate their talent.
Unfortunately, the pipeline from academia to big tech or the VC-led startup world is littered with crappy papers written by students whose entire bent is writing algorithms that can be monetized.
A quick Google Scholar search for “natural language processing,” for example, shows nearly a million hits. Many of the papers listed have hundreds or thousands of citations.
On the surface, this would indicate that NLP is a thriving subset of machine learning research that’s gained attention from researchers around the globe.
In fact, searches for “artificial neural network,” “computer vision,” and “reinforcement learning” all brought up a similar glut of results.
Unfortunately, a significant portion of AI and ML research is either intentionally fraudulent or full of bad science.
What may have worked well in the past is quickly becoming a potentially outdated mode of communicating research.
The Guardian’s Stuart Richie recently penned an article wondering if we should do away with research papers altogether. According to them, science’s problems are baked in pretty deep:
This system comes with big problems. Chief among them is the issue of publication bias: reviewers and editors are more likely to give a scientific paper a good write-up and publish it in their journal if it reports positive or exciting results. So scientists go to great lengths to hype up their studies, lean on their analyzes so they produce “better” results, and sometimes even commit fraud in order to impress those all-important gatekeepers. This drastically distorts our view of what really went on.
The problem is that the gatekeepers everyone is trying to impress tend to hold the keys to students ‘future employment and academics’ admission into prestigious journals or conferences – researchers may fail to gain their approval at their own peril.
And, even if a paper manages to make it through peer-review, there’s no guarantee the people pushing things through aren’t asleep at the switch.
That’s why Guillaume Cabanac, an associate professor of computer science at the University of Toulouse, created a project called the Problematic Paper Screener (PPS).
The PPS uses automation to flag papers containing potentially problematic code, math, or verbiage. In the spirit of science and fairness, Cabanac ensures every paper that’s flagged gets a manual review from humans. But the job’s likely too big for a handful of humans to do in their spare time.
According to a report from Spectrum News, there are a lot of problematic papers out there. And the majority have to do with machine learning and AI:
The screener deemed about 7,650 problematic studies, including more than 6,000 for having tortured phrases. Most papers containing tortured phrases seem to come from the fields of machine learning, artificial intelligence and engineering.
Tortured phrases are terms that raise red flags to researchers because they attempt to describe a process or concept that’s already well-established.
For example, the use of terms such as “counterfeit neural” or “man-made neural” could indicate the use of a thesaurus plug-in used by bad actors trying to get away with plagiarizing previous work.
While Google can not be blamed for everything untoward in the fields of machine learning and AI, it played an outsized role in the devolution of peer-reviewed research.
This is not to say that Google does not also support and prop up the scientific community through open-source, financial aid, and research support. And we’re certainly not trying to imply that everyone studying AI is just out to make a quick buck.
But the system’s set up to encourage the monetization of algorithms first, and to further the field second. In order for this to change, big tech and academia both need to commit to wholesale reform in how research is presented and reviewed.
Currently, there is no widely recognized third-party verification authority for papers. The peer-review system is more like an honor code than a set of agreed-upon principles followed by institutions.
However, there is precedent for the establishment and operation of an oversight committee with the reach, influence, and expertise to govern across academic boundaries: the NCAA.
If we can unify a fair-competition system for thousands of amateur athletics programs, it’s a safe bet we could form a governing body to establish guidelines for academic research and review.
And, as far as Google goes, there’s a better than nil chance that CEO Sundar Pichai’s going to find himself summoned before congress again if the company continues to fire the researchers it hires to oversee its ethical AI programs.
US capitalism means a business is typically free to hire and fire whoever they want, but shareholders and workers have rights too.
Eventually, Google’s going to have to commit to ethical research or it’ll find itself unable to compete with the companies and organizations willing to.