Baidu, the Chinese technology giant specializing in Internet-related services and artificial intelligence (AI), has overtaken rivals, Google and Microsoft in an ongoing AI competition designed to help machines better understand human language.
Baidu achieved the highest ever score in the General Language Understanding Evaluation (GLUE) – a widely accepted benchmark for training, evaluating, and analyzing AI language understanding systems.
GLUE consists of nine different nine sentence- or sentence-pair language understanding tests built on established existing datasets and selected to cover a diverse range of dataset sizes, text genres, and degrees of difficulty. The average person scores about 87 points out of a hundred on the GLUE scale.
Baidu used its own AI model using the Chinese language, called ERNIE (which stands for “Enhanced Representation through kNowledge IntEgration”). The company has now become the first team to surpass 90 with its model, ERNIE and also topped the leaderboard ruled by U.S. tech firms and universities.
Not only this, but Baidu also became one of the only 10 AI systems to beat the average human score of 87.1 on the GLUE benchmark.
Baidu’s ERNIE was inspired by Google’s BERT (Bidirectional Encoder Representations from Transformers), which was created in late 2018. Both these models predict and interpret the meaning of the word by considering the context that appears before and after it in a sentence all at once.
This is done by using a technique called “masking”, where the AI randomly hides words in order to predict the meaning of the sentence.
However, Baidu researchers realized that they needed to make changes to ERNIE because of the differences between Chinese and English language. By creating their own ERNIE model using the Chinese language, it trained ERNIE to predict sets of missing words in Chinese and then used it for English words.
Baidu “researchers trained ERNIE on a new version of masking that hides strings of characters rather than single ones. They also trained it to distinguish between meaningful and random strings so it could mask the right character combinations accordingly,” wrote MIT Technology Review, who first reported on the research.
“When we first started this work, we were thinking specifically about certain characteristics of the Chinese language. But we quickly discovered that it was applicable beyond that,” Hao Tian, chief architect of Baidu Research.
This method made the algorithm even stronger at understanding English, enabling ERNIE to achieve the highest GLUE score yet.
Baidu researchers are planning to present a detailed paper on how Ernie was trained for the language test at the Association for the Advancement of Artificial Intelligence conference next year.