The American Crossword Puzzle Tournament, or ACPT, annually gathers contestants from around the world to tackle eight original crossword puzzles. This year, the top ACPT performer was not a human but was instead an artificial intelligence system built in part by a team of UC Berkeley researchers.
According to Will Shortz, crossword editor for The New York Times and founder and director of the ACPT, this year’s tournament was held virtually and had a record number of 1,287 contestants. While the tournament was built for human competitors, the AI system that “won,” named Dr.Fill, competed unofficially, Shortz noted.
Dr.Fill solved the final puzzle in 49 seconds, according to a Berkeley Engineering press release, more than two minutes faster than the top human contestant.
“Crosswords are generally considered to be a uniquely human activity, because they involve hard-to-program elements like real-world human language and knowledge,” Shortz said in an email. “Dr. Fill’s performance is a tribute to human ingenuity in developing AI.”
Dr.Fill’s success is attributed to a collaboration between Matthew Ginsberg, the original creator of Dr.Fill, and a team of UC Berkeley researchers from the Berkeley Natural Language Processing, or NLP, Group, according to the press release.
According to Dan Klein, campus professor of electrical engineering and computer sciences, the UC Berkeley team included himself; campus doctoral students Nicholas Tomlin, Eric Wallace and Kevin Yang; and campus undergraduate students Eshaan Pathak and Albert Xu.
Tomlin had been working on building an automated crossword puzzle solver since last year. He pitched the idea to researchers at the NLP Group, and the group began working on the Berkeley Crossword Solver, or BCS, an AI system that focuses on reading crossword clues and generating possible answers, according to Tomlin.
“There are two phases to solving a crossword,” Klein said. “First, you have to brainstorm possible answers for each clue. The second phase is to take all those possible answers and figure out which ones go together on the grid.”
According to Klein, the BCS focuses on the first phase of crossword solving via a machine-learning model. The BCS was trained on approximately 6.5 million pairs of clues and answers, Klein noted. By solving several example crossword puzzles, the BCS learns and updates its algorithm as it makes mistakes.
The second phase of crossword solving is managed by the existing Dr.Fill system, which identifies the best guesses to place on the crossword grid, according to Tomlin. Klein added that because the BCS also computes the probabilities that each of the possible answers is correct, Dr.Fill can clearly weigh its options when determining the correct grid layout.
Klein noted Dr.Fill’s performance in the ACPT is an “exciting milestone” for the field of natural language processing.
“If you had asked me two weeks before the competition, I would have told you that we probably weren’t going to win this year,” Tomlin said in an email. “I’m still amazed that we won.”