As soon as Tom Smith got his hands on Codex — a new artificial intelligence technology that writes its own computer programs — he gave it a job interview.
He asked if it could tackle the “coding challenges” that programmers often face when interviewing for big-money jobs at Silicon Valley companies like Google and Facebook. Could it write a program that replaces all the spaces in a sentence with dashes? Even better, could it write one that identifies invalid ZIP codes?
It did both instantly, before completing several other tasks. “These are problems that would be tough for a lot of humans to solve, myself included, and it would type out the response in two seconds,” said Mr. Smith, a seasoned programmer who oversees an A.I. start-up called Gado Images. “It was spooky to watch.”
Codex seemed like a technology that would soon replace human workers. As Mr. Smith continued testing the system, he realized that its skills extended well beyond a knack for answering canned interview questions. It could even translate from one programming language to another.
Yet after several weeks working with this new technology, Mr. Smith believes it poses no threat to professional coders. In fact, like many other experts, he sees it as a tool that will end up boosting human productivity. It may even help a whole new generation of people learn the art of computers, by showing them how to write simple pieces a code, almost like a personal tutor.
“This is a tool that can make a coder’s life a lot easier,” Mr. Smith said.
About four years ago, researchers at labs like OpenAI started designing neural networks that analyzed enormous amounts of prose, including thousands of digital books, Wikipedia articles and all sorts of other text posted to the internet.
By pinpointing patterns in all that text, the networks learned to predict the next word in a sequence. When someone typed a few words into these “universal language models,” they could complete the thought with entire paragraphs. In this way, one system — an OpenAI creation called GPT-3 — could write its own Twitter posts, speeches, poetry and news articles.
Much to the surprise of even the researchers who built the system, it could even write its own computer programs, though they were short and simple. Apparently, it had learned from an untold number of programs posted to the internet. So OpenAI went a step further, training a new system — Codex — on an enormous array of both prose and code.
The result is a system that understands both prose and code — to a point. You can ask, in plain English, for snow falling on a black background, and it will give you code that creates a virtual snowstorm. If you ask for a blue bouncing ball, it will give you that, too.
“You can tell it to do something, and it will do it,” said Ania Kubow, another programmer who has used the technology.
Codex can generate programs in 12 computer languages and even translate between them. But it often makes mistakes, and though its skills are impressive, it can’t reason like a human. It can recognize or mimic what it has seen in the past, but it is not nimble enough to think on its own.
Sometimes, the programs generated by Codex do not run. Or they contain security flaws. Or they come nowhere close to what you want them to do. OpenAI estimates that Codex produces the right code 37 percent of the time.
When Mr. Smith used the system as part of a “beta” test program this summer, the code it produced was impressive. But sometimes, it worked only if he made a tiny change, like tweaking a command to suit his particular software setup or adding a digital code needed for access to the internet service it was trying to query.
In other words, Codex was truly useful only to an experienced programmer.
But it could help programmers do their everyday work a lot faster. It could help them find the basic building blocks they needed or point them toward new ideas. Using the technology, GitHub, a popular online service for programmers, now offers Co-pilot, a tool that suggests your next line of code, much the way “autocomplete” tools suggest the next word when you type texts or emails.
“It is a way of getting code written without having to write as much code,” said Jeremy Howard, who founded the artificial intelligence lab Fast.ai and helped create the language technology that OpenAI’s work is based on. “It is not always correct, but it is just close enough.”
Mr. Howard and others believe Codex could also help novices learn to code. It is particularly good at generating simple programs from brief English descriptions. And it works in the other direction, too, by explaining complex code in plain English. Some, including Joel Hellermark, an entrepreneur in Sweden, are already trying to transform the system into a teaching tool.
The rest of the A.I. landscape looks similar. Robots are increasingly powerful. So are chatbots designed for online conversation. DeepMind, an A.I. lab in London, recently built a system that instantly identifies the shape of proteins in the human body, which is a key part of designing new medicines and vaccines. That task once took scientists days or even years. But those systems replace only a small part of what human experts can do.
In the few areas where new machines can instantly replace workers, they are typically in jobs the market is slow to fill. Robots, for instance, are increasingly useful inside shipping centers, which are expanding and struggling to find the workers needed to keep pace.
With his start-up, Gado Images, Mr. Smith set out to build a system that could automatically sort through the photo archives of newspapers and libraries, resurfacing forgotten images, automatically writing captions and tags and sharing the photos with other publications and businesses. But the technology could handle only part of the job.
It could sift through a vast photo archive faster than humans, identifying the kinds of images that might be useful and taking a stab at captions. But finding the best and most important photos and properly tagging them still required a seasoned archivist.
“We thought these tools were going to completely remove the need for humans, but what we learned after many years was that this wasn’t really possible — you still needed a skilled human to review the output,” Mr. Smith said. “The technology gets things wrong. And it can be biased. You still need a person to review what it has done and decide what is good and what is not.”
Codex extends what a machine can do, but it is another indication that the technology works best with humans at the controls.
“A.I. is not playing out like anyone expected,” said Greg Brockman, the chief technology officer of OpenAI. “It felt like it was going to do this job and that job, and everyone was trying to figure out which one would go first. Instead, it is replacing no jobs. But it is taking away the drudge work from all of them at once.”