dmm | GitHub Copilot ("we are getting there")

github.blog/2021-06-29-introducing-github-copilot-ai-pair-programmer/

"Today, we are launching a technical preview of GitHub Copilot, a new AI pair programmer that helps you write better code. GitHub Copilot draws context from the code you’re working on, suggesting whole lines or entire functions. It helps you quickly discover alternative ways to solve problems, write tests, and explore new APIs without having to tediously tailor a search for answers on the internet. As you type, it adapts to the way you write code—to help you complete your work faster.

Developed in collaboration with OpenAI, GitHub Copilot is powered by OpenAI Codex, a new AI system created by OpenAI. OpenAI Codex has broad knowledge of how people use code and is significantly more capable than GPT-3 in code generation, in part, because it was trained on a data set that includes a much larger concentration of public source code. GitHub Copilot works with a broad set of frameworks and languages, but this technical preview works especially well for Python, JavaScript, TypeScript, Ruby and Go."

If you are using Visual Studio Code often, it might make sense to try to sign-up for the technical preview phase...

Flat | Top-Level Comments Only

And we know that open-source GPT-Neo and GPT-J are also quite formidable in their code-generating capabilities, so it's no longer an OpenAI/Microsoft monopoly.

Edited 2021-06-29 16:26 (UTC)

The Codex paper is out:

"Evaluating Large Language Models Trained on Code", https://arxiv.org/abs/2107.03374

"We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from docstrings, our model solves 28.8% of the problems, while GPT-3 solves 0% and GPT-J solves 11.4%. Furthermore, we find that repeated sampling from the model is a surprisingly effective strategy for producing working solutions to difficult prompts. Using this method, we solve 70.2% of our problems with 100 samples per problem. Careful investigation of our model reveals its limitations, including difficulty with docstrings describing long chains of operations and with binding operations to variables. Finally, we discuss the potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics."

Edited 2021-07-12 04:18 (UTC)

The key phrase from the abstract: "On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from docstrings, our model solves 28.8% of the problems, while GPT-3 solves 0% and GPT-J solves 11.4%."

So this is actually a huge step forward.

It's a smaller model than GPT-3, with 12 billion parameters (so only twice as large as the open-source GPT-J from the EleutherAI grassroot organization).

The list of authors is horribly long even by modern AI standards (53 authors if I counted right, so they really went inclusive, listing everyone who contributed in any way; 6 first authors listed as "equal contribution", and I don't really know any of them, although I do recognize some of the more senior authors).

Да, спасибо; обратил внимание тоже.

https://twitter.com/markchen90/status/1412958673953906690

https://twitter.com/Miles_Brundage/status/1412934622082596865

https://twitter.com/iScienceLuvr/status/1412941729829822464

GitHub Copilot ("we are getting there")

no subject

no subject

no subject

no subject

no subject