Open-source AI functionality provided by the Copilot Chat extension · Issue #249031 · microsoft/vscode (original) (raw)
Our blog post outlines our motivation to open-source the client-side code of our AI features in VS Code. We also compiled FAQs.
Goals
- Open source is only useful, if you can participate in the development process of the AI features. We need a development story that allows you to make code changes and debug AI interactions end-to-end. You need to be able to run the AI tests suites. Since all AI features are powered by models, you need to have access to models during development.
- Once open source, we re-evaluate how we split the functionality between VS Code Core, built-in extension(s), and the Chat extension. We want to improve the user experience and simplify our architecture and build processes.
Approach
We'll first open-source the GitHub Copilot Chat extension. To do so we need:
- Ensure code compliance
- Define the strategy for service access
- Define how to run tests
- Define OSS builds
- Issue management
Note: Today, NES functionality is separate from code completions. NES is implemented in the Chat extension, while completions are implemented in the GitHub Copilot completions extension. We have concrete plans to bring NES and completions together in the Chat extension. Therefore, at the moment, we don't have concrete plans to open-source the Copilot completions extension.
Compliance Review
- We need to review every file in the Copilot Chat extension for compliance. This includes adding copyrights and removing references to internal processes, IP, and issues. This is particularly important for our test suite that contains test cases created with information from private issues.
- After the review, we'll move the Chat extension code to a new repository without history avoiding the need to review thousands of commits.
Service Access
- The Chat extension is powered by the GitHub Copilot service. The GitHub Copilot service provides access to general purpose and custom models, embeddings computation, and semantic code search of GitHub repositories.
- To talk to the GitHub Copilot service, the Chat extension uses CAPI (the GitHub Copilot API). Just like our other production services, for example the settings sync service, the Copilot service will remain closed source, and its usage will continue to be regulated by its service license.
- For debug AI interactions, you need to be able to run Code-OSS with the Chat extension installed. Normally, Code-OSS does not have access to production services. This is not a handicap for non-AI features, but AI features are useless without model access. Our current thinking is that we'll provide a closed-source, licensed npm module providing CAPI access that you can choose to install into the codebase before launching Code-OSS. Or you can use BYOK without CAPI for limited scenarios.
Tests
- We built a test infrastructure that deals with the stochastic nature of LLMs and makes heavy use of caching LLM responses for given prompts. If you make a change in the code that results in a prompt change for a specific scenario, you want to only issue LLM requests for the changed prompts and use the cached LLM responses in all other cases. The cache is implemented using Redis. We need to allow read-only access to the Redis cache which in MS terminology makes the Redis cache a production service. We therefore need to go through the motions of creating a new production service.
- We need to investigate if we can use PR submissions for cache baseline updates.
Builds
- We'll need to define what PR builds look like for the Chat extension.
Issues
- Today, issues for AI features are in three different repositories: microsoft/vscode, microsoft/vscode-copilot-release, and the private repository we use(d) for developing the Chat extension.
- Going forward, all client issues should be in microsoft/vscode.
- We'll move only select issues from the private repo into the public repo.
- We'll archive/lock the microsoft/vscode-copilot-release repo, so that no new issues can be created there and existing issues are locked. The issues will continue to be accessible.
- We need a clearer separation of client issues from service issues. We have a large number of service issues, particularly in the microsoft/vscode-copilot-release repo that are not actionable and have no clear path to being closeable.