Project Discussion

Local only Copilot

 —  How to run LLMs locally on your code
Return Home

It's not a new thing that the advent of Large Language Models (LLMs), especially GitHub's Copilot, has drastically changed how we approach writing code in an extremely short amount of time.

However, with this new technology there is some nervousness for businesses and privacy oriented folks that having to share every keystroke you write to a bundle of powerful companies does not fare well against policies of confidentiality or non-disclosure.

However, it doesn't need to be this way.

Setup

LLM-LS

LLM-LS is a Language Server for LLMs, and at the moment, being a Hugging Face product, it only supports LLMs hosted on Hugging Face. The beauty of open-source licenses, however, is that if you need a feature you're welcome to take the existing code and fit it within your needs.

So, I've created a fork which allows you to make requests to other providers using built-in adapters. And one of these supported providers is Ollama which runs models locally.

Soon my fork should be a part of an LLM-LS release, but for now if you want to use it for Ollama you'll need to clone or fork mine and build it.

git clone git@github.com:noahbald/llm-ls.git
git checkout feat/client-custom-data-support
cargo build

Ollama

Ollama is quite straightforward to install and use, which is certainly part of it's charm.

If you're going to be running this on your laptop while you code, you should be aware that installing and running these models tends to take up quite a lot of resource. I've been using starcoder:1b as my model. The 1b being related to how many parameters your model will consume. Generally, the lower the number the less resources it will require, at the cost of quality of results.

I find 1b to be perfectly adequate, and if you have an 8gb, 16gb, or 32gb memory machine with a nice GPU you could bump up your model to 7b, 13b, or 34b parameters.

Alternatively, you could run Ollama on a self-hosted server if you didn't want to use up all your local resources.

Using in your editor

I use NeoVim and already have a fork of llm.nvim up that should also be merged to the main repo soon. At the moment it won't work quite right, you'll need to rename the options being sent to llm-ls from snake_case to camelCase.

There should be options for VSCode and IntelliJ up soon enough as well.

Mocking other Copilot Features

Copilot provides more than just code generation and with Ollama we can achieve most of these from the terminal.

We can generate code from outside of our editor

ollama run codellama:7b "In tsx, create a new button component" > button.tsx

At-directives can be replaced with their terminal equivalent

ollama run codellama:7b "$(sed -n '99,109p' api.ts)\n How can I update this code to include credentials"

Diffs can be passed to Ollama to write pull requests for us

ollama run codellama:7b "$(git diff)\nFor this given diff, please write a description for a pull request"

There are also lots of plugins that probably exist for your editor that streamline these requests to Ollama for you.

Notes

Hopefully by the time you're reading this, it's a part of a release for llm-ls and your editor's respective plugin. If not, you can still take advantage of my work.

  • Clone my fork of llm-ls as linked earlier
  • Run cargo build and you should see a binary called "llm-ls" has been created somewhere in the target/ directory
  • Install my fork of your editor's plugin for llm-ls, and configure it so that the binary points to your local build of llm-ls