Skip to main content

๐Ÿ”‘ Getting started

Select a language

Chroma is a database for building AI applications with embeddings. It comes with everything you need to get started built in, and runs on your machine. A hosted version is coming soon!

1. Installโ€‹

npm install --save chromadb # yarn add chromadb

You will need to install the Chroma python package to use the Chroma CLI and backend server.

pip install chromadb

Alternatively, you can use a Docker container to run the Chroma backend server.

2. Get the Chroma Clientโ€‹

Start the Chroma backend server:

chroma run --path /db_path

Then create a client which connects to it:

// CJS
const { ChromaClient } = require("chromadb");

// ESM
import { ChromaClient } from 'chromadb'

const client = new ChromaClient();

3. Create a collectionโ€‹

Collections are where you'll store your embeddings, documents, and any additional metadata. You can create a collection with a name:

For this example, we want to generate embeddings from text. OpenAI's ada-002 model is popular, free, and a quick signup. Grab your API key and come back. Chroma's API is polymorphic (it can run in the browser or server-side), but OpenAIs is not. So run this example server-side.

caution

Please take steps to secure your API when interacting with frontend systems.

// CJS
const { OpenAIEmbeddingFunction } = require("chromadb");

// ESM
import { OpenAIEmbeddingFunction } from 'chromadb'

const embedder = new OpenAIEmbeddingFunction({
openai_api_key: "your_api_key",
});
const collection = await client.createCollection({
name: "my_collection",
embeddingFunction: embedder,
});

4. Add some text documents to the collectionโ€‹

Chroma will store your text, and handle tokenization, embedding, and indexing automatically.

await collection.add({
ids: ["id1", "id2"],
metadatas: [{ source: "my_source" }, { source: "my_source" }],
documents: ["This is a document", "This is another document"],
});

If you have already generated embeddings yourself, you can load them directly in:

await collection.add({
ids: ["id1", "id2"],
embeddings: [
[1.2, 2.3, 4.5],
[6.7, 8.2, 9.2],
],
metadatas: [{ source: "my_source" }, { source: "my_source" }],
documents: ["This is a document", "This is another document"],
});

5. Query the collectionโ€‹

You can query the collection with a list of query texts, and Chroma will return the n most similar results. It's that easy!

const results = await collection.query({
nResults: 2,
queryTexts: ["This is a query document"],
});

Find chromadb on npm.

๐Ÿ“š Next stepsโ€‹

  • Chroma is designed to be simple enough to get started with quickly and flexible enough to meet many use-cases. You can use your own embedding models, query Chroma with your own embeddings, and filter on metadata. To learn more about Chroma, check out the Usage Guide and API Reference.
  • Chroma is integrated in LangChain (python and js), making it easy to build AI applications with Chroma. Check out the integrations page to learn more.
  • You can deploy a persistent instance of Chroma to an external server, to make it easier to work on larger projects or with a team.

Coming Soonโ€‹

  • A hosted version of Chroma, with an easy to use web UI and API
  • Multiple datatypes, including images, audio, video, and more