Embeddings.js — Simple Text Embeddings library for Node.js
Embeddings.js
Embeddings.js
is a simple way to get text embeddings in
Node.js. Embeddings are useful for text similarity search using a vector database.
await embeddings("Hello World!"); // embedding array
- Easy to use
- Works with any vector database
- Supports multiple embedding models with the same simple interface
- Local with Xenova/all-MiniLM-L6-v2
- OpenAI with text-embedding-ada-002
- Mistral with mistral-embed
- Caches embeddings
- MIT license
Install
npm install @themaximalist/embeddings.js
To use local embeddings, be sure to install the model as well
npm install @xenova/transformers
Configure
Embeddings.js
works out of the box with local
embeddings, but if you use the OpenAI or Mistral embeddings you’ll need
an API key in your environment.
export OPENAI_API_KEY=<your-openai-api-key>
export MISRAL_API_KEY=<your-mistral-api-key>
Usage
Using Embeddings.js
is as simple as calling a function
with any string.
import embeddings from "@themaximalist/embeddings.js";
// defaults to local embeddings
const embedding = await embeddings("Hello World!");
// 384 dimension embedding array
Switching embedding models is easy:
// openai
const embedding = await embeddings("Hello World", {
service: "openai"
;
})// 1536 dimension embedding array
// mistral
const embedding = await embeddings("Hello World", {
service: "mistral"
})// 1024 dimension embedding array
Cache
Embeddings.js
caches by default, but you can disable it
by passing cache: false
as an option.
// don't cache (on by default)
const embedding = await embeddings("Hello World!", {
cache: false
; })
The cache file is written to .embeddings.cache.json
—you
can also delete this file to reset the cache.
API
The Embeddings.js
API is a simple function you call with
your text, with an optional config object.
await embeddings(
, // Text input to compute embeddings
input
{service: "openai", // Embedding service
model: "text-embedding-ada-002", // Embedding model
cache: true, // Cache embeddings
cache_file: ".embeddings.cache.json", // Cache file
}; )
Options
service
<string>
: Embedding service provider. Default istransformers
, a local embedding provider.model
<string>
: Embedding service model. Default isXenova/all-MiniLM-L6-v2
, a local embedding model. If no model is provided, it will use the default for the selectedservice
.cache
<bool>
: Cache embeddings. Default istrue
.cache_file
<string>
: Cache file. Default is.embeddings.cache.json
.
Response
Embeddings.js
returns a float[]
— an array
of floating-point numbers.
-0.011776604689657688, 0.024298833683133125, 0.0012317118234932423, ... ] [
The length of the array is the dimensions
of the
embedding. When performing text similarity, you’ll want to know the
dimensions of your embeddings to use them in a vector database.
Dimension Embeddings
- Local: 384
- OpenAI: 1536
- Mistral: 1024
The Embeddings.js
API ensures you have a simple way to
use embeddings from multiple providers.
Debug
Embeddings.js
uses the debug
npm module
with the embeddings.js
namespace.
View debug logs by setting the DEBUG
environment
variable.
> DEBUG=embeddings.js*
> node src/get_embeddings.js
# debug logs
Vector Database
Embeddings can be used in any vector database like Pinecone, Chroma, PG Vector, etc…
For a local vector database that runs in-memory and uses
Embeddings.js
internally, check out VectorDB.js.
Projects
Embeddings.js
is currently used in the following
projects:
- AI.js — simple AI library
- VectorDB.js — local text similarity search
- HyperType — knowledge graph toolkit
- HyperTyper — multidimensional mind mapping
License
MIT
Author
Created by The Maximalist, see our open-source projects.