EmbedJs
EmbedJs is an Open Source Framework for personalizing LLM responses. An ultimate toolkit for building powerful Retrieval-Augmented Generation (RAG) and Large Language Model (LLM) applications with ease in Node.js.
It segments data into manageable chunks, generates relevant embeddings, and stores them in a vector database for optimized retrieval. It enables users to extract contextual information, find precise answers, or engage in interactive chat conversations, all tailored to their own data.
Here's an example of how easy it is to get started -
const ragApplication = await new RAGApplicationBuilder()
.addLoader({ type: 'YoutubeSearch', youtubeSearchString: 'Tesla cars' })
.addLoader('https://en.wikipedia.org/wiki/Tesla,_Inc.')
.addLoader('https://tesla-info.com/sitemap.xml')
.setVectorDb(new LanceDb({ path: '.db' }))
.build();
That's it. Now you can ask questions -
console.log(await ragApplication.query('Give me the history of Tesla?'));
Features
-
Supports all popular large language models - paid and open source
-
Supports many vector databases including self-hosted and cloud variants.
-
Load different kinds of unstructured data. Comes built in with several loaders that makes this easy.
-
Supports several cache options that can greatly improve the performance of your RAG applications in production.
-
Exposes a simple and highly configureable API allows both quick launch and deep customizabilty.
-
Use just as an embedding engine or a full blown chat API with history
Quick note
The author(s) are looking to add core maintainers for this opensource project. Reach out on Linkedin if you are interested. If you want to contribute in general - create issues on GitHub or send in PRs.
Contents
- EmbedJs
- Contents
- Getting started
- Loaders supported
- LLMs
- Embedding models
- Vector databases supported
- Caches
- Conversation history
- Langsmith Integration
- Sample projects
- Contributors
Getting started
Installation
You can install the library via NPM or Yarn
npm i @llm-tools/embedjs
Note: The library uses the newer ES6 modules and import
syntax.
Usage
To configure a new EmbedJs application, you need to do three steps -
1. Pick an LLM
The library supports several LLMs. Activate one by allowing the instructions in the LLM section.
const ragApplication = await new RAGApplicationBuilder()
.setModel(new HuggingFace({ modelName: 'mistralai/Mixtral-8x7B-v0.1' }))
...
Note: To use the library only for embeddings and not instantiate a LLM, you can pass the string NO_MODEL
to the setModel function here. This will disable the option to call the query
function but you can still get the embeddings with the getContext
method.
2. Pick a Vector database
The library supports several vector databases. Enable one by allowing the instructions in the Vector Databases section.
.setVectorDb(new PineconeDb({ projectName: 'test', namespace: 'dev' }))
3. Load some data
The library supports several kinds of loaders. You can use zero, one or many kinds of loaders together to import custom knowledge. Read the loaders section to learn more about the different supported loaders.
.addLoader(new YoutubeSearchLoader({ searchString: 'Tesla cars' }))
.addLoader(new SitemapLoader({ url: 'https://tesla-info.com/sitemap.xml' }))
.build();
That's it! Now that you have your instance of RAGApplication
, you can use it to query against the loaded data sets, like so -
await ragApplication.query('What is Tesla?');
Temperature
The temperature is a number between 0 and 1. It governs the randomness and creativity of the LLM responses. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. You can alter it by -
await new RAGApplicationBuilder()
.setTemperature(0.1)
NOTE: The default value is 0.1, which makes the GPT responses very precise.
Search results count
This is the number of documents to aim for when retrieving results from the vector database. A high number of results might mean there is more non-relevant data in the context. A low number might mean none of the relevant documents are retrieved. You need to set the number that works best for you. The parameter can be altered by -
await new RAGApplicationBuilder()
.setSearchResultCount(10)
NOTE: The default value is 7.
It is important to note that the library does not simply dump all contextual document chunks into the prompt. It sends them to the model marking them as context documents. The number of documents still counts toward the token limit.
When the number of documents fetched leads to a request above the token limit, the library uses the following strategy -
It runs a preprocessing step to select relevant sections from each document until the total number of tokens is less than the maximum number of tokens allowed by the model. It then uses the transformed documents as context to answer the question.
Customize the prompt
LLM models need some care. The models are notorious for inventing responses when they don't know the answer. Keeping this in mind, the library auto adds a wrapper to all user queries. The default prompt is -
Use all the provided context to answer the query at the end. Answer in full. If you don't know the answer, just say that you don't know, don't try to make up an answer. Query: {0}
The placeholder {0}
is replaced with the input query. In some cases, you may want to customize this prompt. This can be done with ease by -
await new RAGApplicationBuilder()
.setQueryTemplate('My own query template')
Get context (dry run)
During development, you may want to test the performance and quality of the Loaders
you have enabled without making any LLM calls. You can do this by using the getContext
method -
await ragApplication.getContext('What is Steve Jobs?')
Delete loader
You can remove the embeddings added from a specific loader by calling the deleteLoader
method with the uniqueId of the loader.
await ragApplication.deleteLoader('uniqueId...', true)
Get count of embedded chunks
You can fetch the count of embeddedings stored in your vector database at any time by calling the getEmbeddingsCount
method -
await ragApplication.getEmbeddingsCount()
Remove all embeddings / reset
You can remove all stored embeddings in the vectorDb using the deleteAllEmbeddings
method -
await ragApplication.deleteAllEmbeddings(true)
Set cut-off for relevance
The library can filter the embeddings returned from a vector store that have a low relevance score to the query being asked. To do this, set the cut-off value using the setEmbeddingRelevanceCutOff
method -
await ragApplication.setEmbeddingRelevanceCutOff(0.23)
Add new loaders later
You can add new loaders at any point dynamically (even after calling the build
function on RAGApplicationBuilder
). To do this, simply call the addLoader
method -
await ragApplication.addLoader(new YoutubeLoader({ videoIdOrUrl: 'pQiT2U5E9tI' }));
Note: Do not forget to await the dynamically added loaders to ensure you wait for the load to complete before making queries on it.
Loader inference
You can add most loaders by passing a string to the addLoader
or the addLoaders
methods. The value can be a URL, path, JSON or youtube video id. The library will infer the type of content and invoke the appropirate loader automatically.
await ragApplication.addLoader('pQiT2U5E9tI'); //invokes youtube URL
await ragApplication.addLoader('https://lamport.azurewebsites.net/pubs/paxos-simple.pdf'); //invokes PDF loader
Note: If you pass the path to a local directory, every file in that directory is recursively added (including subfolders)!
Loaders supported
Loaders take a specific format, process the input and create chunks of the data. You can import all the loaders from the path @llm-tools/embedjs
. Currently, the library supports the following formats -
Youtube video
To add any youtube video to your app, use YoutubeLoader
.
.addLoader(new YoutubeLoader({ videoIdOrUrl: 'w2KbwC-s7pY' }))
Youtube channel
To add all videos in a youtube channel, use YoutubeChannelLoader
.
.addLoader(new YoutubeChannelLoader({ youtubeChannelId: '...' }))
Youtube search
To do a general youtube search and add the popular search results, use YoutubeSearchLoader
.
.addLoader(new YoutubeSearchLoader({ youtubeSearchString: '...' }))
PDF file
To add a pdf file, use PdfLoader
. You can add a local file -
.addLoader(new PdfLoader({ filePathOrUrl: path.resolve('paxos-simple.pdf') }))
Or, you can add a remote file -
.addLoader(new PdfLoader({ url: 'https://lamport.azurewebsites.net/pubs/paxos-simple.pdf' }))
Note: Currently there is no support for PDF forms and password protected documents
Docx file
To add a docx file, use DocxLoader
. You can add a local file -
.addLoader(new DocxLoader({ filePathOrUrl: path.resolve('paxos.docx') }))
Or, you can add a remote file -
.addLoader(new DocxLoader({ filePathOrUrl: 'https://xxx' }))
Excel file
To add an excel xlsx file, use ExcelLoader
. You can add a local file -
.addLoader(new ExcelLoader({ filePathOrUrl: path.resolve('numbers.xlsx') }))
Or, you can add a remote file -
.addLoader(new ExcelLoader({ filePathOrUrl: 'https://xxx' }))
Powerpoint file
To add an powerpoint / pptx file, use PptLoader
. You can add a local file -
.addLoader(new PptLoader({ filePathOrUrl: path.resolve('wow.pptx') }))
Or, you can add a remote file -
.addLoader(new PptLoader({ filePathOrUrl: 'https://xxx' }))
Web page
To add a web page, use WebLoader
.
.addLoader(new WebLoader({ urlOrContent: 'https://en.wikipedia.org/wiki/Formula_One' }))
Confluence
To add a confluence space, use ConfluenceLoader
.
.addLoader(new ConfluenceLoader({ spaceNames: ['...'] }))
You also need to set the following environment variables -
CONFLUENCE_BASE_URL=<your space base url>
CONFLUENCE_USER_NAME=<your email id or username>
CONFLUENCE_API_TOKEN=<your personal or bot access token>
Note: The confluence space name is the value you see in the url in the space overview page /wiki/spaces/{{ space name }}/overview
.
Sitemap
To add a XML sitemap, use SitemapLoader
.
.addLoader(new SitemapLoader({ url: '...' }))
This will load all URLs in a sitemap via the WebLoader.
Text
To supply your own text, use TextLoader
.
.addLoader(new TextLoader({ text: 'The best company name for a company making colorful socks is MrSocks' }))
Note: Feel free to add your custom text without worrying about duplication. The library will chuck, cache and update the vector databases without duplication.
Json
To add a parsed Javascript object to your embeddings, use JsonLoader
. The library will not parse a string to JSON on its own but once this is done, it can be injested easily.
.addLoader(new JsonLoader({ object: { key: value, ... } }))
Note: if you want to restrict the keys that get added to the vectorDb in a dynamically obtained object, you can use the pickKeysForEmbedding
optional parameter in the JsonLoader
constructor.
Csv
To add a Csv file (or URL) to your embeddings, use CsvLoader
. The library will parse the Csv and add each row to its vector database.
.addLoader(new CsvLoader({ filePathOrUrl: '...' }))
Note: You can control how the CsvLoader
parses the file in great detail by passing in the optional csvParseOptions
constructor parameter.
Add a custom loader
You can pass along a custom loader to the addLoader
method by extending and implementing the abstract class BaseLoader
. Here's how that would look like -
class CustomLoader extends BaseLoader<{ customChunkMetadata: string }> {
constructor() {
super('uniqueId');
}
async *getChunks() {
throw new Error('Method not implemented.');
}
}
We really encourage you send in a PR to this library if you are implementing a common loader