Skip to content
Webinar - April 23: Breaking the trade-off: ACID transactions at scaleRegister now

Create vector embeddings

To use Aerospike Vector Search (AVS), you must build an application that generates vector embeddings. This page outlines some general approaches for generating vector embeddings using Python and a machine learning model.

Generate embeddings using a hosted service

Using a hosted model like OpenAI provides ease of use, quick deployment, and access to cutting-edge technology without the need for significant infrastructure investment. It ensures scalability, automatic updates, and maintenance, allowing organizations to focus on application development rather than managing the underlying model infrastructure.

  1. Install the OpenAI Python client library if you haven’t already:

    Terminal window
    pip install openai
  2. Use the following Python code to generate a vector embedding:

    import openai
    # Set your OpenAI API key
    openai.api_key = 'your-api-key-here'
    # Define the text chunk for which you want to generate an embedding
    text_chunk = "OpenAI's GPT-4 is a powerful language model capable of performing a wide range of natural language processing tasks."
    # Generate the embedding
    response = openai.Embedding.create(
    input=text_chunk,
    model="text-embedding-ada-002"
    )
    # Extract the embedding vector
    embedding_vector = response['data'][0]['embedding']
    # Print the embedding vector
    print(embedding_vector)

Self-host an open-source model

Self-hosting a machine learning model offers enhanced data privacy, security, and control over the environment, making it easier to comply with regulatory requirements and optimize performance. It can also be more cost-effective for high usage scenarios, eliminating dependency on third-party providers and reducing latency.

The following example shows how you can generate a vector embedding from a chunk of text using the LLaMA model. You can use the generated vector for various downstream tasks such as similarity searches and other vector computations.

  1. Install the required libraries:

    Terminal window
    pip install transformers
    pip install torch
  2. Use the following Python code to generate a vector embedding:

    from transformers import AutoTokenizer, AutoModel
    import torch
    # Load the LLaMA model and tokenizer
    model_name = "facebook/llama-7b" # Replace with the correct model name
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModel.from_pretrained(model_name)
    # Define the text chunk for which you want to generate an embedding
    text_chunk = "LLaMA is a powerful language model capable of performing a wide range of natural language processing tasks."
    # Tokenize the text chunk
    inputs = tokenizer(text_chunk, return_tensors="pt")
    # Generate the embeddings
    with torch.no_grad():
    outputs = model(**inputs)
    # The embeddings are typically in the 'last_hidden_state' tensor
    embeddings = outputs.last_hidden_state
    # Average the token embeddings to get a single vector representation
    embedding_vector = torch.mean(embeddings, dim=1).squeeze().numpy()
    # Print the embedding vector
    print(embedding_vector)

Additional resources

Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?