Azure AI Search: How to Optimize Your Searches for Your RAG Applications

How does Azure AI Search work?

Azure AI Search is an information retrieval service that uses AI to improve the relevance and accuracy of results.

Azure AI Search allows you to index information using vectors, which means it can search not only by keywords but also by concepts and relationships between terms. This is especially useful for Retrieval-Augmented Generation (RAG) applications.

Example of an Indexing Process

What is Vector Vector Embeddings?

A vector embedding is a numerical representation of an object, such as text or an image, that captures its semantic features. In the context of Azure AI Search, embeddings allow the search engine to better understand the meaning and relationship between different terms.

Example:

"Dog" => [0.1, 0.2, 0.3, ...]
"Animal" => [0.4, 0.5, 0.6, ...]

In this example, the vectors for "Dog" and "Animal" are close in the vector space, indicating that they are semantically related. This allows Azure AI Search to find relevant results even if the exact words do not match.

Generating Vector Embeddings with OpenAI

Models for Embedding Generation

text-embedding-ada-002: A model optimized for embedding tasks, offering a good balance between speed and accuracy.
text-embedding-3-small: A small-sized model, ideal for applications that require fast embedding generation at a lower cost.
text-embedding-3-large: A large-sized model that provides higher accuracy in embedding generation, but at a higher cost and slower response time.

Example of Embedding Generation

import os
import dotenv
import openai

dotenv.load_dotenv()

openai_client = openai.OpenAI(
    base_url="https://models.inference.ai.azure.com",
    api_key=os.environ["GITHUB_TOKEN"]
)
MODEL_NAME = "text-embedding-3-small"

content_input = "Resume: Lionel Messi. Argentine footballer, considered one of the best football players of all time."

embeddings_response = openai_client.embeddings.create(
    model=MODEL_NAME,
    input=content_input,
)
embedding = embeddings_response.data[0].embedding

print(len(embedding))
print(embedding)

1536
[0.01875680685043335, -0.01207759603857994, -0.014209441840648651,...]

Comparing Similarity Between Vectors

To compare the similarity between two vectors, you can use a distance metric such as "cosine distance." This metric measures the angle between two vectors and is useful for determining how similar they are in terms of direction.

For example: Suppose we want to compare the vectors of the previously generated text with a query phrase like "Diego Zumárraga Mera." We can calculate the similarity between the two vectors generated by OpenAI.

First, generate the embeddings for both phrases:

content_input = "Resume: Lionel Messi. Argentine footballer, considered one of the best football players of all time."

embeddings_response = openai_client.embeddings.create(
    model=MODEL_NAME,
    input=content_input,
)
content_embedding = embeddings_response.data[0].embedding
print(content_embedding)

embeddings_response = openai_client.embeddings.create(
    model=MODEL_NAME,
    input="Diego Zumárraga Mera",
)
query_embedding1 = embeddings_response.data[0].embedding
print(query_embedding1)

Then, calculate the cosine distance between the two vectors:

def cosine_similarity(v1, v2):

  dot_product = sum(
    [a * b for a, b in zip(v1, v2)])
  
  magnitude = (
    sum([a**2 for a in v1]) *
    sum([a**2 for a in v2])) ** 0.5

  return dot_product / magnitude

# Compare the two vectors
similarity = cosine_similarity(query_embedding1, content_embedding)
print(f"Similarity: {similarity:.4f}")

Similarity: 0.1799

As we can see, the similarity between the two vectors is approximately 0.1799, indicating that they are not very semantically similar.

Now let's change the query phrase to "Resume: Diego Zumárraga Mera" and recalculate the similarity:

embeddings_response = openai_client.embeddings.create(
    model=MODEL_NAME,
    input="Summarize the resume of Diego Zumárraga Mera",
)
query_embedding2 = embeddings_response.data[0].embedding
print(query_embedding2)

# Compare the two vectors
similarity = cosine_similarity(query_embedding2, content_embedding)
print(f"Similarity: {similarity:.4f}")

Similarity: 0.3429

Full code in the notebook: VectorEmbeddings.ipynb

The similarity between the two vectors almost doubled, indicating that they are now more semantically similar. This is because the phrase "Resume" generates similarity with any text containing that phrase.

Conclusion

If we want to develop a RAG solution to index and search documents such as resumes, it may have problems finding a specific resume if the query includes common phrases like "Resume." This is because Azure AI Search may return results containing that phrase, but they are not necessarily relevant to the query.

Solution: Configure Semantic Search in Azure AI Search

To improve the relevance of results, we can configure semantic search in Azure AI Search to take into account other index features such as the document title or additional keywords.

Configuring Semantic Search

When creating an index in Azure AI Search, we can create two fields to use as title and keywords. These fields can then be used in the semanticSearch configuration to improve the relevance of results.

def create_index_definition(name: str) -> SearchIndex:
    """
    Returns an Azure Cognitive Search index with the given name.
    The index includes a vector search with the default HNSW algorithm
    """
    # The fields we want to index. The "embedding" field is a vector field that will
    # be used for vector search.
    fields=[
            SimpleField(name="id", type=SearchFieldDataType.String, key=True),
            SearchableField(name="title", type=SearchFieldDataType.String, searchable=True, filterable=True, facetable=True),
            SearchableField(name="content", type=SearchFieldDataType.String, searchable=True),
            SearchableField(name="keywords", type=SearchFieldDataType.String, searchable=True, filterable=True, facetable=True),
            SearchField(
                name="embedding",
                type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
                searchable=True,
                # Size of the vector created by the text-embedding-3-small model.
                vector_search_dimensions=1536,
                vector_search_profile_name="myHnswProfile",
            ),
        ]

    # The "content" field should be prioritized for semantic ranking.
    semantic_config = SemanticConfiguration(
        name="default",
        prioritized_fields=SemanticPrioritizedFields(
            content_fields=[SemanticField(field_name="content")],
            title_field=SemanticField(field_name="title"),
            keywords_fields=[SemanticField(field_name="keywords")],
        ),
    )

    # For vector search, we want to use the HNSW (Hierarchical Navigable Small World)
    # algorithm (a type of approximate nearest neighbor search algorithm) with cosine
    # distance.
    vector_search = VectorSearch(
        algorithms=[
            HnswAlgorithmConfiguration(
                name="myHnsw")
        ],
        profiles=[
            VectorSearchProfile(
                name="myHnswProfile",
                algorithm_configuration_name="myHnsw",
            )
        ]
    )

    # Create the semantic settings with the configuration
    semantic_search = SemanticSearch(configurations=[semantic_config])

    # Create the search index.
    index = SearchIndex(
        name=name,
        fields=fields,
        semantic_search=semantic_search,
        vector_search=vector_search,
    )

    return index

When indexing a document, we must ensure to include the title and keywords fields. In these fields, we can avoid including common and repetitive phrases such as: "Resume," "User Manual," etc. Instead, we can include keywords that are more specific to the document's content.

def gen_index_document() -> List[Dict[str, any]]:
    
    openai_client = openai.OpenAI(
    base_url="https://models.inference.ai.azure.com",
    api_key=os.environ["GITHUB_TOKEN"]
    )
    MODEL_NAME = "text-embedding-3-small"

    content_input = [{
        "Title": "Lionel Messi",
        "Content": "Resume: Lionel Messi. Argentine footballer, considered one of the best football players of all time."
    },
    {
        "Title": "Diego Zumárraga Mera",
        "Content": "Resume: Diego Zumárraga Mera. Software engineer with experience in web and mobile application development, passionate about artificial intelligence and machine learning."
    }]
    
    items = []
    for ix, item in enumerate(content_input):
        content = item["Content"]
        print(f"Processing item {ix}: {content}")
        embeddings_response = openai_client.embeddings.create(
            model=MODEL_NAME,
            input=content,
        )
        embedding = embeddings_response.data[0].embedding
        print(len(embedding))
        print(embedding)

        items.append({
            "id": f"doc-{ix}",
            "title": item["Title"],
            "content": content,
            "keywords": item["Title"].replace(" ", ", "), # split the title into keywords
            "embedding": embedding
        })

    return items

Full code in the notebook: AzureAISearchIndex.ipynb

Semantic Search with Azure AI Search

To test semantic search, we can search for the phrase "Summarize the resume of Diego Zumárraga Mera" and see how Azure AI Search can return relevant results.

query_input = "Summarize the resume of Diego Zumárraga Mera"
embeddings_response = openai_client.embeddings.create(
    model=MODEL_NAME,
    input=query_input,
)
query_embedding = embeddings_response.data[0].embedding
print(query_embedding)

item = {"query": query_input,
        "embedding": query_embedding
}

search_client = SearchClient(
        endpoint=os.environ["AZURE_AI_SEARCH_ENDPOINT"],
        index_name="my-index",
        credential=DefaultAzureCredential(),
    )

results = []
vector_query = VectorizedQuery(
            vector=item["embedding"], k_nearest_neighbors=3, fields="embedding"
        )
result = search_client.search(
    search_text=item["query"],
    vector_queries=[vector_query],
    query_type=QueryType.SEMANTIC,
    semantic_configuration_name="default",
    query_caption=QueryCaptionType.EXTRACTIVE,
    query_answer=QueryAnswerType.EXTRACTIVE,
    top=2,
)

Now let's see the best search result:

answers = result.get_answers()
if answers:
    print("Answers:")
    for answer in answers:
        print(f"Answer: {answer.text}")
        print(f"Confidence: {answer.score}")

Answers:
Answer: Resume: Diego Zumárraga Mera. Software engineer with experience in web and mobile application development, passionate about artificial intelligence and machine learning.
Confidence: 0.9860000014305115

Full code in the notebook: AzureAISearchQuery.ipynb

As we can see, Azure AI Search has returned a relevant result for the query, even though the query includes the common phrase "Resume". This is because we have configured semantic search to take into account the document's title and keywords, which improves the relevance of the results.

Azure AI Search: How to Optimize Your Searches for Your RAG Applications

How does Azure AI Search work?

Azure AI Search is an information retrieval service that uses AI to improve the relevance and accuracy of results.

Example of an Indexing Process

What is Vector Vector Embeddings?

Example:

"Dog" => [0.1, 0.2, 0.3, ...]
"Animal" => [0.4, 0.5, 0.6, ...]

Generating Vector Embeddings with OpenAI

Models for Embedding Generation

text-embedding-ada-002: A model optimized for embedding tasks, offering a good balance between speed and accuracy.
text-embedding-3-small: A small-sized model, ideal for applications that require fast embedding generation at a lower cost.
text-embedding-3-large: A large-sized model that provides higher accuracy in embedding generation, but at a higher cost and slower response time.

Example of Embedding Generation

import os
import dotenv
import openai

dotenv.load_dotenv()

openai_client = openai.OpenAI(
    base_url="https://models.inference.ai.azure.com",
    api_key=os.environ["GITHUB_TOKEN"]
)
MODEL_NAME = "text-embedding-3-small"

content_input = "Resume: Lionel Messi. Argentine footballer, considered one of the best football players of all time."

embeddings_response = openai_client.embeddings.create(
    model=MODEL_NAME,
    input=content_input,
)
embedding = embeddings_response.data[0].embedding

print(len(embedding))
print(embedding)

1536
[0.01875680685043335, -0.01207759603857994, -0.014209441840648651,...]

Comparing Similarity Between Vectors

First, generate the embeddings for both phrases:

content_input = "Resume: Lionel Messi. Argentine footballer, considered one of the best football players of all time."

embeddings_response = openai_client.embeddings.create(
    model=MODEL_NAME,
    input=content_input,
)
content_embedding = embeddings_response.data[0].embedding
print(content_embedding)

embeddings_response = openai_client.embeddings.create(
    model=MODEL_NAME,
    input="Diego Zumárraga Mera",
)
query_embedding1 = embeddings_response.data[0].embedding
print(query_embedding1)

Then, calculate the cosine distance between the two vectors:

def cosine_similarity(v1, v2):

  dot_product = sum(
    [a * b for a, b in zip(v1, v2)])
  
  magnitude = (
    sum([a**2 for a in v1]) *
    sum([a**2 for a in v2])) ** 0.5

  return dot_product / magnitude

# Compare the two vectors
similarity = cosine_similarity(query_embedding1, content_embedding)
print(f"Similarity: {similarity:.4f}")

Similarity: 0.1799

As we can see, the similarity between the two vectors is approximately 0.1799, indicating that they are not very semantically similar.

Now let's change the query phrase to "Resume: Diego Zumárraga Mera" and recalculate the similarity:

embeddings_response = openai_client.embeddings.create(
    model=MODEL_NAME,
    input="Summarize the resume of Diego Zumárraga Mera",
)
query_embedding2 = embeddings_response.data[0].embedding
print(query_embedding2)

# Compare the two vectors
similarity = cosine_similarity(query_embedding2, content_embedding)
print(f"Similarity: {similarity:.4f}")

Similarity: 0.3429

Full code in the notebook: VectorEmbeddings.ipynb

Conclusion

Solution: Configure Semantic Search in Azure AI Search

To improve the relevance of results, we can configure semantic search in Azure AI Search to take into account other index features such as the document title or additional keywords.

Configuring Semantic Search

def create_index_definition(name: str) -> SearchIndex:
    """
    Returns an Azure Cognitive Search index with the given name.
    The index includes a vector search with the default HNSW algorithm
    """
    # The fields we want to index. The "embedding" field is a vector field that will
    # be used for vector search.
    fields=[
            SimpleField(name="id", type=SearchFieldDataType.String, key=True),
            SearchableField(name="title", type=SearchFieldDataType.String, searchable=True, filterable=True, facetable=True),
            SearchableField(name="content", type=SearchFieldDataType.String, searchable=True),
            SearchableField(name="keywords", type=SearchFieldDataType.String, searchable=True, filterable=True, facetable=True),
            SearchField(
                name="embedding",
                type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
                searchable=True,
                # Size of the vector created by the text-embedding-3-small model.
                vector_search_dimensions=1536,
                vector_search_profile_name="myHnswProfile",
            ),
        ]

    # The "content" field should be prioritized for semantic ranking.
    semantic_config = SemanticConfiguration(
        name="default",
        prioritized_fields=SemanticPrioritizedFields(
            content_fields=[SemanticField(field_name="content")],
            title_field=SemanticField(field_name="title"),
            keywords_fields=[SemanticField(field_name="keywords")],
        ),
    )

    # For vector search, we want to use the HNSW (Hierarchical Navigable Small World)
    # algorithm (a type of approximate nearest neighbor search algorithm) with cosine
    # distance.
    vector_search = VectorSearch(
        algorithms=[
            HnswAlgorithmConfiguration(
                name="myHnsw")
        ],
        profiles=[
            VectorSearchProfile(
                name="myHnswProfile",
                algorithm_configuration_name="myHnsw",
            )
        ]
    )

    # Create the semantic settings with the configuration
    semantic_search = SemanticSearch(configurations=[semantic_config])

    # Create the search index.
    index = SearchIndex(
        name=name,
        fields=fields,
        semantic_search=semantic_search,
        vector_search=vector_search,
    )

    return index

def gen_index_document() -> List[Dict[str, any]]:
    
    openai_client = openai.OpenAI(
    base_url="https://models.inference.ai.azure.com",
    api_key=os.environ["GITHUB_TOKEN"]
    )
    MODEL_NAME = "text-embedding-3-small"

    content_input = [{
        "Title": "Lionel Messi",
        "Content": "Resume: Lionel Messi. Argentine footballer, considered one of the best football players of all time."
    },
    {
        "Title": "Diego Zumárraga Mera",
        "Content": "Resume: Diego Zumárraga Mera. Software engineer with experience in web and mobile application development, passionate about artificial intelligence and machine learning."
    }]
    
    items = []
    for ix, item in enumerate(content_input):
        content = item["Content"]
        print(f"Processing item {ix}: {content}")
        embeddings_response = openai_client.embeddings.create(
            model=MODEL_NAME,
            input=content,
        )
        embedding = embeddings_response.data[0].embedding
        print(len(embedding))
        print(embedding)

        items.append({
            "id": f"doc-{ix}",
            "title": item["Title"],
            "content": content,
            "keywords": item["Title"].replace(" ", ", "), # split the title into keywords
            "embedding": embedding
        })

    return items

Full code in the notebook: AzureAISearchIndex.ipynb

Semantic Search with Azure AI Search

To test semantic search, we can search for the phrase "Summarize the resume of Diego Zumárraga Mera" and see how Azure AI Search can return relevant results.

query_input = "Summarize the resume of Diego Zumárraga Mera"
embeddings_response = openai_client.embeddings.create(
    model=MODEL_NAME,
    input=query_input,
)
query_embedding = embeddings_response.data[0].embedding
print(query_embedding)

item = {"query": query_input,
        "embedding": query_embedding
}

search_client = SearchClient(
        endpoint=os.environ["AZURE_AI_SEARCH_ENDPOINT"],
        index_name="my-index",
        credential=DefaultAzureCredential(),
    )

results = []
vector_query = VectorizedQuery(
            vector=item["embedding"], k_nearest_neighbors=3, fields="embedding"
        )
result = search_client.search(
    search_text=item["query"],
    vector_queries=[vector_query],
    query_type=QueryType.SEMANTIC,
    semantic_configuration_name="default",
    query_caption=QueryCaptionType.EXTRACTIVE,
    query_answer=QueryAnswerType.EXTRACTIVE,
    top=2,
)

Now let's see the best search result:

answers = result.get_answers()
if answers:
    print("Answers:")
    for answer in answers:
        print(f"Answer: {answer.text}")
        print(f"Confidence: {answer.score}")

Answers:
Answer: Resume: Diego Zumárraga Mera. Software engineer with experience in web and mobile application development, passionate about artificial intelligence and machine learning.
Confidence: 0.9860000014305115

Full code in the notebook: AzureAISearchQuery.ipynb

Azure AI Search: How to Optimize Your Searches for Your RAG Applications

Azure AI Search: How to Optimize Your Searches for Your RAG Applications

Table of contents

Azure AI Search: How to Optimize Your Searches for Your RAG Applications

How does Azure AI Search work?

Example of an Indexing Process

What is Vector Vector Embeddings?

Generating Vector Embeddings with OpenAI

Models for Embedding Generation

Example of Embedding Generation

Comparing Similarity Between Vectors

Conclusion

Solution: Configure Semantic Search in Azure AI Search

Configuring Semantic Search

Semantic Search with Azure AI Search

Table of contents

Azure AI Search: How to Optimize Your Searches for Your RAG Applications

How does Azure AI Search work?

Example of an Indexing Process

What is Vector Vector Embeddings?

Generating Vector Embeddings with OpenAI

Models for Embedding Generation

Example of Embedding Generation

Comparing Similarity Between Vectors

Conclusion

Solution: Configure Semantic Search in Azure AI Search

Configuring Semantic Search

Semantic Search with Azure AI Search

Code

Code