Unlock your Full Databricks-Generative-AI-Engineer-Associate Databricks Stable Exam

Databricks Certified Generative AI Engineer Associate Questions and Answers

Question 1

A Generative Al Engineer is building a system which will answer questions on latest stock news articles.

Which will NOT help with ensuring the outputs are relevant to financial news?

Options:

Implement a comprehensive guardrail framework that includes policies for content filters tailored to the finance sector.

Increase the compute to improve processing speed of questions to allow greater relevancy analysis

C Implement a profanity filter to screen out offensive language

Incorporate manual reviews to correct any problematic outputs prior to sending to the users

Question 2

A Generative AI Engineer just deployed an LLM application at a digital marketing company that assists with answering customer service inquiries.

Which metric should they monitor for their customer service LLM application in production?

Options:

Number of customer inquiries processed per unit of time

Energy usage per query

Final perplexity scores for the training of the model

HuggingFace Leaderboard values for the base LLM

Question 3

A Generative Al Engineer is working with a retail company that wants to enhance its customer experience by automatically handling common customer inquiries. They are working on an LLM-powered Al solution that should improve response times while maintaining a personalized interaction. They want to define the appropriate input and LLM task to do this.

Which input/output pair will do this?

Options:

Input: Customer reviews; Output Group the reviews by users and aggregate per-user average rating, then respond

Input: Customer service chat logs; Output Group the chat logs by users, followed by summarizing each user's interactions, then respond

Input: Customer service chat logs; Output: Find the answers to similar questions and respond with a summary

Input: Customer reviews: Output Classify review sentiment

Answer:

Explanation:

The task described in the question involves enhancing customer experience by automatically handling common customer inquiries using an LLM-powered AI solution. This requires the system to process input data (customer inquiries) and generate personalized, relevant responses efficiently. Let’s evaluate the options step-by-step in the context of Databricks Generative AI Engineer principles, which emphasize leveraging LLMs for tasks like question answering, summarization, and retrieval-augmented generation (RAG).

Option A: Input: Customer reviews; Output: Group the reviews by users and aggregate per-user average rating, then respond

This option focuses on analyzing customer reviews to compute average ratings per user. While this might be useful for sentiment analysis or user profiling, it does not directly address the goal of handling common customer inquiries or improving response times for personalized interactions. Customer reviews are typically feedback data, not real-time inquiries requiring immediate responses.

Databricks Reference: Databricks documentation on LLMs (e.g., "Building LLM Applications with Databricks") emphasizes that LLMs excel at tasks like question answering and conversational responses, not just aggregation or statistical analysis of reviews.

Option B: Input: Customer service chat logs; Output: Group the chat logs by users, followed by summarizing each user's interactions, then respond

This option uses chat logs as input, which aligns with customer service scenarios. However, the output—grouping by users and summarizing interactions—focuses on user-specific summaries rather than directly addressing inquiries. While summarization is an LLM capability, this approach lacks the specificity of finding answers to common questions, which is central to the problem.

Databricks Reference: Per Databricks’ "Generative AI Cookbook," LLMs can summarize text, but for customer service, the emphasis is on retrieval and response generation (e.g., RAG workflows) rather than user interaction summaries alone.

Option C: Input: Customer service chat logs; Output: Find the answers to similar questions and respond with a summary

This option uses chat logs (real customer inquiries) as input and tasks the LLM with identifying answers to similar questions, then providing a summarized response. This directly aligns with the goal of handling common inquiries efficiently while maintaining personalization (by referencing past interactions or similar cases). It leverages LLM capabilities like semantic search, retrieval, and response generation, which are core to Databricks’ LLM workflows.

Databricks Reference: From Databricks documentation ("Building LLM-Powered Applications," 2023), an exact extract states:"For customer support use cases, LLMs can be used to retrieve relevant answers from historical data like chat logs and generate concise, contextually appropriate responses."This matches Option C’s approach of finding answers and summarizing them.

Option D: Input: Customer reviews; Output: Classify review sentiment

This option focuses on sentiment classification of reviews, which is a valid LLM task but unrelated to handling customer inquiries or improving response times in a conversational context. It’s more suited for feedback analysis than real-time customer service.

Databricks Reference: Databricks’ "Generative AI Engineer Guide" notes that sentiment analysis is a common LLM task, but it’s not highlighted for real-time conversational applications like customer support.

Conclusion: Option C is the best fit because it uses relevant input (chat logs) and defines an LLM task (finding answers and summarizing) that meets the requirements of improving response times and maintaining personalized interaction. This aligns with Databricks’ recommended practices for LLM-powered customer service solutions, such as retrieval-augmented generation (RAG) workflows.

Question 4

A Generative Al Engineer is helping a cinema extend its website's chat bot to be able to respond to questions about specific showtimes for movies currently playing at their local theater. They already have the location of the user provided by location services to their agent, and a Delta table which is continually updated with the latest showtime information by location. They want to implement this new capability In their RAG application.

Which option will do this with the least effort and in the most performant way?

Options:

Create a Feature Serving Endpoint from a FeatureSpec that references an online store synced from the Delta table. Query the Feature Serving Endpoint as part of the agent logic / tool implementation.

Query the Delta table directly via a SQL query constructed from the user's input using a text-to-SQL LLM in the agent logic / tool

implementation. Write the Delta table contents to a text column.then embed those texts using an embedding model and store these in the vector index Look

up the information based on the embedding as part of the agent logic / tool implementation.

Set up a task in Databricks Workflows to write the information in the Delta table periodically to an external database such as MySQL and query the information from there as part of the agent logic / tool implementation.

Answer:

Explanation:

The task is to extend a cinema chatbot to provide movie showtime information using a RAG application, leveraging user location and a continuously updated Delta table, with minimal effort and high performance. Let’s evaluate the options.

Option A: Create a Feature Serving Endpoint from a FeatureSpec that references an online store synced from the Delta table. Query the Feature Serving Endpoint as part of the agent logic / tool implementation

Databricks Feature Serving provides low-latency access to real-time data from Delta tables via an online store. Syncing the Delta table to a Feature Serving Endpoint allows the chatbot to query showtimes efficiently, integrating seamlessly into the RAG agent’stool logic. This leverages Databricks’ native infrastructure, minimizing effort and ensuring performance.

Databricks Reference:"Feature Serving Endpoints provide real-time access to Delta table data with low latency, ideal for production systems"("Databricks Feature Engineering Guide," 2023).

Option B: Query the Delta table directly via a SQL query constructed from the user's input using a text-to-SQL LLM in the agent logic / tool

Using a text-to-SQL LLM to generate queries adds complexity (e.g., ensuring accurate SQL generation) and latency (LLM inference + SQL execution). While feasible, it’s less performant and requires more effort than a pre-built serving solution.

Databricks Reference:"Direct SQL queries are flexible but may introduce overhead in real-time applications"("Building LLM Applications with Databricks").

Option C: Write the Delta table contents to a text column, then embed those texts using an embedding model and store these in the vector index. Look up the information based on the embedding as part of the agent logic / tool implementation

Converting structured Delta table data (e.g., showtimes) into text, embedding it, and using vector search is inefficient for structured lookups. It’s effort-intensive (preprocessing, embedding) and less precise than direct queries, undermining performance.

Databricks Reference:"Vector search excels for unstructured data, not structured tabular lookups"("Databricks Vector Search Documentation").

Option D: Set up a task in Databricks Workflows to write the information in the Delta table periodically to an external database such as MySQL and query the information from there as part of the agent logic / tool implementation

Exporting to an external database (e.g., MySQL) adds setup effort (workflow, external DB management) and latency (periodic updates vs. real-time). It’s less performant and more complex than using Databricks’ native tools.

Databricks Reference:"Avoid external systems when Delta tables provide real-time data natively"("Databricks Workflows Guide").

Conclusion: Option A minimizes effort by using Databricks Feature Serving for real-time, low-latency access to the Delta table, ensuring high performance in a production-ready RAG chatbot.

Question 5

What is an effective method to preprocess prompts using custom code before sending them to an LLM?

Options:

Directly modify the LLM’s internal architecture to include preprocessing steps

It is better not to introduce custom code to preprocess prompts as the LLM has not been trained with examples of the preprocessed prompts

Rather than preprocessing prompts, it’s more effective to postprocess the LLM outputs to align the outputs to desired outcomes

Write a MLflow PyFunc model that has a separate function to process the prompts

Question 6

A Generative AI Engineer is developing a chatbot designed to assist users with insurance-related queries. The chatbot is built on a large language model (LLM) and is conversational. However, to maintain the chatbot’s focus and to comply with company policy, it must not provide responses to questions about politics. Instead, when presented with political inquiries, the chatbot should respond with a standard message:

“Sorry, I cannot answer that. I am a chatbot that can only answer questions around insurance.”

Which framework type should be implemented to solve this?

Options:

Safety Guardrail

Security Guardrail

Contextual Guardrail

Compliance Guardrail

Question 7

A Generative AI Engineer has created a RAG application which can help employees retrieve answers from an internal knowledge base, such as Confluence pages or Google Drive. The prototype application is now working with some positive feedback from internal company testers. Now the Generative Al Engineer wants to formally evaluate the system’s performance and understand where to focus their efforts to further improve the system.

How should the Generative AI Engineer evaluate the system?

Options:

Use cosine similarity score to comprehensively evaluate the quality of the final generated answers.

Curate a dataset that can test the retrieval and generation components of the system separately. Use MLflow’s built in evaluation metrics to perform the evaluation on the retrieval and generation components.

Benchmark multiple LLMs with the same data and pick the best LLM for the job.

Use an LLM-as-a-judge to evaluate the quality of the final answers generated.

Question 8

A Generative Al Engineer is building an LLM-based application that has an

important transcription (speech-to-text) task. Speed is essential for the success of the application

Which open Generative Al models should be used?

Options:

L!ama-2-70b-chat-hf

MPT-30B-lnstruct

DBRX

whisper-large-v3 (1.6B)

Answer:

Explanation:

The task requires an open generative AI model for a transcription (speech-to-text) task where speed is essential. Let’s assess the options based on their suitability for transcription and performance characteristics, referencing Databricks’ approach to model selection.

Option A: Llama-2-70b-chat-hf

Llama-2 is a text-based LLM optimized for chat and text generation, not speech-to-text. It lacks transcription capabilities.

Databricks Reference:"Llama models are designed for natural language generation, not audio processing"("Databricks Model Catalog").

Option B: MPT-30B-Instruct

MPT-30B is another text-based LLM focused on instruction-following and text generation, not transcription. It’s irrelevant for speech-to-text tasks.

Databricks Reference: No specific mention, but MPT is categorized under text LLMs in Databricks’ ecosystem, not audio models.

Option C: DBRX

DBRX, developed by Databricks, is a powerful text-based LLM for general-purpose generation. It doesn’t natively support speech-to-text and isn’t optimized for transcription.

Databricks Reference:"DBRX excels at text generation and reasoning tasks"("Introducing DBRX," 2023)—no mention of audio capabilities.

Option D: whisper-large-v3 (1.6B)

Whisper, developed by OpenAI, is an open-source model specifically designed for speech-to-text transcription. The “large-v3” variant (1.6 billion parameters) balances accuracy and efficiency, with optimizations for speed via quantization or deployment on GPUs—key for the application’s requirements.

Databricks Reference:"For audio transcription, models like Whisper are recommended for their speed and accuracy"("Generative AI Cookbook," 2023). Databricks supports Whisper integration in its MLflow or Lakehouse workflows.

Conclusion: OnlyD. whisper-large-v3is a speech-to-text model, making it the sole suitable choice. Its design prioritizes transcription, and its efficiency (e.g., via optimized inference) meets the speed requirement, aligning with Databricks’ model deployment best practices.

Question 9

A Generative AI Engineer has been asked to build an LLM-based question-answering application. The application should take into account new documents that are frequently published. The engineer wants to build this application with the least cost and least development effort and have it operate at the lowest cost possible.

Which combination of chaining components and configuration meets these requirements?

Options:

For the application a prompt, a retriever, and an LLM are required. The retriever output is inserted into the prompt which is given to the LLM to generate answers.

The LLM needs to be frequently with the new documents in order to provide most up-to-date answers.

For the question-answering application, prompt engineering and an LLM are required to generate answers.

For the application a prompt, an agent and a fine-tuned LLM are required. The agent is used by the LLM to retrieve relevant content that is inserted into the prompt which is given to the LLM to generate answers.

Question 10

A Generative Al Engineer at an automotive company would like to build a question-answering chatbot for customers to inquire about their vehicles. They have a database containing various documents of different vehicle makes, their hardware parts, and common maintenance information.

Which of the following components will NOT be useful in building such a chatbot?

Options:

Response-generating LLM

Invite users to submit long, rather than concise, questions

Vector database

Embedding model

Answer:

Explanation:

The task involves building a question-answering chatbot for an automotive company using a database of vehicle-related documents. The chatbot must efficiently process customer inquiries and provide accurate responses. Let’s evaluate each component to determine which isnotuseful, per Databricks Generative AI Engineer principles.

Option A: Response-generating LLM

An LLM is essential for generating natural language responses to customer queries based on retrieved information. This is a core component of any chatbot.

Databricks Reference:"The response-generating LLM processes retrieved context to produce coherent answers"("Building LLM Applications with Databricks," 2023).

Option B: Invite users to submit long, rather than concise, questions

Encouraging long questions is a user interaction design choice, not a technical component of the chatbot’s architecture. Moreover, long, verbose questions can complicate intent detection and retrieval, reducing efficiency and accuracy—counter to best practices for chatbot design. Concise questions are typically preferred for clarity and performance.

Databricks Reference: While not explicitly stated, Databricks’ "Generative AI Cookbook" emphasizes efficient query processing, implying that simpler, focused inputs improve LLM performance. Inviting long questions doesn’t align with this.

Option C: Vector database

A vector database stores embeddings of the vehicle documents, enabling fast retrieval of relevant information via semantic search. This is critical for a question-answering system with a large document corpus.

Databricks Reference:"Vector databases enable scalable retrieval of context from large datasets"("Databricks Generative AI Engineer Guide").

Option D: Embedding model

An embedding model converts text (documents and queries) into vector representations for similarity search. It’s a foundational component for retrieval-augmented generation (RAG) in chatbots.

Databricks Reference:"Embedding models transform text into vectors, facilitating efficient matching of queries to documents"("Building LLM-Powered Applications").

Conclusion: Option B is not a usefulcomponentin building the chatbot. It’s a user-facing suggestion rather than a technical building block, and it could even degrade performance by introducing unnecessary complexity. Options A, C, and D are all integral to a Databricks-aligned chatbot architecture.

Question 11

A Generative AI Engineer is developing an LLM application that users can use to generate personalized birthday poems based on their names.

Which technique would be most effective in safeguarding the application, given the potential for malicious user inputs?

Options:

Implement a safety filter that detects any harmful inputs and ask the LLM to respond that it is unable to assist

Reduce the time that the users can interact with the LLM

Ask the LLM to remind the user that the input is malicious but continue the conversation with the user

Increase the amount of compute that powers the LLM to process input faster

Question 12

A Generative AI Engineer has a provisioned throughput model serving endpoint as part of a RAG application and would like to monitor the serving endpoint’s incoming requests and outgoing responses. The current approach is to include a micro-service in between the endpoint and the user interface to write logs to a remote server.

Which Databricks feature should they use instead which will perform the same task?

Options:

Vector Search

Lakeview

DBSQL

Inference Tables

Question 13

A Generative Al Engineer is tasked with developing a RAG application that will help a small internal group of experts at their company answer specific questions, augmented by an internal knowledge base. They want the best possible quality in the answers, and neither latency nor throughput is a huge concern given that the user group is small and they’re willing to wait for the best answer. The topics are sensitive in nature and the data is highly confidential and so, due to regulatory requirements, none of the information is allowed to be transmitted to third parties.

Which model meets all the Generative Al Engineer’s needs in this situation?

Options:

Dolly 1.5B

OpenAI GPT-4

BGE-large

Llama2-70B

Question 14

A Generative Al Engineer wants their (inetuned LLMs in their prod Databncks workspace available for testing in their dev workspace as well. All of their workspaces are Unity Catalog enabled and they are currently logging their models into the Model Registry in MLflow.

What is the most cost-effective and secure option for the Generative Al Engineer to accomplish their gAi?

Options:

Use an external model registry which can be accessed from all workspaces

Setup a script to export the model from prod and import it to dev.

Setup a duplicate training pipeline in dev, so that an identical model is available in dev.

Use MLflow to log the model directly into Unity Catalog, and enable READ access in the dev workspace to the model.

Answer:

Explanation:

The goal is to make fine-tuned LLMs from a production (prod) Databricks workspace available for testing in a development (dev) workspace, leveraging Unity Catalog and MLflow, while ensuring cost-effectiveness and security. Let’s analyze the options.

Option A: Use an external model registry which can be accessed from all workspaces

An external registry adds cost (e.g., hosting fees) and complexity (e.g., integration, security configurations) outside Databricks’ native ecosystem, reducing security compared to Unity Catalog’s governance.

Databricks Reference:"Unity Catalog provides a centralized, secure model registry within Databricks"("Unity Catalog Documentation," 2023).

Option B: Setup a script to export the model from prod and import it to dev

Export/import scripts require manual effort, storage for model artifacts, and repeated execution, increasing operational cost and risk (e.g., version mismatches, unsecured transfers). It’s less efficient than a native solution.

Databricks Reference: Manual processes are discouraged when Unity Catalog offers built-in sharing:"Avoid redundant workflows with Unity Catalog’s cross-workspace access"("MLflow with Unity Catalog").

Option C: Setup a duplicate training pipeline in dev, so that an identical model is available in dev

Duplicating the training pipeline doubles compute and storage costs, as it retrains the model from scratch. It’s neither cost-effective nor necessary when the prod model can be reused securely.

Databricks Reference:"Re-running training is resource-intensive; leverage existing models where possible"("Generative AI Engineer Guide").

Option D: Use MLflow to log the model directly into Unity Catalog, and enable READ access in the dev workspace to the model

Unity Catalog, integrated with MLflow, allows models logged in prod to be centrally managed and accessed across workspaces with fine-grained permissions (e.g., READ for dev). This is cost-effective (no extra infrastructure or retraining) and secure (governed by Databricks’ access controls).

Databricks Reference:"Log models to Unity Catalog via MLflow, then grant access to other workspaces securely"("MLflow Model Registry with Unity Catalog," 2023).

Conclusion: Option D leverages Databricks’ native tools (MLflow and Unity Catalog) for a seamless, cost-effective, and secure solution, avoiding external systems, manual scripts, or redundant training.

Question 15

A Generative Al Engineer is developing a RAG system for their company to perform internal document Q&A for structured HR policies, but the answers returned are frequently incomplete and unstructured It seems that the retriever is not returning all relevant context The Generative Al Engineer has experimented with different embedding and response generating LLMs but that did not improve results.

Which TWO options could be used to improve the response quality?

Choose 2 answers

Options:

Add the section header as a prefix to chunks

Increase the document chunk size

Split the document by sentence

Use a larger embedding model

Fine tune the response generation model

Answer:

A, B

Explanation:

The problem describes a Retrieval-Augmented Generation (RAG) system for HR policy Q&A where responses are incomplete and unstructured due to the retriever failing to return sufficient context. The engineer has already tried different embedding and response-generating LLMs without success, suggesting the issue lies in the retrieval process—specifically, how documents are chunked and indexed. Let’s evaluate the options.

Option A: Add the section header as a prefix to chunks

Adding section headers provides additional context to each chunk, helping the retriever understand the chunk’s relevance within the document structure (e.g., “Leave Policy: Annual Leave” vs. just “Annual Leave”). This can improve retrieval precision for structured HR policies.

Databricks Reference:"Metadata, such as section headers, can be appended to chunks to enhance retrieval accuracy in RAG systems"("Databricks Generative AI Cookbook," 2023).

Option B: Increase the document chunk size

Larger chunks include more context per retrieval, reducing the chance of missing relevant information split across smaller chunks. For structured HR policies, this can ensure entire sections or rules are retrieved together.

Databricks Reference:"Increasing chunk size can improve context completeness, though it may trade off with retrieval specificity"("Building LLM Applications with Databricks").

Option C: Split the document by sentence

Splitting by sentence creates very small chunks, which could exacerbate the problem by fragmenting context further. This is likely why the current system fails—it retrieves incomplete snippets rather than cohesive policy sections.

Databricks Reference: No specific extract opposes this, but the emphasis on context completeness in RAG suggests smaller chunks worsen incomplete responses.

Option D: Use a larger embedding model

A larger embedding model might improve vector quality, but the question states that experimenting with different embedding models didn’t help. This suggests the issue isn’t embedding quality but rather chunking/retrieval strategy.

Databricks Reference: Embedding models are critical, but not the focus when retrieval context is the bottleneck.

Option E: Fine tune the response generation model

Fine-tuning the LLM could improve response coherence, but if the retriever doesn’t provide complete context, the LLM can’t generate full answers. The root issue is retrieval, not generation.

Databricks Reference: Fine-tuning is recommended for domain-specific generation, not retrieval fixes ("Generative AI Engineer Guide").

Conclusion: Options A and B address the retrieval issue directly by enhancing chunk context—either through metadata (A) or size (B)—aligning with Databricks’ RAG optimization strategies. C would worsen the problem, while D and E don’t target the root cause given prior experimentation.

Question 16

A Generative Al Engineer is building a production-ready LLM system which replies directly to customers. The solution makes use of the Foundation Model API via provisioned throughput. They are concerned that the LLM could potentially respond in a toxic or otherwise unsafe way. They also wish to perform this with the least amount of effort.

Which approach will do this?

Options:

Host Llama Guard on Foundation Model API and use it to detect unsafe responses

Add some LLM calls to their chain to detect unsafe content before returning text

Add a regex expression on inputs and outputs to detect unsafe responses.

Ask users to report unsafe responses

Answer:

Explanation:

The task is to prevent toxic or unsafe responses in an LLM system using the Foundation Model API with minimal effort. Let’s assess the options.

Option A: Host Llama Guard on Foundation Model API and use it to detect unsafe responses

Llama Guard is a safety-focused model designed to detect toxic or unsafe content. Hosting it via the Foundation Model API (a Databricks service) integrates seamlessly with the existing system, requiring minimal setup (just deployment and a check step), and leverages provisioned throughput for performance.

Databricks Reference:"Foundation Model API supports hosting safety models like Llama Guard to filter outputs efficiently"("Foundation Model API Documentation," 2023).

Option B: Add some LLM calls to their chain to detect unsafe content before returning text

Using additional LLM calls (e.g., prompting an LLM to classify toxicity) increases latency, complexity, and effort (crafting prompts, chaining logic), and lacks the specificity of a dedicated safety model.

Databricks Reference:"Ad-hoc LLM checks are less efficient than purpose-built safety solutions"("Building LLM Applications with Databricks").

Option C: Add a regex expression on inputs and outputs to detect unsafe responses

Regex can catch simple patterns (e.g., profanity) but fails for nuanced toxicity (e.g., sarcasm, context-dependent harm), requiring significant manual effort to maintain and update rules.

Databricks Reference:"Regex-based filtering is limited for complex safety needs"("Generative AI Cookbook").

Option D: Ask users to report unsafe responses

User reporting is reactive, not preventive, and places burden on users rather than the system. It doesn’t limit unsafe outputs proactively and requires additional effort for feedback handling.

Databricks Reference:"Proactive guardrails are preferred over user-driven monitoring"("Databricks Generative AI Engineer Guide").

Conclusion: Option A (Llama Guard on Foundation Model API) is the least-effort, most effective approach, leveraging Databricks’ infrastructure for seamless safety integration.

Question 17

A Generative Al Engineer is setting up a Databricks Vector Search that will lookup news articles by topic within 10 days of the date specified An example query might be "Tell me about monster truck news around January 5th 1992". They want to do this with the least amount of effort.

How can they set up their Vector Search index to support this use case?

Options:

Split articles by 10 day blocks and return the block closest to the query.

Include metadata columns for article date and topic to support metadata filtering.

pass the query directly to the vector search index and return the best articles.

Create separate indexes by topic and add a classifier model to appropriately pick the best index.

Answer:

Explanation:

The task is to set up a Databricks Vector Search index for news articles, supporting queries like “monster truck news around January 5th, 1992,” with minimal effort. The index must filter by topic and a 10-day date range. Let’s evaluate the options.

Option A: Split articles by 10-day blocks and return the block closest to the query

Pre-splitting articles into 10-day blocks requires significant preprocessing and index management (e.g., one index per block). It’s effort-intensive and inflexible for dynamic date ranges.

Databricks Reference:"Static partitioning increases setup complexity; metadata filtering is preferred"("Databricks Vector Search Documentation").

Option B: Include metadata columns for article date and topic to support metadata filtering

Adding date and topic as metadata in the Vector Search index allows dynamic filtering (e.g., date ± 5 days, topic = “monster truck”) at query time. This leverages Databricks’ built-in metadata filtering, minimizing setup effort.

Databricks Reference:"Vector Search supports metadata filtering on columns like date or category for precise retrieval with minimal preprocessing"("Vector Search Guide," 2023).

Option C: Pass the query directly to the vector search index and return the best articles

Passing the full query (e.g., “Tell me about monster truck news around January 5th, 1992”) to Vector Search relies solely on embeddings, ignoring structured filtering for date and topic. This risks inaccurate results without explicit range logic.

Databricks Reference:"Pure vector similarity may not handle temporal or categorical constraints effectively"("Building LLM Applications with Databricks").

Option D: Create separate indexes by topic and add a classifier model to appropriately pick the best index

Separate indexes per topic plus a classifier model adds significant complexity (index creation, model training, maintenance), far exceeding “least effort.” It’s overkill for this use case.

Databricks Reference:"Multiple indexes increase overhead; single-index with metadata is simpler"("Databricks Vector Search Documentation").

Conclusion: Option B is the simplest and most effective solution, using metadata filtering in a single Vector Search index to handle date ranges and topics, aligning with Databricks’ emphasis on efficient, low-effort setups.

Question 18

A Generative AI Engineer is building a Generative AI system that suggests the best matched employee team member to newly scoped projects. The team member is selected from a very large team. The match should be based upon project date availability and how well their employee profile matches the project scope. Both the employee profile and project scope are unstructured text.

How should the Generative Al Engineer architect their system?

Options:

Create a tool for finding available team members given project dates. Embed all project scopes into a vector store, perform a retrieval using team member profiles to find the best team member.

Create a tool for finding team member availability given project dates, and another tool that uses an LLM to extract keywords from project scopes. Iterate through available team members’ profiles and perform keyword matching to find the best available team member.

Create a tool to find available team members given project dates. Create a second tool that can calculate a similarity score for a combination of team member profile and the project scope. Iterate through the team members and rank by best score to select a team member.

Create a tool for finding available team members given project dates. Embed team profiles into a vector store and use the project scope and filtering to perform retrieval to find the available best matched team members.

Answer:

Explanation:

Problem Context: The problem involves matching team members to new projects based on two main factors:

Availability: Ensure the team members are available during the project dates.

Profile-Project Match: Use the employee profiles (unstructured text) to find the best match for a project’s scope (also unstructured text).

The two main inputs are theemployee profilesandproject scopes, both of which are unstructured. This means traditional rule-based systems (e.g., simple keyword matching) would be inefficient, especially when working with large datasets.

Explanation of Options: Let's break down the provided options to understand why D is the most optimal answer.

Option Asuggests embedding project scopes into a vector store and then performing retrieval using team member profiles. While embedding project scopes into a vector store is a valid technique, it skips an important detail: the focus should primarily be on embedding employee profiles because we're matching the profiles to a new project, not the other way around.

Option Binvolves using a large language model (LLM) to extract keywords from the project scope and perform keyword matching on employee profiles. While LLMs can help with keyword extraction, this approach is too simplistic and doesn’t leverage advanced retrieval techniques like vector embeddings, which can handle the nuanced and rich semantics of unstructured data. This approach may miss out on subtle but important similarities.

Option Csuggests calculating a similarity score between each team member's profile and project scope. While this is a good idea, it doesn’t specify how to handle the unstructured nature of data efficiently. Iterating through each member’s profile individually could be computationally expensive in large teams. It also lacks the mention of using a vector store or an efficient retrieval mechanism.

Option Dis the correct approach. Here’s why:

Embedding team profiles into a vector store: Using a vector store allows for efficient similarity searches on unstructured data. Embedding the team member profiles into vectors captures their semantics in a way that is far more flexible than keyword-based matching.

Using project scope for retrieval: Instead of matching keywords, this approach suggests using vector embeddings and similarity search algorithms (e.g., cosine similarity) to find the team members whose profiles most closely align with the project scope.

Filtering based on availability: Once the best-matched candidates are retrieved based on profile similarity, filtering them by availability ensures that the system provides a practically useful result.

This method efficiently handles large-scale datasets by leveragingvector embeddingsandsimilarity searchtechniques, both of which are fundamental tools inGenerative AI engineeringfor handling unstructured text.

Technical References:

Vector embeddings: In this approach, the unstructured text (employee profiles and project scopes) is converted into high-dimensional vectors using pretrained models (e.g., BERT, Sentence-BERT, or custom embeddings). These embeddings capture the semantic meaning of the text, making it easier to perform similarity-based retrieval.

Vector stores: Solutions likeFAISSorMilvusallow storing and retrieving large numbers of vector embeddings quickly. This is critical when working with large teams where querying through individual profiles sequentially would be inefficient.

LLM Integration: Large language models can assist in generating embeddings for both employee profiles and project scopes. They can also assist in fine-tuning similarity measures, ensuring that the retrieval system captures the nuances of the text data.

Filtering: After retrieving the most similar profiles based on the project scope, filtering based on availability ensures that only team members who are free for the project are considered.

This system is scalable, efficient, and makes use of the latest techniques inGenerative AI, such as vector embeddings and semantic search.

Load More Databricks-Generative-AI-Engineer-Associate Questions

Weekend Sale Limited Time Flat 70% Discount offer - Ends in 0d 00h 00m 00s - Coupon code: 70spcl

Activedumpsnet Logo

Activedumpsnet Navigation

Activedumpsnet Slider

Databricks Databricks-Generative-AI-Engineer-Associate Databricks Certified Generative AI Engineer Associate Exam Practice Test

Databricks Certified Generative AI Engineer Associate Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Copyright © 2014-2025 Activedumpsnet. All Rights Reserved