Skip to content

Latest commit

 

History

History
139 lines (78 loc) · 5.27 KB

lab_3.md

File metadata and controls

139 lines (78 loc) · 5.27 KB

Create and Deploy a OpenAI Pipeline

Overview

In this lab, you will use unstructured data files like contract documents, leases, user manuals etc. to be analyzed using OCR, Azure Cognitive Search Semantic Search feature and the capabilities of Azure Open AI Large Language Models to summerize key information after converting them into more structured index files. For this lab, you will use the dataset provided at Lab3 Sample Data.

Goal

  • How to leverage the Large Language Model(LLM) from GPT-3 to extract concise summary from a subset of huge document repository using OpenAI and Azure Semantic Search

Pre-requisites

  • The accelerator is deployed and ready in the resource group
  • You have access to sample data to test OpenAI

Instructions

Step 1a - Create a OpenAI Generic Pipeline

Step 1b - Get Sample Configurations from GPT-3 Playground

At this stage, we select the model we want to use and the feature we want to leverage. In this case we will be using the Davinci model and the Summerize feature. The playground brings in a sample on the editor. Select the content of the 'Conversation' section and replace with ${document} to ensure the dynamic content is used on runtime. After that click on 'View Code' on top right.

On the pop up, there will be drop down menu where by default 'Python' will be selected. Please change that to 'json' and Copy the code snippet.

Go back to the BPA tab and replace the default text on the Generic OpenAI component opened earlier with the copied text.

That completes the pipeline

Step 2 - Ingest Data for the pipeline

There are 2 options for ingesting the data for the pipeline:

  • Use single file upload for smaller files [smaller than 4 pages]
  • Use the split document option to split the larger documents and upload the individual split files to the pipeline.

Step 3 - Configure Search

  1. The get to Search Service. To view the results, go to portal.azure.com (Azure Portal) again in your browser and get to the resource group like we did earlier in Step 1. There, in the resource group, click on the resource that is of type Search Service.

    searchservicetype.png

  2. Click on Import Data. selectimportdata.png

  3. Select Azure Cosmos DB from the dropdown in datasource. selectazurecosmosdb.png

  4. Provide a name for datasource and click on Choose an existing connection for Connection String. Here the Azure CosmosDB resource created as a part of BPA accelerator already setup will be one of the sources you can choose from. selectcosmosdb.png

  5. Keep the default for Managed identity Authentication, which is None. For Databases and Collection use the dropdown to select the same name as the Cosmos DB you selected at step 15.

  6. Under Query, use the following Query. The pipeline should match the pipeline name you used in step 3

    SELECT * FROM c WHERE c.pipeline = 'YOUR-PIPELINE-NAME' AND c._ts > @HighWaterMark

  7. Click Next: Add cognitive skills (Optional). This validates and creates the index schema.

  8. In the next Screen(Add cognitive skills (Optional)), Click Skip to: Customize Target Index, customizetargetindex.png

  9. Provide a name for the Index and click on Next: Create an indexer indexname.png

  10. Provide a name for the indexer and click Submit

    createindexer.png

  11. You will get a notification that the import is successfully configured

Step 4 - Configure Semantic Search

  1. You can select Semantic Search Plan

  2. Go back to your search index and configure the Semantic Configuration

  3. Select the Semantic Configuration and click on Create new.

    On the pop up do the following:

    • Give a name to the Semantic Search Config
    • Select the Title field and select 'filename'
    • Select the 'content' field and any other relevant fields for Content Fields
    • Select Save

Step 5 - Perform Semantic Search

  1. Now, go back to the accelerator url that you retreived from Step 1 and click on Sample Search Application.

    This opens the same search application

    • Enable Symantic Search
    • Select the relevant Symantic Search Config
    • Enable Open AI Summary and Answer
    • Provide a search query based on your document, like:
      • 'What is RMA?'
      • 'What are the Cleaning instructions?'