[Draft]Labeled Training Data Support#56290
Draft
changjian-wang wants to merge 9 commits intocu_sdk/gafrom
Draft
Conversation
added 9 commits
February 13, 2026 13:53
- Add LabeledDataKnowledgeSource customization with single-param (Uri) constructor - Add Sample16_CreateAnalyzerWithLabels.cs aligned with Java SDK pattern - Add Sample16_CreateAnalyzerWithLabels.md documentation - Add receipt label files (receipt1/receipt2 with .labels.json and .result.json) - Rename SampleFiles to sample_files for consistency - Align environment variables with Java SDK (CONTENTUNDERSTANDING_* prefix) - Update test-resources.bicep and test-resources-post.ps1 output names - Update all appsettings.json files with new env var names - Update API listing files with new constructor - Update README.md with Sample16 references
- Add Azure.Storage.Blobs dependency to test project - Add CONTENTUNDERSTANDING_TRAINING_DATA_STORAGE_ACCOUNT and CONTAINER env vars - Auto-generate User Delegation SAS URL when SAS URL not set but account/container provided - Update Sample16 .cs with fallback SAS generation logic - Update Sample16 .md documentation with Option A/B pattern
- Extract BuildReceiptFieldSchema() and wrap in Snippet region - Shorten SNIPPET SAS block by calling GenerateUserDelegationSasUrlAsync - Add Snippet region around GenerateUserDelegationSasUrlAsync - Add Assertion region for test assertions (consistent with other samples) - Add DeleteAnalyzerWithLabels snippet with #if SNIPPET/#else pattern - Consolidate test infrastructure with clear section separator - Update .md to reference 4 separate snippets for better docs structure
- Add UploadTrainingDataAsync helper: uploads local receipt_labels/ files to container - Option B now auto-uploads before generating SAS URL (no manual upload needed) - Add upload snippet to .md documentation - Update XML doc comments to reflect new auto-upload behavior
…rocess with labeled training data, update variable names, and enhance instructions for Azure Blob Storage setup.
…ield schema verification details
Add unit tests to achieve >=80% coverage on all custom code files: - ContentFieldExtensionsTest.cs: 22 tests for Value property switch branches covering all ContentField subtypes (String, Number, Integer, Date, Time, Boolean, Object, Array, Json, Unknown/default) - AudioVisualContentDeserializationTest.cs: 16 tests for custom DeserializeAudioVisualContent covering KeyFrameTimesMs casing variants, null values, round-trip unknown properties, empty/multiple items - ArrayFieldExtensionsTest.cs: 12 tests for Count property, indexer happy paths, ArgumentOutOfRangeException paths, and nested ObjectField arrays - ContentUnderstandingClientTest.cs: 6 protocol method tests with MockTransport covering OperationWithId wrapping for sync/async Analyze/AnalyzeBinary including WaitUntil.Completed branch Coverage results (all custom code files): ContentField.Extensions.cs: 100% (was 53.8%) ArrayField.Extensions.cs: 100% (was 71.4%) AudioVisualContent.Customizations.cs: 98.7% (was 77.9%) ContentUnderstandingClient.Customizations.cs: 83.3% (was 78.8%) OperationWithId.cs: 90.6% (was 81.3%) All others: 100% or 83.3% (unchanged)
Added 9 new unit tests and simplified OperationWithId to achieve 100% line coverage across all 10 custom code files: Tests added to AudioVisualContentDeserializationTest.cs: - Deserialize_NullTopLevelElement_ReturnsNull: covers null JSON guard Tests added to ContentUnderstandingClientTest.cs: - AnalyzeAsync_Protocol_WaitUntilCompleted: async Analyze WaitForCompletionAsync - AnalyzeBinaryAsync_Protocol_WaitUntilCompleted: async AnalyzeBinary WaitForCompletionAsync - Analyze_Protocol_ThrowsOnTransportError: sync Analyze catch block - AnalyzeBinary_Protocol_ThrowsOnTransportError: sync AnalyzeBinary catch block - AnalyzeAsync_Protocol_ThrowsOnTransportError: async Analyze catch block - AnalyzeBinaryAsync_Protocol_ThrowsOnTransportError: async AnalyzeBinary catch block - Analyze_Protocol_InvalidOperationLocation_ThrowsOnIdAccess: OperationWithId fallback path - GetAnalyzer_Protocol_WithRequestContext_CoversRequestContextParse: non-null RequestContext Source changes: - OperationWithId.cs: merged nested if conditions into single guard clause to eliminate dead code (segments.Length is never 0 for valid absolute URIs)
Member
Author
|
@copilot how to trigger CI check? |
Contributor
|
@changjian-wang I've opened a new pull request, #56291, to work on those changes. Once the pull request is ready, I'll request review from you. |
9 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request adds support for creating custom analyzers with labeled training data in Azure Blob Storage, enhancing the ability to build more accurate field extraction models. It introduces a new sample demonstrating this workflow, updates documentation to guide users through the process, and exposes a new constructor for the
LabeledDataKnowledgeSourceclass to simplify usage. Additionally, there is a minor improvement to operation status parsing logic.Labeled Training Data Support
Sample16_CreateAnalyzerWithLabels.md) that demonstrates how to create a custom analyzer using labeled training data from Azure Blob Storage, including setup instructions, code snippets, and helper methods for uploading and accessing training data.Azure.AI.ContentUnderstanding/README.md) to document the new labeled training data capability and reference the new sample. [1] [2] [3]API and SDK Enhancements
LabeledDataKnowledgeSourcethat accepts only a container URL, making it easier to instantiate when a file list path is not needed. This is implemented across all supported target frameworks and in a new partial class for customizations. [1] [2] [3]Other Changes
assets.jsontag to reflect the new build.Operation-Locationheader to be more robust inOperationWithId.cs.