Custom skills in Azure Cognitive Search extend the built-in cognitive capabilities by allowing you to integrate your own processing logic into the enrichment pipeline. These skills enable you to perform specialized transformations, entity extraction, or analysis that the default skillset does not pā¦Custom skills in Azure Cognitive Search extend the built-in cognitive capabilities by allowing you to integrate your own processing logic into the enrichment pipeline. These skills enable you to perform specialized transformations, entity extraction, or analysis that the default skillset does not provide.
To implement custom skills, you typically create an Azure Function or a web API endpoint that accepts JSON input and returns JSON output in a specific format. The custom skill must conform to the Web API custom skill interface, which requires handling batch requests containing records with unique identifiers and data payloads.
The input schema includes a 'values' array containing records, each with a 'recordId' and 'data' object. Your custom logic processes this data and returns results in the same structure, including any warnings or errors encountered during processing.
When including custom skills in your skillset definition, you specify the skill type as 'WebApiSkill' and configure several properties: the 'uri' pointing to your endpoint, 'httpMethod' (usually POST), 'timeout' for request duration, 'batchSize' for processing efficiency, and 'degreeOfParallelism' for concurrent requests.
You must define 'inputs' that map enrichment tree nodes to your skill's expected parameters and 'outputs' that specify where processed results should be stored in the enrichment tree. These mappings use the '/document' path notation to reference specific fields.
Authentication options include API keys passed via headers or Azure Active Directory tokens for enhanced security. You can also configure retry policies for resilience.
Common use cases for custom skills include proprietary entity recognition, sentiment analysis in specific domains, language translation using external services, document classification using trained machine learning models, and data validation or transformation logic unique to your business requirements.
Proper error handling ensures the indexer continues processing even when individual records fail, maintaining pipeline reliability.
Implementing Custom Skills in Azure AI Search
Why Custom Skills Are Important
Custom skills extend Azure AI Search's built-in cognitive capabilities, allowing you to incorporate specialized AI processing that isn't available out-of-the-box. They enable organizations to leverage proprietary models, call external APIs, or implement unique business logic within their search enrichment pipeline. This flexibility is crucial for scenarios requiring domain-specific entity extraction, custom classification, or integration with third-party services.
What Are Custom Skills?
Custom skills are web API endpoints that you create and host to perform specific enrichment tasks during the indexing process. They integrate into a skillset alongside built-in cognitive skills and follow a standardized input/output contract. Azure AI Search calls your custom skill endpoint, passes document data, receives enriched results, and incorporates them into the search index.
How Custom Skills Work
The process follows these steps:
1. Create a Web API - Develop an HTTP endpoint (commonly using Azure Functions) that accepts JSON input and returns JSON output
2. Implement the Contract - Your API must accept a specific JSON structure containing values array with recordId, data, and return responses with values containing recordId, data, errors, and warnings
3. Define the Custom Skill in Skillset - Use the WebApiSkill type with properties including uri, httpMethod, httpHeaders, inputs, and outputs
4. Map Outputs - Connect skill outputs to index fields through output field mappings
Key Configuration Properties
- @odata.type: Must be #Microsoft.Skills.Custom.WebApiSkill - uri: The endpoint URL of your custom skill - batchSize: Number of records sent per request (default 1000, max 1000) - degreeOfParallelism: Number of concurrent calls (default 1) - timeout: Request timeout duration - authResourceId: For managed identity authentication
Authentication Options
Custom skills support multiple authentication methods: API keys passed in headers, managed identity authentication using authResourceId, or no authentication for testing purposes.
Exam Tips: Answering Questions on Custom Skills
Focus on the JSON Contract: Questions often test your knowledge of the required input/output format. Remember that each record must have a recordId that matches between request and response.
Know the WebApiSkill Properties: Be familiar with mandatory properties like uri, inputs, and outputs, and understand optional ones like batchSize and timeout.
Understand Error Handling: Custom skills should return errors and warnings per record rather than failing the entire batch. Questions may present scenarios about graceful degradation.
Azure Functions Integration: Many questions assume Azure Functions as the hosting platform. Know that HTTP-triggered functions with anonymous or function-level authentication are commonly used.
Performance Considerations: Questions may ask about optimizing throughput using batchSize and degreeOfParallelism settings.
Security Scenarios: Expect questions about securing custom skill endpoints using API keys in headers or managed identity authentication.
Context vs. Data: Remember that the context property in skill definitions determines at what level the skill operates (document level or per item in a collection).
Common Trap: Don't confuse custom skills with built-in skills. Custom skills always use the WebApiSkill type and require an external endpoint.