-
Type: Bug
-
Status: Resolved
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: None
-
Component/s: AI Nuxeo Services
A subscriber recently encountered an error in the ComprehendService -
text$aws.textKeyphrase$enrichment.inPool-00,in:2,inCheckpoint:2,out:0,lastRead:1682387158467,lastTimer:0,wm:220513687014735873,loop:3,checkpoint com.amazonaws.services.comprehend.model.TextSizeLimitExceededException: Input text size exceeds limit. Max length of request text allowed is 100000 bytes while in this request the text size is 132253 bytes (Service: AmazonComprehend; Status Code: 400; Error Code: TextSizeLimitExceededException; Request ID: 59bc926e-aa59-4c08-8fd1-e720f2f3c7e9; Proxy: null) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879) ~[aws-java-sdk-core-1.12.261.jar:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1418) ~[aws-java-sdk-core-1.12.261.jar:?] ... org.nuxeo.ai.comprehend.ComprehendServiceImpl.extractKeyphrase(ComprehendServiceImpl.java:84) ~[nuxeo-ai-aws-core-2.7.13.jar:?] at ... org.nuxeo.ai.enrichment.EnrichingStreamProcessor$EnrichmentComputation.callProvider(EnrichingStreamProcessor.java:212) ~[nuxeo-ai-core-2.7.13.jar:?] at org.nuxeo.ai.enrichment.EnrichingStreamProcessor$EnrichmentComputation.processRecord(EnrichingStreamProcessor.java:147) ~[nuxeo-ai-core-2.7.13.jar:?] at org.nuxeo.lib.stream.computation.log.ComputationRunner.lambda$processRecordWithRetry$10(ComputationRunner.java:417) ~[nuxeo-stream-10.10-HF67.jar:?]
More context can be seen here: https://jira.nuxeo.com/browse/SUPINT-2236
This relates to quotas for Amazon Comprehend.
The quotas can be found here: https://docs.aws.amazon.com/comprehend/latest/dg/guidelines-and-limits.html
As detailed on that page, there is a 100 KB max document size limit for extracting keyphrases.
AICORE-616: Resolve error in ComprehendService, skipping analysis beyond quotas
We utilize Amazon Comprehend within our ComprehendService. There are quota limits that are enforced when detecting entities, key-phrases, dominant languages, sentiment, targeted sentiment, and syntax. Previously, if we tried to analyze text with a text length beyond the quota, we would encounter a TextSizeLimitExceededException. This change skips Amazon Comprehend analysis when text is larger than the specified quotas. In the future, we may implement another solution.
The guidelines and quotas for Amazon Comprehend can be found here: docs.aws.amazon.com/comprehend/latest/dg/guidelines-and-limits.html
- is related to
-
AICORE-617 Forward integrate nuxeo-ai LTS 2019 changes to master for LTS 2021
- Resolved