From 9527749f50e3d3c6f52a44ba814c8b1182fdac31 Mon Sep 17 00:00:00 2001
From: "mergify[bot]" <37929162+mergify[bot]@users.noreply.github.com>
Date: Tue, 30 Jan 2024 09:19:03 +0100
Subject: [PATCH] Suggest chunking for large ELSER fields (#2660) (#2662)

(cherry picked from commit f4dacc9dd2b116377ceea3c2707ad1f97356f582)

Co-authored-by: Sean Story <sean.j.story@gmail.com>
---
 docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc b/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc
index 49973223b..ddfc41275 100644
--- a/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc
+++ b/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc
@@ -397,9 +397,11 @@ image::images/ml-nlp-elser-v2-test.png[alt="Testing ELSER",align="center"]
 * ELSER works best on small-to-medium sized fields that contain natural 
 language. For connector or web crawler use cases, this aligns best with fields 
 like _title_, _description_, _summary_, or _abstract_. As ELSER encodes the 
-first 512 tokens of a field, it may not be as good a match for `body_content` on 
-web crawler documents, or body fields resulting from extracting text from office 
-documents with connectors.
+first 512 tokens of a field, it may not provide as relevant of results for large
+fields. For example, `body_content` on web crawler documents, or body fields 
+resulting from extracting text from office documents with connectors. For larger
+fields like these, consider "chunking" the content into multiple values, where
+each chunk can be under 512 tokens.
 * Larger documents take longer at ingestion time, and {infer} time per 
 document also increases the more fields in a document that need to be processed.
 * The more fields your pipeline has to perform inference on, the longer it takes 
@@ -510,4 +512,4 @@ image::images/ml-nlp-elser-v2-opt-bm-results.png[alt="ELSER V2 optimized benchma
 respectively 14 docs/s and 16 docs/s, indicating a performance improvement due 
 to virtual cores of 12%.
 
-image::images/ml-nlp-elser-v2-cp-bm-results.png[alt="ELSER V2 cross-platform benchmarks",align="center"]
\ No newline at end of file
+image::images/ml-nlp-elser-v2-cp-bm-results.png[alt="ELSER V2 cross-platform benchmarks",align="center"]