fscrawler jobs with Elastics pipeline setting errors Can't find stored field name to check existing filenames in path [/]. Please set store: true on field [file.filename] #1238
-
When we run the fscrawler jobs with an Elasticsearch pipeline added to the _settings. yaml file we get errors like the one below. 16:25:25,859 WARN [f.p.e.c.f.FsParserAbstract] Can't find stored field name to check existing filenames in path [/mnt/folder1/document/files/qp]. Please set store: true on field [file.filename] We need the pipeline to replace part of the file.url field's content. . We need to the search results to display the file.url content differently. When we add the pipeline setting to the jobs file the index is created but sometime the document count total doesn't match the contents in the folder. When crawling smaller content it finishes but when crawler larger content the crawl error out and stops crawling and stops injecting documents into the elasticseach index. We notice if the crawl completes , when we add new documents to the crawl location or remove document the crawl doesn't update the index. FYI: The folder locations, the path where the content being crawled is a mounted shares on a windows operating system, I'm not sure if that's causing an issue. The fscrawler job is running on a REHL8 server. Here's the elastic search pipeline script PUT _ingest/pipeline/fscrawler We added the pipeline settings to the jobs: Here’s the Index field mapping generated by fscrawler. { Here's the _setting.yaml filename: "qp"
ocr:
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Duplicate of #1240 |
Beta Was this translation helpful? Give feedback.
Duplicate of #1240