Skip to content

Allow doc-values only search on ip fields #82929

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/reference/mapping/params/doc-values.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ makes this data access pattern possible. They store the same values as the
sorting and aggregations. Doc values are supported on almost all field types,
with the __notable exception of `text` and `annotated_text` fields__.

<<number,Numeric types>>, <<date,date types>>, the <<boolean,boolean type>>
and the <<keyword,keyword type>>
<<number,Numeric types>>, <<date,date types>>, the <<boolean,boolean type>>,
the <<ip,ip type>> and the <<keyword,keyword type>>
can also be queried using term or range-based queries
when they are not <<mapping-index,indexed>> but only have doc values enabled.
Query performance on doc values is much slower than on index structures, but
Expand Down
5 changes: 4 additions & 1 deletion docs/reference/mapping/types/ip.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,10 @@ The following parameters are accepted by `ip` fields:

<<mapping-index,`index`>>::

Should the field be searchable? Accepts `true` (default) and `false`.
Should the field be quickly searchable? Accepts `true` (default) and
`false`. Fields that only have <<doc-values,`doc_values`>>
enabled can still be queried using term or range-based queries,
albeit slower.

<<null-value,`null_value`>>::

Expand Down
4 changes: 2 additions & 2 deletions docs/reference/query-dsl.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ the stability of the cluster. Those queries can be categorised as follows:

* Queries that need to do linear scans to identify matches:
** <<query-dsl-script-query,`script` queries>>
** queries on <<number,numeric>>, <<date,date>>, <<boolean,boolean>>, or <<keyword,keyword>> fields that are not indexed
but have <<doc-values,doc values>> enabled
** queries on <<number,numeric>>, <<date,date>>, <<boolean,boolean>>, <<ip,ip>> or <<keyword,keyword>> fields
that are not indexed but have <<doc-values,doc values>> enabled

* Queries that have a high up-front cost:
** <<query-dsl-fuzzy-query,`fuzzy` queries>> (except on
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,9 @@ setup:
non_indexed_boolean:
type: boolean
index: false
non_indexed_ip:
type: ip
index: false
geo:
type: keyword
object:
Expand Down Expand Up @@ -255,6 +258,18 @@ setup:

- match: {fields.non_indexed_boolean.boolean.searchable: true}

---
"Field caps for ip field with only doc values":
- skip:
version: " - 8.0.99"
reason: "doc values search was added in 8.1.0"
- do:
field_caps:
index: 'test1,test2,test3'
fields: non_indexed_ip

- match: {fields.non_indexed_ip.ip.searchable: true}

---
"Get object and nested field caps":

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ setup:
boolean:
type: boolean
index: false
ip:
type: ip
index: false

- do:
index:
Expand All @@ -58,6 +61,7 @@ setup:
date: "2017/01/01"
keyword: "key1"
boolean: "false"
ip: "192.168.0.1"

- do:
index:
Expand All @@ -74,6 +78,7 @@ setup:
date: "2017/01/02"
keyword: "key2"
boolean: "true"
ip: "192.168.0.2"

- do:
indices.refresh: {}
Expand Down Expand Up @@ -284,3 +289,30 @@ setup:
index: test
body: { query: { range: { boolean: { gte: "false" } } } }
- length: { hits.hits: 2 }

---
"Test match query on ip field where only doc values are enabled":

- do:
search:
index: test
body: { query: { match: { ip: { query: "192.168.0.1" } } } }
- length: { hits.hits: 1 }

---
"Test terms query on ip field where only doc values are enabled":

- do:
search:
index: test
body: { query: { terms: { ip: [ "192.168.0.1", "192.168.0.2" ] } } }
- length: { hits.hits: 2 }

---
"Test range query on ip field where only doc values are enabled":

- do:
search:
index: test
body: { query: { range: { ip: { gte: "192.168.0.1" } } } }
- length: { hits.hits: 2 }
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import org.apache.lucene.document.StoredField;
import org.apache.lucene.index.LeafReaderContext;
import org.apache.lucene.search.MatchNoDocsQuery;
import org.apache.lucene.search.PointRangeQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.util.BytesRef;
import org.elasticsearch.Version;
Expand Down Expand Up @@ -204,14 +205,27 @@ public IpFieldType(
}

public IpFieldType(String name) {
this(name, true, false, true, null, null, Collections.emptyMap(), false);
this(name, true, true);
}

public IpFieldType(String name, boolean isIndexed) {
this(name, isIndexed, true);
}

public IpFieldType(String name, boolean isIndexed, boolean hasDocValues) {
this(name, isIndexed, false, hasDocValues, null, null, Collections.emptyMap(), false);
}

@Override
public String typeName() {
return CONTENT_TYPE;
}

@Override
public boolean isSearchable() {
return isIndexed() || hasDocValues();
}

@Override
public boolean mayExistInIndex(SearchExecutionContext context) {
return context.fieldExistsInIndex(name());
Expand Down Expand Up @@ -252,25 +266,48 @@ protected Object parseSourceValue(Object value) {

@Override
public Query termQuery(Object value, @Nullable SearchExecutionContext context) {
failIfNotIndexed();
failIfNotIndexedNorDocValuesFallback(context);
Query query;
if (value instanceof InetAddress) {
return InetAddressPoint.newExactQuery(name(), (InetAddress) value);
query = InetAddressPoint.newExactQuery(name(), (InetAddress) value);
} else {
if (value instanceof BytesRef) {
value = ((BytesRef) value).utf8ToString();
}
String term = value.toString();
if (term.contains("/")) {
final Tuple<InetAddress, Integer> cidr = InetAddresses.parseCidr(term);
return InetAddressPoint.newPrefixQuery(name(), cidr.v1(), cidr.v2());
query = InetAddressPoint.newPrefixQuery(name(), cidr.v1(), cidr.v2());
} else {
InetAddress address = InetAddresses.forString(term);
query = InetAddressPoint.newExactQuery(name(), address);
}
InetAddress address = InetAddresses.forString(term);
return InetAddressPoint.newExactQuery(name(), address);
}
if (isIndexed()) {
return query;
} else {
return convertToDocValuesQuery(query);
}
}

static Query convertToDocValuesQuery(Query query) {
assert query instanceof PointRangeQuery;
PointRangeQuery pointRangeQuery = (PointRangeQuery) query;
return SortedSetDocValuesField.newSlowRangeQuery(
pointRangeQuery.getField(),
new BytesRef(pointRangeQuery.getLowerPoint()),
new BytesRef(pointRangeQuery.getUpperPoint()),
true,
true
);
}

@Override
public Query termsQuery(Collection<?> values, SearchExecutionContext context) {
failIfNotIndexedNorDocValuesFallback(context);
if (isIndexed() == false) {
return super.termsQuery(values, context);
}
InetAddress[] addresses = new InetAddress[values.size()];
int i = 0;
for (Object value : values) {
Expand Down Expand Up @@ -301,14 +338,15 @@ public Query rangeQuery(
boolean includeUpper,
SearchExecutionContext context
) {
failIfNotIndexed();
return rangeQuery(
lowerTerm,
upperTerm,
includeLower,
includeUpper,
(lower, upper) -> InetAddressPoint.newRangeQuery(name(), lower, upper)
);
failIfNotIndexedNorDocValuesFallback(context);
return rangeQuery(lowerTerm, upperTerm, includeLower, includeUpper, (lower, upper) -> {
Query query = InetAddressPoint.newRangeQuery(name(), lower, upper);
if (isIndexed()) {
return query;
} else {
return convertToDocValuesQuery(query);
}
});
}

/**
Expand Down
Loading