-
Notifications
You must be signed in to change notification settings - Fork 46
ConjureCollections uses forEach where possible #2524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Generate changelog in
|
Preconditions.checkNotNull(elementsToAdd, "elementsToAdd cannot be null") | ||
.forEach(addTo::add); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the default forEach
on Iterable does
default void forEach(Consumer<? super T> action) {
Objects.requireNonNull(action);
for (T t : this) {
action.accept(t);
}
}
So you know for which iterable types you expect this to be an improvement? (I've found a couple, for it's not always implemented differently)
(I still think it's better because in the worst case, it will be the same, but I don't believe this will necessarily net us benefits in most cases)
List<T> arrayList = newList(iterable); | ||
for (T item : arrayList) { | ||
Preconditions.checkNotNull(item, "iterable cannot contain null elements"); | ||
} | ||
|
||
arrayList.forEach(ConjureCollections::checkNotNullElement); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: ArrayList overrides forEach and does not allocate an iterator, so even though we already loop over the contents in newList(iterable)
, in the worst case, this won't allocate two iterators
if (iterable instanceof Collection) { | ||
return new ArrayList<>((Collection<T>) iterable); | ||
if (iterable instanceof Collection<? extends T> collection) { | ||
return new ArrayList<>(collection); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but more for my understanding/knowledge, since we're looking at optimizing allocations: From what I can tell, this might allocate the underlying array twice?
This does
public ArrayList(Collection<? extends E> c) {
Object[] a = c.toArray();
if ((size = a.length) != 0) {
if (c.getClass() == ArrayList.class) {
elementData = a;
} else {
elementData = Arrays.copyOf(a, size, Object[].class);
}
} else {
// replace with empty array.
elementData = EMPTY_ELEMENTDATA;
}
}
and Collection#toArray
says
The returned array will be "safe" in that no references to it are maintained by this collection. (In other words, this method must allocate a new array even if this collection is backed by an array). The caller is thus free to modify the returned array.
Unfortunately, it seems like addAll
also does the same 🤔 (so using new ArrayList(collection.size())
and addAll
wouldn't help either)
public boolean addAll(Collection<? extends E> c) {
Object[] a = c.toArray();
modCount++;
int numNew = a.length;
if (numNew == 0)
return false;
Object[] elementData;
final int s;
if (numNew > (elementData = this.elementData).length - (s = size))
elementData = grow(s + numNew);
System.arraycopy(a, 0, elementData, s, numNew);
size = s + numNew;
return true;
}
Curious if you have any knowledge about this and why we would be allocating the same array twice
Preconditions.checkNotNull(item, "iterable cannot contain null elements"); | ||
} | ||
|
||
set.forEach(ConjureCollections::checkNotNullElement); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fwiw, LinkedHashSet does not seem to override Iterable's default forEach implementation, so we'll still create the iterator here afaict
addTo.addAll(new AbstractCollection<>() { | ||
@Override | ||
public Iterator<T> iterator() { | ||
return new NonNullIterator<>(collection.iterator()); | ||
} | ||
|
||
@Override | ||
public Object[] toArray() { | ||
Object[] array = collection.toArray(); | ||
for (Object element : array) { | ||
checkNotNullElement(element); | ||
} | ||
return array; | ||
} | ||
|
||
@Override | ||
public int size() { | ||
return collection.size(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In an ideal world I think we'd implement this within specialized non-null collections to avoid the new anonymous wrapper allocation on a per-call basis. Getting there would be a little tricky, because this API is fairly specialized, however it's in the internal
package so we can make some assumptions about how it's used.
Probably not a substantial optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose another option is to make the assumption that toArray
will be used, and explicitly handle that ourselves:
T[] values = elementsToAdd.toArray();
addTo.ensureCapacity[IfPossible](values.length);
for (T element : values) { // no more iterator allocation because we're iterating over an array
Preconditions.checkNotNull(element, "elementsToAdd cannot contain null elements");
addTo.add(element);
}
Before this PR
ConjureCollections
used iterator based for loopsAfter this PR
==COMMIT_MSG==
ConjureCollections uses
Iterable#forEach
where possible to minimize allocations and speed up iteration over collections.==COMMIT_MSG==
Possible downsides?