4
4
5
5
# Summary
6
6
7
- Add additional iterator-like View objects to collections.
8
- Views provide a composable mechanism for in-place observation and mutation of a
7
+ Add additional iterator-like Entry objects to collections.
8
+ Entries provide a composable mechanism for in-place observation and mutation of a
9
9
single element in the collection, without having to "re-find" the element multiple times.
10
10
This deprecates several "internal mutation" methods like hashmap's ` find_or_insert_with ` .
11
11
12
12
# Motivation
13
13
14
14
As we approach 1.0, we'd like to normalize the standard APIs to be consistent, composable,
15
15
and simple. However, this currently stands in opposition to manipulating the collections in
16
- an * efficient* manner. For instance, if we wish to build an accumulating map on top of one
17
- of our concrete maps, we need to distinguish between the case when the element we 're inserting
16
+ an * efficient* manner. For instance, if one wishes to build an accumulating map on top of one
17
+ of the concrete maps, they need to distinguish between the case when the element they 're inserting
18
18
is * already* in the map, and when it's * not* . One way to do this is the following:
19
19
20
20
```
@@ -27,11 +27,11 @@ if map.contains_key(&key) {
27
27
}
28
28
```
29
29
30
- However, this requires us to search for ` key ` * twice* on every operation.
31
- We might be able to squeeze out the ` update ` re-do by matching on the result
30
+ However, searches for ` key ` * twice* on every operation.
31
+ The second search can be squeezed out the ` update ` re-do by matching on the result
32
32
of ` find_mut ` , but the ` insert ` case will always require a re-search.
33
33
34
- To solve this problem, we have an ad-hoc mix of "internal mutation" methods which
34
+ To solve this problem, Rust currently has an ad-hoc mix of "internal mutation" methods which
35
35
take multiple values or closures for the collection to use contextually. Hashmap in particular
36
36
has the following methods:
37
37
@@ -50,111 +50,120 @@ the same value (even though only one will ever be called). `find_with_or_insert_
50
50
is also actually performing the role of ` insert_with_or_update_with ` ,
51
51
suggesting that these aren't well understood.
52
52
53
- Rust has been in this position before: internal iteration. Internal iteration was (I'm told)
53
+ Rust has been in this position before: internal iteration. Internal iteration was (author's note: I'm told)
54
54
confusing and complicated. However the solution was simple: external iteration. You get
55
55
all the benefits of internal iteration, but with a much simpler interface, and greater
56
- composability. Thus, we propose the same solution to the internal mutation problem.
56
+ composability. Thus, this RFC proposes the same solution to the internal mutation problem.
57
57
58
58
# Detailed design
59
59
60
- A fully tested "proof of concept" draft of this design has been implemented on top of pczarn's
61
- pending hashmap PR, as hashmap seems to be the worst offender, while still being easy
62
- to work with. You can
63
- [ read the diff here] ( https://github.com/Gankro/rust/commit/6d6804a6d16b13d07934f0a217a3562384e55612 ) .
60
+ A fully tested "proof of concept" draft of this design has been implemented on top of hashmap,
61
+ as it seems to be the worst offender, while still being easy to work with. You can
62
+ [ read the diff here] ( https://github.com/Gankro/rust/commit/39a1fa7c7362a3e22e59ab6601ac09475daff39b ) .
64
63
65
- We replace all the internal mutation methods with a single method on a collection: ` view ` .
66
- The signature of ` view ` will depend on the specific collection, but generally it will be similar to
67
- the signature for searching in that structure. ` view ` will in turn return a ` View ` object, which
64
+ All the internal mutation methods are replaced with a single method on a collection: ` entry ` .
65
+ The signature of ` entry ` will depend on the specific collection, but generally it will be similar to
66
+ the signature for searching in that structure. ` entry ` will in turn return an ` Entry ` object, which
68
67
captures the * state* of a completed search, and allows mutation of the area.
69
68
70
69
For convenience, we will use the hashmap draft as an example.
71
70
72
71
```
73
- pub fn view<'a>(&'a mut self, key: K) -> Entry<'a, K, V>;
72
+ /// Get an Entry for where the given key would be inserted in the map
73
+ pub fn entry<'a>(&'a mut self, key: K) -> Entry<'a, K, V>;
74
+
75
+ /// A view into a single occupied location in a HashMap
76
+ pub struct OccupiedEntry<'a, K, V>{ ... }
77
+
78
+ /// A view into a single empty location in a HashMap
79
+ pub struct VacantEntry<'a, K, V>{ ... }
80
+
81
+ /// A view into a single location in a HashMap
82
+ pub enum Entry<'a, K, V> {
83
+ /// An occupied Entry
84
+ Occupied(OccupiedEntry<'a, K, V>),
85
+ /// A vacant Entry
86
+ Vacant(VacantEntry<'a, K, V>),
87
+ }
74
88
```
75
89
76
90
Of course, the real meat of the API is in the View's interface (impl details removed):
77
91
78
92
```
79
- impl<'a, K, V> Entry <'a, K, V> {
80
- /// Get a reference to the value at the Entry's location
81
- pub fn get(&self) -> Option<&V> ;
93
+ impl<'a, K, V> OccupiedEntry <'a, K, V> {
94
+ /// Get a reference to the value of this Entry
95
+ pub fn get(&self) -> &V ;
82
96
83
- /// Get a mutable reference to the value at the Entry's location
84
- pub fn get_mut(&mut self) -> Option< &mut V> ;
97
+ /// Get a mutable reference to the value of this Entry
98
+ pub fn get_mut(&mut self) -> &mut V;
85
99
86
- /// Get a reference to the key at the Entry's location
87
- pub fn get_key(& self) -> Option<&K> ;
100
+ /// Set the value stored in this Entry
101
+ pub fn set(mut self, value: V ) -> V ;
88
102
89
- /// Return whether the Entry's location contains anything
90
- pub fn is_empty(&self) -> bool;
91
-
92
- /// Get a reference to the Entry's key
93
- pub fn key(&self) -> &K;
94
-
95
- /// Set the key and value of the location pointed to by the Entry, and return any old
96
- /// key and value that might have been there
97
- pub fn set(self, value: V) -> Option<(K, V)>;
103
+ /// Take the value stored in this Entry
104
+ pub fn take(self) -> V;
105
+ }
98
106
99
- /// Retrieve the Entry's key
100
- pub fn into_key(self) -> K;
107
+ impl<'a, K, V> VacantEntry<'a, K, V> {
108
+ /// Set the value stored in this Entry
109
+ pub fn set(self, value: V);
101
110
}
102
111
```
103
112
104
113
There are definitely some strange things here, so let's discuss the reasoning!
105
114
106
- First, ` view ` takes a ` key ` by value, because we observe that this is how all the internal mutation
107
- methods work. Further, taking the ` key ` up-front allows us to avoid * validating* provided keys if
108
- we require an owned ` key ` later. This key is effectively a * guarantor* of the view.
109
- To compensate, we provide an ` into_key ` method which converts the entry back into its guarantor.
110
- We also provide a ` key ` method for getting an immutable reference to the guarantor, in case its
111
- value effects any computations one wishes to perform.
115
+ First, ` entry ` takes a ` key ` by value, because this is the observed behaviour of the internal mutation
116
+ methods. Further, taking the ` key ` up-front allows implementations to avoid * validating* provided keys if
117
+ they require an owned ` key ` later for insertion. This key is effectively a * guarantor* of the entry.
118
+
119
+ Taking the key by-value might change once collections reform lands, and Borrow and ToOwned are available.
120
+ For now, it's an acceptable solution, because in particular, the primary use case of this functionality
121
+ is when you're * not sure* if you need to insert, in which case you should be prepared to insert.
122
+ Otherwise, ` find_mut ` is likely sufficient.
112
123
113
- Taking the key by-value might change once associated types land,
114
- and we successfully tackle the "equiv" problem. For now, it's an acceptable solution in our mind.
115
- In particular, the primary use case of this functionality is when you're * not sure* if you need to
116
- insert, in which case you should be prepared to insert. Otherwise, ` find_mut ` is likely sufficient.
124
+ The result is actually an enum, that will either be Occupied or Vacant. These two variants correspond
125
+ to concrete types for when the key matched something in the map, and when the key didn't, repsectively.
117
126
118
- Next, we provide a nice simple suite of "standard" methods:
119
- ` get ` , ` get_mut ` , ` get_key ` , and ` is_empty ` .
120
- These do exactly what you would expect, and allow you to query the view to see if it is logically
121
- empty, and if not, what it contains.
127
+ If there isn't a match, the user has exactly one option: insert a value using ` set ` , which will also insert
128
+ the guarantor, and destroy the Entry. This is to avoid the costs of maintaining the structure, which
129
+ otherwise isn't particularly interesting anymore.
122
130
123
- Finally, we provide a ` set ` method which inserts the provided value using the guarantor key,
124
- and yields the old key-value pair if it existed. Note that ` set ` consumes the View, because
125
- we lose the guarantor, and the collection might have to shift around a lot to compensate.
126
- Maintaining the entry after an insertion would add significant cost and complexity for no
127
- clear gain.
131
+ If there is a match, a more robust set of options is provided. ` get ` and ` get_mut ` provide access to the
132
+ value found in the location. ` set ` behaves as the vacant variant, but also yields the old value. ` take `
133
+ simply removes the found value, and destroys the entry for similar reasons as ` set ` .
128
134
129
- Let's look at how we now ` insert_or_update ` :
135
+ Let's look at how we one now writes ` insert_or_update ` :
130
136
131
137
```
132
- let mut view = map.view(key);
133
- if !view.is_empty() {
134
- let v = view.get_mut().unwrap();
135
- let new_v = *v + 1;
136
- *v = new_v;
137
- } else {
138
- view.set(1);
138
+ match map.entry(key) {
139
+ Occupied(entry) => {
140
+ let v = entry.get_mut();
141
+ let new_v = *v + 1;
142
+ *v = new_v;
143
+ }
144
+ Vacant(entry) => {
145
+ entry.set(1);
146
+ }
139
147
}
140
148
```
141
149
142
- We can now write our "intuitive" inefficient code, but it is now as efficient as the complex
150
+ One can now write something equivalent to the "intuitive" inefficient code, but it is now as efficient as the complex
143
151
` insert_or_update ` methods. In fact, this matches so closely to the inefficient manipulation
144
- that users could reasonable ignore views * until performance becomes an issue* . At which point
145
- it's an almost trivial migration. We also don 't need closures to dance around the fact that we
146
- want to avoid generating some values unless we have to, because that falls naturally out of our
152
+ that users could reasonable ignore Entries * until performance becomes an issue* . At which point
153
+ it's an almost trivial migration. Closures also aren 't needed to dance around the fact that one may
154
+ want to avoid generating some values unless they have to, because that falls naturally out of
147
155
normal control flow.
148
156
149
- If you look at the actual patch that does this, you'll see that Entry itself is exceptional
157
+ If you look at the actual patch that does this, you'll see that Entry itself is exceptionally
150
158
simple to implement. Most of the logic is trivial. The biggest amount of work was just
151
159
capturing the search state correctly, and even that was mostly a cut-and-paste job.
152
160
153
- With Views, we also open up the gate for... * adaptors* !
154
- You really want ` insert_or_update ` ? We can provide that for you! Generically!
155
- However we believe such discussion is out-of-scope for this RFC. Adaptors can
156
- be tackled in a back-compat manner after this has landed, and we have a better sense
157
- of what we need or want.
161
+ With Entries, the gate is also opened for... * adaptors* !
162
+ Really want ` insert_or_update ` back? That can be written on top of this generically with ease.
163
+ However, such discussion is out-of-scope for this RFC. Adaptors can
164
+ be tackled in a back-compat manner after this has landed, and usage is observed. Also, this
165
+ proposal does not provide any generic trait for Entries, preferring concrete implementations for
166
+ the time-being.
158
167
159
168
# Drawbacks
160
169
@@ -180,21 +189,17 @@ However, preventing invalidation would be more expensive, and it's not clear tha
180
189
cursor semantics would make sense on e.g. a HashMap, as you can't insert * any* key
181
190
in * any* location.
182
191
183
- # Unresolved questions
184
-
185
- One thing omitted from the design was a "take" method on the Entry. The reason for this
186
- is primarily that this doesn't seem to be a thing people are interested in having for
187
- internal manipulation. However, it also just would have meant more complexity, especially
188
- if it * didn't* consume the View. Do we want this functionality?
192
+ * This RFC originally [ proposed a design without enums that was substantially more complex]
193
+ (https://github.com/Gankro/rust/commit/6d6804a6d16b13d07934f0a217a3562384e55612 ).
194
+ However it had some interesting ideas about Key manipulation, so we mention it here for
195
+ historical purposes.
189
196
197
+ # Unresolved questions
190
198
The internal mutation methods cannot actually be implemented in terms of the View, because
191
199
they return a mutable reference at the end, and there's no way to do that with the current
192
200
View design. However, it's not clear why this is done by them. We believe it's simply to
193
201
validate what the method * actually did* . If this is the case, then Views make this functionality
194
- obsolete. However, if this is * still* desirable, we could tweak ` set ` to do this as well.
195
- Do we want this functionality?
196
-
197
- Do we want to introduce a proper standard trait, or keep it all concrete and ad-hoc for a while
198
- to figure out what does and doesn't work?
202
+ obsolete. However, if this is * still* desirable, ` set ` could be tweaked to do this as well.
203
+ However for some structures it may incur additional cost. Is this desirable functionality?
199
204
200
205
Naming bikesheds!
0 commit comments