Skip to content

Commit 1947104

Browse files
committed
A new stack-based vector
1 parent 1aa4590 commit 1947104

File tree

1 file changed

+281
-0
lines changed

1 file changed

+281
-0
lines changed

text/2978-stack_based_vec.md

Lines changed: 281 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,281 @@
1+
- Feature Name: `stack_based_vec`
2+
- Start Date: 2020-09-27
3+
- RFC PR: [rust-lang/rfcs#2990](https://github.com/rust-lang/rfcs/pull/2990)
4+
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
This RFC, which depends and takes advantage of the upcoming stabilization of constant generics (min_const_generics), tries to propose the creation of a new "growable" vector named `ArrayVec` that manages stack memory and can be seen as an alternative for the built-in structure that handles heap-allocated memory, aka `alloc::vec::Vec<T>`.
10+
11+
# Motivation
12+
[motivation]: #motivation
13+
14+
`core::collections::ArrayVec<T>` has several use-cases and should be conveniently added into the standard library due to its importance.
15+
16+
### Unification
17+
18+
There are a lot of different crates about the subject that tries to do roughly the same thing, a centralized implementation would stop the current fragmentation.
19+
20+
### Optimization
21+
22+
Stack-based allocation is generally faster than heap-based allocation and can be used as an optimization in places that otherwise would have to call an allocator. Some resource-constrained embedded devices can also benefit from it.
23+
24+
### Building block
25+
26+
Just like `Vec`, `ArrayVec` is also a primitive vector where high-level structures can use it as a building block. For example, a stack-based matrix or binary heap.
27+
28+
### Useful in the real world
29+
30+
`arrayvec` is one of the most downloaded project of `crates.io` and is used by thousand of projects, including Rustc itself.
31+
32+
33+
# Guide-level explanation
34+
[guide-level-explanation]: #guide-level-explanation
35+
36+
`ArrayVec` is a container that encapsulates fixed size buffers.
37+
38+
```rust
39+
let mut v: ArrayVec<i32, 4> = ArrayVec::new();
40+
let _ = v.push(1);
41+
let _ = v.push(2);
42+
43+
assert_eq!(v.len(), 2);
44+
assert_eq!(v[0], 1);
45+
46+
assert_eq!(v.pop(), Some(2));
47+
assert_eq!(v.len(), 1);
48+
49+
v[0] = 7;
50+
assert_eq!(v[0], 7);
51+
52+
v.extend([1, 2, 3].iter().copied());
53+
54+
for element in &v {
55+
println!("{}", element);
56+
}
57+
assert_eq!(v, [7, 1, 2, 3]);
58+
```
59+
60+
Instead of relying on a heap-allocator, stack-based memory area is added and removed on-demand in a last-in-first-out (LIFO) order according to the calling workflow of a program. `ArrayVec` takes advantage of this predictable behavior to reserve an exactly amount of uninitialized bytes up-front and these bytes form a buffer where elements can be included dynamically.
61+
62+
```rust
63+
// `array_vec` can store up to 64 elements
64+
let mut array_vec: ArrayVec<i32, 64> = ArrayVec::new();
65+
```
66+
67+
Of course, fixed buffers lead to inflexibility because unlike `Vec`, the underlying capacity can not expand at run-time and there will never be more than 64 elements in the above example.
68+
69+
```rust
70+
// This vector can store up to 0 elements, therefore, nothing at all
71+
let mut array_vec: ArrayVec<i32, 0> = ArrayVec::new();
72+
let push_result = array_vec.push(1);
73+
// Ooppss... Our push operation wasn't successful
74+
assert!(push_result.is_err());
75+
```
76+
77+
A good question is: Should I use `core::collections::ArrayVec<T>` or `alloc::collections::Vec<T>`? Well, `Vec` is already good enough for most situations while stack allocation usually shines for small sizes.
78+
79+
* Do you have a known upper bound?
80+
81+
* How much memory are you going to allocate for your program? The default values of `RUST_MIN_STACK` or `ulimit -s` might not be enough.
82+
83+
* Are you using nested `Vec`s? `Vec<ArrayVec<T, N>>` might be better than `Vec<Vec<T>>`.
84+
85+
Each use-case is different and should be pondered individually. In case of doubt, stick with `Vec`.
86+
87+
For a more technical overview, take a look at the following operations:
88+
89+
```rust
90+
// `array_vec` has a pre-allocated memory of 2048 bits (32 * 64) that can store up
91+
// to 64 decimals.
92+
let mut array_vec: ArrayVec<i32, 64> = ArrayVec::new();
93+
94+
// Although reserved, there isn't anything explicitly stored yet
95+
assert_eq!(array_vec.len(), 0);
96+
97+
// Initializes the first 32 bits with a simple '1' decimal or
98+
// 00000000 00000000 00000000 00000001 bits
99+
array_vec.push(1);
100+
101+
// Our vector memory is now split into a 32/2016 pair of initialized and
102+
// uninitialized memory respectively
103+
assert_eq!(array_vec.len(), 1);
104+
```
105+
106+
# Reference-level explanation
107+
[reference-level-explanation]: #reference-level-explanation
108+
109+
`ArrayVec` is a contiguous memory block where elements can be collected, therefore, a collection by definition and even though `core::collections` doesn't exist, it is the most natural module placement.
110+
111+
The API basically mimics most of the current `Vec` surface with some tweaks to manage capacity.
112+
113+
Notably, these tweaked methods are checked (out-of-bound inputs or invalid capacity) versions of some well-known functions like `push` that will return `Result` instead of panicking at run-time. Since the upper capacity bound is known at compile-time and the majority of methods are `#[inline]`, the compiler is likely going to remove most of the conditional bounding checking.
114+
115+
```rust
116+
// Please, bare in mind that these methods are simply suggestions. Discussions about the
117+
// API should probably take place elsewhere.
118+
119+
pub struct ArrayVec<T, const N: usize> {
120+
data: MaybeUninit<[T; N]>,
121+
len: usize,
122+
}
123+
124+
impl<T, const N: usize> ArrayVec<T, N> {
125+
// Constructors
126+
127+
pub const fn from_array(array: [T; N]) -> Self;
128+
129+
pub const fn from_array_and_len(array: [T; N], len: usize) -> Self;
130+
131+
pub const fn new() -> Self;
132+
133+
// Methods
134+
135+
pub const fn as_mut_ptr(&mut self) -> *mut T;
136+
137+
pub const fn as_mut_slice(&mut self) -> &mut [T];
138+
139+
pub const fn as_ptr(&self) -> *const T;
140+
141+
pub const fn as_slice(&self) -> &[T];
142+
143+
pub const fn capacity(&self) -> usize;
144+
145+
pub fn clear(&mut self);
146+
147+
pub fn dedup(&mut self)
148+
where
149+
T: PartialEq;
150+
151+
pub fn dedup_by<F>(&mut self, same_bucket: F)
152+
where
153+
F: FnMut(&mut T, &mut T) -> bool;
154+
155+
pub fn dedup_by_key<F, K>(&mut self, mut key: F)
156+
where
157+
F: FnMut(&mut T) -> K,
158+
K: PartialEq<K>;
159+
160+
pub fn drain<R>(&mut self, range: R) -> Option<Drain<'_, T, N>>
161+
where
162+
R: RangeBounds<usize>;
163+
164+
pub fn extend_from_cloneable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]>
165+
where
166+
T: Clone;
167+
168+
pub fn extend_from_copyable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]>
169+
where
170+
T: Copy;
171+
172+
pub fn insert(&mut self, idx: usize, element: T) -> Result<(), T>;
173+
174+
pub const fn is_empty(&self) -> bool;
175+
176+
pub const fn len(&self) -> usize;
177+
178+
pub fn pop(&mut self) -> Option<T>;
179+
180+
pub fn push(&mut self, element: T) -> Result<(), T>;
181+
182+
pub fn remove(&mut self, idx: usize) -> Option<T>;
183+
184+
pub fn retain<F>(&mut self, mut f: F)
185+
where
186+
F: FnMut(&mut T) -> bool;
187+
188+
pub fn splice<I, R>(&mut self, range: R, replace_with: I) -> Option<Splice<'_, I::IntoIter, N>>
189+
where
190+
I: IntoIterator<Item = T>,
191+
R: RangeBounds<usize>;
192+
193+
pub fn split_off(&mut self, at: usize) -> Option<Self>;
194+
195+
pub fn swap_remove(&mut self, idx: usize) -> Option<T>;
196+
197+
pub fn truncate(&mut self, len: usize);
198+
}
199+
```
200+
201+
Meaningless, unstable and deprecated methods like `reserve` or `drain_filter` weren't considered. A concrete implementation is available at https://github.com/c410-f3r/stack-based-vec.
202+
203+
# Drawbacks
204+
[drawbacks]: #drawbacks
205+
206+
### Additional complexity
207+
208+
New and existing users are likely to find it difficult to differentiate the purpose of each vector type, especially people that don't have a theoretical background in memory management.
209+
210+
### The current ecosystem is fine
211+
212+
`ArrayVec` might be an overkill in certain situations. If someone wants to use stack memory in a specific application, then it is just a matter of grabbing the appropriated crate.
213+
214+
# Prior art
215+
[prior-art]: #prior-art
216+
217+
These are the most known structures:
218+
219+
* `arrayvec::ArrayVec`: Uses declarative macros and an `Array` trait for implementations but lacks support for arbitrary sizes.
220+
* `heapless::Vec`: With the usage of `typenum`, can support arbitrary sizes without a nightly compiler.
221+
* `staticvec::StaticVec`: Uses unstable constant generics for arrays of arbitrary sizes.
222+
* `tinyvec::ArrayVec`: Supports fixed and arbitrary (unstable feature) sizes but requires `T: Default` for security reasons.
223+
224+
As seen, there isn't an implementation that stands out among the others because all of them roughly share the same purpose and functionality. Noteworthy is the usage of constant generics that makes it possible to create an efficient and unified approach for arbitrary array sizes.
225+
226+
# Unresolved questions
227+
[unresolved-questions]: #unresolved-questions
228+
229+
### Nomenclature
230+
231+
`ArrayVec` will conflict with `arrayvec::ArrayVec` and `tinyvec::ArrayVec`.
232+
233+
### Prelude
234+
235+
Should it be included in the prelude?
236+
237+
### Macros
238+
239+
```rust
240+
// Instance with 1i32, 2i32 and 3i32
241+
let _: ArrayVec<i32, 33> = array_vec![1, 2, 3];
242+
243+
// Instance with 1i32 and 1i32
244+
let _: ArrayVec<i32, 64> = array_vec![1; 2];
245+
```
246+
247+
# Future possibilities
248+
[future-possibilities]: #future-possibilities
249+
250+
### Dynamic array
251+
252+
An hydric approach between heap and stack memory could also be provided natively in the future.
253+
254+
```rust
255+
pub struct DynVec<T, const N: usize> {
256+
// Hides internal implementation
257+
data: DynVecData,
258+
}
259+
260+
impl<T, const N: usize> DynVec<T, N> {
261+
// Much of the `Vec` API goes here
262+
}
263+
264+
// This is just an example. `Vec<T>` could be `Box` and `enum` an `union`.
265+
enum DynVecData<T, const N: usize> {
266+
Heap(Vec<T>),
267+
Inline(ArrayVec<T, N>),
268+
}
269+
```
270+
271+
The above description is very similar to what `smallvec` already does.
272+
273+
### Generic collections and generic strings
274+
275+
Many structures that use `alloc::vec::Vec` as the underlying storage can also use stack or hybrid memory, for example, an hypothetical `GenericString<S>`, where `S` is the storage, could be split into:
276+
277+
```rust
278+
type DynString<const N: usize> = GenericString<DynVec<u8, N>>;
279+
type HeapString = GenericString<Vec<u8>>;
280+
type StackString<const N: usize> = GenericString<ArrayVec<u8, N>>;
281+
```

0 commit comments

Comments
 (0)