Source Code Party - Container Performance
A lovely night.
Wanna dance the Waltz with this beauty?
🕺: 😍 yes, please
💃: Do you wanna dance with me?
🕺: 😍 of course
💃: Men are sometimes liars.
🕺: I am the only compatible partner for you.
💃: Humans can’t be trusted, only the source code won’t lie!
🤖: Yes? I’m always online for you, my dear.
Prelude to the Waltz: Why this?
std::vector | std::deque | std::list | |
---|---|---|---|
Indexed Access | Super Fast | Fast | Impossible |
Insert/remove front | Slow | Fast | Fast |
Insert/remove back | Super Fast | Very Fast | Fast |
Insert/remove elsewhere | Slow | Fast | Very Fast |
Memory | Low | High | High |
Splicing/Joining | Slow | Very Slow | Fast |
Stability * | Poor | Very Poor | Good |
from some page here
Should I look at the source code of Container?
3 Sources
Indexed Access
Vector: Super Fast
Line 3042:
template <class _Allocator>
typename vector<bool, _Allocator>::reference
vector<bool, _Allocator>::at(size_type __n)
{
if (__n >= size())
this->__throw_out_of_range();
return (*this)[__n];
}
As shown in the list, to have an indexed access to an element is just like to retrieve an item in an array.
Deque: Fast
template <class _Tp, class _Allocator>
inline
typename deque<_Tp, _Allocator>::reference
deque<_Tp, _Allocator>::at(size_type __i)
{
if (__i >= __base::size())
__base::__throw_out_of_range();
size_type __p = __base::__start_ + __i;
return *(*(__base::__map_.begin() + __p / __base::__block_size) + __p % __base::__block_size);
}
Seems that Deque is using a combination of small continuous blocks, so if we wanna get a value, we need to find the place of the small continuous block, then get the value. It looks a bit complicated than Vector, so it’s slower.
List: Impossible !
Ahaha 😀, look at the source code, no such function. lol.
Push Back <- adds an element to the end
Vector: Super Fast
Line 3060:
template <class _Allocator>
void
vector<bool, _Allocator>::push_back(const value_type& __x)
{
if (this->__size_ == this->capacity())
reserve(__recommend(this->__size_ + 1));
++this->__size_;
back() = __x;
}
I am curious what is so-called __recommend
, which is also mentioned in the cs106L lecture note, so I dig in.
__recommend
: if the new size is larger than the max size the system can give, throw error. If current the capacity (the size of the allocated memory) is larger than or equal to half of the max capacity, then return the max capacity; If not, return 2 x current capacity, or the new size, depends on which is larger.
template <class _Tp, class _Allocator>
inline _LIBCPP_INLINE_VISIBILITY
typename vector<_Tp, _Allocator>::size_type
vector<_Tp, _Allocator>::__recommend(size_type __new_size) const
{
const size_type __ms = max_size();
if (__new_size > __ms)
this->__throw_length_error(); // boundary check
const size_type __cap = capacity();
if (__cap >= __ms / 2)
return __ms;
return _VSTD::max<size_type>(2*__cap, __new_size);
}
max_size: roughly understand this is to get the max size a vector can be allocated. Ignore the deeper details.
template <class _Tp, class _Allocator>
typename vector<_Tp, _Allocator>::size_type
vector<_Tp, _Allocator>::max_size() const _NOEXCEPT
{
return _VSTD::min<size_type>(__alloc_traits::max_size(this->__alloc()),
numeric_limits<difference_type>::max());
}
capacity: get the size of the allocated memory
size_type capacity() const _NOEXCEPT
{return static_cast<size_type>(__end_cap() - __begin_);}
end_cap: returns the pointer to the end of the allocated memory, different from the end pointer
pointer& __end_cap() _NOEXCEPT
{return __end_cap_.first();} // __compressed_pair<pointer, allocator_type> __end_cap_;
back(): access the last element, return a reference
_LIBCPP_INLINE_VISIBILITY reference back()
{return __make_ref(__size_ - 1);}
Now, I come back to this code: when the size is equal to the allocated memory, means it’s full, then the system reserves a larger recommended size to this vector, get the new size, and then assign the address of the new value to the last element’s reference(alias). No wonder it’s super fast.
template <class _Allocator>
void
vector<bool, _Allocator>::push_back(const value_type& __x)
{
if (this->__size_ == this->capacity())
reserve(__recommend(this->__size_ + 1));
++this->__size_;
back() = __x;
}
Deque: Very Fast
template <class _Tp, class _Allocator>
void
deque<_Tp, _Allocator>::push_back(const value_type& __v)
{
allocator_type& __a = __base::__alloc(); // allocate a new space
if (__back_spare() == 0)
__add_back_capacity(); // make sure got a back_spare for the new space
// __back_spare() >= 1
__alloc_traits::construct(__a, _VSTD::addressof(*__base::end()), __v); // educated guess, link the new allocated space to the last pointer of the base, put the new value v inside the allocated space
++__base::size(); // increase the base size
}
As seen from the code, seems no copying options, just allocated new space and add new value, so it’s very fast.
List: Fast
template <class _Tp, class _Alloc>
void
list<_Tp, _Alloc>::push_back(const value_type& __x)
{
__node_allocator& __na = base::__node_alloc();
__hold_pointer __hold = __allocate_node(__na);
__node_alloc_traits::construct(__na, _VSTD::addressof(__hold->__value_), __x);
__link_nodes_at_back(__hold.get()->__as_link(), __hold.get()->__as_link());
++base::__sz();
__hold.release();
}
Obviously 😅, it’s fast, but slower than deque and vector.
Insertion
There are many types of insertion:
iterator insert(const_iterator position, const value_type& x);
iterator insert(const_iterator position, value_type&& x);
iterator insert(const_iterator position, size_type n, const value_type& x);
template <class InputIterator>
iterator insert(const_iterator position, InputIterator first, InputIterator last);
iterator insert(const_iterator position, initializer_list<value_type> il);
for Insertion, just take a look at the simplest one: inserts value before pos (position).
Vector
front: slow; elsewhere: slow; end: super fast; memory: low
Pre
To read the Insert function of Vector, needs to understand the following concepts first:
template< class BidirIt1, class BidirIt2 >
BidirIt2 copy_backward( BidirIt1 first, BidirIt1 last, BidirIt2 d_last );
- first, last: the range of the elements to copy from
- d_last: the end of the destination range
-
A const_iterator is an iterator that points to const value (like a const T* pointer), use
*
to visit the element that the iterator points to. -
__const_iterator_cast
iterator __const_iterator_cast(const_iterator __p) _NOEXCEPT
{return begin() + (__p - cbegin());} //starting point + diff
cbegin():
const_iterator cbegin() const _NOEXCEPT
{return begin();}
template< class InputIt, class OutputIt >
OutputIt copy( InputIt first, InputIt last, OutputIt d_first );
Copies the elements in the range, defined by [first, last), to another range beginning at d_first.
Code
Then we can look at the code of vector
:
template <class _Allocator>
typename vector<bool, _Allocator>::iterator
vector<bool, _Allocator>::insert(const_iterator __position, const value_type& __x)
{
iterator __r;
if (size() < capacity()) // if allocated memory is enough
{
const_iterator __old_end = end();
++__size_; // increase the size
_VSTD::copy_backward(__position, __old_end, end()); // copy the second-half of the vector in a backward manner starting from the end of current size, and the first half remains the same
__r = __const_iterator_cast(__position); // get the iterator version of the const iterator
}
else
{
vector __v(__alloc()); // allocate some new spaces for a temporary vector
__v.reserve(__recommend(__size_ + 1)); // reserve recommended new space for it
__v.__size_ = __size_ + 1; // calculate the size
__r = _VSTD::copy(cbegin(), __position, __v.begin()); // copy the first part of the original vector(until pos) from the original location to the starting point of the temp vector's space
_VSTD::copy_backward(__position, cend(), __v.end()); // copy the second part backwards from the end of the temp space
swap(__v); // swap the original v's content with the temp __v's content
}
*__r = __x; // Make the iterator's pointer pointing to the new value's address
return __r; // return the iterator of the inserted value
}
So I guess, inserting in front/elsewhere is slow because _VSTD::copy
is slow!
Based on cpp, the behaviour of copy is:
template<class InputIterator, class OutputIterator>
OutputIterator copy (InputIterator first, InputIterator last, OutputIterator result)
{
while (first!=last) {
*result = *first;
++result;
++first;
}
return result;
}
The time complexity seems to be O(n). But if the element to add is the last element, we don’t need to copy the whole array, since the copy_backward
is used, so just appending the new element to the end, so it’s very fast.
Deque
front: slow; elsewhere: fast; end: very fast; memory: high
Doc iterator insert( iterator pos, const T& value );
I feel sleepy, this part is the DIY part for you =) To found that front
is slow, and elsewhere
is fast!
template <class _Tp, class _Allocator>
typename deque<_Tp, _Allocator>::iterator
deque<_Tp, _Allocator>::insert(const_iterator __p, value_type&& __v)
{
size_type __pos = __p - __base::begin();
size_type __to_end = __base::size() - __pos;
allocator_type& __a = __base::__alloc();
if (__pos < __to_end)
{ // insert by shifting things backward
if (__front_spare() == 0)
__add_front_capacity();
// __front_spare() >= 1
if (__pos == 0)
{
__alloc_traits::construct(__a, _VSTD::addressof(*--__base::begin()), _VSTD::move(__v));
--__base::__start_;
++__base::size();
}
else
{
iterator __b = __base::begin();
iterator __bm1 = _VSTD::prev(__b);
__alloc_traits::construct(__a, _VSTD::addressof(*__bm1), _VSTD::move(*__b));
--__base::__start_;
++__base::size();
if (__pos > 1)
__b = _VSTD::move(_VSTD::next(__b), __b + __pos, __b);
*__b = _VSTD::move(__v);
}
}
else
{ // insert by shifting things forward
if (__back_spare() == 0)
__add_back_capacity();
// __back_capacity >= 1
size_type __de = __base::size() - __pos;
if (__de == 0)
{
__alloc_traits::construct(__a, _VSTD::addressof(*__base::end()), _VSTD::move(__v));
++__base::size();
}
else
{
iterator __e = __base::end();
iterator __em1 = _VSTD::prev(__e);
__alloc_traits::construct(__a, _VSTD::addressof(*__e), _VSTD::move(*__em1));
++__base::size();
if (__de > 1)
__e = _VSTD::move_backward(__e - __de, __em1, __e);
*--__e = _VSTD::move(__v);
}
}
return __base::begin() + __pos;
}
List
front: fast; elsewhere: very fast; end: fast; memory: high
Doc iterator insert( iterator pos, const T& value );
template <class _Tp, class _Alloc>
typename list<_Tp, _Alloc>::iterator
list<_Tp, _Alloc>::insert(const_iterator __p, const value_type& __x)
{
#if _LIBCPP_DEBUG_LEVEL == 2
_LIBCPP_ASSERT(__get_const_db()->__find_c_from_i(&__p) == this,
"list::insert(iterator, x) called with an iterator not"
" referring to this list");
#endif
__node_allocator& __na = base::__node_alloc();
__hold_pointer __hold = __allocate_node(__na);
__node_alloc_traits::construct(__na, _VSTD::addressof(__hold->__value_), __x);
__link_nodes(__p.__ptr_, __hold->__as_link(), __hold->__as_link());
++base::__sz();
#if _LIBCPP_DEBUG_LEVEL == 2
return iterator(__hold.release()->__as_link(), this);
#else
return iterator(__hold.release()->__as_link());
#endif
}
Basically, constructing a new node and put in, so it’s very fast compared with others!
Source Code Party is over…
💃: Good night~