πSTL in CPP ( Part I)
STANDARD TEMPLATE LIBRARY (INTRO, VECTOR, LIST) - 2023
Standard Template Library (STL)
The Standard Template Library (STL) is the heart of the C++ standard library. There is no official definition of the STL, however, generally accepted definition may be this: The STL is the parts of C++ Standard Library what work with iterators.
So, it includes the containers, part of the iostream libraries, function objects, and algorithms.
The STL is an example of generic programming. While the object-oriented programming concentrates on the data aspect of programming, generic programming concentrates on algorithms.
A goal of generic programming is to write code that is independent of data types. Templates are the C++ tools for creating generic program. Templates let us define a function or class in terms of a generic type. The STL goes further by providing a generic representation of algorithms.
STL Component
STL has three key components - containers (popular templatized data structures), iterators and algorithms.
Containers Containers manage collection of element. To meet different needs, the STL provides different kinds of containers.
Sequential containers These are ordered collections in which every element has a certain position. This position depends on the time and place of the insertion, but it is independent of the value of the element. These include vector, deque, and list.
Associative containers These are sorted collections in which the actual position of an element depends on its value due to a sorting criterion. These include set, multiset, map, and multimap.
Iterators Iterators are objects that can navigate over elements. An iterator represents a certain position in a container. Iterators are divided into five groups, based on the operations they support.
Input iterators These are read-only iterators where each iterated location may be read only once. The most common manifestation of input iterator is istream_iterators.
Output iterators These are write-only iterators where each iterated location may be read only once. The most common manifestation of output iterator is ostream_iterator.
Forward iterators These have the capabilities of both input and output iterators, but they can read or write a single location repeatedly. They don't support operator--, so they can move only forward.
Bidirectional iterators These are just like forward iterators, except they can go backward as well as forward. The standard associative containers all offer bidirectional iterators.
Random access iterators These do everything bidirectional iterators do, but they also offer iterator arithmetic, i.e., the ability to jump forward or backward in a single step. vector, string, and deque each provide random access iterators.
Algorithms The STL provides several standard algorithms for the processing of elements of collections. They can search, sort, modify, or simply use the element for different purpose. Algorithms use iterators. So, an algorithm has to be written only once to work with arbitrary containers because the iterator interface for iterators is common for all container types.
Vectors
A vector manages its elements in a dynamic array. It enables random access, which means we can access each element directly with index.
Appending and removing elements at the end of the array is very fast. But inserting an element in the middle or at the beginning of the array takes time because all the following elements should be moved to make room for it while maintaining the order.
Deques
Double-ended-queue (deque) is a dynamic array that is implemented so that it can grow in both directions. So, inserting element at the end and at the beginning is fast. Inserting and elements in the middle, however, takes time because element must be moved.
Lists
A list is implemented as a doubly linked list of element. In other words, each element in a list has its own segment of memory and refers to its predecessor and its successor. Lists do not provide random access. General access to an arbitrary element takes linear time and this is a lot worse than vectors and deques.
The advantage of a list is that the insertion or removal of an element is fast at any position. Only the links must be changed. This implies that moving an element in the middle of a list is very fast compared with moving an element in a vector or a deque.
Vector
Initializing Vectors
Using the member function
push_back()
pop_back()
size()
clear()
empty()
Output is:
The push_back()
adds a new element to the end of a vector. As a result, Scientist[0] is now equal to "James Maxwell", and Scientist[3] is now equal to "Louis Pasteur".
The pop_back()
removes the last element of a vector and reduces the vector size by one. In the example, the size of "Scientist" is reduced from 4 to 3.
The clear()
member function removes all of the elements of a vector and sets its size to 0.
The empty()
returns "true" if the vector is empty, otherwise, it returns "false."
Iterators
We can declare iterator as following: container type::iterator new_iterator
Iterators are values that identify an element in a container. We can access/change the value of the element using iterator.
We can also declare another iterator:
This constant iterator is just like a regular iterator except that we can think of a constant iterator as providing read-only access. The iterator itself, however, can change. In other words, we can move iter all around the vector. But we can't change the value of any of the elements through iter.
begin() and end()
In the above example, we assign the return value of Scientist.begin() to iter. The begin() returns an iterator that refers to a container's first element. So, the statement assigns an iterator that refers to the first element of Scientist to iter. Then, while looping through the elements, we test the return value of Scientist.end() against iter to make sure the two are equal.
The end() returns an iterator one past the last element in a container. This means the loop will continue until iter has looped through all of the elements in Scientist. In the action statement in the loop, ++iter, increments iter, which traverse it to the next element in the vector. Depending upon the iterator, we can perform other mathematical operations on iterators to traverse them around a container.
In the body of the loop, we send *iter to cout. By placing the dereference operator(*) in front of iter, we display the value of the element to which the iterator refers.
insert() and erase()
We can insert an item, for example, an the beginning:
not or we can remove an item from the list, an element not at the end but from the middle:
Vector - Performance
Vectors grow dynamically, and every vector has a specific size. When we add a new element to a vector, the computer reallocates memory and may even copy all of the vector elements into this new memory, and this can cause a performance hit.
capacity()
The capacity() returns the capacity of a vector (the number of elements that a vector can hold before a program must allocate more memory for it). So, a vector's capacity is not the same thing as its size which is the number of elements a vector currently holds. In short, capacity() is the size of the container and the size() is the currently filled level. The capacity() is always equal to or larger than the size. The difference between them is the number of elements that we can add to the vector before the array under the hood needs to be reallocated.
reserve()
Before we look into the reserve() we need to know what's happening whenever a vector needs more space. It's doing similar to realloc operation. New memory allocation, copy from the old to the new, destruct old objects, deallocate old memory, invalidation of iterators. It's expensive!
The reserve() increases the capacity of a vector to the number supplied as an argument. The reserve() gives us control over when a reallocation of additional memory occurs.
By using reserve() to keep a vector's capacity large enough for our purposes, we can delay memory reallocation.
According to Scott Meyers, the following code requires 2-18 reallocations:
So, he suggested we should use resever() to reduce the costs:
insert() and erase()
Adding or removing an element from the end of a vector using push_back() or pop_back() is extremely efficient. Adding or removing an element at any other element in a vector, however, can require more work because we may have to move multiple elements to accommodate the insertion or deletion. With large vectors this can cause a performance hit.
Vector - Matrix
Vector - Matrix Initialization
Here is another example of vector of vector, 3x2, matrix initialization:
Output is:
Actually, similar initialization is used for a C++ Starter Package of Google AI Ants 2011:
Matrix
We can make a matrix with vector:
Output from the run is:
Matrix multiplication
Here is an example of multi-dimensional vectors used to calculate matrix multiplication:
My vector::reserve(), capacity(), resize(), and push_back()
The vector class in this section is not the stl's vector, but implementation wise, it should be similar, and we get some insight how it works.
When we change sizes, the most basic operation is vector::reserve(). This operation adds space for new elements, and its code looks like this;
We are not initializing the elements of the reserved space, and push_back() will do it. Here, we are just reserving space. So, we can get the information about the amount of available space by using:
If v is a vector, then the space for our new vector element when we use with out push_back() will be:
That's the currently available size for new elements without reallocation.
If we utilize the reserve(), our resize() fairly straightforward: make the vector have the requested size of elements and initialize each new element with 0.0.
So, the push_back() looks like this:
vector::operator=()
The operator=() implemented above may be the simplest:
Allocate memory for a copy.
Copy the elements/
Free the old memory allocation.
Update sz, elem, space.
Queue with Vector
The code below implements queue using vector. It's a contrived one. The vector contains objects which have only one integer member.
Vector vs List
vector
Contiguous memory.
Pre-allocates space for future elements, so extra space may be required.
Unlike a list where additional space for a pointer is needed, each element only requires the space for the element type itself.
Can re-allocate memory for the entire vector at any time that we add an element.
Insertions at the end are constant, but insertions elsewhere are a costly O(n).
Erasing an element at the end of the vector is constant time, but for the other locations it's O(n).
We can randomly access its elements.
Iterators, pointers, and references are invalidated if we add or remove elements to or from the vector.
So, after the *it == 5, this may crash.
We can easily get at the underlying array if we need an array of the elements:
vector is very similar to the array, so the following line is true:
list
Non-contiguous memory.
No pre-allocated memory. The memory overhead for the list itself is constant.
Each element requires extra space for the node which holds the element, including pointers to the next and previous elements in the list.
Never has to re-allocate memory for the whole list just because we add an element.
Insertions and erasures are cheap no matter where in the list they occur.
It's cheap to combine lists with splicing.
We cannot randomly access elements, so getting at a particular element in the list can be expensive.
Iterators remain valid even when we add or remove elements from the list.
If we need an array of the elements, we'll have to create a new one and add them all to it, since there is no underlying array.
Vectors with Iterator and Algorithms(STL)
The following code examples shows how the vector works with iterator and algorithms such as find() and for_each(). It also demonstrates the characteristics of string class as part of STL.
Coding itself is almost self explanatory:
The output is:
We could have done more to the output such as put spaces between appropriate words and convert characters to lower/upper case. Yet, it demonstrates enough use of several features that STL provides.
In the following Chapters, we'll look at Associative Containers, iterators, and algorithms in detail.
List
Erasing elements in a loop
It's a little bit tricky to erase elements from the list within a loop. Here is a sample:
The iterator returned from the line:
should points to the next element, Visual C++ seems it an additional step to increment the iterator:
But gcc can do the job without the additional line.
Types defined for STL containers
All STL containers define the types. In this list, we use X to represent a container type, such as vector<double>, and T for the type stored in the container, such as double.
X::value_type - T, the element type. T is the type we get when the iterator is dereferenced. For sets and multisets, it is constant. For maps and multimaps, it is pair<const key-type, mapped-type>.
X::reference - T&.
X::const_reference - const T&.
X::iterator - iterator type pointing to T, behaves like T*.
X::const_iterator - iterator type pointing to const T, behaves like const T*.
X::difference_type - signed integral type used to represent the distance from one iterator to another.
X::size_type - unsigned integral type size_type can represent size of data objects, such as number of elements, and subscripts.
Last updated