2025-12-13 10:14

Tags: [[programming]], [[study]]

Understanding C++ Sequence Containers Through Interview Lenses

C++ sequence containers often show up in interviews, not only as API trivia but as a test of how well you understand memory layout, complexity, and practical trade‑offs. This post summarizes key concepts about std::array, std::vector, std::deque, std::list, and some special cases like std::vector<bool> from an “interview answer” perspective.


std::array vs std::vector: Fixed vs Dynamic, Both Contiguous

Both std::array<T, N> and std::vector<T> provide contiguous storage and O(1) random access, but they differ in lifetime and size flexibility:

  • std::array<T, N>

    • Fixed size known at compile time.
    • No dynamic growth: no push_back, no reserve.
    • Very low overhead and predictable layout: often used in performance‑critical or embedded code.
    • Complexity for element access is O(1); there is no complexity notion for resizing because you cannot resize.
  • std::vector<T>

    • Dynamic size at runtime.
    • Contiguous memory, O(1) index access.
    • Supports push_back, insert, erase, reserve, resize.
    • push_back is amortized O(1): most insertions are cheap, but occasionally a reallocation and full copy/move of existing elements happens (O(n)) when capacity is exceeded.

Interview takeaway:

  • When size is fixed and known at compile time → mention std::array.
  • When size grows/shrinks dynamically with good cache locality and random access → default to std::vector.

How std::vector Grows: The Apartment Moving Analogy

A useful way to explain std::vector’s internal growth is the “apartment moving” analogy:

  • Imagine you rent an apartment (a contiguous block of memory) with a few extra rooms (capacity > size).
  • When you add people (push_back) and still have free rooms, they just move into an empty room → O(1).
  • When all rooms are full and someone new arrives, you rent a bigger apartment (often about twice as many rooms), move everyone over (copy/move all elements), then give up the old one → that one insertion costs O(n).
  • Over many insertions, the total cost of these occasional “moves” spreads out, giving an amortized O(1) cost per push_back.

This explains:

  • Why push_back is usually fast but can occasionally be expensive.
  • Why reserve(n) can significantly improve performance when you know (even roughly) how many elements you will insert.

reserve vs resize: Capacity vs Size

A frequent source of confusion is the difference between reserve and resize on std::vector:

  • reserve(n)

    • Increases capacity to at least n.
    • Does not change size.
    • No elements are constructed or destroyed; it only allocates storage.
    • Useful for avoiding repeated reallocations when you know how many elements you will push back.
  • resize(n)

    • Changes the size to n.
    • If n is larger than current size:
      • New elements are default‑constructed or value‑initialized.
    • If n is smaller:
      • Extra elements are destroyed.
    • May allocate more capacity if needed.

Quick mental model:


std::vector<int> v;
v.reserve(100); // size = 0, capacity >= 100
v.resize(100);  // size = 100, capacity >= 100, 100 ints constructed`

Deque vs List vs Vector: Memory Layout and Indexing

A key insight you solidified: deque and list are both non‑contiguous, but only deque supports O(1) indexing.

std::deque

  • Internally implemented as multiple fixed‑size blocks (chunks) plus an array of pointers to these blocks.

  • For index i, the implementation computes:

    • which block contains it, and
    • the offset within that block.
  • That extra indirection lets deque provide O(1) operator[] and random access iterators, even though the entire structure is not a single contiguous array.

  • Strength: efficient push/pop at both ends, acceptable random access, but slightly weaker cache locality than vector.

std::list

  • Doubly linked list: each node holds one element plus pointers to prev and next.
  • No concept of index: there is no operator[].
  • To get to the k‑th element, you must iterate k steps → O(k).
  • Provides O(1) insert/erase given an iterator to the position, but poor cache locality and higher per‑element overhead.

Interview answer pattern:

  • deque is non‑contiguous but still supports O(1) random access via an internal block map. list is a linked list and does not provide indexing at all; accessing the k‑th element is O(k) by traversal.”

Complexity Cheat Sheet (Core Cases)

You already refined these answers; here they are in clean form:

  • std::vector

    • Index access: O(1)
    • push_back: amortized O(1), worst O(n) with reallocation
    • Insert/erase in the middle: O(n)
  • std::array

    • Index access: O(1)
    • No dynamic insert/erase; size fixed.
  • std::list

    • Insert/erase at known position (iterator): O(1)
    • Access k‑th element: O(k)
  • std::deque

    • Index access: O(1)
    • Push/pop front/back: amortized O(1)

This vocabulary is exactly what interviewers look for: O(1), O(n), amortized O(1), and a brief justification.


How std::vector<bool> Breaks the Usual Rules

std::vector<bool> is notoriously considered a “bad design” because it tries to compress booleans into bits, which breaks the usual reference and pointer semantics.

Why it cannot return bool&

Instead of storing one bool per byte (or more), vector<bool> packs multiple booleans into one word as bits:

  • There is no real “bool object” at a unique address per element.
  • Each element is just one bit inside a larger integer.
  • Therefore, operator[] cannot return bool&; there is no standalone bool to reference.

Instead, it returns a proxy object (e.g. vector<bool>::reference):

  • This proxy remembers which bit in which word corresponds to the element.
  • It overloads conversion to bool to read the bit, and assignment to update the bit.

That proxy type:

  • Does not behave exactly like bool&.
  • You cannot take a bool* from it (&v[0] is not a bool*).
  • Can break generic code that assumes references and pointers to elements behave like normal T& and T*.

Why iterators and references are “non‑standard‑feeling”

For typical vector<T>:

  • T& ref = v[0]; works.
  • T* ptr = &v[0]; works.
  • Iterators behave almost like raw pointers.

For vector<bool>:

  • bool& ref = v[0]; does not compile (there is no bool&).
  • bool* ptr = &v[0]; also does not work.
  • Some template code that expects a true reference or pointer to bool simply fails.

Practical advice and interview line:

vector<bool> is a specialized bit‑packed container that doesn’t provide real bool& references. It returns a proxy type instead, which can break generic code and pointer/reference assumptions. That’s why many guidelines recommend avoiding vector<bool> and using alternatives.”


Alternatives to std::vector<bool>: When to Use What

You now have a clear decision tree for vector<bool> replacements:

1. std::bitset<N> – Compile‑time fixed size

Use when:

  • The number of bits is known at compile time.
  • You want very compact storage and rich bitwise operations.

Example:


#include <bitset>
#include <cstddef>

std::bitset<128> flags;
flags.set(3);       // set bit 3
flags.reset(3);     // clear bit 3
bool b = flags.test(3);

Pros:

  • Very memory efficient.
  • Rich operations (&, |, ^, shifts).

Cons:

  • Size is a template parameter; cannot change at runtime.

2. std::deque<bool> – Runtime size, true bools

Use when:

  • Size changes at runtime.
  • You want real bool references and pointers (generic code compatibility).
  • You don’t strictly need contiguous memory.

Example:


#include <deque>

std::deque<bool> flags;
flags.push_back(true);
flags.push_back(false);

bool& r = flags[0];  // real bool&
r = false; 

bool* p = &flags[1]; // real bool* 
*p = true;

Pros:

  • No weird proxy type: bool& and bool* behave normally.
  • Good for generic code that expects “normal” container semantics.

Cons:

  • Not contiguous; slightly less cache‑friendly than vector<bool> or a custom packed structure.

3. std::vector<unsigned char> – Custom bit‑packing, contiguous

Use when:

  • You need contiguous memory (e.g. for IO, SIMD, or tight cache behavior).
  • You’re willing to manage bit packing yourself.
  • You want predictable semantics and full control.

Minimal bit‑vector wrapper:


#include <vector>
#include <cstddef>

class BitVector {
	std::vector<unsigned char> data;
	std::size_t nbits;
public:
	explicit BitVector(std::size_t n)
		: data((n + 7) / 8, 0), nbits(n) {}
	void set(std::size_t i) {
		data[i / 8] |=  (1u << (i % 8));
	}
    void reset(std::size_t i) {
	    data[i / 8] &= ~(1u << (i % 8));
    }
    bool test(std::size_t i) const {
	    return (data[i / 8] >> (i % 8)) & 1u;
    }
};

Pros:

  • Full control over layout and semantics.
  • Works like a normal vector for generic code (because element type is unsigned char).

Cons:

  • More implementation effort; manual bit operations.

push_back vs emplace_back: Construction Cost and Temporaries

Your refined understanding here is also a strong interview answer.

push_back

  • Semantic: “Take an existing object (or temporary), and copy/move it into the container”.
  • Two main use cases:

std::vector<MyType> v;

// 1) existing object MyType obj(1, 2);
v.push_back(obj);             // copies obj
v.push_back(std::move(obj));  // moves obj (if move ctor exists)

// 2) temporary object
v.push_back(MyType(1, 2));    // constructs a temporary, then moves/copies into v

Key point: push_back always assumes you already have a MyType object (even if it is a temporary) and then creates a copy/move of it inside the vector.

emplace_back

  • Semantic: “Construct the element in place inside the container using constructor arguments.”

std::vector<MyType> v;
v.emplace_back(1, 2); // constructs MyType(1, 2) directly in the vector's storage

Benefits:

  • Avoids creating a separate temporary MyType when possible.
  • Reduces copy/move overhead, especially for heavy or move‑only types.

Interview phrasing:

push_back inserts an already‑constructed object (or a temporary) by copying or moving it into the container. emplace_back forwards constructor arguments and constructs the object directly in place, which can avoid extra temporaries and be more efficient.”


Complexity and Cache: vector vs set/map

Another key insight you consolidated is about time complexity plus cache behavior:

  • Sorted std::vector + binary search

    • Build cost: sort once → O(n log n).
    • Lookups: O(log n) with excellent cache locality (contiguous memory).
    • Ideal when: lots of lookups, infrequent insertions.
  • std::set / std::map (balanced tree)

    • Each insertion, deletion, lookup: O(log n).
    • Nodes scattered in memory → poorer cache behavior.
    • Ideal when: many insertions/deletions interleaved with lookups, and you need order maintained automatically.

Strong interview soundbite:

“Both sorted vector (with binary search) and set offer O(log n) lookups, but vector often wins in practice because contiguous memory gives much better cache locality, as long as insertions are rare.”


Final Interview‑Ready Summary

If this blog post had to be condensed into a spoken answer:

  • Prefer std::vector as the default dynamic container: contiguous, cache‑friendly, amortized O(1) push_back.
  • Use std::array for fixed‑size collections with compile‑time length.
  • Use std::deque when you need frequent push/pop at both ends with O(1) access via operator[], even though it’s not fully contiguous.
  • Use std::list only when you truly need O(1) insert/erase given an iterator and don’t care about random access or cache locality.
  • Be able to explain reserve vs resize and amortized O(1) growth via the “reallocation and move” story.
  • Avoid std::vector<bool> in generic code; instead, pick bitset (fixed size), deque<bool> (runtime size, real bools), or a custom bit‑packed vector<unsigned char> depending on requirements.
  • Highlight cache locality and real‑world performance when comparing vector to tree‑based structures like set and map.

This combination of complexity, memory‑layout intuition, and practical trade‑off language is exactly what strong C++ interviews look for.