Master
Strings
A complete revision system covering everything from memory layout to advanced sliding window patterns. Built for interviews, designed for clarity.
Introduction to Strings
A string is a sequence of characters stored contiguously in memory. In C++, strings can be represented in two fundamentally different ways — the old C-style approach and the modern C++ std::string class.
Unlike arrays of integers, strings carry the implicit convention that they represent human-readable text. Every character maps to an ASCII value — a critical insight for hashing and frequency-count problems.
ASCII Insight: Lowercase letters 'a'–'z' map to ASCII values 97–122. The difference ch - 'a' gives you an index 0–25, which is the foundation of character frequency arrays.
Memory layout for "hello":
C-strings require a null terminator '\0' — its absence causes undefined behaviour in strlen/strcpy. std::string manages this internally and tracks size and capacity separately for amortised O(1) appends.
char str[] = "hello";
Array of chars, null-terminated. Manual memory management. Functions: strlen, strcpy, strcmp. Prone to buffer overflow bugs. Used in competitive programming for speed.
string str = "hello";
Dynamic, safe, rich API. Automatically handles memory. Supports +, ==, range-based for, iterators. The correct choice for interviews and real code.
// C-style string
char cstr[] = "hello"; // stored as h,e,l,l,o,\0
int len = strlen(cstr); // O(n) — scans until \0
// C++ std::string
string s = "hello";
string s2(5, 'a'); // "aaaaa" — fill constructor
string s3 = s.substr(1, 3); // "ell" — O(k) copy
// Traversal — multiple ways
for (char c : s) cout << c; // range-for
for (int i = 0; i < s.size(); i++) cout << s[i];
// Conversion
string ns = to_string(42); // int → string
int ni = stoi("42"); // string → int
Basic Operations
Traversal is O(n). Under the hood, s[i] is a direct memory access via pointer arithmetic — same as arrays. The range-for compiles to iterator-based traversal, equally fast.
string s = "hello";
// Method 1: index-based
for (int i = 0; i < (int)s.size(); i++) {
cout << s[i] << " ";
}
// Method 2: range-based (preferred)
for (char c : s) cout << c << " ";
// Method 3: iterator
for (auto it = s.begin(); it != s.end(); it++)
cout << *it;
Cast s.size() to int when comparing with signed integers. size() returns size_t (unsigned) — i < s.size() with a negative i causes wrap-around bugs.
cin >> s reads until whitespace. For multi-word input, use getline — but after reading an integer with cin, the newline stays in the buffer. Always cin.ignore() before getline.
string s;
cin >> s; // reads "hello" (stops at space)
string line;
getline(cin, line); // reads full line with spaces
// When mixing cin and getline:
int n; cin >> n;
cin.ignore(); // flush newline from buffer
getline(cin, line); // now safe to read line
s.size() and s.length() are identical for std::string — both return the stored length in O(1) from an internal field. strlen() for C-strings is O(n) — it scans until '\0'.
string s = "hello";
cout << s.size(); // 5 — O(1)
cout << s.length(); // 5 — O(1), identical
char cs[] = "hello";
cout << strlen(cs); // 5 — O(n)! scans each char
// Empty check
if (s.empty()) { ... } // prefer over s.size() == 0
The + operator creates a new string each time — O(n). Repeated concatenation in a loop becomes O(n²). Use += or append() for in-place modification — amortised O(1) per character.
string a = "hello", b = " world";
string c = a + b; // creates NEW string — O(n)
a += b; // in-place — amortised O(1)
a.append(b); // same as += for strings
a += '!'; // append single char
// ❌ BAD — O(n²) in a loop
string result = "";
for (char c : chars) result = result + c;
// ✅ GOOD — O(n)
string result = "";
for (char c : chars) result += c;
Comparison operators work lexicographically — character by character via ASCII values. == checks length first (short-circuit), then each char. All comparisons are O(n) in the worst case.
string a = "apple", b = "banana";
if (a == b) { ... } // false — lexicographic equality
if (a < b) { ... } // true — 'a' < 'b'
// compare() returns: <0, 0, or >0
int r = a.compare(b); // negative → a comes first
// Sorting strings uses lexicographic order by default
vector<string> v = {"cat", "apple", "bat"};
sort(v.begin(), v.end()); // apple, bat, cat
Important String Patterns
When to use: Any problem where you process characters from both ends simultaneously — palindrome check, reversing, comparing mirrored positions. The key insight: you meet in the middle after n/2 iterations.
🔁 Palindrome Check
bool isPalindrome(string& s) {
int l = 0, r = s.size() - 1;
while (l < r) {
if (s[l] != s[r])
return false;
l++; r--;
}
return true;
}
// "racecar" → true
// "hello" → false
🔄 Reverse a String
void reverseStr(string& s) {
int l = 0, r = s.size() - 1;
while (l < r) {
swap(s[l], s[r]);
l++; r--;
}
}
// Or simply: reverse(s.begin(), s.end());
// Both are O(n/2) = O(n)
Core Insight: Since there are only 26 lowercase letters (or 128 ASCII chars), a fixed-size array acts as a perfect hash map. freq[c - 'a'] maps 'a'→0, 'b'→1, … 'z'→25. Building this array is O(n), and querying any char is O(1).
📝 Character Count
string s = "hello";
int freq[26] = {0}; // all zeros
for (char c : s)
freq[c - 'a']++;
// freq['h'-'a'] = 1
// freq['l'-'a'] = 2
// freq['e'-'a'] = 1
for (int i = 0; i < 26; i++)
if (freq[i])
cout << (char)('a'+i)
<< ": " << freq[i] << "\n";
🔀 Anagram Detection
bool isAnagram(string& a, string& b) {
if (a.size() != b.size()) return false;
int freq[26] = {0};
for (int i=0; i<a.size(); i++) {
freq[a[i]-'a']++;
freq[b[i]-'a']--;
}
for (int x : freq)
if (x != 0) return false;
return true;
}
// "listen" & "silent" → true
Why it works: Instead of checking all O(n²) substrings, we maintain a window with two pointers. We expand right always, and shrink left only when a constraint is violated. Each element is added and removed at most once — giving O(n) total.
Problem: Longest Substring Without Repeating Characters
int lengthOfLongestSubstring(string s) {
unordered_map<char, int> lastSeen; // char → last index
int left = 0, maxLen = 0;
for (int right = 0; right < s.size(); right++) {
char c = s[right];
// If char seen & is inside our window
if (lastSeen.count(c) && lastSeen[c] >= left) {
left = lastSeen[c] + 1; // shrink window
}
lastSeen[c] = right; // update position
maxLen = max(maxLen, right - left + 1);
}
return maxLen;
}
// "abcabcbb" → 3 ("abc")
// "pwwkew" → 3 ("wke")
Problem: Minimum Window Substring (Conceptual)
Strategy: Build frequency maps for both strings. Expand right until window contains all required chars. Then shrink left while it still satisfies the condition. Track the minimum valid window size throughout. Time: O(n+m).
string minWindow(string s, string t) {
unordered_map<char,int> need, have;
for (char c : t) need[c]++;
int formed=0, required=need.size();
int l=0, minLen=INT_MAX, start=0;
for (int r=0; r<s.size(); r++) {
have[s[r]]++;
// Check if this char satisfies its need
if (need.count(s[r]) && have[s[r]]==need[s[r]])
formed++;
while (formed == required) { // valid window
if (r-l+1 < minLen) { minLen=r-l+1; start=l; }
have[s[l]]--;
if (need.count(s[l]) && have[s[l]]<need[s[l]])
formed--;
l++;
}
}
return minLen==INT_MAX ? "" : s.substr(start, minLen);
}
Important Algorithms
Intuition: Naïve pattern matching retries from scratch on mismatch — O(nm). KMP pre-processes the pattern to build a failure function (LPS array) that stores the longest proper prefix which is also a suffix. On mismatch, instead of restarting, we jump to the position the LPS array tells us — avoiding redundant comparisons.
// Step 1: Build LPS (Longest Prefix Suffix) array
vector<int> buildLPS(string& pat) {
int m = pat.size();
vector<int> lps(m, 0);
int len = 0, i = 1;
while (i < m) {
if (pat[i] == pat[len]) lps[i++] = ++len;
else if (len) len = lps[len-1]; // jump
else lps[i++] = 0;
}
return lps;
}
// Step 2: Pattern search using LPS
void KMPSearch(string& txt, string& pat) {
int n=txt.size(), m=pat.size();
auto lps = buildLPS(pat);
int i=0, j=0;
while (i < n) {
if (txt[i]==pat[j]) { i++; j++; }
if (j==m) {
cout << "Found at: " << i-j << "\n";
j = lps[j-1];
} else if (i<n && txt[i]!=pat[j]) {
if (j) j = lps[j-1];
else i++;
}
}
}
Intuition: Compute a hash for the pattern. Use a rolling hash to slide over the text — removing the leftmost character and adding the next one in O(1). Only compare characters when hashes match. Worst case O(nm) on hash collisions, but average O(n+m).
void rabinKarp(string& txt, string& pat) {
const int BASE=31, MOD=1e9+7;
int n=txt.size(), m=pat.size();
// Compute pattern hash & first window hash
long long patHash=0, winHash=0, power=1;
for (int i=0; i<m; i++) {
patHash = (patHash*BASE + pat[i]) % MOD;
winHash = (winHash*BASE + txt[i]) % MOD;
if (i) power = power*BASE % MOD;
}
for (int i=0; i<=n-m; i++) {
if (winHash==patHash && txt.substr(i,m)==pat)
cout << "Match at: " << i << "\n";
if (i < n-m) // rolling hash
winHash = (winHash - txt[i]*power%MOD + MOD)
*BASE % MOD + txt[i+m];
}
}
Intuition: Concatenate pattern + '$' + text. Build the Z-array in O(n). Any position where z[i] == len(pattern) is a match. The '$' separator prevents overlap between the pattern and text.
vector<int> buildZ(string& s) {
int n=s.size(); vector<int> z(n, 0);
z[0]=n;
int l=0, r=0;
for (int i=1; i<n; i++) {
if (i<r) z[i]=min(r-i, z[i-l]);
while (i+z[i]<n && s[z[i]]==s[i+z[i]]) z[i]++;
if (i+z[i]>r) { l=i; r=i+z[i]; }
}
return z;
}
void zSearch(string& txt, string& pat) {
string concat = pat + "$" + txt;
auto z = buildZ(concat);
for (int i=pat.size()+1; i<concat.size(); i++)
if (z[i]==(int)pat.size())
cout << "Found at: " << i-pat.size()-1 << "\n";
}
STL String Functions
The C++ Standard Library provides a rich set of string utilities. Understanding what happens internally helps you reason about time complexity and avoid surprises.
| Function | Signature | Complexity | Internal Behaviour |
|---|---|---|---|
substr() |
s.substr(pos, len) |
O(len) | Allocates a new string and copies len characters. Expensive in tight loops — avoid inside O(n) iteration. |
find() |
s.find(pattern) |
O(n·m) | Naive search by default. Returns string::npos if not found. Use KMP or Rabin-Karp for repeated searches. |
stoi() |
stoi(str, &pos) |
O(n) | Parses string to int. Optional second arg stores chars consumed. Throws invalid_argument on failure. Use try-catch. |
to_string() |
to_string(num) |
O(digits) | Converts numeric types to string. Equivalent to sprintf internally. Allocates a new string object. |
reverse() |
reverse(s.begin(), s.end()) |
O(n/2) | In-place swap using two pointers under the hood. No extra allocation. Works on any bidirectional iterator. |
sort() |
sort(s.begin(), s.end()) |
O(n log n) | Sorts characters by ASCII value. Quick way to check anagram: sort both strings and compare. |
erase() |
s.erase(pos, len) |
O(n) | Removes characters and shifts remainder. Very expensive in a loop — prefer building a new string. |
replace() |
s.replace(pos, len, str) |
O(n) | Replaces a range with another string. May cause reallocation if new string is larger than capacity. |
string s = "hello world";
// substr — extract "world"
string w = s.substr(6, 5); // "world"
// find — locate "world"
int pos = s.find("world"); // 6
if (pos == string::npos) // not found
// stoi — parse "42" → 42
int num = stoi("42");
// to_string — 3.14 → "3.14"
string pi = to_string(3.14);
// reverse in-place
reverse(s.begin(), s.end()); // "dlrow olleh"
// Check anagram via sort
string a="listen", b="silent";
sort(a.begin(),a.end());
sort(b.begin(),b.end());
bool anagram = (a == b); // true
Important Problems
Common Mistakes
s.size() returns an unsigned type. Comparing i < s.size() - 1 when s is empty causes underflow to a huge number, not -1. Always cast: (int)s.size() or use s.empty() guard first.s.erase() or s.insert() inside a loop invalidates iterators and shifts indices. Build a result string separately and return it, or use a two-pointer in-place approach only when safe.result = result + c inside a loop creates a new O(n) string each iteration. Total cost: O(n²). Use result += c for amortised O(1) per append, or use stringstream for complex formatting.s.substr(l, r-l+1) is O(window size). Calling it inside every iteration of a sliding window loop makes the total complexity O(n²). Store the window's start/length; extract only the final answer.freq[c - 'a'] when the string may contain uppercase, digits, or spaces causes negative array indices and undefined behaviour. Always validate the character range or use a 128-element ASCII array.s.find() returns size_t (unsigned). Checking pos == -1 will never be true. Always check pos == string::npos. Alternatively, store in int after explicit cast.Edge Cases
Interview Tip: After solving the main logic, always run through these edge cases mentally. Interviewers often ask "what if the string is empty?" immediately after you write your solution.
if (s.empty()) return .... Two-pointer: l < r fails immediately for empty strings — correct behaviour. Sliding window: loop never executes.result.reserve(n) prevents repeated reallocations.tolower(c) during traversal. "Hello" and "hello" are NOT anagrams of each other unless case-insensitive. Palindrome problems often say "ignore case".// Always add these guards at the top of string functions
if (s.empty()) return 0; // or "", or false
if (s.size() == 1) return true; // single char is palindrome
// Normalise case before processing
string norm;
for (char c : s)
if (isalnum(c)) norm += tolower(c);
// Safe char-to-index with bounds check
auto idx = [](char c) { return c - 'a'; }; // only for lowercase
auto idxASCII = [](char c) { return (int)c; }; // for all chars
Time Complexity Insights
Unlike arrays of integers, string operations inherently depend on the length of the string. Even "simple" operations like equality check, hashing, and copying require touching every character. This is because strings are variable-length data — the CPU cannot know the result without reading all n bytes.
| Operation | Complexity | Reason / Notes |
|---|---|---|
| s[i] — random access | O(1) |
Direct pointer arithmetic, like arrays |
| s.size() / s.length() | O(1) |
Stored as a field — doesn't scan the string |
| s += c (single char) | O(1) amortised |
Doubling strategy; occasional O(n) realloc |
| s + t (string concat) | O(n + m) |
Creates new string, copies both |
| s == t (equality) | O(n) |
Compares char-by-char after length check |
| s.find(pattern) | O(n·m) |
Naïve — use KMP or Z for O(n+m) |
| s.substr(pos, len) | O(len) |
Allocates and copies len characters |
| reverse(begin, end) | O(n/2) |
n/2 swaps — still O(n) asymptotically |
| sort(begin, end) | O(n log n) |
std::sort — introsort (quicksort + heapsort) |
| Sliding window | O(n) |
Each char added/removed at most once |
| Nested loop on string | O(n²) |
Every pair of positions — avoid when possible |
O(n²) Danger Zone: Substring extraction inside a loop, naive pattern matching, nested comparisons. Look for these patterns in your code and replace with sliding window or hashing.
O(n) Techniques: Sliding window, frequency arrays, two-pointer, KMP/Z for matching. These are the gold standard for competitive programming and interviews.
Quick Revision Cheat Sheet
- Two Pointer: converge from both ends — palindrome, reverse
- Freq Array:
freq[c-'a']++— O(1) lookup per char - Sliding Window: expand right, shrink left on violation
- Sort-based key: anagram grouping, canonical form
- Rolling Hash: Rabin-Karp for O(n) average matching
- KMP LPS: skip redundant comparisons on mismatch
- Z-Array: longest prefix matching at each position
reverse(s.begin(), s.end())— in-place O(n)sort(s.begin(), s.end())— anagram keys.find(p) != string::npos— check existencetolower(c), isalnum(c)— normalisationstoi(s), to_string(n)— type conversion
- Access s[i] → O(1)
- s.size() → O(1)
- s += c → O(1) amort.
- s + t → O(n+m)
- s.find(p) → O(n·m) naive
- KMP / Z → O(n+m)
- Sliding Window → O(n)
- sort string → O(n log n)
- ☐ Empty string — return base case
- ☐ Single character — trivially valid
- ☐ All same chars — verify loop behaviour
- ☐ Mixed case — normalise with tolower()
- ☐ Spaces / special chars — isalnum() guard
- ☐ s.size() unsigned — cast before subtraction
- ☐ find() → check against string::npos
// ═══════════════════════════════════════════
// String Problem Starter Template
// ═══════════════════════════════════════════
#include <bits/stdc++.h>
using namespace std;
void solve(string& s) {
int n = s.size();
if (n == 0) return; // ① empty guard
// ② Frequency array (lowercase only)
int freq[26] = {0};
for (char c : s) freq[c-'a']++;
// ③ Two pointer skeleton
int l = 0, r = n - 1;
while (l < r) {
// process s[l] and s[r]
l++; r--;
}
// ④ Sliding window skeleton
unordered_map<char,int> win;
int left = 0, best = 0;
for (int right = 0; right < n; right++) {
win[s[right]]++;
while (/* constraint violated */ false) {
win[s[left]]--;
if (!win[s[left]]) win.erase(s[left]);
left++;
}
best = max(best, right - left + 1);
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
string s; cin >> s;
solve(s);
return 0;
}