Data Structures and Algorithms: From Zero to Hero
Why DSA?
You know Python. You can write elegant one-liners. But when the interviewer asks, "What's the time complexity of your solution?" — you need to answer without hesitation. Data structures and algorithms is the language of technical interviews. It's how you reason about whether your code is fast enough and whether it uses too much memory.
This is my study note. Same philosophy as the Python Crash Course — self-serving, concise, and written so I can review it quickly before the exam.
Complexity
Complexity measures how an algorithm's resource usage grows as the input size grows. It's not about how fast your computer is — it's about how your code scales.
Types of Complexity
- Time — How many operations does your code perform as input grows?
- Space — How much memory does your code use as input grows?
- Other Resources — Network calls, graphics rendering, hardware (printers, CPUs, sensors, etc.). These matter in real systems, but for coding interviews, we focus on time and space.
Big O Notation
Big O describes the upper bound — the worst-case growth rate of your algorithm. When someone asks "What's the Big O?", they're asking: as the input gets really large, how does the number of operations grow?
The Rules
- Drop constants.
O(2n)is justO(n). Constants don't matter at scale. - Drop lower-order terms.
O(n² + n)is justO(n²). The dominant term wins. - Worst case matters. Unless stated otherwise, Big O means worst case.
Common Complexities (Fastest to Slowest)
O(1) — Constant — Dictionary lookup, array index access
O(log n) — Logarithmic — Binary search
O(n) — Linear — Loop through a list
O(n log n) — Linearithmic — Efficient sorting (merge sort, timsort)
O(n²) — Quadratic — Nested loops (bubble sort)
O(2ⁿ) — Exponential — Recursive fibonacci (naive)
O(n!) — Factorial — Permutations, brute-force TSP
Visualizing the Growth
Think of it this way. If n = 1,000:
| Big O | Operations | Verdict |
|---|---|---|
| O(1) | 1 | Instant |
| O(log n) | ~10 | Instant |
| O(n) | 1,000 | Fast |
| O(n log n) | ~10,000 | Fine |
| O(n²) | 1,000,000 | Slow |
| O(2ⁿ) | good luck | Heat death of the universe |
Time Complexity in Practice
Let's see Big O in actual Python code.
O(1) — Constant Time
def get_first(lst):
return lst[0]
# Doesn't matter if the list has 10 or 10 million elements.
# Accessing by index is always one operation.
d = {"key": "value"}
d["key"] # O(1) — hash table lookupO(log n) — Logarithmic Time
def binary_search(lst, target):
lo, hi = 0, len(lst) - 1
while lo <= hi:
mid = (lo + hi) // 2
if lst[mid] == target:
return mid
elif lst[mid] < target:
lo = mid + 1
else:
hi = mid - 1
return -1
# Each step cuts the search space in half.
# 1,000 elements? ~10 steps. 1,000,000 elements? ~20 steps.O(n) — Linear Time
def find_max(lst):
maximum = lst[0]
for item in lst: # Visits every element once
if item > maximum:
maximum = item
return maximum
# Double the input, double the work. That's linear.O(n log n) — Linearithmic Time
sorted_list = sorted(my_list) # Python's Timsort is O(n log n)
# This is the best you can do for comparison-based sorting.
# If an interviewer asks you to sort, this is your floor.O(n²) — Quadratic Time
def bubble_sort(lst):
n = len(lst)
for i in range(n):
for j in range(0, n - i - 1): # Nested loop
if lst[j] > lst[j + 1]:
lst[j], lst[j + 1] = lst[j + 1], lst[j]
return lst
# Nested loops over the same data = O(n²).
# This is the red flag in interviews. Avoid if possible.O(2ⁿ) — Exponential Time
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
# Each call branches into two more calls.
# fib(40) takes billions of operations. Don't do this.
# Fix: use memoization (dynamic programming) to make it O(n).Space Complexity
Same idea, but for memory. How much extra memory does your algorithm use?
O(1) Space — In-Place
def swap(lst, i, j):
lst[i], lst[j] = lst[j], lst[i]
# Uses no extra memory beyond the input. This is O(1) space.O(n) Space
def double_all(lst):
return [x * 2 for x in lst]
# Creates a new list the same size as the input. O(n) space.O(n) Space — Hash Map
def has_duplicates(lst):
seen = set()
for item in lst:
if item in seen:
return True
seen.add(item)
return False
# The set can grow up to n elements. O(n) space.
# But the time is O(n) instead of O(n²). Classic trade-off.The Time-Space Trade-off
This is the single most important concept for interviews:
You can often trade space for time. Use more memory to make things faster.
| Approach | Time | Space | Example |
|---|---|---|---|
| Brute force nested loops | O(n²) | O(1) | Check all pairs for duplicates |
| Hash set | O(n) | O(n) | Store seen elements in a set |
| Sort first, then scan | O(n log n) | O(1)* | Sort, then check adjacent pairs |
*Depends on sorting algorithm. Python's Timsort uses O(n) space internally.
When the interviewer says "Can you optimize this?" — they usually mean: use a hash map or sort the data first.
How to Analyze Complexity
When you see code, ask yourself:
- Is there a loop? → At least O(n)
- Is the loop nested? → Multiply. Two nested loops over n = O(n²)
- Does it divide the input each step? → O(log n)
- Does it loop AND divide? → O(n log n)
- Does it branch recursively? → Could be exponential
Quick Practice
# What's the time complexity?
# 1.
for i in range(n): # O(n)
print(i)
# 2.
for i in range(n): # O(n²)
for j in range(n):
print(i, j)
# 3.
for i in range(n): # O(n) — inner loop is constant (10 iterations)
for j in range(10):
print(i, j)
# 4.
i = n
while i > 0: # O(log n) — halving each time
print(i)
i //= 2
# 5.
for i in range(n): # O(n log n)
j = n
while j > 0:
j //= 2Big O, Big Omega, Big Theta
For completeness — you'll see these in courses, but interviews almost always use Big O:
| Notation | Meaning | Think of it as |
|---|---|---|
| O (Big O) | Upper bound — worst case | "At most this bad" |
| Ω (Big Omega) | Lower bound — best case | "At least this good" |
| Θ (Big Theta) | Tight bound — average case | "Exactly this" |
Example: Linear search in an unsorted list:
- Best case Ω(1): target is the first element
- Worst case O(n): target is the last element (or not there)
- Average case Θ(n/2) = Θ(n): drop the constant
In interviews, just use Big O. Nobody will quiz you on Omega and Theta — but it's good to know they exist.
Python-Specific Complexity Cheat Sheet
Know these by heart. The interviewer won't ask directly, but you need this to analyze your own solutions:
| Operation | Time Complexity |
|---|---|
list[i] | O(1) |
list.append(x) | O(1) amortized |
list.insert(0, x) | O(n) — shifts everything |
list.pop() | O(1) |
list.pop(0) | O(n) — shifts everything |
x in list | O(n) — linear scan |
list.sort() | O(n log n) |
dict[key] | O(1) average |
key in dict | O(1) average |
set.add(x) | O(1) average |
x in set | O(1) average |
collections.deque.appendleft(x) | O(1) |
collections.deque.popleft() | O(1) |
heapq.heappush() | O(log n) |
heapq.heappop() | O(log n) |
Key takeaway: If you need fast lookups, use a dict or set, not a list. If you need fast inserts/removes from both ends, use deque, not a list.
Next up: Arrays, Linked Lists, and the data structures that show up in every coding interview.