CS 351: Analysis of Algorithms
What We’ll Discuss This Week
We already have MergeSort - why do we need QuickSort?
Professional Opinion: QuickSort appears on most computer scientists’ “top 10 algorithms” lists
Practical Advantages:
Aesthetic Appeal:
Problem Definition:
Example:
Input: [3, 8, 2, 5, 1, 4, 7, 6]
Output: [1, 2, 3, 4, 5, 6, 7, 8]
Assumption: Distinct elements (no duplicates) for simplicity
Checkpoint 1: Can you think of any advantages an in-place sorting algorithm might have over one that requires additional memory?
QuickSort is built around partial sorting - partitioning an array around a “pivot element”
Two-Step Process:
Step 1: Choose a pivot element
Step 2: Rearrange around pivot
Input Array:
[3, 8, 2, 5, 1, 4, 7, 6]
Choose pivot = 3 (first element)
After partitioning:
[2, 1, 3, 6, 7, 4, 5, 8]
↑
pivot
Key Insight: Elements before pivot don’t need to be in sorted order relative to each other, and neither do elements after pivot!
1. Fast Implementation
2. Significant Progress
Checkpoint 2: In the partitioned array [2, 1, 3, 6, 7, 4, 5, 8]
, what can you say about the final position of the element 3
in the fully sorted array?
Visual representation:
[< p] [p] [> p]
↓ ↓ ↓
sort done sort
Order of Operations Difference:
MergeSort:
QuickSort:
Historical Note: QuickSort was invented by Tony Hoare in 1959 when he was just 25 years old. He later won the Turing Award in 1980.
Three Main Questions:
Section 5.2: How do we implement the partitioning subroutine?
Section 5.3: How should we choose the pivot element?
Sections 5.4-5.5: What’s the running time of QuickSort?
Checkpoint 3: Before we dive into implementation details, predict: what do you think would be the worst way to choose a pivot element?
Simple approach using extra memory:
Problem: Uses O(n) extra memory Goal: Implement in-place with O(1) extra memory
Key Challenge: Rearrange elements without extra array
Solution Approach:
Memory Usage: Only constant extra space for:
Checkpoint 4: Can you sketch how you might use two pointers (one from left, one from right) to partition an array in place?
Algorithm Correctness: Independent of pivot choice Algorithm Performance: Heavily dependent on pivot choice
Common Strategies:
When things go wrong:
Example: Sorted array [1,2,3,4,5,6,7,8] with first-element pivot
Pick 1: [1] [2,3,4,5,6,7,8] (unbalanced!)
Pick 2: [2] [3,4,5,6,7,8] (still unbalanced!)
...
When things go perfectly:
Example: Each partition splits roughly in half
Pick median: [smaller half] [median] [larger half]
Recursively: both halves ≈ n/2 size
Depth: log n levels
Work per level: O(n)
Total: O(n log n)
Checkpoint 5: If we have an 8-element array and always achieve perfect splits, how many levels of recursion will we have?
The Magic Solution: Choose pivot uniformly at random
Why randomization works:
Real-world impact: Randomized QuickSort is a “for-free primitive”
Worst Case: O(n²)
Best Case: O(n log n)
Average Case: O(n log n)
Why randomized QuickSort works well:
Good Split Probability:
Progress Accumulation:
Remarkable Fact: No matter what your input array is, if you run randomized QuickSort repeatedly, the average running time will be O(n log n)
Checkpoint 6: If we always get a 25%-75% split, what’s the maximum depth of recursion for an array of size n?
Real-world impact: Understanding QuickSort’s performance helps us:
The practical question: “If I have 1 million records to sort, how long will QuickSort take?”
What makes QuickSort slow or fast?
Bottom line: Count comparisons → predict performance
Imagine tracking every pair of elements:
Think of your array like a tournament bracket. Every element might get “matched up” against every other element at some point.
Example with array [4, 1, 3, 2]:
Our strategy: Instead of asking “How many total comparisons?”, ask “Will these two specific elements get compared?”
Why this is smart:
Checkpoint 7a: In a dating app with 100 users, how many possible “matches” (pairs) could there be? This is similar to counting element pairs in QuickSort.
The key insight: Two elements get compared only under specific circumstances.
Let’s use a concrete example:
Array after sorting would be: [1, 2, 3, 4, 5, 6, 7, 8]
Question: When do elements 2 and 7 get compared?
Scenario 1: Element 2 becomes a pivot
Scenario 2: Element 7 becomes a pivot
Scenario 3: Element 4 becomes a pivot first
Real-world analogy: Think of QuickSort like organizing a library
The library scenario:
Back to our example with elements 2 and 7:
The general rule: Elements get compared only if one becomes a pivot before any “separator” elements are chosen.
Checkpoint 7b: You’re organizing a bookshelf with books numbered 10, 20, 30, 40, 50. If book 30 is chosen as the separator first, will books 10 and 50 ever be compared later?
Let’s make this practical: What are the chances elements 2 and 7 get compared?
The contestants: Who could be chosen as pivot first?
Winning scenarios: Element 2 or 7 picked first
Losing scenarios: Element 3, 4, 5, or 6 picked first
Generic formula: For elements i steps apart:
Performance prediction: Now we can estimate QuickSort’s behavior
For any array size n:
What log(n) means in practice:
Real performance:
Checkpoint 7c: If QuickSort makes about 2n×log(n) comparisons, roughly how many comparisons would you expect for an array of 10,000 elements? (Hint: log(10,000) ≈ 9)
What we discovered:
Why this analysis matters:
Practical takeaway: When you see QuickSort in a library or system, you now know why it’s there - it’s not just fast, it’s probably fast on average!
The Bottom Line: QuickSort’s genius isn’t just in its simplicity, but in how randomization naturally prevents the worst-case scenarios from happening often. Mathematics proves what programmers observe: it just works well in practice!
Practical Improvement: Choose pivot more carefully
Algorithm:
Example: Array [8,3,2,5,1,4,7,6]
Benefits: Much better performance on nearly-sorted arrays
Fundamental Question: Can we sort faster than O(n log n)?
Theorem 5.5: No comparison-based sorting algorithm has worst-case running time better than O(n log n).
Proof Idea:
Implication: QuickSort (and MergeSort) are asymptotically optimal!
Checkpoint 8: Can you think of sorting algorithms that might beat the O(n log n) lower bound? What would they need to do differently?
Algorithm Design:
Performance:
Practical Impact:
Algorithmic Elegance:
Mathematical Beauty:
Historical Significance:
Final Checkpoint: Now that you understand QuickSort, can you explain why it’s often preferred over MergeSort in practice, despite having the same average-case complexity?
Implement and Experiment: Try implementing QuickSort with different pivot strategies:
Compare: Count comparisons on various input types:
Observe: How does pivot choice affect performance in practice?
What we covered:
Next steps: