Some theory
Bubble sort is one of the most commonly known sorting algorithms. The idea is simple: scan a list of items (say integers) sequentially, from left to right, and compare consecutive pairs of elements starting at index 0.
Example:
my_numbers = [92,11,45,2234,0,7,65]
# 92 is index 0 and the consecutive pairs are
# (92,11), (11,45), (45,2234) and so on ...First we compare elements (list[0],list[1]) then (list[1],list[2]) then (list[2],list[3]) and so on until we reach the end of the list.
When comparing, we check if element i is greater than element i + 1. If it is, we swap the two elements and move on to the next pair. If not, it means the pair is already sorted, so we still move on to the next pair.
Example:
my_numbers = [92,11,45,2234,0,7,65]
# Let's compare my_numbers[0] and my_numbers[1]
if my_numbers[0] > my_numbers[1]:
swap(my_numbers[0], my_numbers[1])
print(my_numbers) # [11, 92, 45, 2234, 0, 7, 65]This process has to be repeated for however many items are on the list. So if the list holds 9 items, it means we need to loop through it 9 times at most. But what if our original list is partially sorted? We might not need 9 passes through the list.
One way for us to know that the list is fully sorted is if we have made no swaps during our pass. For that, we need a variable to keep track of how many swaps were made during a pass.
Example:
my_numbers = [92,11,45,2234,0,7,65]
# Elements (0,1) are compared and swapped. List is now 11,92,45,2234,0,7,65
# Elements (1,2) are compared and swapped. List now 11,45,92,2234,0,7,65
# Elements (2,3) are compared and not swapped. List remains the same.
# Elements (3,4) are compared and swapped. List is now 11,45,92,0,2234,0,7,65
# Elements (4,5) are compared and swapped. List is now 11,45,92,0,7,2234,65
# Elements (5,6) are compared and swapped. List is now 11,45,92,0,7,65,2234
# This represents one unique pass through the list.Notice how after each pass the highest value number is pushed at len(list) - 1.
Some code
Let's look at how to implement Bubble Sort in Python:
def bubble_sort(some_list):
is_sorted = False
while not is_sorted:
is_sorted = True
for i in range(0, len(some_list) - 1):
if some_list[i] > some_list[i + 1]:
some_list[i], some_list[i+1] = some_list[i+1], some_list[i]
is_sorted = FalseThis works fine and it will sort any list you throw at it. However we can slightly optimise it: We know that, after each pass the highest value element is guaranteed to be sorted and placed at len(some_list) - 1. Because of this, for each subsequent pass, we can stop comparing the last sorted item. instead of comparing pairs that we know are already sorted. This is what it looks like:
def bubble_sort(some_list):
is_sorted = False
last_sorted_item = len(some_list) - 1
while not is_sorted:
is_sorted = True
for i in range(0, last_sorted_item):
if some_list[i] > some_list[i + 1]:
some_list[i], some_list[i+1] = some_list[i+1], some_list[i]
is_sorted = False
last_sorted_item -= 1After each pass through the loop, we know the right side of the list is sorted so we decrement the value of last_sorted_item. What this means is that the 1st pass will loop from 0 to len(some_list) -1, the second time, it will be from 0 to len(some_list) - 2 and so on ...
Time complexity
The rate of growth of this algorithm is quadratic. Expressed as O(n^2) in "big-oh" notation.
def bubble_sort(some_list):
is_sorted = False # time here is constant
last_sorted_item = len(some_list) - 1
while not is_sorted: # We go through this first loop n times
is_sorted = True
for i in range(0, last_sorted_item): # we go through this loop n-1 times
if some_list[i] > some_list[i + 1]:
# execution here is constant
some_list[i], some_list[i+1] = some_list[i+1], some_list[i]
is_sorted = False
last_sorted_item -= 1 # constant timeIt's O(n^2) because, for each pass through the loop n times, we loop n times through the consecutive pairs. It's not a very efficient algorithm when used on large samples of data. It should only be used if you have a specific case on a small data set.
Next in the series is QuickSort, another interesting and more efficient sorting algorithm. As always, if you have questions, comments or if you spotted a typo or a mistake, please feel free to let me know on Twitter, I'm @aaqaishtyaq and always happy to help!