This document contains the instructions for
lab number 9 in CSC 148 H1.
To earn your lab mark, you must actively participate in the lab.
We mark you in order to ensure a serious attempt at learning, NOT to make careful critical judgments on the results of your work.
We will use the same general rules as for the first lab (including pair programming). See the instructions at the beginning of Lab 1 to refresh your memory.
This lab has you compare several searching and sorting algorithms.
You will use the module
cProfile to investigate
how the various functions spend their execution time.
Some algorithms are so fast that the output of the
does not provide enough significant digits to tell the time!
In those cases, we rely on the
to estimate the total time taken.
Do not be surprised if sometimes the output of the profiler is not what you expect! Think about all the various factors that affect the profiler. Always think about better ways of performing analysis of code performance. Feel free to tweak the code to try to come up with more sensible results.
In order for the
cProfile module to work correctly under Wing,
you must execute your code in DEBUG mode — in other words,
click on the "Debug" button instead of the "Run" button.
This is due to some unknown compatibility issue...
s1" and who will be student "
s2" for this lab.
s1drives and student
timeitmodules to investigate linear search, sentinel search, and binary search.
search.py", and "
test_search.pyto record the timing results for all three searching algorithms. (Remember to run the code in DEBUG mode under Wing.) From these results, we want you to rank the searching algorithms according to speed. For those algorithms that are very close in timing, use
cProfileto identify which operations make the difference.
When you are done, show your work to your TA and switch roles.
s2drives and student
sortmethod, for comparison purposes. (Remember to run the code in DEBUG mode under Wing.)
sort.py", and "
Selection sort works by repeatedly selecting the smallest remaining item and putting it where it belongs.
When you profile selection sort, you'll discover that a lot of time is spent
Fix this by introducing a temporary variable in the main while loop and
re-run the profiling.
You'll notice that
len is still being called a lot;
find out where and use the same trick to avoid calling
len so much.
How much time did you just save?
Insertion sort works by repeatedly inserting the next item
where it belongs in the sorted items at the front of the list.
There are two versions: one manually moves items using a loop, and
the other relies on Python's
Why do you think Python's list code is so much faster?
Of selection sort and insertion sort, which is faster?
Why do you think this is?
Bubblesort works by repeatedly scanning the entire list and swapping items that are out of order. One consequence of bubblesort is that, on the first scan, the largest item will end up at the end of the list no matter where that item was before the first scan. Given what we've learned from timing selection and insertion sort, how do you think bubblesort will perform?
There are two versions. The second one has a check to see whether any items have been swapped on the last scan, and if not it stops early (in that case, no items were out of order). How much of a difference does it make to exit early? Is it noticeable? Once you've done the bubblesort timing, figure out which is faster and why.
Mergesort is different: it splits the list in half,
sorts the two halves, and then merges the two sorted halves.
There are two versions: the first one uses a helper function
_mergesort_1 that returns a new sorted list
(and thus only replaces the items in the original list once,
when the helper function exits), and
the second one uses a helper function
that sorts the list between two indices and continually updates
the original list.
Which do think is faster, and why?
Quicksort works by partitioning the list into two halves:
those items less than the first item, and those greater than the first item.
For example, if the list is
[5, 1, 7, 3, 9, 12],
then the helper function
will rearrange the list into this:
[3, 1, 5, 9, 12, 7] — notice that the 5 is now in the right place.
Then the left and right sections are sorted using quicksort.
How fast is this?
Is quicksort faster on nearly-sorted lists or on random data?
There are two versions of quicksort. The second one uses indices to sort the list in-place, without making copies of each sublist. How much difference does this make?
How much of a difference does it make to use Python's built-in sort? Why do you think it is so much faster?
The results hold for randomized input.
Investigate whether there is an input that leads to fast or slow times:
try both nearly-sorted input and reverse-sorted input.
(See the comments in
test_sort.py to find out
how to generate these other kinds of inputs.)
When you are done, show your work to your TA. Then, please stick around to help other students in your lab section!