Loops

KEY TERMS

  • Loop: code that is repeated continuously

  • For loop: a loop that uses the keyword for and repeats a block of code for each element in a data structure

  • Iterate: to execute, usually as part of a repetition

Introduction

Many times, in programming and in life, we do repetitive tasks. For instance, folding laundry requires these following steps: picking up a piece of laundry -> folding it -> putting it in a pile with its type. Let's say we have a load of laundry to fold, load = [shirt1, shirt2, pants1, pants2, jacket1, pants3]. It's a pretty small load, but describing the folding process through pseudocode would still take up a lot of space:

pick_up(shirt1)
fold(shirt1)
put_in_correct_pile(shirt1)
pick_up(shirt2)
fold(shirt2)
put_in_correct_pile(shirt2)
pick_up(pants1)
...
...

The purpose of loops is to avoid repetition. While there are multiple types of loops we commonly use for loops in data science applications.

For Loops

Forloops repeat a chunk of code for a specified number of times or for a collection of elements. The basic format of a for loop is: for x in [ ]: do_something(x)

Here, the loop takes each element from the list [ ], sets it equal to a variable (in this case x), and does something to that element.

Now that we know what a for loops looks like, let's use it to simplify our task of folding laundry. We go through the same three steps for each item in our load of laundry, so we can put those three steps in the body of our loop! Then, let's make sure that those three steps are applied to the individual elements of our load. This would look like:

load = [shirt1, shirt2, pants1, pants2, jacket1, pants3] #repeated from above
for item in load:
    pick_up(item)
    fold(item)
    put_in_correct_pile(item)

This code is so much easier to read than before! Being able to go through all the items in load is powerful because we don't need to know how many times the loop runs for (AKA the number of items we need to fold), but we know that the three tasks are done for every item of clothing.

Apply to Statistics

For loops are especially helpful when we have lists of numbers. If you were asked to find the sum of the numbers 1 to 20, that would take a long time to calculate. Let's try finding the sum using for loops!

We would want to iterate through every number from 1 through 20. Fortunately, there is a python function, range(start, stop), that returns a sequence of numbers from the start integer to the stop integer, not including the stop value. In our case, range(1, 21) gives us all of the numbers we want to add up. A first approach would be to start with looping through this range.

for n in range(1, 21):
    #add up n

Each time the n variable changes, we want to add it to the total. Let's create a variable that holds our sum before we start adding numbers. We'll start at 0.

total = 0
for n in range(1, 21):
    total = total + n

Try to understand each line of code and verify that it makes sense. If the range function is not supplied with a start value, it just starts at 0. Thus, we can simplify our code a bit to look like this without changing our end total value.

total = 0
for n in range(21):
    total = total + n

Summary

  1. Python forloops have the following form: for _in [ ]: # do something with _

    Here, the for loop can complete an action on every element in a list.

  2. for loops can also repeat a task for a set number of times. If you want to do something 5 times, use the range function: for n in range(5): # do something

  3. Loops allow us to write shorter code. If you find yourself repeating yourself or copy-pasting, use a loop!

  4. Loops are especially helpful in data science because they allow us to go through lists, tables, and other forms of datasets.

Last updated