Opportunity Through Data Textbook
  • Opportunity Through Data Textbook
  • Introduction
    • What is Data Science?
    • Introduction to Data Science: Exploratory Musical Analysis
  • Module 1
    • Introduction to Programming
      • The Command Line
      • Installing Programs
      • Python and the Command Line
      • Jupyter Notebook
    • Introduction to Python
      • Building Blocks of Python - Data Types and Variables
      • Functions
      • Formatting and Syntax
    • Math Review
      • Variables and Functions
      • Intro to Graphs
  • Module 2
    • Data Structures
      • Lists
      • Dictionaries
      • Tables
    • Programming Logic
      • Loops
      • Logical Operators
      • Conditionality
  • Module 3
    • Introduction to Probability
      • Probability and Sampling
    • Introduction to Statistics
      • Mean & Variance
      • Causality & Randomness
  • Module 4
    • Packages
    • Intro to NumPy
      • NumPy (continued)
  • Module 5
    • Introduction to Pandas
      • Introduction to Dataframes
      • Groupby and Join
    • Working with Data
    • Data Visualization
      • Matplotlib
      • Introduction to Data Visualization
  • Appendix
    • Table Utilities
    • Area of More Complicated Shapes
    • Introduction to Counting
    • Slope and Distance
    • Short Circuiting
    • Linear Regression
    • Glossary
  • Extension: Classification
    • Classification
    • Test Sets and Training Sets
    • Nearest Neighbors
  • Extension: Introduction to SQL
    • Introduction to SQL
    • Table Operations
      • Tables and Queries
      • Joins
  • Extension: Central Limit Theorem
    • Overview
    • Probability Distributions
      • Bernoulli Distribution
      • Uniform Distribution (Discrete)
      • Random Variables, Expectation, Variance
      • Discrete and Continuous Distributions
      • Uniform Distribution (Continuous)
      • Normal Distribution
    • Central Limit Theorem in Action
    • Confidence Intervals
  • Extension: Object-Oriented Programming
    • Object-Oriented Programming
      • Classes
      • Instantiation
      • Dot Notation
      • Mutability
  • Extension: Introduction to Excel
    • Introduction to Excel
      • Terminology and Interface
      • Getting Started with Analysis and Charts
      • Basics of Manipulating Data
    • Additional Features in Excel
      • Macros
      • The Data Tab
      • Pivot Tables
Powered by GitBook
On this page
  • Introduction
  • For Loops
  • Apply to Statistics
  • Summary

Was this helpful?

  1. Module 2
  2. Programming Logic

Loops

KEY TERMS

  • Loop: code that is repeated continuously

  • For loop: a loop that uses the keyword for and repeats a block of code for each element in a data structure

  • Iterate: to execute, usually as part of a repetition

Introduction

Many times, in programming and in life, we do repetitive tasks. For instance, folding laundry requires these following steps: picking up a piece of laundry -> folding it -> putting it in a pile with its type. Let's say we have a load of laundry to fold, load = [shirt1, shirt2, pants1, pants2, jacket1, pants3]. It's a pretty small load, but describing the folding process through pseudocode would still take up a lot of space:

pick_up(shirt1)
fold(shirt1)
put_in_correct_pile(shirt1)
pick_up(shirt2)
fold(shirt2)
put_in_correct_pile(shirt2)
pick_up(pants1)
...
...

The purpose of loops is to avoid repetition. While there are multiple types of loops we commonly use for loops in data science applications.

For Loops

Forloops repeat a chunk of code for a specified number of times or for a collection of elements. The basic format of a for loop is: for x in [ ]: do_something(x)

Here, the loop takes each element from the list [ ], sets it equal to a variable (in this case x), and does something to that element.

Now that we know what a for loops looks like, let's use it to simplify our task of folding laundry. We go through the same three steps for each item in our load of laundry, so we can put those three steps in the body of our loop! Then, let's make sure that those three steps are applied to the individual elements of our load. This would look like:

load = [shirt1, shirt2, pants1, pants2, jacket1, pants3] #repeated from above
for item in load:
    pick_up(item)
    fold(item)
    put_in_correct_pile(item)

This code is so much easier to read than before! Being able to go through all the items in load is powerful because we don't need to know how many times the loop runs for (AKA the number of items we need to fold), but we know that the three tasks are done for every item of clothing.

Apply to Statistics

For loops are especially helpful when we have lists of numbers. If you were asked to find the sum of the numbers 1 to 20, that would take a long time to calculate. Let's try finding the sum using for loops!

We would want to iterate through every number from 1 through 20. Fortunately, there is a python function, range(start, stop), that returns a sequence of numbers from the start integer to the stop integer, not including the stop value. In our case, range(1, 21) gives us all of the numbers we want to add up. A first approach would be to start with looping through this range.

for n in range(1, 21):
    #add up n

Each time the n variable changes, we want to add it to the total. Let's create a variable that holds our sum before we start adding numbers. We'll start at 0.

total = 0
for n in range(1, 21):
    total = total + n

Try to understand each line of code and verify that it makes sense. If the range function is not supplied with a start value, it just starts at 0. Thus, we can simplify our code a bit to look like this without changing our end total value.

total = 0
for n in range(21):
    total = total + n

Summary

  1. Python forloops have the following form: for _in [ ]: # do something with _

    Here, the for loop can complete an action on every element in a list.

  2. for loops can also repeat a task for a set number of times. If you want to do something 5 times, use the range function: for n in range(5): # do something

  3. Loops allow us to write shorter code. If you find yourself repeating yourself or copy-pasting, use a loop!

  4. Loops are especially helpful in data science because they allow us to go through lists, tables, and other forms of datasets.

PreviousProgramming LogicNextLogical Operators

Last updated 5 years ago

Was this helpful?