Tables

How do we store large amounts of data?

KEY TERMS

  • Table: a way to organize data using rows and columns

  • Row: each horizontal line in a table is a row (also known as an entry)

  • Column: each vertical line in a table is a column

  • Attribute: a characteristic of an entry that describes that particular entry. Each attribute corresponds to a column

Now that we know more about data types in Python and representing sequences of items, let’s see what they can be used for!

In data science, we are concerned with data as well as its organization. We want to have well-organized data that’s simple to read and understand. A common way to organize data is by using a table.

What is a table? Essentially, a table is a way to organize data in rows and columns. Rows run horizontally and columns run vertically. Across the top of a table, you’ll see the column labels, or the column names. Column labels are usually attributes that describe something about every entry. For example, if we decided to put the data from our phone_numbers dictionary into a table, it would look something like this:

Name

Phone Number

"Sam"

3431234098

"Daisy"

5672349876

"John"

8907654321

Looking at the rows of our table, we find that each row represents a one of our friends. In tables, each row is an entry. Moreover, we learn about two things about each friend: their name and their phone number, which are the friend's attributes. In tables, each entry is described by multiple attributes.

Looking at the columns of our table, we notice that every column has a label and values. Each column label, like 'Name' or 'Phone Number', is associated with the list of values in that column. In fact, every column label could be considered a key mapped to a list of values. This means we can represent a table using a dictionary with key: value pairs!

# this is the phone_numbers dictionary we've seen before
>>> phone_numbers
{'Sam': 3431234098, 'Daisy': 5672349876, 'John': 8907654321}

# this is how we might organize the same information in a table
>>> phone_numbers_table = {'Name': ['Sam', 'Daisy', 'John'], 
                   'Phone Number': [3431234098,5672349876,8907654321]}

Notice that phone_numbers and phone_numbers_table look very different -- both are dictionaries and contain exactly the same information, but we've changed how the data is organized. Instead of having each key: value pair in the dictionary represent one friend (ex. 'Sam' : 3431234098), the key: value pairs now represent columns in a table.

Suppose we want to get a list of all the friends that we want to call tonight, how would you manipulate the table? (Hint: a list of friends is a list of names) What about a list of all phone numbers?

Now, if we wanted to find the phone number , how would we read the phone_numbers_table dictionary? To read a row in a table, we look across the columns. For example, reading the first row of a table means reading the first item in every column. We see that 'Daisy' is the second element in the column 'Name'. Therefore, the corresponding price can be found in the second element of the 'Phone Number' column.

# the values in the 'Phone Number' column of the table are associated with the key 'Price($)'
# we've assigned price to the list of values in the 'Price($)' column
>>> number = phone_number_table['Phone Number']
>>> number
[3431234098, 5672349876, 8907654321]

# Daisy is the second element of the 'Name' column
# so her corresponding phone number is also the second element of the 
# 'Phone Number' column.
# !! Remember that lists are zero indexed, 
# !! so the second element is at index 1
>>> numbers[1]
5672349876

# the following statement returns the same thing
>>> phone_numbers_table['Phone Number'][1]
5672349876

You meet a new friend, Mike, when waiting in line for Peet's Coffee, and you want to add them to your new table. How would you go about adding them to our phone_numbers_table?

Another way to add a new entry to the table is to use the method .append()

# here's Mike
>>> new_name = 'Mike'
>>> new_number = 5558801916

# let's add the new name to the list of names
>>> phone_numbers_table['Name'].append(new_name)

# now let's add the new number to the list of numbers
>>> phone_numbers_table['Phone Number'].append(new_number)

# now student_table has a new entry
>>> phone_number_table
{'Name': ['Sam', 'Daisy', 'John', 'Mike'], 
'Phone Number': [3431234098, 5672349876, 8907654321, 5558801916]}

Now, we've realized that your once-friend John actually hates Peet's Coffee and you no longer want to call him ever again. In order to remove such entry, we can use the method del, which is short for delete.

# now we want to remove the entry for John
# John is described by the third item in each column

# let's remove peach from the list of fruits
>>> del phone_numbers_table['Name'][2]

# now let's remove John's number from the list of numbers
>>> del phone_numbers_table['Phone Number'][2]

# now a row has been removed from phone_numbers_table
>>> phone_numbers_table
{'Name': ['Sam', 'Daisy', 'Mike'], 
'Phone Number': [3431234098, 5672349876, 5558801916]}

Summary

  • Tables are a way to organize data in rows and columns. Each row represents a new entry. Each column represents an attribute of those entries. Every column has a column label.

  • A table can be represented by a dictionary. Each key: value pair represents one column where the key is the column label and the value is the list of values in that column.

Last updated