Slope and Distance

How do we find connections between different values? This subsection serves as a refresher on concepts you might or might not remember from algebra, covering slope and the distance formula.

Key Terms:

Coordinate Plane: grid with x and y axis where points can be plotted

Slopes

Slope is a number that indicates the direction and steepness of something. We usually use slope to describe the steepness of a line seen on a coordinate plane.

Why is the idea of slope important? We will see that prediction and estimation make up a large portion of Data Science applications. For example, if we use xx versus yy grams of some medicine called A, how many people will be healed? We can use this idea of slope to define a relationship between two variables ( xx and yy ).

We use slope to estimate regressions (which we will learn more about these in a future chapter) and predict results based on certain settings. For example, we might want to predict people’s weights given their heights. As we discussed earlier, we can use slope to define a relationship between two variables. In this case, their heights (which are given/known) are the xx values. We can set up a model using regressions to then predict the dependent variable (or a=ba = b ), which represents the weight. This is called dependent, because the value that you put in for xx will affect the value of yy . In this way, there is a relationship between the two variables, and we will soon see that this relationship is defined as the slope.

Slope Formula

Now that we know what slope means, how can we calculate it? This is the formula for how to get the slope between two points: (x1,y1) and (x2,y2)(x_1, y_1) \space and \space (x_2, y_2), which you may have heard of as "rise over run." Think of it as calculating the steepness of a line if you had two points and drew a line connecting them.

slope=riserun=ΔyΔx=y2y1x2x1slope = \frac{rise}{run} = \frac{\Delta y}{\Delta x} = \frac{y_2 - y_1}{x_2-x_1}

The equation above gives us a fraction, but we should always try and simplify our fraction. For example, instead of saying that we have a slope of ​ 62\frac{6}{2} , we should further simplify that to 33 .

With larger positive values, numbers will increase much more quickly, whereas smaller positive values will increase much more slowly. In mirroring that, for smaller negative numbers (remember: smaller negative numbers are further from 0), numbers will decrease much more quickly, whereas larger negative numbers will decrease much more slowly.

Review of Functions and Equations

Functions, as we introduced in the programming section, are very similar to equations. Think back to our definitions of functions. When you input a value into a function, you get one output. This is the same for an equation! Equations are simply mathematical functions that deal with numbers.

We can also represent mathematical equations through code! Let's take the equation y=x+2y = x + 2. How can we write a function that takes in an input ( xx ), and returns an output ( yy ), that is equal to the input value + 2+\space 2 ?

def add_two(x):
    return x + 2

Don't worry too much on how to write the function. Rather, focus on what the function is doing. Now, if we were to run this program, we can see what our function would output.

>>> add_two(4)
6
>>> add_two(3)
5

Notice that when we plug in 33 as the xx-value to the equation y=x+2y = x+ 2, we will always get the answer y=5y = 5 as the output. In this equation, we will get different outputs no matter what number we input. But, equations can also produce the same output for completely different input values. Try and think of an example of an equation that does this.

One possible example is y=x2y = x^2. x=2, x=2x=-2, \space x =2 both have y=4y = 4 as the output. What about y=0xy = 0 * x? Notice that for this function, no matter what value of xx you plug in, you will get y=0y = 0 as the output.

Slope-Intercept Formula

The slope-intercept formula might sound fancy, but it's just another way for us to represent the same equations we've seen so far. In the following equation, yy is the output number that comes from plugging in an xx value, multiplying that by our slope mm and shifting that by bb units on the y-axis. bb is also known as the y-intercept, where the line hits the y-axis.

y=mx+by = mx + b

By plugging in values to this formula, we can see that for y=3x+2y = 3x + 2, if we plug in 00 as our xx value, the corresponding yy value is 22 . Likewise, x=1x = 1 results in y=5y = 5. Below is a table that shows the corresponding values for this equation. Notice that this equation goes on infinitely, but we've only shown 5 inputs of x in the table. This means that if you plugged in a really big or really small number to the equation (say 1000000000000), you would still get an output value. There are no restrictions for where the equation stops! However, some equations do have certain restrictions, called bounds. If any of these are confusing, try calculating one by hand or drawing out what the line should look like.

xx Value

yy Value

-2

-4

-1

-1

0

2

1

5

2

8

Distance

Distance is a mathematical calculation of the space between two (coordinate) points. Let's say we drew a line connecting two points. With slope, we were calculating the steepness of the line. But with distance, we're calculating how long the line is.

Although there are many different ways to calculate distance, we mostly will use the “Euclidean Distance,” which is just a fancy term for the type of distance you are probably most familiar with.

Why does this matter? We will use this in countless applications throughout data science. Most likely, we will be using the distance formula in calculating the closest points in K-nearest neighbors (link).

Formulas and Explanations

To calculate the (shortest) distance between two coordinate points(x1,y1)(x_1, y_1) and (x2,y2x_2, y_2), we use the following formula:

distance=(x2x1)2+(y2y1)2distance =\sqrt{(x_2-x_1)^2 + (y_2-y_1)^2}

While this formula may look complicated, it comes from a well-known mathematical formula known as Pythagorean Theorem (a2+b2=c2a^2 + b^2 = c^2). This theorem is used to find the length of one side of a triangle, given the other two. Finding the length of a side of a triangle is the same as finding the distance between two points, since the side of a triangle is just a line. Here is a graphical interpretation:

Sanity check: Distance should always be positive! (Why?)

Here's how we can calculate the distance between two points using a function in Python!

def distance(x1, x2, y1, y2):
    x_squared = (x2-x1) ** 2
    y_squared = (y2-y1) ** 2
    return math.sqrt(x_squared + y_squared)

This might look complicated, but let's break it down. The variable x_squared represents (x2x1)2(x_2-x_1)^2 , and y_squared represents (y2y1)2(y_2-y_1)^2 . The two star symbols (**) represents the power sign in Python. So, 3**2 just means 3 squared. Then, we take the square root of the two variables added together. math.sqrt() is a built-in function (remember those?) that computes the square root of something for us, so that we don't have to do all of that math manually.

Practice Problems

  1. Try to explain distance and slope to a friend or peer.

  2. Given y=2x+3y = -2x+3 , complete this table:

    xx Value

    yy Value

    -2

    -1

    0

    1

    2

  3. Find the slope between (3,9)(3, -9)and (5,5)(-5, 5), then see where it intersects the yy-axis, then use all of these results to construct a slope-intercept equation. Repeat for (1,2)(1, 2) and (3,4)(3,4).

  4. True or False: Do (2,1),(4,13)(2,1), (-4, 13) and (11,1),(1,19)(-11, 1),(1, -19) have the same slope?

  5. True or False: Do (1,2),(1,2)(-1, -2), (1,2) and (0,4),(2,0)(0,4), (-2,0) have the same slope and intercept? (Draw out these two lines, do you see anything interesting?)

  6. What is the distance between (3,9)(3, -9)and (5,5)(-5, 5)?

  7. Which two points are closest in: (1,2),(2,3),(1,2),(2,3)(-1, -2), (2, -3), (1, 2), (-2, 3)?

Last updated