GoSkills
Help Sign up Share
Back to course

Plotting Density Plots

Compact player layout Large player layout

Locked lesson.

Upgrade

  • Lesson resourcesResources
  • Quick referenceReference
  • Transcript
  • Notes

About this lesson

In this video, we discuss density plots, kernel density estimation (KDE) plots, and how to create them.

Exercise files

Download this lesson’s related exercise files.

Plotting Density Plots.docx
57.2 KB
Plotting Density Plots - Solution.docx
55.4 KB

Quick reference

Plotting Density Plots

Kernel Density Plots, and Density plots, are basically the same thing and allow you to estimate the probability density function of a random variable.

When to use

Use the methods whenever you want to chart the kernel density of your data.

Instructions

The first method of creating a kernel density plot is:
   my_df.plot(kind='kde')

The second method of creating a kernal density plot is:
   my_df.plot.kde()

You need scipy to run a kde plot. From inside the terminal, run:
   pip install scipy

 

Hints & tips

  • my_df.plot(kind='kde')
  • my_df.plot.kde()
  • pip install scipy
Login to download
  • 00:05 Okay, in the last video, we looked at hex plots.
  • 00:07 In this video, we want to look at density plots and kernel density estimation plots.
  • 00:12 Basically the same thing, although slightly different.
  • 00:14 So let's just jump right in here.
  • 00:17 Let's go my_df.plot.
  • 00:20 And for kind we want, let's start out with kernel density,
  • 00:26 so kde, kernel density estimation.
  • 00:29 And if we run this, we're going to get an error.
  • 00:31 So if we come down here and look, we see, No module named scipy.
  • 00:34 So we actually need to install SciPy, Scientific Python, in order to use this.
  • 00:40 So we can do that very quickly.
  • 00:42 Just head back over to our terminal,
  • 00:44 Ctrl+C to break out of the Jupyter Notebook.
  • 00:47 And we can just pip install scypy, all one word,
  • 00:52 and it should just take a second.
  • 00:55 And now we can run our juptyter notebook again, and it pops back on.
  • 01:03 So I'm just going to close this.
  • 01:04 Now we can reload our current notebook, and we're still in our plot notebook.
  • 01:09 And we're going to need to come up here and Shift+Enter to run these three fields
  • 01:13 again, just to get this stuff in the current memory,
  • 01:17 since we had to restart our notebook.
  • 01:19 And now if we come down here, my_df.plot,
  • 01:24 and then, kind='kde'.
  • 01:28 And now we get a kernel density estimation for each of our columns,
  • 01:33 Monday, Tuesday, Wednesday, and Thursday.
  • 01:36 And a KDE is sort of like a histogram, but it's not quite the same.
  • 01:41 In fact, we can come down here and we can go my_df.plot.kde.
  • 01:47 And then Shift+Enter, to see exactly what the documentation says.
  • 01:52 And in statistics, kernel density estimation, KDE, is a nonparametric way to
  • 01:56 estimate the probability density function, PDF, of a random variable.
  • 02:01 The function uses Gaussian kernels and
  • 02:03 includes automatic bandwidth determination.
  • 02:06 And we come through here, there's not much really to see.
  • 02:07 It says some SciPy stuff, and really not much to learn there.
  • 02:15 But we can do some different things here.
  • 02:18 We can call alpha=0.4.
  • 02:22 I'll spell alpha right.
  • 02:24 To make these a little bit transparent if we want to see them better or
  • 02:31 whatever, we can do line width equals 2, or
  • 02:36 5 to make them fatter if we want, and all the things.
  • 02:41 And just like KDE, we can also do a density plot,
  • 02:48 And it's slightly different here.
  • 02:50 And we can come down here, so my_df.plot.density, and run this.
  • 02:57 We can Shift+Tab to read about this if we want to.
  • 03:02 And we get the same kernel density estimation spiel.
  • 03:07 These are basically the same thing.
  • 03:09 Okay, so we went through a lot of these plots very, very quickly, and
  • 03:13 we didn't get into great detail about them.
  • 03:15 But that's not really what I wanted to do in this course.
  • 03:18 I don't want to give you a great detail.
  • 03:20 I just want to show you what's available and sort of introduce you to
  • 03:24 each of these things, all of these plots, sort of slowly.
  • 03:27 Because in data analysis with Python, it's easy to get overwhelmed.
  • 03:31 So I really just want you to become familiar with all these plots.
  • 03:34 Play around with them a little bit, get used to sort of the attributes that you
  • 03:38 can play with, changing the line width, changing the colors,
  • 03:41 just passing basic data into them, seeing the results.
  • 03:44 because in the future, you'll learn how to use them in more advanced ways.
  • 03:46 But first, you have to become familiar with them.
  • 03:48 And that's really what I want you to get out of this course,
  • 03:51 just a basic familiarity with the different plotting that you can
  • 03:54 do with pandas using these visualization techniques of basic plots.
  • 03:59 So we looked at histograms, we looked at area plots.
  • 04:01 We looked at bar plots, line plots, scatter plots, box plots,
  • 04:05 I love the box plots, hex plots, and density plots.
  • 04:08 And you've got a nice little toolkit now to work from going forward.
  • 04:12 So that's all for this section.
  • 04:14 In the next section, we're going to start to look at linear regression with
  • 04:18 scikit-learn, talk about what linear regression is, how to train models and
  • 04:22 do fun stuff like that.
  • 04:24 And that'll be starting in the next video.

Lesson notes are only available for subscribers.

Plotting HexPlots
04m:36s
What Is Linear Regression - Least Squares Method
03m:21s
Share this lesson and earn rewards

Facebook Twitter LinkedIn WhatsApp Email

Gift this course
Give feedback

How is your GoSkills experience?

I need help

Your feedback has been sent

Thank you

Back to the top

© 2023 GoSkills Ltd. Skills for career advancement