- HD
- 720p
- 540p
- 360p
- 0.50x
- 0.75x
- 1.00x
- 1.25x
- 1.50x
- 1.75x
- 2.00x
We hope you enjoyed this lesson.
Cool lesson, huh? Share it with your friends
About this lesson
In this video, we begin discussing data visualization, and install matplotlib to help us create histograms.
Exercise files
Download this lesson’s related exercise files.
Plotting Histograms57.1 KB Plotting Histograms - Solution
56.7 KB
Quick reference
Plotting Histograms
We can use matplotlib to plot charts and graphs, including a histogram.
When to use
Use this to create a histogram.
Instructions
To create a histogram, use the .hist() function on a column of your DataFrame:
my_df['Wed'].hist()
To change the number of bins in the histogram:
my_df['Wed'].hist(bins=20)
You can also use the .plot() function:
my_df['Wed'].plot(kind="hist")
Hints & tips
- my_df['Wed'].hist()
- Change Bins: my_df['Wed'].hist(bins=20)
- Other Method: my_df['Wed'].plot(kind="hist")
- 00:06 All right, in this section, we want to start doing some visualization with
- 00:09 pandas, and charting graphs, and things like that.
- 00:12 So pandas has built-in visualization, and
- 00:14 we're going to be looking at that in the next few videos.
- 00:18 But it's built on top of something called Matplotlib.
- 00:21 So we actually need to pip install that from our terminal.
- 00:25 So though I know it's been a while since we looked at our terminal here, but
- 00:28 come back to your terminal.
- 00:29 Hit Ctrl+C to break out of this.
- 00:32 And let me just clear the screen here.
- 00:34 And we're still in our c/data directory.
- 00:37 We still have our virtual environment turned on.
- 00:38 So to install this, we just go, pip install matplotlib.
- 00:48 And boom, that should do it.
- 00:49 Now let's run our Jupyter notebook again.
- 00:54 Because when we turned it off, it stopped running.
- 00:57 So now we have to run it again.
- 01:01 And here we are.
- 01:02 Now, this is our old notebook right here.
- 01:03 Let's create a new one for this new section,
- 01:05 since we're going to be doing slightly different things.
- 01:07 So click on New, we want a new Python 3 folder.
- 01:11 And let's save this as plot, because we're going to be plotting some things.
- 01:18 And you can see our old one,
- 01:21 the connection failed because we turned it off to install matplotlib.
- 01:25 But now let's just grab each of these things, and copy and
- 01:29 paste them into our new notebook.
- 01:35 So copy, paste.
- 01:42 Copy, paste, and we also need to add one more thing.
- 01:45 So we're going to be creating charts and graphs.
- 01:48 And in order to do these inline in our Jupyter notebook,
- 01:50 we need to make a little notation here.
- 01:52 So we need the percentage sign and then, matplotlib inline.
- 01:58 So if we Shift+Enter to run all of these things, now we're ready to go.
- 02:02 So let's create a new DataFrame.
- 02:04 And let's just stick with this, and let's just call it my_df.
- 02:09 And that's going to be pandas dot DataFrame.
- 02:12 Now let's create some random data for this, so randn.
- 02:16 Let's go 100 rows and 4 columns.
- 02:20 And we can designate the columns as, what?
- 02:25 Let's just stick with our sort of Monday, Tuesday,
- 02:31 Wednesday, Thursday.
- 02:36 So let's take a look at this, my_df, and we get this big, huge thing.
- 02:43 And we've got 100 rows, from 0 to 99.
- 02:46 sort of put some dots here, so it's not all on the screen.
- 02:51 And so let's start out by creating a histogram.
- 02:55 Histogram's a common sort of plot that you're going to want to do.
- 02:58 So all we have to do is, my_df, and then what do we want to chart?
- 03:03 Let's chart Wednesday, so we would just call Wednesday.
- 03:08 And then you could just call .hist, and it's a function.
- 03:11 And if we run this, boom, we get a histogram.
- 03:14 Now, this is kind of blocky and chunky.
- 03:16 Maybe we want to break it down into smaller bins.
- 03:21 So we can designate that right here, bins equals, I don't know, let's say 50.
- 03:25 And it kind of breaks it up a little bit more.
- 03:28 We could go bins=100, if we wanted to really stretch it out.
- 03:32 We could go bins=20, maybe that's a little better.
- 03:35 And it's just that easy.
- 03:36 So this is plotting with pandas, very simple and kind of cool.
- 03:41 Now, there's another way we could do histograms.
- 03:44 We could go my_df.
- 03:48 And then let's say, again, we want Wednesday.
- 03:51 Let's mix it up, let's go Monday.
- 03:52 Now, we can go .plot.
- 03:54 And then inside of here, we can designate the kind of plotting we want to do.
- 04:00 And we can call this histogram.
- 04:01 And notice this way, we don't have those grid bars.
- 04:05 And just like before, you could pass in arguments here.
- 04:08 So we could go bins equals, I don't know, let's say 30,
- 04:13 and it's kind of interesting.
- 04:14 Now, this isn't really supposed to be normalized data, but
- 04:18 it looks sort of normal, I guess, normal distribution.
- 04:23 A little bit.
- 04:24 So that's how you do histograms with Matplotlib.
- 04:26 And really, that's how you do a lot of this charting stuff.
- 04:29 So really, really easy to use this stuff.
- 04:32 And like I said, this is built on top of Matplotlib, so that's cool.
- 04:34 So in the next video, we'll look at area plots.
Lesson notes are only available for subscribers.