Basic Excel Business Analytics #28: Power Query: Import Multiple Text Files & Build Grade Dashboard


Welcome to Highline BI348
class video number 28. If you want to download this
Excel file BI348 chapter 2.5 Import 01 text files– and it’s
important for this chapter, there are a lot of
files to download. So this is the
example, Import 01. Or you want to download the
important zipped folder, because that zip folder has
our source files to import. You can click on the
link below the video. Now whether or not you go to
the class website or the link below the video, these will
be the files to download. Now we actually want
to go over and look at the finished
result. We are going to build a dashboard like this. Here’s some grade data analysis. We have the classes Business
135, a business math class, and Business 210,
a statistics class. And we want to be able to
click on any particular year and see our mean standard
deviation and the distribution that is shown
through a histogram for any particular year. So that’s the end result.
But here’s the deal. If we go look at our source
data– and here it is, the Import 01 text files,
and we’re going to look at our start files. For this first example,
we have text files 1988 to 1990 and 1990 to 2000. If we open these up, they’re
just text files, tab delimited. That means each one of
the columns or fields are separated like tabs. So how do we get
that into Excel? Well, here’s the cool thing. Here is a folder– and I’m
going to click the Back button– that folder, we’re going
to tell Power Query just to look at this folder,
and whatever text files are in there will
automatically be imported, cleaned, and then dumped
in our final report. Not only that, but when
we later dump new files into this folder,
the whole process will automatically update. So let’s go to our Start file. All right, here it
is, Power Query. And in the Get External
Data– and we’re going to look at a
few different options from this over the
next few videos– we’re going from file in
this amazing From folder. It’s going to ask us to
browse, so we browse. I’m browsing, and when I
get to my final folder– there it is– I click OK. Now I click OK. And here is a bunch
of information of the files in that folder. Before we do anything, I want
to come over here and give this query a good name, Get
Text Grade Data and enter. Now the first
thing we want to do is look through these
columns, and here it is. It shows me the
extension for each one of the files in that folder. Any time I click one
of these drop down arrows, just like in
Excel, it will give me a unique list of options
that I can filter out. And I don’t want the CSV files,
I only want the TXT files. And click OK. So it’s filtered that. And look over here. It’s memorizing the steps. All of the data transformation
and cleaning steps that we do in Power Query
will be memorized over here. Now, we don’t need
any of this data. We only need the Content
column, which has the data. So I’m going to right
click the Content field and check that out,
Remove Other Columns. Instantly they’re gone. Now the double
downward pointing arrow with the little line, when I
click that, it will expand. Now notice over here it
did a bunch of steps here, including importing–
it called it a CSV, but this is really text. And notice on that
step right there, it actually treated
the field name from the text file
not as the field name but actually has a record. And look at this. It automatically knew
to promote the headers– so we have the proper field
names– and change the type. So this is text, and in
the Home ribbon transform you can see text. This one is decimal number,
and this one is text. So that did a bunch
of steps for us. Anytime we import from
a folder many files, we actually want to make
sure that these field names in some of the other files
didn’t get stuck as a record. And one way to do
this is to find which column has the fewest
number of unique records. And I know Course
ID only has two, so I’m going to
click the drop down. This is just like
the filter in Excel. It will give us a unique list. Now since power query
can import a lot of data, you always want to come down
here and click Load More if you see that,
because that forces it to look through more records. And notice, somewhere down
there in that second file, it did not promote that header. So we have to uncheck this. Course ID will be unchecked. And remember, if
we see Course ID, there will be a decimal
grade and an academic year. Those are the field
names from the text file. So by filtering that out, it
will filter out that record that’s not really a record. When I click OK,
boom, there it is. All of our steps are completed. We have a proper name,
come up to Close and Load. Close and Load just
saves it on a new sheet. We don’t want that. We always want to go
to Close and Load two. We have the choice
here, a table, and then we could
put it somewhere. Only create a connection, we’ll
do that next video, in unison with Add this Data
to the Data model. But here we want a table. I want to say on the
existing sheet, A1 is fine, so now I click Load. And over here, we have
almost 4,000 records loaded. Now I’m going to come down
here and double click and call this Grade Data and Enter. So now we have
our imported data. And we want to
build our dashboard. Now if we go look at our
end result, remember, we have these three
charts and this one slice. So there’s actually
going to be some behind the scenes
pivot table reports that are fueling these charts
and governed by this slicer. Now what is a dashboard? A dashboard is a data
visualization tool that illustrates multiple
metrics or KPIs– Key Performance Indicators–
and automatically updates as new data becomes available. So if we come back over here,
our key performance metric for each class is going to
be mean, standard deviation, and the distribution. If I click through
these and get over here, I notice that in ’88,
’89, Business 135 was over 3.0 grade point average. But for some reason,
in 2005, 2.3. And if we look down here,
this is a histogram. And these columns
tell us the frequency. And boom, look at that. That went way down. But it mostly looks like it
was because of these zeros. Now that’s used
in our end result. And let’s take a
quick look at what makes a good data dashboard
or effective dashboard– presents timely summary
data or metrics/KPIs; metrics/KPIs should be useful
for the decision maker; dashboard should inform
rather than overwhelm– So if we look over here, we don’t
have lots of other minutia, it’s pretty straightforward. We have the mean, the
standard deviation, and the distribution–
and should call attention to unusual metrics/KPIs that
require attention or are of interest, like that
column with the zeros. All right, let’s go back
to our awesome Import from Power Query. Now the first pivot
table we’re going to need is mean and standard deviation
for each one of our courses. So I click in a
single cell, Alt-N-V. And notice that the name
of that table range– remember, that’s the output
from Power Query– it took the name of our query and
actually, when it imported it on to this sheet, it created
an official Excel table. And that’s the name of it. Existing sheet, I’m
going to put it in E1. None of this stuff
is really going to be part of our
finished report. But it will be governing how
the charts and the slicers move. Now I want course ID down here. Decimal grade, I’m
going to drag it twice. We’re going to do our mean
and standard deviation. Right click, Summarize values by
average for mean, right click. Summarize values by– and I
don’t see standard deviation– More Options. One of our 11 functions,
StDev, and click OK. Now I’m going to go ahead
and just type names up here. So this will be Class, tab. This will be Mean, tab. And this will be,
click right here, F-2, I’m going to leave it StDev
for standard deviation. Now I’m going to highlight E to
G and change the column width. You know, I actually don’t
really need to do that. Because really,
again, this is going to be represented by a chart. Right click and
Number Formatting. Number, two decimals
is fine, click OK. That number formatting
from this pivot table will show up over
there in our chart. Number formatting, same thing. Number, two decimals, click OK. Now I can build my chart. And notice the categories
are Categorical Data. So I’ll make a column chart,
insert over here column. That’s looking pretty good. I’m going to get rid of my
pivot chart fields, close that. Right click, Hide
all Field buttons. I want to click on the
legend, Control-1 to open up the task pane. Wow, it looks like it’s
way off to the side. Actually, I’m going
to say at the bottom. We’ll put this on the bottom. I definitely want to see
the numbers at the top, so data labels. And unlike the dashboard we
looked at just a second ago, I left some numbers
here and here. That is chart junk, so
I’m going to delete that. Now I do you want a title. And since we’re ultimately
going to have a slicer based on the academic year, I’m
going to go ahead and add– pull this out of the way
and add a slicer here. Analyze. Up in the filter
group, Insert Slicer, Academic Year and click OK. Slicer Tools Options,
I don’t know how many. Let’s change it to four. We’ll leave it there right now. And when we add a slicer
like this, it is filtering. So when I click 1988 to ’89,
looks like the chart’s working fine. We’ve got the mean and
the standard deviation. But I want to make a label. And I don’t know how
to create a label and connect it to the slicer. So I’m actually going to
click back in the pivot table. And I closed the
Field list, so I’m going to say Show Field list. And the way I’m
going to do it is I’m going to drag the academic
year down to filter. Now really, this is
doing the same thing. But watch this. When I change the filter in the
slicer, boom, it changes there. Because that’s in the cell,
I can make a chart title link to that cell. So now I’ll close this. And remember, all of this is
kind of behind the scenes, so I can do stuff where I want. I’m going to make my
title equals– and I’m going to say academic
year– and in double quotes space equals sign
space end double quote ampersand, and
boom, click on that. And there you go. So if I slice this or filter
it, there is my label. Pull this off to the side,
let’s our chart back over. I’m going to click
the plus chart title. With the solid line, I’m
going to say equal sign, shoots me up there. And I’m going to click on
that cell with that new label, and Enter. Now let’s see if it’s working. If I click 1991 to 1992,
that is looking awesome. I’m going to control and role. We’re going to make a
few more pivot tables. Pull this down over here. And we’ll connect the slicer
to all the pivot tables. So click back in the
data set, Alt-N-V. And I’m going to put this on
the existing sheet location E11, and click OK. Now this is going
to be our histogram. I’m going to take the
actual decimal grade and get a unique list
for decimal grades. I’m not going to group it, I’m
going to leave it like that. Drag it down here, it
will default to sum. Right click. Summarize values by, and
we’re doing frequency, so we’re going to use count. And now I’m going to
want to filter each one of these reports by course ID. And this one is going
to be Business 135. And click OK. Now let’s change the labels. Grade, tab,
Frequency, and enter. Now I want to make it a label
similar to what we did up here. And I can see I want a little
bit more room, so I’m actually going to highlight the entire
pivot table, point to the edge. And when I see my
move cursor, I’m going to click and drag down to. And right here, I’m going
to make a chart label. And watch this, I’m
going to say, hey, that filter right
there, Business 135, and join it to double quotes,
comma, space, end double quotes and join it to this
label up here, and enter. So now our chart title will say
Business 135, Academic Year, based on whatever is there,
based on the filter, based on the slicer. Now I can click in a single cell
and make my chart histogram. Column, we’re going
to choose that one. Control and roll,
get rid of this. Right click, Hide
all field buttons. I don’t need this
legend right here. I definitely want that
with a solid line. Equal sign shoots me up there. I’m going to click in
E8, and look at that. Now I click on the column,
control-1, gap width. Maybe I can do some formatting. You can do whichever
formatting you want. Vary colors by
point, solid line, close that for the time being. Plus symbol, because I
definitely want data labels at the top, it looks
like I’m going to want those aligned differently. So control-1, I just closed it. And you might have
to click through, but that’s going to
be under Properties. Remember, when you’re
learning these, it just requires that
you click through until you find what you want. Text direction, I want to click
Rotate all text to 70, well, that’s kind of off to the side. But the third one there. Click OK. I definitely want
some axis titles. Right there I’m just going
to move the chart down, because I can’t quite get it. Equal sign, click
on Frequency, enter. I click on this equal
sign, grade, and enter. There’s some chart
junk, I delete. I may or may not want lines,
I don’t think I do, delete. Now control and roll. I have two charts and a slicer. And I can test this right now. Click on any one of
these and look at that. Now notice when I do that,
the data isn’t changing. And it’s because this
pivot table is not connected to this slicer. So I would like to right click
the slicer, Report Connections. And look at that, all the
pivot tables available, I can just check them. Now we’re going to do one
more pivot table and chart, so I’ll have to come
back and do that again. But when I click OK,
that is beautiful. Our data dashboard is
starting to take shape. All right, so we have
our second pivot table and our second chart. I want to select all three
charts and the slicer. I’m going to hold
the Control key and click on each
one of these objects. Control-X to cut. Come over to Grade Dashboard,
a sheet I inserted and named. And I’m going to Control-V. Now
we can click outside and then click in move each one of these. And now I’m going
to grab the slicer. And there’s a great trick. You can hold the Alt-key. And once you’re holding
Alt and moving it, it’s jumping and fitting. And I’m going to try
and jump and fit it to the B column, row 2. Now I want to see
the size of these, so I’m going to go
up to Slicer options. And I’d like to make each
one of these 3 inches, enter, by 5 inches, enter. Click outside here. I’m going to do the same
trick with the Alt-key. I want to click and drag and
try and get it right to the J. And this is a chart, so I’m
going to come up to Format. There’s 3 and 5, enter. Let’s try that here too. Click and hold Alt. I’m going
to drag it to right below 17 and on the J. Format. It’s got a 3 and a 5, enter. And over here, I will click–
hold Alt and click and drag. It looks like B and 17. Format 3 and 5 and enter. Then I’m going to highlight
all of these cells add a color, a background
color, something super light. And actually maybe I will
right click 2 and insert. And then I want to
type a title up here. It looks like this
slicer I’m going to hold Alt and click and drag. There we go. And now I have my Data
Dashboard emerging. If I click on each one
of these, I can clearly see everything is moving. That is absolutely amazing. Now we have our Control-S,
our data dashboard and all of our
pivot tables running those charts and that slicer. And then here’s our data set. And I want to go over,
and we have some file, if we come over to
our Windows Explorer. These were the two
original text files. And we have a Folder Start file. But when you downloaded, it’s
Import 01 Text CSV files. I’m going to go to New. And now I’m going
to highlight both of these, Control-C to copy. Going up here to
Import and then start. Control-V. There are four text files. And if we look up here, we told
Power Query to look in start files and find extensions .txt. That was part of
our memorized query. So when we come over here– and
I’m going to Control down now– so right now there’s
about 3,000, Control-Home. Right click Refresh, and
it’s working, there it goes. Control-down arrow. We can see we now
have 9,000 records. Instead of refreshing
each one of these, I’m going to go up
to the Data tab. And under Refresh All, I’m
going to say Refresh All. There’s actually a
keyboard, Control-Alt-F5. And hopefully now, when we
go over to our dashboard, you’ve got to be kidding me. That is amazing. I can go all the way back to the
academic year ’88 to ’89, 2001. I can click on 2005. Wow, look at that. So if I’m looking
through this, it looks like– here’s
all these grades, boom. Not very many zeros down here. And then all of a sudden, right
here, look at all those zeros. So now the next step, this
is part of the analysis. We’re analyzing this. We might want to go
in, and my suspicion is right around 2005, 2006, is
when their online classes first began. And so maybe there
was some problems with online– understanding
how to do online. So we might go back to
our original dataset and see if we could
get a field that says is this in class or online. Oh, man, but that is amazing. Look at that. A Data Dashboard. We have our slicer,
our three charts– one on categorical data,
the actual category of the class and
some calculations, mean, and standard deviation
and our frequency distribution for continuous quantitative
data in our histogram, all based on some pivot tables
and a Power Query import. Now when we come back
in our next video, we’re going to see
something similar. But this time, we’re going
to import multiple Excel files using Power Query. All right, we’ll
see you next video.

17 thoughts on “Basic Excel Business Analytics #28: Power Query: Import Multiple Text Files & Build Grade Dashboard

  1. Great video !!!!!!!! Maybe just as an interesting fact, When I was doing a dashboard I had pivot tables coming from worksheets and powerpivot as well. The pivot tables from different sources could not be connected to the same slicer

  2. I didn't think 'refresh all' updated pivot tables unless it was the one selected???… Is it that you clicked the sub-menu 'refresh all' instead of the big button or a newer behavior or am I just crazy???

  3. @ around 12:30 the trick with the diagram title from a cell doesn't work for me 🙁 I have Excel 365, does this make a difference? I put the equal sign into the diagram title field but when I click it just selects the cell.

  4. Many thanks Mike, you are amazing!
    Can you give me any advice if I want to use median instead of mean?

  5. great video!  Do you have any more dashboard creation vids? I did a search but this is all I found. Thanks for your response

Leave a Reply

Your email address will not be published. Required fields are marked *