>> Hey, friends. I'm Scott Hansen,
and it's Azure Friday. I'm here, with Christian Wade.
We are going to be talking about Azure Analysis Services.
Some new features like AutoScale.
Thanks, for coming back.
>> Thanks, Scott.
I'm excited to be here.
Thank you very much for having me back.
>> Absolutely. My pleasure. So, last time you showed me
some cool things that you did with
a huge data set of taxi data.
>> Yes.
>> I mean, you could use that again.
>> Yes. So, what I'm going to do is,
I'm going to show the user experience using
the taxi data set. Just very briefly.
Just to remind the viewers what Azure analysis is for.
And then will lead through to scale out.
But before we get there,
we'll look at diagnostic logging
in order to figure out that we need scale out.
>> And I would say as someone who would
be creating these kinds of reports,
I would prefer to think
about scaling as little as possible.
It sounds like that's your goal as well.
>> Absolutely. So, we'll start
here with the user experience.
This is the new taxi data set.
Each row in this trip table represents
an individual taxi ride across a ten year period.
This trip countermeasure gives
us the count of all the rides I'm
going to cagily drag it onto the canvas and bang.
This is over 10 billion rows.
This is about nine terabytes of data.
Last time I came here, it was two billion,
now it's 10 billion.
Every time I come here,
it's going to exponentially increase.
Anyway, so 10 billion,
nine terabytes of data, instant gratification.
If I want to see revenue per trip, for example,
and then break it out by zip code,
I just get instant response times.
It's so seamless.
And then, of course, we can visualize this on a map.
Latitude, longitude, and revenue
just spread like this a little bit bigger.
And there we go,
there we have all the pickup locations in New York.
Just instant, click it,
click it, drag it, drop it.
Direct analysis of a massive data set.
>> But it seems impossibly fast.
And you're saying that we might need to scale this.
It's already 10 billion rows.
It's already super fast.
>> Yes. So, there are some cases
where you might want to scale out to
replicas to support
maybe hundreds or thousands of concurrent users,
for example, or whilst
where loading these massive data sets.
We want to take that sort of out of
the query pool so that we don't affect users.
>> Okay. So, then the need for scale is not
just or maybe less about the size of the data set,
10 billion 20 billion,
and more about the number of
people who are interested in that data set.
>> Absolutely.
>> So, right now, there's one person, you,
looking at this data set on your servers.
>> Yes.
>> But if you run a company that has a bunch of analysts,
you might have 1,000 people looking at the data set.
>> That's exactly right.
And as you, saw the queries return very fast.
So, we actually support a lot of
concurrent users given the queries are so fast.
But there will obviously,
be a threshold somewhere where you have
advanced analyst that is submitting advance queries.
>> Okay. So, then it is identifying
that threshold and then dealing with it gracefully,
that is the point.
>> That is the point.
>> All right.
>> Absolutely. So, I'm going to switch over to
a diagnostic log and show you how we
would set up diagnostic logging,
to help us figure out we need scale out.
So, here, I'm in
the Azure Analysis Services server in the Azure portal.
And I'm going to scroll down
to the diagnostic logs section,
and I can simply click add diagnostic setting.
I already created one earlier.
>> It's interesting to point out if I may,
that diagnostic logs icon,
the diagnostic logs pane looks very familiar.
It's the one that I use when I do websites.
>> It's exactly the same one.
So, we get all of
the Azure diagnostic logging data across
the customer architecture on
Azure in one place, pre-integrated.
This was one of the real challenges
when doing this on-premises.
It wasn't even integrated
with the performance council data,
let alone the data from
the rest of the custom architecture.
>> So, because Azure has
diagnostic logging, for lack of a better word,
as a service,
you the analysis services team
in the Cloud were able to enlist in their service.
>> That's what we did.
>> And since I already know how to use
diagnostic logging, I know how to do it now.
>> Absolutely. And it's like we're
taking all the credit and this was all pre-built for us.
All we did was hook into the pipeline and that was it.
>> And then I as the developer,
get to take credit as well
because a new LEGO brick has shown up.
>> A new LEGO brick has arrived.
>> Very cool.
>> So, here we are and we can now
put to storage accounts,
as you and others are already familiar with.
To event hub wide integration with big data systems.
I like Log Analytics a lot. That's what I'm using.
So, I've hooked it up to Log Analytics,
I'll show that in just a moment.
And we can output engine events.
So, these are the most useful SQL server
service analysis extended events.
Service events are things like,
scale up, downpours resume.
And metrics gives us the performance counter data, right?
And it's integrated in one place.
Now, I'll quickly point out that this was
quite a lot of work to set up on Premises traditionally.
You would have to use SQL server,
extended event session management to output to
an Excel file which was a proprietary binary format.
You'd used special system functions to
access that data in SQL server
which would output in XML format.
You would do complex parsing of it. It was a lot of work.
And it wasn't pre-integrated as we said.
So, here, this is much simpler.
>> I was talking with a friend recently
about the idea that the Cloud
has taken the second page of my resume,
and turned it into a checkbox.
>> Absolutely.
>> That's why it feels like.
Everything is just a little checkbox.
>> So, I can either embrace it
or I can feel bad about it.
But I choose to embrace it.
So, you click that button,
page two of your resume is gone.
>> You can embrace it.
Just click the checkbox and take the credit.
>> There you go. Nothing wrong with that.
>> Absolutely.
So, I've hooked this up.
Now, I'm going to switch over
to Log Analytics which we can access here.
It's just a resource here in the Azure portal,
as you can see.
And I've already selected my work space,
I've clicked on Log Search,
and then here, I get all of the events.
So, Log Analytics has its own query language which
generates a query for me on the fly as
I interactively use the user interface.
So, now, I'm filtering on all of the events
that are either the engine events or the service events.
And one of the things I really like about this is that
we can save our own queries.
So, for example, I've got a bunch of queries here.
Some scale-up queries,
some slow queries being submitted to the server.
But the thing that probably unlocks this
the most for business intelligence professionals,
because really, I'm touching just the tip of the iceberg.
There's a whole appraises management suite et cetera.
But this little Power BI button allows me to download
a simple little text file with
an M expression that I can copy and
paste into Power BI desktop,
and now, I'm really at home
as a business intelligence professional.
So here, I've brought some events through.
>> So, I think we just had an inception moment.
>> Absolutely.
>> Are you using Power BI desktop
to manage your Power BI desktop?
>> Of course. Absolutely.
>> I should know.
>> Absolutely.
>> So, it's just another business intelligence system.
I want to know which users are using my system.
What data are they using?
What's the health of my system?
There's so much insight I can gain about my system.
>> What better tool for a data analyst to do than
>> It fits the bill perfectly.
Absolutely. So, this is
an S4 server which has 400 Query processing units,
which is an abstraction, of course.
It has up to 100 gigabytes of memory.
So, you can see it's doing okay on the memory.
It doesn't actually cross 70 gigabytes but the
QPU's are maxed out throughout
pretty much this whole hour of sample data.
And if we come down here,
we can see that I've got processing operations
which are data of refresh operations, right?
So, I've got these massive data sets
that I have to refresh.
And these are happening at
the same time as long-running queries.
These a long-running queries,
and there's a point on the line
for when the query finishes, right?
So, that suggesting that there is some contention between
data refresh and long-running query operations.
And so, if we come down to this one,
we can see the actual text of the logs,
we can actually look at the queries themselves.
And this query here happens to makes no sense whatsoever.
This is another thing that you can
monitor the queries being submitted to your server.
If there's like some crazy
that really doesn't make any sense,
you can figure out who was submitting these queries.
Actually, in this particular case,
this user didn't know what he was doing,
but in other cases, you could figure.
>> So, if I said, "Oh,
this is an order n squared thing. Stop doing that."
>> Absolutely. Who's submitting
this dumb query on my server?
Oh, actually, no.
That user doesn't know exactly what he's doing.
Anyway, the point being that you have
all of the information here.
And I've also got an error event here.
So, this error is a timeout that's caused specifically,
due to contention between
data refresh operations and queries.
I've got a long-running query,
the data refresh operation wants to take a look.
It waits 30 seconds,
and then it issues an error to the query.
So, in other words, this server
looks like a good candidate for scale-up.
>> I mean, it seems like there's a number of reasons,
as I'm speaking from a place of naivete,
it just seems to me that any extended QPU's there
>> There's something going on.
>> Contention as well.
>> Absolutely.
And this will help you get to the bottom of it.
>> Let me guess. A check box.
>> A checkbox indeed.
Of course. If it is not a checkbox, it's a slider box.
>> It's a slider box. I accept
the slider box as the answer to this question.
>> All right. So, in fact, let's take a look.
So, here we go back
to the Azure Analysis Services resource,
and now, we'll come up to the scale section.
I'll quickly point out that we can, of course,
scale up and down the SQL.
So, if I need some extra horsepower
maybe for just half an hour while I'm
loading a massive data
set at three o'clock in the morning, I can get it.
When I'm done, I scale it back down
again and the promise of the Cloud,
I only pay for what I actually
require is a beautiful thing, right?
And the same applies to not only scale up and down,
but also to scale out.
So, for scale out,
I can click here on replicas
and guess what it's just a simple slide of bars.
I can choose up to seven read-only replicas.
I can choose to separate
the prices and separate from the query pool which means
that I can perform those data refresh
operations without impacting queries.
>> And in this case, that appears
to be the primary issue, is that?
>> That is the primary issue in this particular case.
Yes. And I'll just point out that this kind of like
the diagnostic logging was
extremely expensive and time consuming to do on Premises.
Customers would have to set up
load balances instead of virtual directories.
I've run out of services, not to mention
the data copying to the replicas.
The data storage costs for large amounts of data.
This is now simple slider bar.
It's just a prime example of how
the Cloud is making our customers lives so much easier.
>> Is this available now?
>> This is available now.
It's generally available.
>> Fantastic. So, somebody out there
who's running analysis services just went to bed,
they woke up, and then there's
now a checkbox and a slider bar.
>> There's a checkbox and a slider bar, and
their life is much simpler.
>> Absolutely brilliant.
I am learning all about scaling up and
out Azure Analysis Services here on Azure Friday.
Không có nhận xét nào:
Đăng nhận xét