Azure Arc for data services, including SQL and PostgreSQL (Microsoft Ignite)


(energetic music) (audience applauds) – Hello, everyone, and welcome back to Microsoft Mechanics Live! Coming up, we’re gonna take a deep dive on Azure data services as
part of the new Azure Arc which provides a central way to provision, then manage your databases, whether it sits in your Data Center, in Azure, or in other Clouds, for both SQL as well as PostgreSQL. Now we’re gonna show you how it takes advantage of container-based architectures with Kubernetes for instant
database provisioning, updating and elastic
as well as hyperscale. So, please join me in
welcoming back to the show, Travis Wright from SQL engineering! (audience applauds) – Thanks, Jeremy, it’s great to back. – Thank you! So, we’ve just released
the private preview of Azure data services in Azure Arc. Now, what’s the significance
of what we’re doing here? And what are we really solving for? – Yes, this is a problem
that’s very familiar to anybody that’s working with databases
in a large organization. The variety and volume of data that you’re dealing with is just
constantly increasing, and it can be pretty overwhelming. And this data is oftentimes sprawled across many different locations, whether it’s in the cloud
or in your own data center. You’ve got a different
variety of database engines you’re working with and
somehow you need to manage all of this and keep it
all secure all the time. So let’s take a peek now
at what that looks like with Azure Arc helping
you out to do all this. So here I am in Azure Data Studio. And Azure Data Studio now
has this new panel here for Azure and I can connect in using my Azure active directory credentials and I can see this subscription here, and then inside of this subscription I can see my resources
that are up in Azure, like this SQL database instance here, this SQL server running
on a VM in Azure here, an Azure SQL Database
Managed Instances here, and now with Azure Arc, I can now manage my Postgre instances and
SQL Managed Instances running in my data center or in another public
cloud all in one place. So up here I’ve connected
to this demo instance right here and we can
just right click on this and run a new query and we’ll
run select at at version. That’s sort of a pretty
common query to see what you’re working with here. And let’s look at this. This is something that
might be sort of unexpected for somebody running a
select at at version query against the SQL instance
running on-premises. – [Jeremy] So what does this
mean then Microsoft Azure SQL Database Managed
Instance hyphen Azure Arc? – Yeah, so this is our new service that’s a different way for customers to get our Azure SQL
Database Managed Instance Service and run it wherever they want, whether that’s in their own infrastructure or in other public clouds and it provides built-in management capabilities
and continuous updates so that you’re always up to date and you never run out of support. – [Jeremy] So you’re able to run this then on any infrastructure, right? – Yeah, exactly. Let’s take a look at how that works. So this works by leveraging
containers in Kubernetes. First of all, you would
deploy some infrastructure and this could be your own infrastructure in your own data center where you purchase some OEM hardware, for example, or it could be rented infrastructure that you’re getting from a
public cloud provider. It doesn’t have to be
any particular special type of hardware, but we are working with our OEM partners, like
HPE at Lenovo and Dell, to have reference architectures designed for supporting these types
of data service workloads. Next, after you’ve done
that, you’ll define your persistent storage layer which is, of course, very important for data, and then on top of that you’ll deploy your Kubernetes distribution of choice. Once you have that laid down, then you can use Azure Data Studio
to pull down and deploy the data controller and control plane pods that then run on top of Kubernetes. And these pods provide
all the different services like identity, telemetry data collection, and it hooks up to Azure to provide Azure connectivity for consuming services like Azure backup or
Azure monitor which we’ll see here in a moment, and
once you’ve done that, you’re then ready to start
using Azure Data Studio, the Azure Portal or the CLI to start to provision and manage your resources. – Okay, so everything then is wired up. Can you show us then how the provisioning would work to an existing cluster? – Yeah, absolutely. So let’s pop here into Azure Portal. There’s lots of different ways
to provision, but I’m gonna show you the example here
using the Azure portal. So I’ll just go into the Marketplace, just like I would go into the Marketplace to provision any type
of resource in Azure, and we’ll just do a
search for Azure Arc here, and we can see the two
different data services we have available today,
Azure SQL Database Managed Instance, and Azure
Database for Postgres. We’ll just click on the
Managed Instance offer here. Click on Create and this will take us into a form that looks
very much like the form that you would use to provision a managed instance into Azure itself. In this case, I’m doing
choose my subscription, a resource group and then I’ll provide an instance name for my instance. I’ll choose an Azure location to store the metadata about this resource, and then this is the only thing that’s really any different now is that there’s this new Arc location prompt here, and this prompts me to choose
where I want to deploy, whether that’s into some
other public cloud like AWS or I can deploy into my own
data center here in Orlando. Then we just provide the usual credentials to be the administrator on the machine, and we click Review
and create and this was now going to be a deployment
experience which is just like how you provision
any type of resource into Azure except with
one key difference here which is we’re not deploying
into the Azure infrastructure. We’re actually deploying
into my data center back in the lab in Redmond. So, now that I’ve done
this, let’s go ahead and go take a look at what’s
actually happened here. So I’ll run a kubectl command here to show me this pod that’s now running on my Kubernetes cluster in my lab. So this shows me that this pod has now been deployed and I can
run another command here, kubectl get service, that will show me the connection end point
that I need to connect to. So this is being exposed on port 12521. So we can copy that and we can pop over here now into Azure
Data Studio and create a new connection to this service here. So, I’m gonna connect to
this same IP address here as some of the other ones
I’ve been doing before, 23.9, provide my connection
credentials here, and we’ll call this one demo2 and connect. So this shows you how easy it is to go up to the Azure portal, provision an instance into my infrastructure
from the Azure Portal and then quickly connect to
it from Azure Data Studio. – Very cool, so now you’ve got some nodes provisioned, the pods are provisioned via that resource manager
template from Azure. What can we do next now
that everything’s kind of up and running and talking
back and we can write to it? – Yeah, so at this point,
this is really just, in a lot of ways just like
any other resource in Azure, I can go and look at this resource. I can view the overview
properties about it. I can view the activity happening on it. I can control access to
it, tag it and so on, but my favorite feature I wanna talk about today is advanced data security, and advanced data security
provides two different services. The first one here is
vulnerability assessment and vulnerability assessment allows you to take these policies that are created by our Azure security
team that help you detect any sort of potential
vulnerabilities in your systems. So this is things like
port 1433 being enabled or maybe the default SA login’s enabled, or you have a weak
password, things like that, it’ll go through and scan your system and make suggestions on how to improve your security posture. And the second service down here is advanced threat protection, and advanced threat protection is again, like a set of policies that are curated
by our Azure security team that help you identify potential threats. It could be maybe somebody’s
logged in recently that hasn’t logged in for
six months or something, or maybe it’s that we’ve had a bunch of failed login attempts
and we wanna notify people at these email addresses
that we configure here about those potential threats, and to turn this on I
just toggle this button, I click save and now these policies are being downloaded and
applied against my Azure SQL database instances running
in my infrastructure, whether that’s in my data center, or in another public cloud, and this is the kind of thing that
used to only be available to your data services
running in Azure itself, but now you can run this
against your data services wherever they happen to be. – Alright, so you’re mentioning things like identities and permissions. How does that work then here? Does it use all the different providers? – Yeah, absolutely. So, you would use Azure
role-based access control, just like you would for
your Azure data services running in Azure, to
control who has access to these resources and
who can do what to them, and we’ll be using Azure active directory, on-premises active
directory and SQL logins to control who has permission to log into these data services
and then, in addition to that, you can use Azure
policy to define policies that define your desired
configuration states so you can enforce things
like naming conventions, tagging or quotas. – And this is part of that big story in terms of Azure Arc, you can actually instead of just monitor,
you can write back. So all these policies,
tagging and all that stuff is writing back to the pods in this case that you have there,
but you mentioned though then like any other data
service I think in Azure, there’s probably something
with high availability. Is this something that
we can achieve also, even if it’s for and
in my own data center? – Yeah, absolutely. Of course, high availability
is super critical to data. So we’ve got multiple layers
of redundancy built in. In a lot of ways it’s
just like the Azure SQL database service that you get in Azure, except that now it’s being
delivered on-premises in your own data center,
and so we provide a lot of the same capabilities,
like always on availability groups, log shipping and backup restore. – Alright, so why don’t
we move on to servicing ’cause I think that’s
part of where that kind of the clustering and the magic comes. How does servicing work? – Yeah, absolutely. So, let’s take a look at
just sort of how this works. First of all, when you deploy an Azure SQL Database Managed Instance
you have an option to deploy in a highly
available configuration like you see here on the screen, and when it comes time to apply an update, we’ll automate the process
of this by first upgrading a secondary, and then failing
over to that secondary. Once that’s been done,
then we’ll go through one at a time and upgrade the other pods within the availability group, and throughout this entire process, the load balancers that sit in front of this are directing the traffic to the right nodes at the right time to make sure that your applications continuously have
connectivity to that database. In addition to that, you can take each of your databases and set them to a desired compatibility level, and that makes sure that your applications perform the same way
regardless of the fact that the binaries are being updated behind the scenes, and we’ll all be doing this automatically, and we can define maintenance windows, and you can create your own maintenance windows and adjust that if you want, or we can let you do that manually,
but basically you’re in control of how this updating works. – Okay, so once
everything’s up and running, servicing’s kind of configured, all the maintenance
windows are configured, then how do we monitor
the service to make sure everything’s running
to our specifications? – Yeah, exactly, so we want to provide a local monitoring
capability so we have agents that are continuously
collecting this telemetry data across all of your data services instances as well as the
infrastructure and providing a unified view of that,
like in this dashboard that you see here where we can
see some SQL statistics here like transactions per second,
batch requests per second or wait statistics and
this allows me to just keep an eye on things locally,
even if I’m disconnected from Azure, but then if I want to I can optionally choose to send this monitoring data up to Azure monitor. So let’s take a look at
what that might look like. So I’ll just search for this same instance up in Azure, click on this here, and up here in Azure, you can see that we’re sending that telemetry data up and right here on the overview blade I can see statistics like what
my CPU utilization is when you’ve got things
for memory and so on, and we’ll continue to build this out, and that allows you to have a consistent experience for how you monitor things in your own environment, but then you can also view all that data
up in the Azure Portal. – Okay, so in Azure today
we’ve got Beyond SQL, Postgres as well as MySQL, is there a plan to bring that to Azure Arc as well? – Yeah, so we’re starting with Azure SQL Database Managed Instance
and Postgres Hyperscale, and then over time we’ll add additional data services based on customer demand. We’re not just bringing
sort of any Postgres, we’re actually bringing a special version of Postgres called Postgres Hyperscale. – Okay, so this is all
based on the Hyperscale tech, I think, from Citus
that Microsoft acquired about a year ago, a little
earlier this year, I should say. – Yeah, exactly, so that
was a little bit earlier this year, but the thing
about that is that it is open source Postgres,
but it’s a special extension that we include with
it that allows Postgres to scale out across
multiple nodes to give you really unprecedented query performance by spreading that out
across a big set of compute. So, well, you know, let’s
maybe take a look at that. – [Jeremy] Yeah, why
don’t we see how you would actually Hyperscale on prem ’cause that’s something I think that’s even new to me. – [Travis] Right, so,
provided that you have the right infrastructure,
obviously you need the capacity to do this, but here I’ve got an instance that’s running
and this particular instance has four nodes currently. This is an Azure Arc instance,
you can see that up here. I can click on configure over here and I can take this node and I can go from four nodes up to
50 nodes by 64 cores. So if my math’s right, that’s 3200 cores. – [Jeremy] And your math’s
actually pretty good this early on a Tuesday morning, but why would I need
something that powerful? – Yeah, so we’ve got Black Friday coming up, for example, surface devices are on sale I hear, so let’s sort of, in anticipation of that demand, we want to scale up for something like this and we can have that
additional compute capacity available and then once
the Black Friday period is over, we can scale back down so we’re not unnecessarily
consuming those resources. – [Jeremy] Alright, so
why don’t we go for it. Why don’t we scale this up
and see what that looks like? – [Travis] Alright, so
we’re gonna hit save here and then what we wanna do
is we wanna see and monitor how this data is being
rebalanced across those new nodes that have been added
into this environment. So, here we can see the
shard rebalancing view and this is how it kicked off. We can see here’s the original four nodes and how the shards are distributed
across those four nodes and you can see that each
of these shards is moving one at a time from one
of the original nodes to one of these new nodes that I’ve added for capacity purposes here,
and this whole process happens seamlessly behind the scenes and your applications
continuously have connectivity to that database without any downtime. – And the really cool thing here is if you’ve got queries that were underway during the rebalancing,
those continue to run even after the rebalancing
and in these flashing boxes. I’d just look at that all day actually, it’s pretty mesmerizing to see all of these things happen in real time. Now, is the same experience true for things like SQL as well as Postgres or is it a little bit different? – Yeah, so let’s nerd
out a little bit here and drop into the CLI and
let’s run azdata dash h. Now azdata is our CLI
tool for managing all of these data services and you can see that we have two command
groups, the Postgres command group and the SQL command group right there side by side and they have basically the same set of commands. So I can consistently manage both my SQL and my Postgres as well
as future data services right here from the CLI and
automate lots of things. So let’s run an azdata
postgres list to see the postgres instances I have
running in this environment. We got just this one right here and we can run an azdata postgres describe of that particular instance and we can see all the details about this
particular instance here and we can see that
currently this a four pause running and that’s kind of
the current state of things, but if I do azdata postgres edit on this instance, this
allows me to be prompted for how many nodes I
want to be scaled up to. – So can you scale it up to a hundred? – Easy, tiger! Geez. (laughter) I don’t even know how many
cores that is but let’s just start with four, okay, I don’t know how much capacity we have
in this environment here, but let’s go up to four there and if we do another describe on this,
we’ll see that there’s now going to be some additional nodes that are gonna be added into this, this is like a pending
scale operation here, and this will automatically scale this up and rebalance that data for me. – Very cool. So you’ve shown
us quite a bit in terms of how, quite a few capabilities in terms of how scaling and management will work, but can you take us back
then to the bigger picture for managing databases, really anywhere on any infrastructure wherever it is. – Yeah, absolutely, so lets pop over here to the new Azure Arc
dashboard in the Azure Portal and this shows this sort
of world view of all the different arc locations that I have, whether that’s arc locations that are in my own data center or
in other public clouds like AWS, and it really
gives you kind of a visual on how we’re really taking
Azure data services anywhere. – Thanks, Travis, great
overview of how you can actually use Azure Arc
for your data services in Azure and really to bring
it to any infrastructure and keep it highly available, scale it, manage it, monitor it, all those things. Where can people go to learn more? – Yeah, so, please sign
up for the private preview and just to be notified
about more information as it becomes available
at aka.ms/AzureArcData. – Thanks again, and of
course, keep watching Microsoft Mechanics for
the latest tech updates across Microsoft. That’s all
the time we have for this show. Thanks for watching and
we’ll see you next time. (audience clapping) (upbeat music)

1 Comment

  1. That is super cool of course,
    But how you actually let Azure know where this DataCenter is located? And can we manage non-kubernetes resources?

Leave a Reply

Your email address will not be published. Required fields are marked *