Cloud Networking for the Hybrid Enterprise (Cloud Next ’19)


ZACH SEILS: Welcome, everyone. Good afternoon. Thanks for joining
our session today. This session is cloud networking
for the hybrid enterprise. My name is Zach Seils. I’m here with my
colleague Matt Nowina. Matt and I are both
networking specialists in the customer
engineering organization within Google Cloud. So we work on a daily basis
with customers such as yourself to help kind of onboard
them, understand networking within Google Cloud how
to best take advantage of our networking features, and
more specific to this session, how to best integrate with your
existing on premise networking environments. We do have the Dory Q&A
enabled for this session. So if you have the Next app,
you can click on Dory Q&A and pop questions into the app. We want to make sure we have
enough time for all the content in the presentation. So we’re not going
to do live Q&A today. But the Dory is open through
the end of the month, so through April 25. So if you have questions
during the session, if you have questions after the
session when you get back home, pop them into the app. Matt and I look forward to
interfacing with you online, and we may even blog about
a few of the questions. We’ll see what happens. So a quick summary of
what we’re going to cover in the next 50 or so minutes. I’m going to start
out by talking about familiar networking
concepts that everyone is familiar with from
your on-premise networking environment, and how those
map into the Google Cloud. So if you have kind of direct
correlation between things, I’ll make the
connections for you. And I’ll really highlight
how some of those things are different in the
cloud versus what you may be familiar
with in terms of your on-premise network. Next I’m going to
hand it over to Matt. Matt is going to talk about
some common enterprise designs and some recommendations
for how you can approach those, talking in a little bit more
detail about the features that I’ve talked about, how
you actually leverage those based on our recommendations. And then finally, we’re
going to come back together and we’re going to talk
about some best practices. So what we typically
recommend to customers as they start their journey
towards Google Cloud. All right, so let’s
jump right in. Again, I’m going to start with
kind of some common networking concepts, and I’ll
map those into what their equivalents are inside of
the Google Cloud environment. This is the current state
of Google’s global backbone network. So we have our own
private backbone network. It’s one of the largest private
backbone networks in the world. One of the unique benefits
of cloud networking is that within a few minutes,
through configuration that you define, this basically
becomes your infrastructure. So you can extend your
on-premise network environment literally across the
globe based on which cloud regions you deploy compute
and other workloads into. And that’s really the premise
of this presentation, which is how do you extend your
on-premise networking environment with things
that you’re familiar with, but do so in a way that’s
scalable, manageable, and hopefully with as little
design and operational burden as possible. So again, essentially, this is
basically your backbone network completely under your control. Let’s start with some very
familiar kind of constructs. Though your on-premise data
center environment undoubtedly is made up of physical
network devices. These devices are
commonly purpose specific. So whether we’re talking about
a core switch, or an access switch, or maybe a
firewall, whether it’s physical or virtual, these
devices in your network are often put together in
a very specific topology based on those capabilities. And where those
devices are placed and how you connect them
is very relevant to how the network performs and
what capabilities it has. Now data center topologies
have evolved over time. You had your traditional three
tier kind of core distribution and access networks. And you had this
resurgence recently. You have cost-based fabrics
where you have spine and leaf networks. But generally speaking in your
on-premise network, topology, at least the physical topology,
is generally fairly static. So it doesn’t change that much. So we have the concept
here of the device. We have the concept of placement
being very relevant, important in terms of what those
capabilities are, and how they apply to the
things that you’re actually connecting to the network. Next, we have
different capabilities for how you virtualize that
physical infrastructure. So for example, you have
virtual LANs, or VLANs, that allow you to provide
virtual segmentation within your physical topology. You can even have isolation from
an IP forwarding perspective. So for example, you can
have VRFs, Virtual Routing and Forwarding instances,
and historically there’s been some providers
specific implementations of these capabilities. So for example, maybe Virtual
Data Centers, or VDCs, and various different types
of overlay and underlay technologies. The premise here
is really the same. How do I basically
virtualize and isolate different capabilities on
top of the same physical infrastructure? So here, again, we have the
virtualization concepts– VLANs, IP forwarding domains,
like VRFs, and sometimes Virtual Data Centers, or VDCs. Moving on, we have what I refer
to as kind of network identity. So when you talk about how you
identify systems, particularly from a network perspective,
it’s no surprise we start talking about IP
addressing and subnets, and these kind of exist
pervasively everywhere. They’re in the public
internet, they’re in your private network, they
obviously exist in the cloud. But what I’m specifically
talking about in this context is the importance of what
an IP address actually means about the identity of a system. Like, I know these
range of IP addresses are my database servers, or
these range of IP addresses are the DMZ where I
want to kind of apply a certain type of
traffic controls. And placement is also
important here, right? Oftentimes where something
connects to the network says a lot about what it is and
implies what types of access it has or it will allow. And much like there’s
a low rate of change with physical networks, there’s
typically a pretty low rate of change with the number
of subnets and VLANs and IP addressing schemes
that you configure in your on-premise network. I mean, certainly they
change and expand over time, but they’re generally
not changing at a high rate of volume
on a daily or weekly basis. Now, if we’ve gotten this far
and your thinking, great, Zach, IP addresses and subnets. I’m super happy I
came to your session. Just be patient please. I promise we’re
actually going to help you think about these constructs
in a different way in the cloud that hopefully makes your
cloud journey a little bit more simple from a
networking perspective. Next, we have various different
types of traffic controls from a data center perspective. So you have firewalls. These again can be physical
or virtual devices. You have network-based
access control lists. You may apply these at
various points in the network. So these are more like
very network-centric. You create these based on IP
addresses, subnets, interfaces, et cetera. Sometimes you have
host-based filtering through agents or other
things you actually install on the endpoints. And finally, more recently,
you have kind of I think an emergence of what I
call drop by default fabrics or networks, where
multiple hosts are connected to the same network. And whereas they may
have historically been able to automatically
talk to each other if they could discover
each other, that’s no longer the case. These networks or
fabrics actually drop most unicast
traffic by default, which requires that
you must explicitly configure what communication
is allowed across that network. So I’ve got here a set of
controls for how I actually identify and filter traffic. So that’s where we kind
of came from, right? We have the devices. We have the importance of
placement of those devices based on their functionality. We have the network-based
virtualization technologies. We talked about VLANs. We talked about VRFs. We talked about access
control capabilities. And so, now it’s start to map
these into their equivalents inside of Google Cloud. So the first one is
really the device. What does a device mean? What is the equivalent of
your on-premise network switch in the Google
Cloud environment? And there really isn’t one. So this is funny, for the
first one there’s really not a direct equivalent. And this is really a
byproduct of the fact that most everything inside of
the Google Cloud environment is heavily software-driven
and globally distributed. So there’s really no call into
I have this network switching device, what is its
equivalent in the cloud. It just doesn’t exist. Now that being
said, we do actually surface some
capabilities as devices. This is primarily from a
configuration and management perspective. So for example, when
you’re setting up routing between your network
and the Google Cloud network, you configure that routing using
something called Cloud Router. Now Cloud Router is presented
to you as a thing you configure, but it’s not actually a device. Behind the scenes it’s
actually a distributed set of software processes that
live within a particular cloud region. So you’re not actually
configuring a device. You’re basically
providing a configuration that we then program these
software processes with and distribute them
across the infrastructure. In a similar fashion, you
have the VPC firewall, the native firewall built
into the virtual private cloud in Google Cloud. We present this to you from
a configuration perspective as a single global
list of rules. So it’s very simple
to configure, you can kind of see everything
in one list or one page. But in reality,
what we’re doing is we’re taking those rules
that you configure, and we’re actually pushing
them down and programming them at the host level on
the individual machines where your virtual
machines are scheduled. So really no equivalent
from a device perspective, but we do kind of
manifest that in a way that’s convenient for you
for both configuration and management. Next is mapping for network. So we have our
physical network, we have virtual networks in
our on-premise environment. What’s the equivalent
in Google Cloud? And this is the VPC,
Virtual Private Cloud. This is our global
VPC, and it’s really kind of a direct correlation
with your IP forwarding domain on-prem. A lot of customers have
just a single IP forwarding domain, which is you have one
routing space, one IP space, and there’s a lot of
similarities between that and the VPC in the cloud. So for example, the IP
addressing inside of a VPC and inside one of your
forwarding domains on-prem has to be unique. You generally avoid
duplicate IP addressing. Likewise, you don’t generally
automatically connect to networks together without
some explicit configuration. So in your on-prem
environment, you may establish connections
between networks. You may establish BGP routing
protocols between networks to actually explicitly
make communication happen. Very similar in the
VPC environment– VPCs are isolated
in that they do not communicate with other
VPCs inside the cloud unless you do something
explicitly to make that happen. One interesting
thing about the VPC is it’s a global
construct in Google Cloud. And what that means is
that the scope of the VPC, how big of a geographic
area it actually covers, is completely dependent
on how you deploy workloads in the cloud. So for example, if you start
out by deploying your workloads in one cloud region, that’s
really the scope of your VPC. But as soon as you start
deploying workloads in other cloud regions
across the globe, the network’s scope
immediately expands to include that entire network. And so this is actually
pretty unique to Google Cloud, and it’s actually a really,
really nice implementation because it allows you to
basically instantly grow your network in global
scope, kind of going back to that global backbone
slide that I presented at the beginning, with a single
routing and firewall policy. So your network, your
VRFs on-prem, they map into VPCs in the
cloud environment. Next, we have the
concept of VLANs. So we only use VLANs in a very
limited fashion in the cloud. We don’t use them actually
within the data center switching fabric, meaning you
don’t put your virtual machines inside of a specific VLAN. We only use VLANs to actually
virtualize the connectivity back to your
on-premise environment. This is specifically
with a product we have called Cloud
Interconnect that allows you to have
high speed, low latency access from your network
to Google’s network. But you can actually create
multiple virtual links across those physical
circuits and terminate those in different VPCs
in different locations across the world. That’s it. There’s no VLANs anywhere
else in our environment. There’s no VLAN numbering. You need to think
about making sure that you put access lists or map
interfaces to a certain VLAN. That just doesn’t exist in
the cloud environment today. Next, we have subnets. So show of hands,
who in the room remembers deploying
workloads in Google Cloud before we had
VPCs and subnets? Anyone? A few people. So it’s true. It used to be one
big flat IP space across that entire global
backbone I mentioned. But as customers
started connecting to our environment in more
different geographic locations, we needed the ability
to provide them with optimized access
from a routing perspective to the closest
cloud region where they’re deploying workloads. And that’s really the
purpose of the subnet inside of Google Cloud. I would advise you
to think about it as less of an
isolation mechanism, less of a segmentation
mechanism, and more of an identity
regionally for instances that you deploy in
certain cloud regions. So for example, you
can deploy resources in a cloud region
on a single subnet so long as that
subnet is big enough from an IP addressing
perspective to handle all of your workloads. So all of your
Kubernetes clusters, all of your virtual
machines, et cetera. That’s really the
genesis, that’s the primary purpose of subnets. That are really not intended to
be kind of a first-class entity to identify systems. It’s about efficient routing
between your environment and our environment. So your subnets, they map
into subnets in Google Cloud, but the purpose of
Google Cloud subnets is really about regional
identity or regional proximity of resources. Then we have IP addresses. So our IP addresses
within your VPC, they’re original constructs. So you create a subnet
within a particular region, and it has a unique set of
IP addresses for that region. We also have public
IP addresses that are used for our global
load balancers that are internet-facing. For the regional
IP addresses, these are by default automatically
managed by the cloud. So we automatically allocate
IP addresses to machines, we automatically
allocate IP addresses to Kubernetes pods, et cetera. So I want to actually
ask you to do something. As you kind of go
through this session, I want you to think about what
if you don’t have the ability to explicitly define the
IP address for a given virtual machine? This is already very
common in Kubernetes. You can’t specify that
a Kubernetes pod has an explicit static IP address. Assume you can’t do that
with a virtual machine. How are you going to
handle forwarding? How are you going to
handle access control, micro segmentation? We’re going to talk about some
of these things a little bit later. And finally, the equivalent
for the firewalls in your environment or in
general access control, we have a number of products. These are, again,
globally distributed, and they really kind of
span layer three controls, so building access controls
based on IP addressing, all the way up to
layer seven controls. So does this user, who’s
part of this group, have the right access
to actually interface with this cloud API? So there’s a complete
spectrum of access controls that cover a lot
of different parts of the stack from the
network all the way up to the application level. So again, this is
important when we start talking about
segmentation and the ability to kind of abstract how
we identify and control a system, separate from
where it’s actually placed in the network. And so with that, I’m
going to turn it over to Matt, who’s going to talk
about some common enterprise design scenarios and how
we can approach those. MATT NOWINA: Thank
you very much, Zach. I don’t know how many of
you are public presenters, but this is normally
the part where you get stuck looking
into the lights and forget everything
that you’ve prepared. But what we really want to
talk about at this next stage is how do we start mapping
these analogs into Google Cloud solutions specifically. So here you get a
snapshot of what the Google Cloud
networking product portfolio looks like today. There’s 20+ products
and services all focused on enabling your
journey to the cloud. We’ve grouped these into
different sections that represent the way that
customers are thinking about the cloud– so connecting,
scale, optimize, security, and modernize. And so, what we’re going
to do in this next section is we’re going to touch through
a series of these products and sort of answer questions
that customers come to us with. So in this section, what
we’re going to focus on are the Cloud
Interconnect VPN and VPCs. So one of the first
things that customers think about when
they’re coming to us is how do I take advantage
of this new network. Oftentimes when you have
on-premise environments, you have fixed infrastructure
and fixed locations, but you don’t have the way a
way to extend your applications to where your customers are,
improve the speed and latency. And so, that’s where, as
Zach was mentioning before, global CPC comes into play. This is a way of
simplifying the ability to connect to all of the
regions where you deploy things, makes it simple for you
to enable replication across your applications,
as well as leveraging Google-managed services that
are built from the ground up on a multi-region deployment
model and the associated availability. The next thing, once we’ve
started to map out the network, we start to think about,
well, how do we fit this into our operational model. So by show of hands,
how many of you no longer have dedicated
networking or security teams? So there’s a few. And we know the
industry is moving towards DevOps models
or DevSecOps models, but until we fully
move into that, we have to think
about how can we take our workflows in our
on-premise environment today and map those over to the cloud. This is where shared
VPC comes into play. So shared VPC is
designed around the idea that we have different teams
that deploy our applications and different teams that
manage our networks. And we don’t just want to give
free rein to the application teams to deploy networks
as they see fit. We still want to have
some level of control. So shared VPC is
enterprise friendly. It’s a centralized
model, and it allows you to centralize
your administration and auditability. So now we’ve got a global VPC. We have a deployment
model that works. As Zach mentioned before,
we want to look at– I mean, we’re a
hybrid enterprise. How are we going to
interconnect into GCP? And it’s important to note
that you need different models depending upon what
you’re going to rely on that hybrid connectivity for. Are you just doing management? Do you need to do batch data
loads at certain times of day? Or, is this going to become
a mission critical part of the application? This is where things like
Cloud VPN, Cloud Interconnect come into play, and being
able to map your requirements and your costs to the
exact implementation. So what does it look
like when we start to put these things together? This is an example– a zoomed in, simplified
example here– but starting on the left-hand side, you
can see an on-premises network with four dedicated connections
coming into Google Cloud. These are coming into
two separate regions that are both accessible
through the global VPC. That VPC is stored within
a shared VPC host project, and we can share
individual subnets out to the service projects. So again, it’s a
relatively simple way of leveraging
physical connectivity, ensuring a four nines
SLA, and giving you centralized administration. So from here, one of the
other common questions that we get, and I was sort
of confused when I first heard this, but there’s
some customers who are surprised to hear that our
managed services are typically hosted on the internet,
on external IPs. Now Zach and I would probably
be among the first to argue that simply putting
something on a public IP really is about accessibility
and not security. But for customers
that have invested in these interconnects,
that want to leverage this private
connectivity for accessing the managed services,
it makes sense to have an option to do so. And so, that’s where we start to
introduce private Google access and private service access. So what these are are ways
of extending those managed services to your
on-prem environment. I’ll pause as people
take pictures. But what it really
looks like is this. So that same interconnect
model that you had before, what you’re now
able to do is to– as a part of your dynamic
peering with Google, you’re able to advertise
a restricted VIP back to your on-premises environment. So this is a special IP range
that will come from your VPC to your on-premises environment. And then through
the use of DNS you can swing services over
to that restricted VIP. The three basic models
for four using this are– the first is to do an
enterprise-wide scening of start at GoogleAPIs.com
to the restricted VIP. If you don’t want this to
apply to all of your services, what you may choose to do
is implemented DNS view. So a view only for
certain clients implementing the exact
same redirection, or you can even implement
it as a host based level. So we’ve talked
about connectivity. Now I want to spend just
a few minutes on security, and more specifically
on our Cloud Armor IP, VPC firewall rules, and
cloud firewall rules. So no one is going to consider
leveraging a cloud service provider if they don’t
have complete faith in the implementation
of firewalling. GCP VPC firewalls provide
a micro segmentation model, because they are
effectively implemented at the host-based level. What that means is two instances
running on the same physical host cannot connect to one
another without traversing that firewall. And the default stance of that
firewall is a deny ingress. At the same time, these
rules are stateful. So they’re simpler to maintain
then more traditional stateless ACLs. And we can see the
segmentation model implemented in the following slide. So again, this is a
relatively simplified model here, but when I started
thinking about this– I have a question. How many people have
had to implement 802.1x in an on-premise environment. So there’s a few people. This was port-based
network access control. So this was introduced at a
time when we no longer knew what endpoint was going
to plug into what switch, so where in the network
they might be, and be able to dynamically configure
that port with the associated rules for that endpoint. Well, the same thing can be
done with CPC firewall rules. Through the use of tags
or service accounts, you can have your
endpoints inherit the rules that they need in order to
access the correct endpoints. And you can do this
very similar to 802.1x, but without the pain of
having to manage TACACS. So with this combination
we have granular rules that are applied
dynamically within our VPC, and then as we start to
push out towards the edge, we have Cloud Armor and
Identity Aware Proxy. Cloud Armor will extend our
defense in depth strategy by adding layer seven DDOS and
web application firewalling to our apps. And IAP provides a way of
extending our applications to only specific users. In fact, we also
introduced quite recently a blog post that
talks about using IAP in place of bastion hosts. So now you can open up 22
to your individual hosts that you want to manage,
but ensure that’s only exposed through to
certain identities without the need of having
a separate management host. Then the next part, and this
is a really critical one because we’ve all heard
repeatedly these stories about misconfigurations
of access rules on a managed service
that end up exposing data to unauthorized clients. So how do we address this? Well, Google’s actually taken
a two-pronged approach to this. The first is a set of open
source security tools called Forseti, which give you the
ability to establish inventory policies and remediation
actions whenever changes are made within your environment. And the second is
VPC Service Controls, which allows you to establish
a trusted perimeter model around your VPCs and projects. So this is what it
looks like in practice. So here we have a project in
the VPC, and the associated services that we
want to protect. We also, through
the same mechanism of that private Google
access, have the ability to extend these services to
our on-premises environment and say, that’s a part
of our trusted perimeter. So when VPC Service
Controls is enabled, only resources from
within the perimeter can interact with
those services, and they can’t be
used to copy those to any external
unauthorized projects and prevent access
from the internet. So this protects against
that misconfiguration. Lastly, all of these
security controls are great, but centralized logging and SIM
solutions are not going away. So this is where VPC
flow logs and firewall logs come into play. The VPC flow logs provide
you with NetFlow style data, TCP dumped information
without the payload that gives you information
about the flows within your VPC environment. And firewall logs
gives you insight into what’s being allowed and
blocked by your VPC firewalls. So now that you have a sense
of the various products, the analogs to the on-prem,
what the options are in GCP, what we
want to do is try something a little different. As network
specialists, Zach and I get the chance to see
many different customer configurations. And then this next section,
we wanted to try and verbalize the thought process
that we go through when considering different sets
of customer requirements. Before we start
that, I just wanted to give you a few quick
points in terms of VPC design pre-work and recommendations. So the first is identifying
who your stakeholders are. This can vary depending
upon who you’re trying to design this VPC for. Is it for an individual
application, a line of business, your
entire organization? It’s important to
understand who you’re trying to address, and
make sure that you really understand their requirements. The second is to start with
security objectives and not security controls. Many times we see
customers come to GCP and say, how do I do X in GCP,
rather than thinking about why they’re doing X. So by starting
with the security objectives, you have a very
clear understanding of what you’re trying
to achieve and what your options are in the cloud. The next is understanding how
many VPCs you’re going to need. And I don’t mean coming
up with a static number, like 5, 6, 7, 10. The important thing
here is to understand what you’re trying to achieve,
where your scale and quota limits are going
to play, and get an overall understanding
of what magnitude you’re going to have to address. And lastly, think simple. Don’t design things
just because you can. I mean, we all know that
simplicity is directly correlated with supportability. So keeping it to
exactly what you need is going to be important. Anything you want to
add on this slide? ZACH SEILS: I think the point
of number three is relevant. It’s not a static number
you’re trying to get. It’s really a
pattern that you’re trying to use so that you can
grow and scale your network environment inside
the cloud in the most efficient manner possible. MATT NOWINA: So let’s start
with a simple scenario here. Zach, what do you see in
this initial scenario? ZACH SEILS: Sure. So again, the idea here
is this is basically our day job, right? So how do we kind of think
about approaching this out loud? So obviously here I’ve
got a single VPC, global, pretty straightforward. Looks like I have both
development and production workloads in the same VPC. They’re across
different regions, so they’re in different subnets. Single project, which
is also relevant when we look at how we scale out. And the other thing
that’s apparent here is it looks to me
like this is predominantly a cloud isolated workload. I don’t see any hybrid
connectivity back to the on-premises environment. So those are kind of the things
that initially pop out at me. MATT NOWINA: Any
best practices you think we can take
away from this design? ZACH SEILS: Yeah. I mean, I think a couple here
that really resonate with me, something you just mentioned,
which is start out simple. So I think this is a
very common approach that customers who are
either new to cloud can take just so they kind of
get their bearings in the cloud environment. Or, even customers that
are coming from other cloud providers, and
they’re just trying to get familiar
and have experience with how some of our networking
constructs may differ. So for example, the global
VPC, how does it actually behave in practice? So simplicity is a key one here. I think the other one
here, too, is just the small number of subnets
with larger address ranges. There’s no really reason to
kind of over-rotate and start creating a bunch of
subnets within a region. You can start with one with
adequate IP address space. We have a great feature
where you can actually grow the size of
your existing subnets in a completely hitless manner
to the VMs that are already deployed. This doesn’t preclude you from
creating multiple subnets. This is just what we see and
what we recommend for customers who are just getting started. One nice thing
here is that things are relatively programmatic
and easy to deploy and manage inside of Google Cloud. So even for core
infrastructure, stuff that takes a lot of
planning and a lot of effort to implement in an
on-premises environment, they can actually be pretty
disposable and pretty easy to kind of delete
and recreate inside of the cloud environment. So I think overall a
good starting point. One thing I do
notice here, though, and I think a pretty common
recommendation as customers start to scale in
the cloud, is really kind of starting to segregate or
put more kind of firm isolation between development and
production workloads. MATT NOWINA: So you
mean something that looks a little more like this? So in this design,
there is a couple of things that jump
out to me right away. I mean, the first is
exactly what you’ve sort of identified there. We’ve moved from a subnet
model as an isolation boundary between our
different environments, and into VPC level. So at VPC level, we’ve now
segregated firewall rule management into
two separate VPCs. And then the other
big one that jumps out is the hybrid connectivity. So here we can
see that we’ve now deployed cloud routers,
dedicated interconnect, and VLAN attachments into
each of the individual VPCs. ZACH SEILS: Yeah. I mean, just a point kind of
on the hybrid connectivity. If you notice here, we have
actually separate connectivity coming from on-premises
into each VPC. And this has virtual
connectivity. So if you’re using
VPN, for example, these are separate VPN tunnels. If you’re using
interconnect, this can be the same physical circuit
that you pair Google’s network on, but different logical
connections, those interconnect attachments I
mentioned previously that terminate in the
separate projects. You kind of extend the isolation
of the different environments all the way through
the connectivity back to your on-premises environment. So it’s really a clean kind of
separation which works well. MATT NOWINA: And as you’ve
mentioned in previous slides, because we now have
separate independent VPCs and there is no inherent
connectivity between them, we need to start thinking
about how does our application deployment look, where
are our build servers, where are we actually
deploying from. If we’re deploying from
on-premises, this works. We’ve got connectivity
between them. But if the build servers are
sitting in one of those VPCs, we now have to
start to look at VPC peering or connecting
those VPCs together in order to allow for that build
process or that deploy process to succeed. And when we start to think
about those peering, that’s what we need to
start thinking about, well, what are the aggregate
resource requirements when we mesh these
two VPCs together. So when you think about
this design, what’s the next logical extension? Where do people go from here? Especially if we
wanted to sort of align with that workflow
framework from earlier. ZACH SEILS: Yeah. So one thing I noticed
when Matt asked who doesn’t have network
or security teams anymore, there was like a few hands. And so, if I assume the inverse,
I assume that most of you still do. And so, one thing that we see
if we go to the next slide is this concept of shared
VPC that Matt talked about. And so, to me,
shared VPC is as much an organizational construct
as it is a set of technologies that you leverage in the cloud. And what I mean by
that is that shared VPC is designed for
organizations where you want to maintain
centralized administration and control of the network
and the security functions in the cloud. And so, shared VPC does that. Shared VPC is still
fundamentally a single VPC, but it’s a single
VPC that can be leveraged by multiple
projects, and it has a curated set of
permissions, IEM permissions. So there is a specific role for
network admins in the shared VPC, and they control,
as you might imagine, creating subnets, establishing
hybrid connectivity, establishing routing policies. There is an explicit
role for security admins in the shared VPC
model, who control firewall rules, et cetera. So if your organization
is structured in that way, and that’s an
organizational construct you intend to carry
over into the cloud, the shared VPC is a really
nice model for this. So here you can see we’ve got a
couple of things in play here. We’ve got actually
multiple different VPCs all still within a single project. And then the shared VPC model,
we call this a host project. So your host project has all
of your networking and security stuff– the VPC, the firewall
rules, the connectivity back to your on-premises
environments. And then you have one or
more service projects. Service projects are
separate projects that are usually given to the
application or development teams. In those separate projects, they
can spin up their own compute, they can spin up their
own Kubernetes clusters, they have the autonomy to manage
their workloads themselves. And these service projects
attach to the shared VPC to leverage those resources. So the networking and security
teams maintain control over the network and security,
the application teams, they manage their workloads,
they manage compute, they manage Kubernetes
clusters, et cetera. So I mean, in terms
of best practices, just thinking kind of thinking
about this a little bit– number one is I like simplicity. So we’ve got a
pretty simple model. It’s still the single VPC
kind of model, if you will, which again kind of just
reinforces the point that we want to
kind start to think about non-networking constructs
as the primary identifier for workload. So we don’t want just
the fact that you happen to be connected to
a VPC to say everything about what you can do. There’s a better way to
address that, which we’re going to talk about in a little bit. Another nice thing
about shared VPC is you actually have
pretty granular control on which portions of the VPC
are visible to the service projects. So for example, you can say
this particular group of service projects for service
XYZ, they can only see this particular subnet
or set of subnets in the VPC. So you can kind
of help people not make mistakes in terms of
deploying their workloads in the wrong location. So this works really well. It’s actually a
very common thing. It’s pretty popular with
enterprise customers, mostly, as I
mentioned, because of that organizational alignment. But a question, I guess,
back for you is what if you need to kind of scale this out? Because fundamentally we’re
dealing with a single VPC, what if things go
really well and you need to kind of scale up
to maybe tens of thousands of virtual machine instances? MATT NOWINA: Yeah, so
I think that’s when we start to see a move towards
this model, where now we are separating out
the host projects and going with a single
VPC per host project. This allows us to
more accurately align VPC and project quotas to
an individual host project, and allow them to scale
independently of one another. So no longer are you
going to be worried about if a dev resource
spins up things that your project relies
upon, because now they’re managed independently. The other big thing that this
design starts to introduce is segregation at the IEM level. So in the previous models when
you’re using the security admin role, within that
host project you would have had the ability
to modify firewall rules across any of the VPCs. In this model they’re now
independent of one another. So we can have different users
mapped to each of those host projects, and we’re starting to
see a scale out pattern here. What we’re talking about is
building that host project segregation at the
environment level. So prod, test, dev. But we could continue to
make this more granular if our application
requirements demand it, where we can create host
projects on individual line of business or application. ZACH SEILS: So
just so I’m clear, I just want to make sure I
understood one thing you said. So the segregation of
the IEM permissions, that’s because the
permissions are associated at the project level? MATT NOWINA: Correct. ZACH SEILS: OK. So by moving into
separate projects here, you’re giving your ability to
kind of delegate administration to different parts of
the cloud environment. MATT NOWINA: That’s right. As we continue to move out
in this scale model, what you can see is now
we are increasing the number of cloud
routers, increasing the number of VLAN attachments. And while it’s a software
defined network on the Google Cloud side, on the
on-premises side we need to think about
how we’re going to be managing those connections. So what if it’s a pain
on the on-premises side? What can we do to help
optimize for that? ZACH SEILS: Yeah,
it’s a great question. It’s a frequent feedback
we hear from customers. It’s like, it’s great. The network in the cloud
is software defined. I can spin things up and delete
things with relative ease. But when it starts to talk
about hybrid interconnectivity with your on-premises
environment, that may not always
be the case right so getting the
appropriate permissions to change configuration add
additional interfaces create new BGP peers. Sometimes customers
want to avoid this. And what they’re
really after basically is trying to leverage not only
the same physical connectivity with the Google Cloud network,
but the same logical or virtual connections with the
Google Cloud network. And so, we move on to
a different scenario here, which the actual
VPC structure here, which dev test and prod in
multiple projects, that really stays the same. The one big difference
here is we’ve actually leveraged another VPC
that we’re terming here in this presentation
a connectivity VPC. Now, this is not a separate
type of VPC that you check a box and say, I want this
to be connectivity. This just happens to be
a normal VPC that we’re using in a very specific way. So what we’ve
actually done here is we’re actually moving all of
the hybrid connectivity out of the individual
environment-based projects, and we’re putting that
into this connectivity VPC. And then we peer
the connectivity VPC with all of those
environment-based projects. So this actually
relies on a feature that we’ve just
recently released called VPC peering custom routes
that allows us to propagate routes that are
learned dynamically from your on-premises
environment across VPC peering relationships. So the routes that you
advertise into the Google Cloud environment that
specify what networks in your on-prem
environment are going to be reachable
from the cloud, we can now propagate those
routes all the way down to these environment-based VPCs. So a pretty powerful
capability, nice way to not only leverage the
same physical connectivity, but to really remove a lot of
the kind of hybrid connectivity constructs out of
those environment VPCs. Another common thing
I see as customers grow, a lot of customers,
especially larger enterprises, they may have multiple business
units or business entities inside the organization. And those entities pretty much
want to operate autonomously within the cloud, except
when it comes to paying for hybrid connectivity. They almost always
want to share that. So I want to use the same
kind of high speed connections I have with the Google
Cloud environment even though I may be an entirely
different business entity. So where we have
this segregation here by environment type, like
dev, test, and production, you could also think about that
to be separate logical business entities or small peer groups of
VPCs for a particular business entity. Name them whatever you want. Inevitably in that case,
there’s almost always services that you want to share
across those business groups, whether it’s active directory,
or source code repositories, or the CI/CD pipeline,
those resources are also a good fit for
this connectivity VPC because it has peer to access
to all of those downstream VPCs as well. So this acts–
again, this is really just about how you
use the networking constructs in the cloud versus
just going and saying this is going to be like
a connectivity VPC. The scope and use
of the VPC really defines its purpose inside
of the cloud environment. MATT NOWINA: So it’s
not like you’re limited to a single connectivity VPC. This is actually a
scale-out pattern. ZACH SEILS: Yeah, you
could scale it out. I’m working with
one customer now where they have 13 independent
business entities inside of their organization. Each one has their own kind of
connectivity VPC, if you will, and then they have a set of
small group of downstream VPCs that are environmentalized,
so dev, stage, production, et cetera. So I think we kind of–
we have one more, right? One more. So I think maybe we’ll do
a little curve ball here, just thinking about
what customers commonly kind of come to us with. I’m going to give this one
to you just because I can. So what if a customer needs
to bring a third-party network capability or device into the
cloud as a virtual machine? MATT NOWINA: Why would
you ever need that? ZACH SEILS: Let’s
just say it’s needed. How would that
influence and change the way you’re doing VPC
design and some of the things that we’ve talked about? MATT NOWINA: So I
mean, wouldn’t it be nice if there were cloud
native solutions for everything we wanted? But I mean, the reality is
that for any number of reasons there are going to be times when
you need to bring appliances into the cloud. You need to start to think
about what those deployment models look like. Now, the icons are a
little small at this point, because we’re starting to
group these things together. But what we wanted to
sort of call out here is the idea that
typically, these devices require multi-NICs. So you can think of an NGFW that
you wanted to do layer seven inspection on, or
something that’s going to act as a router
between your various VPCs. But there are specific
rules within GCP that actually require us
to modify the VPC design. So here, what you’re seeing
is that multi-NIC requires different VPCs for each
of the interface cards, and all of those VPCs have
to be in the same project. ZACH SEILS: So sorry,
Matt, just one second. So when you say
multi-NIC devices, you’re talking about a virtual
appliance for a security vendor or something? So it’s multiple interfaces. MATT NOWINA: Exactly. ZACH SEILS: OK. MATT NOWINA: Yeah So the best
practices, the considerations you need to bring into
play when you’re thinking about these devices, though,
is how are you routing traffic to those devices, are you
modifying the default route, what are the implications to
accessing managed services when you’re no longer using
the default internet gateway, instead using
a third-party device, as well as thinking
about high availability. So how are you ensuring
that this device is up and forwarding packets? How do you health check it? Each one of these are
important considerations, but there is a deployment
pattern for using these devices within GCP. ZACH SEILS: Good stuff. So I think we’ve covered a
number of different scenarios. I think this is
pretty commonly what we see from customers in
terms of how they start their journey really specific
to networking inside the cloud. Again, the premise here
is just start simple. It’s easy to kind
of expand and evolve over time as your requirements
here need change in the cloud environment. There’s no reason to kind of
overly engineer from a design perspective. Also, think less about
the traditional networking constructs like placement in
the network and IP addresses and subnet membership as really
kind of primary identifier of a particular workload. As Matt mentioned,
we have capabilities within the firewall to
do micro-segmentation that can actually follow a
particular virtual machine instance or humanities
pod regardless of where it happens to be
instantiated in the network. So this makes actually
creating the policies and maintaining those policies
inside of our environment much, much more simple
by leveraging these kind of abstractions for identity. And so, I think that– we have something else, right? You have something else
you want to talk about. MATT NOWINA: So I almost
wish I had a black turtleneck on for this part. As much as Zach and
I would like to think that a 40-minute presentation on
some example of VPC deployment scenarios would be enough
for you to sort of rethink your designs, we actually
have gone one step further. So we believe that as a
cloud service provider, the obligation is on
us to provide best practices for these things. You would have seen,
as we were going through these
deployments, that we had different sets
of best practices that go down the side. So I am very pleased to
announce during this session that we have just made live our
VPC best practices and design guide. So the links that
you see up here are links to all
the things that we think are important when first
moving into your VPC design decision. So we want you to
start by thinking about your organization, your
line of business, your project. What are you trying
to design for? And then use these links. So bit.ly/NET201-VPC for our
VPC best practices guide. NET201-ENT for best practices
for enterprise organizations. NET201-POLICY for understanding
policy design as it applies to enterprise customers. You may have also noticed
in the past few months we launched a new Coursera
course for networking in the Google Cloud Platform. We encourage you
to use that if you want to continue to
get your hands dirty, understand how these different
components work together. And then get started
today, build something, understand that this VPC
design that you implement now uses all the information that
you have available to you, but it may not
apply in the future. There may be iterations,
and that’s OK. And lastly, showcase
your skills. So another thing
that we’ve worked on in the last few months is to
launch the Google Certified Professional Network Engineer. And I’ve been told there’s
some pretty cool swag if you decide to go and write the test
either today or in the future. Additionally, we
have a special guest with us today who’s actually
the author of the VPC best practices– Mike Columbus, who will be
signing autographs if anyone wants. Mike, put your hand up. He’s over there in the audience
if anyone wants to talk to him. But ZACH SEILS: Also, Matt, you also
worked on the certification. MATT NOWINA: Yeah. ZACH SEILS: Perfect. So one more slide. Again, the Dory’s open. It will stay open through
April 25, as we mentioned. Please add your questions
there now or in the future. We really look forward to
engaging with you online. And take the survey. Let us know how we did, let
us know what things you want to hear about in the future,
whether it’s in print, online, whether it’s in the
future Next sessions. We really, really appreciate
you taking the time to come to our
session, and we hope you enjoy the rest
of your week at Next. MATT NOWINA: Thank
you very much. [MUSIC PLAYING]

1 thought on “Cloud Networking for the Hybrid Enterprise (Cloud Next ’19)

Leave a Reply

Your email address will not be published. Required fields are marked *