Friday, 15 September 2023

NSX 4.x Site Resiliency model - Continued...

 

Hello everyone!

Let’s continue our discussions about Site Resiliency model offered by NSX not through Multisite but through federation. What benefits we may have and what improvements we may have. This is all what we are going to discuss here in this topic.

NSX Federation (a Quick Brief)

Unlike Multisite NSX architecture, NSX Federation does not require to configure MTU over WAN or at provider side to be changed from typical (Default) value to 1700+. It’s a big change at infrastructure configuration or requirement level.

NSX Managers can be on different geographical locations despite of 10ms RTT problem. Because, the global objects will penetrate into local NSX Managers through Async Replicator Service through Application Proxy Hub (APH) offered by Global NSX Managers. It only replicates Clusters with other site clusters not amongst the nodes of one cluster (or inside a cluster). 

Whereas the distance in between Global managers (within same NSX Manager Cluster)  (Active / Standby instances) must not go beyond 10ms but NSX Managers Active and Standby instances can have upto 500ms RTT. As show in the figures below respectively for each scenario. Below Scenario is NSX Global Manager Stretch architecture (Active Global Manager Cluster only).

Fig 1.1

 Figure above shows the distance in mili-second time amongst the nsx manager instances. 

And below figure shows the cluster to cluster “Async Replicator” activity to synchronise and help assisting stretch architecture used in federation. 

Fig 1.2

Another major benefits you can have using federation is, you don’t need to configure bigger MTU (as required for VxLAN configs). It always go as default but within site (Local not stretched) , yes you need to configure the same MTU 9000+.

Even in between Edges across sites, these Edges are known as RTEPs to each other. These can further chunk down the MTU more less than 1500. So, below picture is going to explain a high level overview of a federation 

Fig 1.3

There are many different options or scenarios that we can explore to design a solution for NSX Federation. Below are some that I am going to explain in a bit details

Federation with Stretch Active Global NSX Manager Cluster

This scenario is useful and feasible only in cases where sites / regions are not so far but fall under the distance of 1ms to 10ms (or upto 150ms only incase of NSX 4). 

In such scenario, you can build Global Manager cluster (Active only) with each GM instance in each site making it Active per site with GSLB integration and LM (local NSX Managers) also in the same topological model as explained in Fig 1.1.

It doesn’t require additional vCenter server per-site and only one vCenter Server is sufficient in this scenario [Ref: NSX Design Guide 4.1 v.1.3 – Multisite page 27]. Best suitable for Metropolitan network-based scenarios or intera-city Branch/Data-center Availability zones. Even there is no need for vCenter server ELM.

Federation with Stretch Active/Stand-by Global NSX Manager Cluster

This scenario is useful and feasible only in case when your organization regions are quite far from one another having more than 150ms RTT. In this scenario, it is recommended to put all the NSX Managers (Global or Local) in same DC and connect (Local NSX Manager cluster per site) them at max 10ms/500ms RTT from one another as shown in above picture Fig 1.3

It is also worth noted that with the introduction of NSX 4.1.x now the maximum RTT amongst NSX Manager clusters is now 500ms instead of 10ms [Ref: NSX Design Guide – Multi-location page-81 ]  

Thanks for your valuable time 😊 My next topic would be how to recover NSX 4.x Global Manager (Active Cluster) if it is unavailable and replaced by Standby Global Manager.

Always, waiting for your valuable inputs for all my articles I am writing 😊

Thursday, 17 August 2023

NSX Cross Site Design and Architecture

 

The NSX Multi-site Architecture 

By – Adnan  Hussain (VCIX – NV) 

Hi there, If you are looking for Multisite topologies based on applications availability across sites, may be in the form of stretch clusters or in the form of Active/Active Sites or Active/Stand-by sites architecture then you are at the right place. 

I am writing this article in series / episodes to have a connection with you all and can discuss different aspects using this portal to address your understanding and to learn more

In-order to have Applications available across multiple sites even from on-premise to public cloud tenancies, you keep your infrastructure ready and expandable or responsive to difference architectural challenges and changes. 

VMware being pioneer in providing Infrastructure relevant software like 




All above technologies run on top of software logics with no dependency on Hardware make and model (only needed x86 architecture).

I will discuss all above listed technologies individually in detail but first of all let’s start with NSX and its Multisite capability.

NSX-Multisite is still basically divided into two major topological Structures that you can choose one depending on your need or requirement.

1. Multisite – Traditional architecture 
2. Federation – Advanced architecture 

Multisite – Traditional Architecture 

This is simple to understand and deploy with maximum of upto 3 sites having two options to adapt in this topology implications

1. Single-site NSX Manager Architecture 
2. Multi-site NSX Manager Architecture 

So let’s discuss, single-site NSX Manager Architecture in more details.

In this case, all the instances of NSX Manager are required to be in the same site and Data Plane is needed to be spanned across multiple (Three) sites including ESXi Hosts and Edge Nodes.

  • The Round Trip Time (RTT) between NSX Managers must not go beyond 10 millisecond if stretched across sites or Rack or Datacenter(s).



  • The Round Trip Time (RTT) amongst Transport Nodes must not go beyond 150 milliseconds if stretched across sites or racks or DC(s).

  • And, in this topology (Multi-site) you must follow to configure MTU 1700+ end to end. 
Below diagram can illustrate a high level construct of physical and virtual network binding with NSX provided traditional multi-site requirement.


In the light of above explanation, let’s dig-out some more options and probabilities. 

Let’s say, we have a requirement of high bandwidth (consumption) based application with Active/Active Instance requirement on two different sites located far off from each other of having 5000+ KM approx. (across countries). 

What would be the plan or deployment strategy of infrastructure services and network design in this case. Though the Applications are core internal business apps and are not required to be published through 3rd party ISP/Cloud services.

In the above explained example, application provided infrastructure should be designed as below or close to below design solution and justifications.


Above information is derived from VMware online resources for VCF.


There are more things to discuss with this design, which we can talk about one by one slowly to make up the pace and digest this information easily.

My Next topic in continuation to this series would be site resiliency models and their impact using multi-site architecture. Stay Tuned 😊 …
 
Follow me on LinkedIn: www.linkedin.com/comm/mynetwork/discovery-see-all?usecase=PEOPLE_FOLLOWS&followMember=adnan-hussain-69750823





















NSX 4.x Site Resiliency model - Continued...

  Hello everyone! Let’s continue our discussions about Site Resiliency model offered by NSX not through Multisite but through federation. Wh...