Nothing But NAT - Part 4
Layer 2 NAT / L2NAT
Just when you think you’ve got a handle on NAT, Cisco comes along and introduces a technology called Layer 2 NAT. that of course also gets implemented in certain SKUs of Allen-Bradley Stratix as well. The purpose of this post is to introduce L2NAT, what I think about it’s use, and where I think marketing (shocker!) has been misleading.
What is L2NAT and how is it different from traditional (Layer 3) NAT?
Pulled from the Allen-Bradley Stratix 5700™ Network Address Translation (NAT) [Publication ENET-WP032A-EN-E – August 2013]
Ok, noted on above. Are there any other issues L2NAT was created to solve?
Between Cisco Live presentations and info in Allen-Bradley marketing sheets, the marketers highlight the fact the traditional NAT solution for multiple machines requires multiple boxes - one L3 NAT device per machine.
Exhibit 1 - from a Cisco Technical Marketing Engineer at Cisco live, of course you can use an L3 device but “This is a multibox solution. Sometimes requiring a L3 device per machine.
Exhibit 2 - Allen-Bradley describes Layer 2 NAT implementation as allowing for a “scalable, high performance, single box solution.”
Well a single box solution DOES sound better. So what’s your beef?
With traditional (Layer 3) NAT, one of the inherent benefits of having a Layer 3 boundary is broadcast domain isolation between the Private and Public segments. You could incorporate, for example, 10/20/100 OEM machines into your plant network using L3 NAT and know that you weren’t adding to the broadcast domain of the plant network. These notes below suggest for L2NAT, perhaps unsurprisingly, that no similar broadcast domain isolation exists and therefore adding these 10/20/100 OEM machines into a plant network would have direction broadcast domain impact.
Pulled from Deploying Network Address Translation within a Converged Plantwide Ethernet Architecture, Design and Implementation Guide, April 2016 [Publication ENET-TD007B-EN-P]
Ok yeah all that sounds terrible. Why would anyone use L2NAT then?
This is where things start to get murky. In reviewing the two documents previously referenced along with Stratix Managed Switches User Manual [1783-um007_-en-p], more nuance is provided. All of these documents have different architecture examples of L2NAT deployments and it’s typically a list like the following:
Single Skid/Machine Aggregated by One NAT Switch, Single VLAN
Single Skid/Machine Aggregated by One NAT Switch, Multiple VLANs
Multiple Skids/Machines Aggregated by One NAT Switch, Multiple VLANs
or
Example 1: Using NAT With A Layer 3 Uplink
Example 2: NAT In A Ring Topology With A Layer 3 Uplink
Example 3: Using NAT With A Layer 2 Uplink
Example 4: Machine to Machine Communication
There’s a subtle but CRITICAL difference in these architectures. While some employ what I could call standalone L2NAT or what the marketing earlier referred to as a “single box solution”, many of the architectures necessitate a Layer 3 switch.
Furthermore in the two excerpts below, they’re saying they REALLY want you to route traffic through a Layer 3 switch as the standalone L2NAT application is only appropriate for when “Only a few inside devices need to talk to the outside network” and for “Small number of skids of machines.” Then another reminder is provided that with standalone L2NAT there is “No true Layer 2 segmentation between skids or machines and plant-wide network.”
So the “single box solution” like below that they market as scalable, they seem to be explicitly calling NOT scalable in the image above. It doesn’t help that the IMPORTANT note that below isn’t best practice seems to only appear in one of the multiple L2NAT documents and webpages I reviewed. All of the rest leave this guidance out, which seems problematic.
Any more complaints?
Let’s assume a customer missed the ONE document advising them to add a Layer 3 switch into the architecture and extended the use case above plant-wide to incorporate 24 skids/machines, each with their own Stratix 5700 NAT performing standalone L2 NAT. Any why wouldn’t they? Those other documents said this was a “scalable, high performance, single box solution” right? Well Layer 3 NAT and L2NAT share a common requirement that:
A private-to-public translation for each device on the private subnet that communicates on the public subnet.
But standalone L2 NAT uniquely adds the following requirement:
A public-to-private translation for each device on the public subnet that communicates on the private subnet.
Think about that for a second. You deploy 24 Stratix switches and setup the NAT translations on day 1 for the Line Controller PLC to be able to communicate with each of the 24 machine PLCs. 6 months later you stand up Ignition or name-your-favorite SCADA application and would like it also to communicate to each of the machine PLCs. In Layer 3 NAT world, unless that rule you wrote was explicitly source restrictive (not default), you have no further work to do. SCADA would already be able to reach those machine PLCs using the same private-to-public translation that the Line Controller is using. But in standalone L2NAT we now need to TOUCH EACH OF THOSE 24 Stratix configurations to add the missing public-to-private translation for SCADA.
In the L2NAT where you add a Layer 3 switch, this cumbersome requirement goes away in favor of a simplified specialized Gateway Translation parameter.
So make no mistake, the L2NAT version that is best practice, REQUIRES the purchase/deployment/ultimately use of….ANOTHER box - a Layer 3 switch. In other words, the BEST PRACTICE is NOT standalone L2NAT and NOT a “single box solution.”
So, you hate L2NAT?
Not exactly.
Cisco or Rockwell will say the L3 NAT version of solving the same situation as the image above requires one L3 NAT appliance per skid/machine (refer back to Image 2) but I would counter:
1) You could utilize a Hirschmann RSPx5, a Moxa EDR, a Phoenix Contact NAT Switch, etc. to cover both the switch and the L3 NAT needs, many of which can also address the Ring architecture support mentioned at the end of the L2NAT pitch in Image 1. Also, I can’t help but notice one might be able to call THIS a single box solution insofar as the way AB uses the term, but THIS would be a single box solution with broadcast domain isolation and no requirement to define all public hosts in NAT translations :)
2) You could do something like we did in Nothing But NAT - Part 3, where you use whatever switches you want in the skids, and configure overlapping NAT in a single NGFW (or more than 1 in a cluster for redundancy) and accomplish the same thing and I might argue the “management” is a little easier with all the machines NAT rules being in a single appliance vs spread out across all the individual L2 NAT switches. This uses LESS boxes, as we’re using a single NAT appliance to cover multiple machines :)
Ultimately, I accept L2NAT as another tool in the toolbelt to evaluate for a given application. I just wish it was marketed/messaged differently. As the situation stands, my neck hurts from the whiplash of marketing L2 NAT as a “scalable, single box solution” and elsewhere flip-flopping that it shouldn’t be a single box solution as best practice.
Thanks for coming to my TED talk!