Wednesday, May 22, 2019

IP Broadcasting and Multicasting in the Cloud

Broadcasting and Multicasting in the Cloud

In public clouds such as Amazon EC2Google Compute Engine and Microsoft Azure, native support for multicast and broadcast is missing. In fact, on AWS it has been on the "to do" list since 2009 see https://forums.aws.amazon.com/thread.jspa?messageID=280285 . Broadcast & multicast are integral parts of today's network solutions and this is a missed opportunity for all public cloud platforms.

Additionally, in public clouds Layer 2 access is generally limited by design of VPC, Security Groups and ACLs. This makes public clouds networking very different from datacenter, where there's usually full L2 access (even across VLANs using L2 routing methods such as SVI).

Broadcasting, Multicasting, Anycasting & Unicasting

Before delving into broadcast and multicast, let's take a look at the most common addressing mode in IP networks - unicast. In IP network, the most common addressing mode is unicast where 2 hosts on the network can communicate with each other. It's a typical server client topology. The vast majority of Internet is unicast where servers serve continuous request of billions of client end point (mobile, IoT and traditional PCs, laptops). The reason behind this architecture is based on the type of protocol used (TCP). TCP is preferred because of it's guaranteed delivery and recovery mechanism. Since TCP is only unicast, we have majority of the internet as unicast. Please note that UDP on the other hand can be used with unicast, multicast & broadcast packets.  

In a broadcast addressing mode (see RFC919 in October 1984), a packet is address to all hosts in a local network rather than being sent/addressed to a single host.

In multicast (see RFC966 in December 1985), which is basically a subset of broadcast mode, a packet is not addressed to all hosts but it's instead addressed to a group of hosts called a "multicast group". Multicast groups are dynamic by default where any host can join and leave it "on the fly" and rejoin using a protocol called IGMP (Internet Group Management Protocol). A multicast group is defined by an IP address which can range from multicast reserved range (224.0.0.0 - 239.255.255.255).

On a host joining the multicast group, it will start receiving messages addressed to the group. For multicast addressing, UDP (User Datagram Protocol). 

When a host is joined to a multicast group, it receives messages addressed to the group. The protocol that is most commonly used with multicasting is the User Datagram Protocol (UDP). UDP is a very flexible protocol that can work with any addressing mode. TCP on the other hand works with unicast only.

IPv4 has only unicast, multicast & broadcast.
IPv6 has unicast, anycast, multicast & broadcast.

Anycast is a relatively newer addressing mode (kind of a subset of multicast) where a packet is sent of only a single host within a multicast group. Please note that anycast is present in IPv6 only.

Broadcasting and Multicasting at Layer 2

At layer 2, we deal with Ethernet which is the most prevalent Layer 2 protocol used today & a PDU here is called as a "frame". The ethernet frame has embedded source and destination MAC address, also called as a MAC address which is 48 bits hexadecimal address such as 01:23:45:67:89:01. 6 octets with the first 3 octets used as OUI (Organizationally Unique Identifier) and last 3 octets used exclusively to identify the device. Within the OUI's first octet the least significant bit (b0) identifies whether addressing is multicast or unicast and bit (b1) second least significant bit signifies whether the MAC address is locally or universally administered (locally unique or universally unique). 

So for example 06:00:00:00:00:00, where the first octet (06) is also represented as 00000110 has the b1 bit as 1 which means this is a locally administered address and not universally unique.

Now, a MAC address in an Ethernet frame is considered unicast if the b0 bit is set to 0 and broadcast if b0 bit is set to 1. In the above example of MAC 06:00:00:00:00:00, we have LSB in first octet set to 0 (06 = 0110) and hence the MAC address is unicast which means the frame to which this address belongs is a unicast PDU and is encompassing a unicast packet and is meant to reach only a single host/NIC/node unlink a broadcast frame which will be delivered to all hosts/nodes/NICs in the collision domain. For multicast as well, this bit (b0) is set to 1 with the caveat that it is broadcast to only those hosts which have joined a specific multicast group!

When an IP unicast packet is passed to layer 2 so that it can be sent to the next hop, it is wrapped in an unicast Ethernet frame. The MAC address of the next hop is determined using a protocol called the Address Resolution Protocol (ARP, which incidentally uses broadcast Ethernet frames to find out the Mac address for a given IP). If a switch is unaware of the port which leads to a given MAC unicast address in the frame then it will forward the unicast frame to all of it's port (except the originating port), an action known as unicast flood

IP broadcast and multicast do not use ARP. IP broadcasts are always sent to the "all-ones" Ethernet address ff:ff:ff:ff:ff:ff. Since the low bit of the high byte is a 1, this is a broadcast address, and it will be delivered to all hosts on the L2 network. IP multicast instead uses a formula to convert the IP multicast group address to an Ethernet address. This formula is described in RFC1112. The group address 224.1.2.4 for example is translated to 01:00:52:01:02:04. The mapping is not unique: multiple group addresses correspond to the same broadcast address on the Ethernet.

Applications using IP Multicasting

While many more applications use unicast addressing, multicasting does have a few important use cases. The two main areas seem to be infrastructure for high availability solutions, and to implement "zero config" discovery mechanisms. 
Examples of high availability solutions that use multicasting are the well-known keepalived (an implementation of Cisco's Virtual Router Redundancy Protocol or VRRP), uCarp, the Red Hat Cluster Suite (based on the open source Corosync/OpenAIS projects) and JGroups. In this category, there is also the venerable Veritas Cluster Server (VCS). It should be mentioned that some of these projects have grown unicast support recently, exactly because of the lack of multicasting in the cloud. However in all cases the most optimal solution is to use multicasting. The multicast networking in this category is used to send "heartbeat" messages. All nodes listen to these messages. If, at some point, a message is not received for a certain amount of time, the nodes assume something went wrong and can start a corrective action. At Layer 2 level, many solutions such as MSCS (Microsoft Clustering Services) and many other solutions also use multicast to send "heartbeat" messages
Examples of discovery solutions that use multicasting include the Apple Bonjour/Zeroconf protocol (also known as multicast DNS or DNS service discovery), the Java NoSQL databases Hazelcast and EhCache, and the Oracle Grid Infrastructure. These solutions use multicast to announce a presence or a status on the network, without having to explicitly configure which other nodes exist.

Conclusion

People have tried to work around the lack of multicasting using various OS level tools. A few interesting ones are using n2n to set up a peer to peer L2 VPN between virtual machines, or using various approaches to turn multicast into unicast. Some of these approaches may have valid use cases. That said, in all cases, these solutions add significant complexity, and push what is essentially a network responsibility back into the OS.