.. _config-bgp-floating-ip-over-l2-segmented-network:

==========================================
BGP floating IPs over l2 segmented network
==========================================

The general principle is that L2 connectivity will be bound to a single rack.
Everything outside the switches of the rack will be routed using BGP. To
perform the BGP announcement, neutron-dynamic-routing is leveraged.

To acheive this, on each rack, servers are setup with a different management
network using a vlan ID per rack (light green and orange network below).


.. image:: figures/bgp-floating-ip-over-l2-segmented-network.png


On the OpenStack side, a provider network must be setup, which is using a
different subnet range and vlan ID for each rack. This includes:

* an address scope

* some network segments for that network, which are attached to a named
  physical network

* a subnet pool using that address scope
* one provider network subnet per segment (each subnet+segment pair matches
  one rack physical network name)

A segment is attached to a specific vlan and physical network name. In the
above figure, the provider network is represented by 2 subnets: the dark green
and the red ones. The dark green subnet is on one network segment, and the red
one on another. Both subnet are of the subnet service type
"network:floatingip_agent_gateway", so that they cannot be used by virtual
machines directly.

On top of all of this, a floating IP subnet without segment is added, which
spans in all of the racks. This subnet must have the below service types:

* network:routed

* network:floatingip

* network:router_gateway

Since the network:routed subnet isn't bound to a segment, it can be used on all
racks. As the service types network:floatingip and network:router_gateway are
used for the provider network, the subnet can only be used for floating IPs and
router gateways, meaning that the subnet using segments will be used as
floating IP gateways (ie: the next HOP to reach these floating IP / router
external gateways).


Configuring the Neutron API side
--------------------------------

On the controller side (ie: API and RPC server), neutron-dynamic-routing-common
and python3-neutron-dynamic-routing must be installed. On top of that,
"segments" and "bgp" must be added to the list of plugins in service_plugins.
For example in neutron.conf:

  .. code-block:: ini

     [DEFAULT]
     service_plugins=router,metering,qos,trunk,segments,bgp


The BGP agent
-------------

The neutron-bgp-agent must be installed. Best is to install it twice per rack,
on any machine (it doesn't mater much where). Then each of these BGP agents
will establish a session with one switch, and advertise all of the BGP
configuration.


Setting-up BGP peering with the switches
----------------------------------------

A peer that represents the network equipment must be created. Then a matching
BGP speaker needs to be created. Then, the BGP speaker must be
associated to a dynamic-routing-agent (in our example, the dynamic-routing
agents run on compute 1 and 4). Finally, the peer is added to the BGP speaker,
so the speaker initiates a BGP session to the network equipment.

  .. code-block:: console

     $ # Create a BGP peer to represent the switch 1,
     $ # which runs FRR on 10.1.0.253 with AS 64601
     $ openstack bgp peer create \
           --peer-ip 10.1.0.253 \
           --remote-as 64601 \
           rack1-switch-1

     $ # Create a BGP speaker on compute-1
     $ BGP_SPEAKER_ID_COMPUTE_1=$(openstack bgp speaker create \
           --local-as 64999 --ip-version 4 mycloud-compute-1.example.com \
           --format value -c id)

     $ # Get the agent ID of the dragent running on compute 1
     $ BGP_AGENT_ID_COMPUTE_1=$(openstack network agent list \
           --host mycloud-compute-1.example.com --agent-type bgp \
           --format value -c ID)

     $ # Add the BGP speaker to the dragent of compute 1
     $ openstack bgp dragent add speaker \
           ${BGP_AGENT_ID_COMPUTE_1} ${BGP_SPEAKER_ID_COMPUTE_1}

     $ # Add the BGP peer to the speaker of compute 1
     $ openstack bgp speaker add peer \
           compute-1.example.com rack1-switch-1

     $ # Tell the speaker not to advertize tenant networks
     $ openstack bgp speaker set \
           --no-advertise-tenant-networks mycloud-compute-1.example.com


It is possible to repeate this operation for a 2nd machine on the same rack,
if the deployment is using bonding (and then, LACP between both switches),
as per the figure above. It also shall be done on each rack. One way to
deploy is to select two computers in each rack, and install the
neutron-dynamic-routing-agent on each of them, so they can "talk" to both
switches of the rack.


Setting-up physical network names
---------------------------------

Before setting-up the provider network, the physical network name must be set
in each host, according to the rack name.

This shall be done in /etc/neutron/plugins/ml2/ml2_conf.ini as:

  .. code-block:: ini

     [ml2_type_flat]
     flat_networks = physnet-rack1

     [ml2_type_vlan]
     network_vlan_ranges = physnet-rack1


Once this is done, the provider network can be created, using physnet-rack1
as "physical network".


Setting-up the provider network
-------------------------------

Everything that is in the provider network's scope will be advertised through
BGP. Here is how to create the network scope:

  .. code-block:: console

     $ # Create the address scope
     $ openstack address scope create --share --ip-version 4 provider-addr-scope


Then, the network can be ceated using the physical network name set above:

  .. code-block:: console

     $ # Create the provider network that spawns over all racks
     $ openstack network create --external --share \
           --provider-physical-network physnet-rack1 \
           --provider-network-type vlan \
           --provider-segment 11 \
           provider-network


This automatically creates a network AND a segment. Though by default, this
segment has no name, which isn't convenient. This name can be changed though:

  .. code-block:: console

     $ # Get the network ID:
     $ PROVIDER_NETWORK_ID=$(openstack network show provider-network \
           --format value -c id)

     $ # Get the segment ID:
     $ FIRST_SEGMENT_ID=$(openstack network segment list \
           --format csv -c ID -c Network | \
           q -H -d, "SELECT ID FROM - WHERE Network='${PROVIDER_NETWORK_ID}'")

     $ # Set the 1st segment name, matching the rack name
     $ openstack network segment set --name segment-rack1 ${FIRST_SEGMENT_ID}


Setting-up the 2nd segment
--------------------------

The 2nd segment, which will be attached to our provider network, is created
this way:

  .. code-block:: console

     $ # Create the 2nd segment, matching the 2nd rack name
     $ openstack network segment create \
           --physical-network physnet-rack2 \
           --network-type vlan \
           --segment 13 \
           --network provider-network \
           segment-rack2


Setting-up the provider subnets for the BGP next HOP routing
------------------------------------------------------------

These subnets will be in use in different racks, depending on what physical
network is in use in the machines. In order to use the address scope, subnet
pools must be used. Here is how to create the subnet pool with the two ranges
to use later when creating the subnets:

  .. code-block:: console

     $ # Create the provider subnet pool which includes all ranges for all racks
     $ openstack subnet pool create \
           --pool-prefix 10.1.0.0/24 \
           --pool-prefix 10.2.0.0/24 \
           --address-scope provider-addr-scope \
           --share \
           provider-subnet-pool


Then, this is how to create the two subnets. In this example, we are keeping
the addresses in .1 for the gateway, .2 for the DHCP server, and .253 +.254,
as these addresses will be used by the switches for the BGP announcements:

  .. code-block:: console

     $ # Create the subnet for the physnet-rack-1, using the segment-rack-1, and
     $ # the subnet_service_type network:floatingip_agent_gateway
     $ openstack subnet create \
           --service-type 'network:floatingip_agent_gateway' \
           --subnet-pool provider-subnet-pool \
           --subnet-range 10.1.0.0/24 \
           --allocation-pool start=10.1.0.4,end=10.1.0.252 \
           --gateway 10.1.0.1 \
           --network provider-network \
           --network-segment segment-rack1 \
           provider-subnet-rack1

     $ # The same, for the 2nd rack
     $ openstack subnet create \
           --service-type 'network:floatingip_agent_gateway' \
           --subnet-pool provider-subnet-pool \
           --subnet-range 10.2.0.0/24 \
           --allocation-pool start=10.2.0.4,end=10.2.0.252 \
           --gateway 10.2.0.1 \
           --network provider-network \
           --network-segment segment-rack2 \
           provider-subnet-rack2


Note the service types. network:floatingip_agent_gateway makes sure that these
subnets will be in use only as gateways (ie: the next BGP hop). The above can
be repeated for each new rack.


Adding a subnet for VM floating IPs and router gateways
-------------------------------------------------------

This is to be repeated each time a new subnet must be created for floating IPs
and router gateways. First, the range is added in the subnet pool, then the
subnet itself is created:

  .. code-block:: console

     $ # Add a new prefix in the subnet pool for the floating IPs:
     $ openstack subnet pool set \
           --pool-prefix 85.125.24.0/24 \
           provider-subnet-pool

     $ # Create the floatin IP subnet
     $ openstack subnet create vm-fip \
           --service-type 'network:routed' \
           --service-type 'network:floatingip' \
           --service-type 'network:router_gateway' \
           --subnet-pool provider-subnet-pool \
           --subnet-range 85.125.24.0/24 \
           --network provider-network

The service-type network:routed ensures we're using BGP through the provider
network to advertize the IPs. network:floatingip and network:router_gateway
limits the use of these IPs to floating IPs and router gateways.

Setting-up BGP advertizing
--------------------------

The provider network needs to be added to each of the BGP speakers. This means
each time a new rack is setup, the provider network must be added to the 2 BGP
speakers of that rack.

  .. code-block:: console

     $ # Add the provider network to the BGP speakers.
     $ openstack bgp speaker add network \
           mycloud-compute-1.example.com provider-network
     $ openstack bgp speaker add network \
           mycloud-compute-4.example.com provider-network


In this example, we've selected two compute nodes that are also running an
instance of the neutron-dynamic-routing-agent daemon.


Per project operation
---------------------

This can be done by each customer. A subnet pool isn't mandatory, but it is
nice to have. Typically, the customer network will not be advertized through
BGP (but this can be done if needed).

  .. code-block:: console

     $ # Create the tenant private network
     $ openstack network create tenant-network

     $ # Self-service network pool:
     $ openstack subnet pool create \
           --pool-prefix 192.168.130.0/23 \
           --share \
           tenant-subnet-pool

     $ # Self-service subnet:
     $ openstack subnet create \
           --network tenant-network \
           --subnet-pool tenant-subnet-pool \
           --prefix-length 24 \
           tenant-subnet-1

     $ # Create the router
     $ openstack router create tenant-router

     $ # Add the tenant subnet to the tenant router
     $ openstack router add subnet \
           tenant-router tenant-subnet-1

     $ # Set the router's default gateway. This will use one public IP.
     $ openstack router set \
           --external-gateway provider-network tenant-router

     $ # Create a first VM on the tenant subnet
     $ openstack server create --image debian-10.5.0-openstack-amd64.qcow2 \
           --flavor cpu2-ram6-disk20 \
           --nic net-id=tenant-network \
           --key-name yubikey-zigo \
           test-server-1

     $ # Eventually, add a floating IP
     $ openstack floating ip create provider-network
     $ openstack server add floating ip test-server-1 85.125.24.17

Verification
------------

If everything goes well, the floating IPs are advertized over BGP through the
provider network:

  .. code-block:: console

     $ # Check the advertized routes:
     $ openstack bgp speaker list advertised routes \
           mycloud-compute-4.example.com
     +-----------------+-----------+
     | Destination     | Nexthop   |
     +-----------------+-----------+
     | 85.125.24.7/32  | 10.1.0.48 |
     | 85.125.24.12/32 | 10.1.0.65 |
     | 85.125.24.16/32 | 10.2.0.23 |
     | 85.125.24.17/32 | 10.2.0.35 |
     +-----------------+-----------+
