On Fri, Oct 18, 2019 at 10:49 AM Arturo Borrero Gonzalez
<aborrero(a)wikimedia.org> wrote:
On 10/18/19 2:19 PM, Arturo Borrero Gonzalez wrote:
Ok for the floating IP!
The domain name, I would use what Bryan suggests:
* k8s.tools.eqiad1.wikimedia.cloud
* k8s.tools.eqiad.wmflabs
And the counterparts:
* k8s.toolsbeta.eqiad1.wikimedia.cloud
* k8s.toolsbeta.eqiad.wmflabs
regards.
The final question. I may have found a contradiction.
If we are going to use a floating IP for this (we just agreed on that), then
using 'tools.eqiad1.wikimedia.cloud' may be wrong, since that's meant to
hold
private IP addresses.
Shall we step back and consider the 'wmcloud.org' domain?
I miss information on how to handle floating IPs wrt. the new domains. Just
asking for the additional clarification.
I think is important we are having this debate (and clarifying the examples in
the wiki), this will make things easier in the future.
Today the only floating IP pool we have in eqiad1 is the IPv4
publically routable address space labeled "wan-transport-eqiad"
(185.15.56.0/25)[0]. An address drawn from this pool today would
typically be associated with a *.wmflabs.org FQDN. One example:
185.15.56.18 is a floating IP associated with the mx-out01 instance in
the cloudinfra project with an A record of
mx-out01.wmflabs.org and
PTR records of
mx-out01.wmflabs.org and
instance-mx-out01.cloudinfra.wmflabs.org.
The agreed replacement for *.wmflabs.org is *.wmcloud.org.
I believe that these are statements of fact. These statements of fact
lead me to ask two new questions as we look to a future with more
usage of HAProxy or some other LBaaS solution being applied more
commonly inside Cloud VPS projects:
* It seems reasonable that DNS records for *.wmcloud.org should all
relate to publicly routable addresses. Does it also seem reasonable
that *.wikimedia.cloud DNS records should all relate to non-publicly
routable addresses? If an address from a publically routable IPv4
space is being used only to provide a service IP for a service that is
100% internal to Cloud VPS (like the example of the HAProxy cluster
fronting a Kubernetes cluster inside the tools project) is it
acceptable to create an A record in the *.wikimedia.cloud zone for
that address?
* Should we have a floating IP pool of non-publicly routable IPv4
addresses for use cases where the service that is being provided is
only intended to be internal to a single project or the Cloud VPS
tenant network? Routable IPv4 addresses are a limited commodity, and
currently Cloud VPS has a very small number of them available.
One of the reasons that we are spending time discussing these things
is that we hope to decide on a set of standards and practices which
will make it easier to reason about use and maintenance of Cloud VPS.
Towards that end, I think I would propose these answers to the new
questions:
* The *.wikimedia.cloud zone should only contain A records pointing to
non-publicly routable IPv4 records. My reasoning for this is that it
makes it easier to quickly think about the general threat model for a
FQDN in this zone. If the FQDN ends in wikimedia.cloud, then the IP
associated with that FQDN is not publically routable.
* We should create a pool of floating ips using non-publicly routable
IPv4 addresses for the explicit purpose of being used as service IPs
for HA/LB systems within the Cloud VPS internal network. These service
IPs would then be given FQDNs in the *.wikimedia.cloud zone without
breaking the rule that all FQDNs ending in wikimedia.cloud are not
publically routable.
We may have to adjust a lot of security groups if the new floating
pool of private IPS is outside of the 172.16.0.0/21 subnet. If we
carve it out as a subset of that CIDR then I *think* it requires
careful changes to the existing "cloud-instances2-b-eqiad" subnet to
cordon off a /24 or /25 block into a new neutron subnet that would be
the source of the new pool and not part of the subnet that
nova/neutron use to give out fixed IPs to instances. Hopefully Arturo
or Jason can reason about the work and impact of this better than I
can.
Back to Arturo's question, I think I agree that if the IP in use is
from the "wan-transport-eqiad" pool (which is a great name for a
network and a horrible name for a pool), then the FQDN used for that
IP should be in the
wmcloud.org zone (or another zone dedicated to
public IPs) and not the wikimedia.cloud zone.
I am also starting to feel like this is all too complicated somehow.
Maybe it is just that using valid, registered TLDs is somehow adding
cognitive burden for me that the legacy 'fake' TLDs did not?
[0]: It seems we have a buggy version of python-openstackclient that
makes digging this information out of the system more difficult than
it should be.
https://bugs.launchpad.net/python-openstackclient/+bug/1616129
Bryan
--
Bryan Davis Technical Engagement Wikimedia Foundation
Principal Software Engineer Boise, ID USA
[[m:User:BDavis_(WMF)]] irc: bd808