opendevreview | Erik Olof Gunnar Andersson proposed openstack/designate master: Cleaned up and fixed record objects and tests https://review.opendev.org/c/openstack/designate/+/849831 | 04:25 |
---|---|---|
opendevreview | Erik Olof Gunnar Andersson proposed openstack/designate master: Fixed issues with __repr__ and __str__ on objects https://review.opendev.org/c/openstack/designate/+/849847 | 04:28 |
opendevreview | Erik Olof Gunnar Andersson proposed openstack/designate master: Added additional test coverage for adapters https://review.opendev.org/c/openstack/designate/+/849848 | 04:29 |
opendevreview | Erik Olof Gunnar Andersson proposed openstack/designate master: Cleaned up and fixed record objects and tests https://review.opendev.org/c/openstack/designate/+/849831 | 04:39 |
opendevreview | Erik Olof Gunnar Andersson proposed openstack/designate master: Fixed issues with __repr__ and __str__ on objects https://review.opendev.org/c/openstack/designate/+/849847 | 04:40 |
opendevreview | Erik Olof Gunnar Andersson proposed openstack/designate master: Added additional test coverage for adapters https://review.opendev.org/c/openstack/designate/+/849848 | 04:41 |
opendevreview | Michael Johnson proposed openstack/designate master: Enable cache_ok on custom sqlalchemy UUID type https://review.opendev.org/c/openstack/designate/+/850681 | 17:53 |
ozzzo_work | One of my regions has a bunch of zones in ERROR status, and the designate-worker.log says "Could not find <serial> for <zone> on enough nameservers." | 18:29 |
ozzzo_work | WHen I use dig to check the SOA, that same serial is reported | 18:29 |
ozzzo_work | When I create new VMs, they get correct DNS records, but the recordset entries are in ERROR status | 18:30 |
ozzzo_work | It looks like Designate is correctly updating the DNS servers, but then it gets confused and thinks that it failed | 18:30 |
ozzzo_work | what would cause that? | 18:30 |
ozzzo_work | When I first started investigating, mariadb was jammed up so I recovered it, and then bounced the designate containers on all 3 controllers, and mariadb is fine now, but Designate is still broken | 18:31 |
ozzzo_work | when I create a VM, the forward and reverse zones go PENDING for a while and then back to ERROR | 18:33 |
eandersson | ozzzo_work is this Train? | 18:51 |
johnsom | eandersson Yeah, it's likely Train if they are on 16.2 | 19:13 |
johnsom | ozzzo_work I would check your nameservers configuration. The error is not enough name servers had the record. Do a "designate-manage pool show_config" and make sure the "nameservers" records are correct and reachable from the workers on the controllers. | 19:15 |
eandersson | Yep - was just gonna suggest the same, make sure to use the show_config command, as the configuration file may not be in sync. | 19:15 |
ozzzo_work | eandersson: yes RHOSP Train | 19:23 |
ozzzo_work | it looks like we don't have designate-manage installed. Is there an equivalent "openstack" command? | 19:24 |
johnsom | No, that is the official command | 19:24 |
johnsom | It needs to be run from a location that has access to the DB. You may need to run it from inside one of your containers | 19:25 |
ozzzo_work | I'm getting "-bash: designate-manage: command not found" | 19:26 |
ozzzo_work | Does that need that I need to install the client? I've been using the openstack client for everything | 19:26 |
johnsom | It is installed with designate | 19:26 |
ozzzo_work | oic got it | 19:26 |
johnsom | Right, openstack client works with the API, designate-manage (like neutron-manage, etc.) talks to the database directly. It is used for the DB migrations, pools management, etc. | 19:27 |
ozzzo_work | oic this shows the same thing I see in pools.yml. Do I need to check the DNS servers (also-notify) or the Designate servers? | 19:28 |
ozzzo_work | I used nc to verify that I can connect to port 53 on the DNS servers from all 3 controllers | 19:28 |
johnsom | All of them in this list: nameservers: - host: 10.21.21.88 port: 53 | 19:29 |
ozzzo_work | yes I can connect to 53 on all of those | 19:29 |
ozzzo_work | I'm using: nc -vz <IP> 53 | 19:30 |
ozzzo_work | the "nameservers" are my controllers; the "also-notifies" are the DNS servers that they update | 19:31 |
johnsom | For nc you probably need to us "-u" for UDP | 19:31 |
ozzzo_work | that works for both sets | 19:32 |
johnsom | The nameservers should be the bind9 instances, which may be running on the controllers. Also, you might make sure to run that from inside the worker container as from outside might behave differently. | 19:34 |
ozzzo_work | ok | 19:34 |
ozzzo_work | I can connect to them from inside the worker container | 19:36 |
johnsom | Hmmm, so that is odd. | 19:37 |
johnsom | Or at least the most common configuration issue isn't at play here. | 19:37 |
ozzzo_work | we last changed config here 2 weeks ago; the problem seems to have started at 9AM this morning with mariadb failing | 19:38 |
ozzzo_work | er.. 6AM eastern time; 9AM UTC | 19:38 |
johnsom | The message you reported is after the records are created by designate, the worker attempts to query all of the bind instances to make sure they pulled the new zone update (i.e. have the new serial #). | 19:43 |
johnsom | If you are confident in the pools configuration we had you look at. You could try a "designate-manage pool update" to see if that gets things re-synced. | 19:45 |
ozzzo_work | should I run that from one of the designate_worker containers? | 19:48 |
johnsom | yes | 19:48 |
ozzzo_work | all zones are PENDING now; waiting | 19:51 |
ozzzo_work | I get this in designate-worker.log: https://paste.openstack.org/show/bRPHOtYt9DKBFnsNLdKO/ | 19:55 |
ozzzo_work | and then more of the "enough nameservers" errors | 19:55 |
ozzzo_work | and the zones all changed from PENDING to ERROR | 19:56 |
ozzzo_work | we don't have a zone called dva3-p4gen-3 | 19:57 |
ozzzo_work | but we do have tenant networking in this region; this could be a network that a customer created | 19:58 |
johnsom | It's in your designate database if it's trying to update the zone configuration. | 19:58 |
ozzzo_work | do I need to spelunk in the database, or is there a better way? | 20:00 |
johnsom | It should show up in an openstack zone list --all-projects | 20:01 |
ozzzo_work | ok I see it, so it looks like my DB isn't broken. What else would explain the "failed: not found"? | 20:04 |
johnsom | Can you check your bind9 logs and see if there is a reason bind9 might be rejecting the zone updates? | 20:08 |
ozzzo_work | on the also-notify servers or the controllers? | 20:19 |
johnsom | The instances listed in the pool nameserver list | 20:21 |
ozzzo_work | the DNS servers are accepting the updates. When I see the error: Could not find <serial> for <zone> on enough nameservers, I can dig SOA and I see that same SN | 20:21 |
ozzzo_work | where are the bind9 logs on my controllers? I see a designate_backend_bind9 container but no bind9 logs in /var/log/kolla/designate/ | 20:22 |
johnsom | It would not be under designate, bind or named somewhere under /var/log | 20:23 |
ozzzo_work | ok I found it; it's in the container log: https://paste.openstack.org/show/bkkf9jiNCiLt1lIR8unc/ | 20:35 |
ozzzo_work | I see those errors for all of the "dva3-p4gen-?" zones and some of the reverse zones | 20:36 |
ozzzo_work | the "dva3-p4gen-?" zones are customer domains that we are doing DNS for, we call it "bring your own domain" | 20:37 |
opendevreview | Merged openstack/designate master: Cleaned up and fixed record objects and tests https://review.opendev.org/c/openstack/designate/+/849831 | 20:38 |
ozzzo_work | we have those same zones in dva4 which isn't broken; I'm not sure that they are the cause of the dva3 issue | 20:39 |
johnsom | I think I would take one of the zones listed in the "not enough nameservers" and trace it through the worker, mdns and bind9 logs. Something is out of sync between the designate DB and the bind instances that isn't reconciling. | 20:46 |
johnsom | I would have expected the update command to help resync all of that, but something else is going on. | 20:47 |
johnsom | You might also check the bind configuration for the zone. | 20:49 |
ozzzo_work | ok I'll search logs, ty for the advice! | 21:04 |
johnsom | NP, let us know what you find | 21:04 |
opendevreview | Michael Johnson proposed openstack/designate master: Fix pecan lookup_controller DeprecationWarning https://review.opendev.org/c/openstack/designate/+/850695 | 21:27 |
opendevreview | Michael Johnson proposed openstack/designate master: DNM: testing git review https://review.opendev.org/c/openstack/designate/+/850699 | 21:38 |
opendevreview | Michael Johnson proposed openstack/designate master: Fix sqlalchemy table_names DeprecationWarning https://review.opendev.org/c/openstack/designate/+/850704 | 23:15 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!