*** rpittau|afk is now known as rpittau | 06:41 | |
*** odyssey4me is now known as Guest847 | 08:50 | |
snadge | i have dhcp and dhcpv6 enabled on the wan interface, and now the ipv4 address only times out after 1800 seconds | 11:56 |
---|---|---|
snadge | its like it wont renew.. but if i manually renew it does, pretty frustrating | 11:56 |
snadge | i could just set the ipv4 address statically.. as it is a static ip, but that's annoying, i want to know why its doing that | 11:57 |
spatel | noonedeadpunk yesterday i have upgraded openstack V to W without any issue. | 12:58 |
spatel | it took 8 hours to completed full upgrade | 12:58 |
mgariepy | nice spatel, what is your network stack ? ovs ? or still lxb ? | 13:06 |
spatel | lxb | 13:06 |
mgariepy | how many host do you have ? | 13:06 |
spatel | this is production 100 around | 13:06 |
mgariepy | cool. | 13:07 |
noonedeadpunk | great news! | 13:07 |
spatel | is that normal to take 8 hour ? | 13:07 |
noonedeadpunk | against 100 hosts? | 13:07 |
spatel | yes | 13:07 |
spatel | i have one more environment which has 328 computes :) | 13:08 |
mgariepy | when i do upgrade like that i tend to split the task over a couple of days. | 13:08 |
spatel | mgariepy is it ok if infra running on wallaby and compute running on Victoria ? | 13:08 |
noonedeadpunk | me too | 13:08 |
noonedeadpunk | yes, totally | 13:08 |
spatel | i thought they don't work in mix version | 13:08 |
mgariepy | control plane first on day 1 (infra + keystone + other services) and then the nova/neutron on day 2. | 13:09 |
spatel | oh!! damn it.. i didn't know that | 13:09 |
mgariepy | well your upgrade was live and for 8 hours you had a mix of the 2 releases! ;) | 13:09 |
spatel | next time i will split out... it would be good to put nodes in Official doc incase people not aware :) | 13:09 |
mgariepy | you can also split that i 3 days if you want ! | 13:10 |
noonedeadpunk | we split in a week lol | 13:10 |
mgariepy | depending on how many hosts :P | 13:10 |
mgariepy | my cloud is kinda small so .. 2 days is enough haha | 13:10 |
noonedeadpunk | to provide exact time when and what exactly can fail for customers | 13:10 |
spatel | i have noticed when you upgrade OVS agent / restart take network hit | 13:12 |
spatel | is that true? | 13:12 |
noonedeadpunk | yep, ovs restart does break networking on compute | 13:12 |
spatel | so we have to take downtime, that sucks... | 13:12 |
noonedeadpunk | because eventually you shot down and re-create bridges | 13:12 |
spatel | my big problem is in remote datacenter i am planning to use dpdk (because sriov doesn't support bonding). so i have to use ovs | 13:13 |
spatel | noonedeadpunk as you know i haven't upgrade ceph yet.. that is i am planning to test in lab and later do it on production | 13:14 |
spatel | noonedeadpunk i have very stupid question, what is i lost deployment node in that case how to build deployment node? | 13:15 |
noonedeadpunk | you need to backup /etc/openstack_deploy :) | 13:15 |
noonedeadpunk | or store it in git | 13:26 |
mgariepy | and push it somewhere. | 13:26 |
noonedeadpunk | not best idea to store user_secrets in git though | 13:26 |
mgariepy | i do encrypt secrets and store it in a private git server. | 13:27 |
mgariepy | how do you guys do that noonedeadpunk ? | 13:29 |
jamesdenton | anyone here run into an issue with the dashboard, when after logging in you get a 504 after a while. Looking at nova-api it's just spinning its wheels on /v2.1/os-simple-tenant-usage/ | 13:29 |
jamesdenton | repeated (successful) requests for GET /v2.1/os-simple-tenant-usage/7a8df96a3c6a47118e60e57aa9ecff54 (project). strange | 13:31 |
mgariepy | is it only one node that is failing ? | 13:32 |
jamesdenton | i see the same behavior across all controllers running that service | 13:33 |
jamesdenton | It's the Project -> Compute -> Overview page that's problematic, not all of horizon | 13:34 |
mgariepy | do you see the error in the nova logs? | 13:34 |
mgariepy | when i do an upgrade on horizon i usualy destroy one of the container and create a new one. just to have a quick fallback if the upgrade fails for misterious reason.. | 13:36 |
jamesdenton | https://pastebin.com/ndzhmk1B | 13:36 |
jamesdenton | horizon logs look clean, but something is making repeated requests to this url | 13:37 |
jamesdenton | which is related to that overview page | 13:37 |
mgariepy | are the request comes all from the same horizon container ? | 13:40 |
mgariepy | or mostly** | 13:40 |
jamesdenton | ahh, good question, i'll have to check. it's "load balanced" but i need t make sure | 13:42 |
mgariepy | the connection to horizon are sticky per client IRRC | 13:43 |
jamesdenton | nova usage-list uses the same os-simple-tenant-usage calls, testing it now. | 13:43 |
mgariepy | your loadbalancer would log the balancing for the nova api logs if you list the last 1000 are they coming mostly from the same host ? | 13:45 |
jamesdenton | this is just a lab env | 13:45 |
mgariepy | lol ok and ? take 100 call then haha | 13:46 |
jamesdenton | :D | 13:46 |
mgariepy | you are the only client ? | 13:46 |
jamesdenton | yes, i'm the only client. Problem persists a reboot, too, so i wonder if there something funky in the DB | 13:46 |
mgariepy | flush memcached | 13:47 |
jamesdenton | it happened after a recent upgrade, but can't recall which | 13:47 |
mgariepy | is it a 3 ctrl node deployement ? | 13:48 |
jamesdenton | yes | 13:48 |
mgariepy | block-migrate on hdds is soooo painful | 13:48 |
mgariepy | did you try to remove on ctrl node from the lb to see if it's only one node that is causing the issue ? | 13:51 |
jamesdenton | yeah, i've done the usual stuff. | 14:02 |
mgariepy | you should try to start the weekend early ahha | 14:07 |
Adri2000 | hi... this backport https://review.opendev.org/c/openstack/openstack-ansible-os_swift/+/806210 would be happy with another +2 :) | 14:17 |
mgariepy | Adri2000, done. | 14:22 |
Adri2000 | thank you mgariepy! | 14:23 |
noonedeadpunk | Eventually would be great to review https://review.opendev.org/q/topic:"osa%252Fgalera_pki"+(status:open) as well :) | 14:51 |
jamesdenton | mgariepy So, I've isolated this to some issue w/ nova api versioning. But maybe really something with haproxy, dunno. If I use microversion <= 2.39 it works, >2.40 fails | 15:17 |
jamesdenton | https://docs.openstack.org/nova/latest/reference/api-microversion-history.html | 15:17 |
mgariepy | jamesdenton, mismatch of version of the microversion between horizon/nova/ and other stuff? | 15:35 |
jamesdenton | i can't imagine so, microversion 2.40 is pretty old (older than this env, even). but i'm not really sure what the deal is. running wallaby now and this issue started happening in this env during victoria, IIRC. this lab sees some abuse. at this point it's the principle of the thing - i need to fix it to keep myself sane | 15:37 |
jamesdenton | i might adjust the endpoints to bypass haproxy and see if it occurs directly to nova-api | 15:38 |
mgariepy | keep us in the loop | 15:42 |
mgariepy | i don't see why haproxy would cause issue with a version like that. | 15:43 |
jamesdenton | I was thinking maybe a difference in the payload was causing an issue. but the same thing is happening directly to nova api when i changed the endpoint | 15:44 |
jamesdenton | mgariepy so, it looks like there was something about terminated instances that was causing an issue. The issue persisted, even after deleting all instances from all projects. Had to run "nova-manage db archive_deleted_rows" | 15:58 |
mgariepy | when did you install that system ? is it an old install upgraded for the last 10 years ? | 15:59 |
jamesdenton | Austin -> Wallaby | 15:59 |
jamesdenton | Not really, probably Queens/Rocky -> W | 15:59 |
mgariepy | LOL | 15:59 |
jamesdenton | but this issue crept up some time in the last few weeks | 15:59 |
mgariepy | did you run nova api with debug ? | 16:00 |
mgariepy | how did you saw the issue with the deleted instance? | 16:00 |
jamesdenton | oh yeah. in fact, i thought it was fixed but it's not. as soon as i created a new instance, the problem came back. I mentioned deleted instances because you could still see that returned in the payload | 16:01 |
mgariepy | db migration missing? | 16:01 |
mgariepy | if nova error's out on a db issue there should be a traceback somewhere ?no ? | 16:02 |
mgariepy | unless it's in try: except: pass haha | 16:03 |
jamesdenton | i would think so. only think i see in the logs is repeated attempts against os-simple-tenant-usage | 16:03 |
opendevreview | Merged openstack/openstack-ansible-os_swift stable/ussuri: Revert "split templates to work around configparser bug" https://review.opendev.org/c/openstack/openstack-ansible-os_swift/+/806210 | 16:35 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!