Friday, 2025-01-17

opendevreviewDr. Jens Harbott proposed openstack/neutron-dynamic-routing master: Add context wrapper for router_interface delete callback  https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/93950904:33
opendevreviewRodolfo Alonso proposed openstack/neutron master: [eventlet-deprecation] Reimplement ``common.utils.spawn_n``  https://review.opendev.org/c/openstack/neutron/+/93909507:45
opendevreviewRodolfo Alonso proposed openstack/neutron master: [eventlet-deprecation] Remove ``subprocess_popen``  https://review.opendev.org/c/openstack/neutron/+/93909707:45
opendevreviewRodolfo Alonso proposed openstack/neutron master: [eventlet-deprecation] Remove eventlet from ``TestNovaNotify``  https://review.opendev.org/c/openstack/neutron/+/93921007:46
ralonsohslaweq, lajoskatona hello! Please check ^^ these 3 small patches if you have time07:46
ralonsohthanks!!07:46
lajoskatonaralonsoh: Hi, checking07:47
opendevreviewRodolfo Alonso proposed openstack/neutron master: [eventlet-deprecation] Remove eventlet from ``TestNovaNotify``  https://review.opendev.org/c/openstack/neutron/+/93921007:48
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: async_process: remove usage of eventlet for AsyncProcess  https://review.opendev.org/c/openstack/neutron/+/93934808:08
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: reimplement signals handling  https://review.opendev.org/c/openstack/neutron/+/93932108:08
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: common: fix wait_until_true to support native thread  https://review.opendev.org/c/openstack/neutron/+/93784308:08
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: remove the usage of eventlet in the OVS agent  https://review.opendev.org/c/openstack/neutron/+/93776508:08
ralonsohykarel, hi, are you checking the issue with test_list_pagination_with_href_links?08:14
ralonsohI think I saw a patch recently08:14
opendevreviewMichal Nasiadka proposed openstack/neutron master: WIP: Add ``OVNGatewayHAChassisGroup`` scheduler class  https://review.opendev.org/c/openstack/neutron/+/93951808:31
opendevreviewDr. Jens Harbott proposed openstack/neutron-dynamic-routing master: Add context wrapper for router_interface delete callback  https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/93950908:47
ykarelralonsoh, no not checking but i recall there was a bug for it exists08:49
ykarelpatch was for some different similar issue, currently it's different08:49
noonedeadpunkfrickler: just as an update of IPV6 state with BGP-OVN driver - we got a little bit stuck due to https://bugs.launchpad.net/ovn-bgp-agent/+bug/2020410 which just found. As proxy_ndp doesn't work alike to proxy_ndp which was very confusing for me at least09:16
opendevreviewRodolfo Alonso proposed openstack/neutron master: WIP == [OVN] ``PortBindingUpdateUpEvent``  https://review.opendev.org/c/openstack/neutron/+/93934509:18
ralonsohhi folks, one last review for https://review.opendev.org/c/openstack/neutron/+/937545?09:26
ralonsohthanks in advance09:26
opendevreviewEduardo Olivares proposed x/whitebox-neutron-tempest-plugin master: Add config dict dataplane_podified_services  https://review.opendev.org/c/x/whitebox-neutron-tempest-plugin/+/93951909:56
opendevreviewMerged openstack/ovn-bgp-agent master: Introduce multinode tempest job  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/93696810:37
ralonsohsahid, please check https://review.opendev.org/c/openstack/neutron/+/939348/comment/d272208e_7a28cb10/11:15
opendevreviewRodolfo Alonso proposed openstack/neutron master: [eventlet-deprecation] Remove eventlet from ``TestNovaNotify``  https://review.opendev.org/c/openstack/neutron/+/93921011:20
opendevreviewMerged openstack/neutron master: [OVN] Reduce the OVN hash ring touch interval  https://review.opendev.org/c/openstack/neutron/+/93735113:26
opendevreviewMerged openstack/neutron master: [OVN] Check if the LRP exists in ``check_provider_distributed_ports``  https://review.opendev.org/c/openstack/neutron/+/93888913:45
mlavallehaleyb: meeting today?13:59
haleybmlavalle: i completely forgot until a second ago, let me check wiki13:59
slaweqo/14:00
slaweq(in case there will be a meeting) :)14:00
ihrachysralonsoh: wondering if https://8db7cf905af5e8e0b8f5-eb0a5b05669bc8730a6c9efdcc0969f3.ssl.cf5.rackcdn.com/938106/4/gate/neutron-tempest-plugin-ovn/ce26d89/testr_results.html may mean port down missed...14:01
ihrachys"RuntimeError: Timed out waiting for port 'e3b17cad-63ff-4b8d-ad30-ba2c2680d749' to transition to status 'DOWN'."14:01
haleybo/ i don't see anything on the agenda14:01
haleybso i will cancel14:01
ralonsohihrachys, let me check14:01
haleybsorry i need to look the day before and send an email14:01
slaweqthx haleyb 14:02
slaweqhave a great weekend then :)14:02
haleybyou too14:02
mlavallehaleyb: ack, have a nice weekend. You too slaweq 14:02
ihrachysralonsoh: (this is a run before your timeout change merged btw so not reflecting on it)14:02
lajoskatonao/14:03
ralonsohihrachys, https://bugs.launchpad.net/neutron/+bug/2085421 there is a bug related to this14:03
lajoskatonaAs so many good people lurking here let me use this opportunity :P: Please review my taas doc patches: https://review.opendev.org/q/topic:%22taas_driver_docs%2214:03
ralonsohyou hit this issue not in the cleanup, but same command14:04
ralonsohlajoskatona, sure14:04
lajoskatonaand if you have some more time please also check the tap mirror feature: https://review.opendev.org/q/topic:%22bug/2015471%22  Thanks in advance14:04
ykarelralonsoh, i checked again the logs https://review.opendev.org/c/openstack/neutron/+/93440914:10
ralonsohykarel, yes, I'm reading the comment and the logs14:10
ykareland it looks the patch hiding actual issue14:10
ykarelack14:10
ralonsohbut the PG was created before (several seconds before)14:10
ralonsohso I don't know why the PG deletion fails... doesn't make sense14:11
ykarellooks like some cache/idl issue14:14
ihrachysralonsoh: this doesn't look like the same issue, even if the same test case14:43
ihrachysI see: SetLSwitchPortCommand(_result=None, lport=e3b17cad-63ff-4b8d-ad30-ba2c2680d749, external_ids_update={'neutron:device_owner': ''}, columns={'parent_name': [], 'up': False, 'tag': []}, if_exists=True) in neutron14:43
ihrachysbut I don't see the corresponding release message in ovn-controller14:43
ihrachyserrors listed in the bug are different14:43
opendevreviewVasyl Saienko proposed openstack/neutron master: Increase ovs operation timeout for functional tests  https://review.opendev.org/c/openstack/neutron/+/93943914:47
ralonsohihrachys, right, this is not the same issue14:50
ralonsohand could be caused, again, due to the hash ring14:50
ralonsohihrachys, I'm planning to remove the refresh thing14:50
ralonsohwe expect all API workers to be always enable 14:51
ralonsohactually in one API worker goes down, the API will fail14:51
ralonsohso once we initialize the hash ring nodes, why should we continue refreshing them?14:51
ihrachysralonsoh: I am not sure about hash ring implication. looks like we ran txn against nb db to set up=false for LSP. and txn succeeded. so I'd think, at this point ovn-controller should release the port no?14:52
ralonsohyes, and OVN is working fine but the test is expecting the port to go down and we don't catch this event14:53
ihrachysralonsoh: not sure why refresh. maybe it was an attempt to handle deaths of workers?14:53
ralonsohbut a death worker is a critical event14:54
ralonsohthat implies (or should imply) the API restart14:54
ihrachysralonsoh: what I'm saying is ovn-controller is not setting pb to down. shoudnt' there be a release message in ovn-controller for the subport? there was a message when it claimed the pb.14:54
ralonsohihrachys, hold on, a trunk subport?14:55
ralonsohdo we have PB for subports?14:55
ralonsohI would need to check that14:55
ihrachysthere's 'Setting lport e3b17cad-63ff-4b8d-ad30-ba2c2680d749 up in Southbound' message when it configures the subport pb14:55
ralonsohI would need to check this log deeply14:57
ralonsohihrachys, actually with my patch for the PB events, I'm hitting this error14:58
ihrachysI admit I don't know what we should see in a good run. it could be that ovn-controller doesn't post "releasing" message for this type.14:58
ralonsohhttps://review.opendev.org/c/openstack/neutron/+/93934514:58
opendevreviewEduardo Olivares proposed x/whitebox-neutron-tempest-plugin master: Add config dict dataplane_podified_services  https://review.opendev.org/c/x/whitebox-neutron-tempest-plugin/+/93951914:59
opendevreviewEduardo Olivares proposed x/whitebox-neutron-tempest-plugin master: Skip test_dscp_bwlimit_external_network with only one compute  https://review.opendev.org/c/x/whitebox-neutron-tempest-plugin/+/93953914:59
ralonsohihrachys, I thought the subports were not bound/released, because they were just a configuration for the parent port14:59
ralonsohactually there is one physical port only14:59
ralonsohthis is why I was asking about the subport PB14:59
ihrachysah the first message is probably before the port was added as subport then?15:02
ihrachysso yeah I think you are right then. the lsp down event missed probably. same issue.15:06
ihrachysralonsoh: does anyone besides neutron complain that futurist doesn't honor intervals in wsgi env by chance?15:07
ralonsohihrachys, but futuristic (or any other library) is using threads. If we monkey_patch that implies we are using greenthreads15:18
ralonsohand to be honest, I don't know why we are not relasing the GIL on time15:19
opendevreviewIhar Hrachyshka proposed openstack/neutron master: tests: Don't assume update outside ovs txn notifies separately from create  https://review.opendev.org/c/openstack/neutron/+/93954215:43
opendevreviewIhar Hrachyshka proposed openstack/neutron master: ovn: Use green futurist executor when eventlet is used  https://review.opendev.org/c/openstack/neutron/+/93954516:36
ihrachysralonsoh: I wonder ^ ...16:36
opendevreviewIhar Hrachyshka proposed openstack/neutron master: ovn: Use green futurist executor when eventlet is used  https://review.opendev.org/c/openstack/neutron/+/93954516:43
ralonsohlet me check16:44
ralonsohI'm going offline some hours, Ill check later16:44
opendevreviewIhar Hrachyshka proposed openstack/neutron master: DNM - Test "neutron-ovn-tempest-ipv6-only-ovs*" with WSGI  https://review.opendev.org/c/openstack/neutron/+/93936016:47
ihrachyshaleyb: fyi apparently our nodes have memory pressure / something leaks and eventually functional job gets stuck / pids oomkilled: https://bugs.launchpad.net/neutron/+bug/209519619:06
ihrachysit shows as ovs timeouts but really it's just everything hangs19:06
haleybihrachys: interesting, i think we had dealt with that before, will have to see if we can change something. will have to see if there is an obvious culprit19:38
haleybthink https://review.opendev.org/c/openstack/neutron/+/921420/ was the change, increased swap19:50
haleybhttps://launchpad.net/bugs/206582119:50
haleybcover job was failing19:50
opendevreviewMerged x/whitebox-neutron-tempest-plugin master: Skip test_dscp_bwlimit_external_network with only one compute  https://review.opendev.org/c/x/whitebox-neutron-tempest-plugin/+/93953919:58
opendevreviewBrian Haley proposed openstack/neutron master: Increase swap in functional job to 12GB  https://review.opendev.org/c/openstack/neutron/+/93955820:07
haleyband bug 196639420:08
opendevreviewTerry Wilson proposed openstack/neutron master: Update Nova aggregates on changed host mappings  https://review.opendev.org/c/openstack/neutron/+/93599021:55
ihrachyshaleyb: this seems like swiping trash under carpet :p21:59
haleybihrachys: it would be temporary until we can investigate further i would hope22:03
ihrachyshaleyb: then the very least add TODO to remove it?22:03
ihrachysI am fine with unblocking gate as much as possible; concerned later followups often never happen :)22:03
haleyblooking at logs/screen-memory_tracker.txt i don't see one process being a hog in that patch22:08
ihrachysyeah same, nothing popped up. there are lots of privsep processes I think but no single hogger22:13
haleybihrachys: so in the "shortcut url" patch, i do see mem went to zero, https://19862c796f0176e05bae-03c0af7271380f8d13c3735dacc9c317.ssl.cf2.rackcdn.com/939272/2/gate/neutron-functional/23f58b6/controller/logs/screen-dstat.txt22:16
haleybso something got triggered, typically is well above 1G22:17
haleybit fell off a cliff at Jan 17 20:41:05.63491122:19
haleybin 20 seconds it gobbled 2G, that's one little piggy22:20
haleybovs-vswitchd22:21
ihrachysit got close to zero in 20:18:20 too (150mb)22:21
haleybJan 17 20:48:10.003783 np0039605950 memory_tracker.sh[342949]: [ovs-vswitchd (pid:66606)]=925872KB22:22
haleyb2 seconds before that22:22
haleybJan 17 20:46:29.766784 np0039605950 memory_tracker.sh[332205]: [ovs-vswitchd (pid:66606)]=531716KB22:22
ihrachysthat's minutes not seconds22:23
ihrachysbut yeah, impressive performance, top eater22:23
haleybETOOMANYNUMBERS22:24
ihrachyspid number 66606, SPOOKY22:24
ihrachys:)22:24
haleybnumber of the beast :-o at least we have a change that can trigger it22:27
ihrachysa lot of messages in vswitchd: 2025-01-17T20:47:18.413Z|13078|netdev_linux|INFO|gre0 device has unknown hardware address family 778 though I don't know if maybe it's normal22:29
haleybyeah, and we could be reading that whole file wrong22:29
haleyball i can say is I don't like those messages, but i don't see anything in that code showing a memory leak22:53
opendevreviewMerged openstack/neutron master: Add option to configure live migration activation strategy for OVN  https://review.opendev.org/c/openstack/neutron/+/93810623:26
opendevreviewBrian Haley proposed openstack/neutron master: Increase swap in functional job to 12GB  https://review.opendev.org/c/openstack/neutron/+/93955823:39

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!