opendevreview | Dr. Jens Harbott proposed openstack/neutron-dynamic-routing master: Add context wrapper for router_interface delete callback https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/939509 | 04:33 |
---|---|---|
opendevreview | Rodolfo Alonso proposed openstack/neutron master: [eventlet-deprecation] Reimplement ``common.utils.spawn_n`` https://review.opendev.org/c/openstack/neutron/+/939095 | 07:45 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: [eventlet-deprecation] Remove ``subprocess_popen`` https://review.opendev.org/c/openstack/neutron/+/939097 | 07:45 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: [eventlet-deprecation] Remove eventlet from ``TestNovaNotify`` https://review.opendev.org/c/openstack/neutron/+/939210 | 07:46 |
ralonsoh | slaweq, lajoskatona hello! Please check ^^ these 3 small patches if you have time | 07:46 |
ralonsoh | thanks!! | 07:46 |
lajoskatona | ralonsoh: Hi, checking | 07:47 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: [eventlet-deprecation] Remove eventlet from ``TestNovaNotify`` https://review.opendev.org/c/openstack/neutron/+/939210 | 07:48 |
opendevreview | Sahid Orentino Ferdjaoui proposed openstack/neutron master: async_process: remove usage of eventlet for AsyncProcess https://review.opendev.org/c/openstack/neutron/+/939348 | 08:08 |
opendevreview | Sahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: reimplement signals handling https://review.opendev.org/c/openstack/neutron/+/939321 | 08:08 |
opendevreview | Sahid Orentino Ferdjaoui proposed openstack/neutron master: common: fix wait_until_true to support native thread https://review.opendev.org/c/openstack/neutron/+/937843 | 08:08 |
opendevreview | Sahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: remove the usage of eventlet in the OVS agent https://review.opendev.org/c/openstack/neutron/+/937765 | 08:08 |
ralonsoh | ykarel, hi, are you checking the issue with test_list_pagination_with_href_links? | 08:14 |
ralonsoh | I think I saw a patch recently | 08:14 |
opendevreview | Michal Nasiadka proposed openstack/neutron master: WIP: Add ``OVNGatewayHAChassisGroup`` scheduler class https://review.opendev.org/c/openstack/neutron/+/939518 | 08:31 |
opendevreview | Dr. Jens Harbott proposed openstack/neutron-dynamic-routing master: Add context wrapper for router_interface delete callback https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/939509 | 08:47 |
ykarel | ralonsoh, no not checking but i recall there was a bug for it exists | 08:49 |
ykarel | patch was for some different similar issue, currently it's different | 08:49 |
noonedeadpunk | frickler: just as an update of IPV6 state with BGP-OVN driver - we got a little bit stuck due to https://bugs.launchpad.net/ovn-bgp-agent/+bug/2020410 which just found. As proxy_ndp doesn't work alike to proxy_ndp which was very confusing for me at least | 09:16 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: WIP == [OVN] ``PortBindingUpdateUpEvent`` https://review.opendev.org/c/openstack/neutron/+/939345 | 09:18 |
ralonsoh | hi folks, one last review for https://review.opendev.org/c/openstack/neutron/+/937545? | 09:26 |
ralonsoh | thanks in advance | 09:26 |
opendevreview | Eduardo Olivares proposed x/whitebox-neutron-tempest-plugin master: Add config dict dataplane_podified_services https://review.opendev.org/c/x/whitebox-neutron-tempest-plugin/+/939519 | 09:56 |
opendevreview | Merged openstack/ovn-bgp-agent master: Introduce multinode tempest job https://review.opendev.org/c/openstack/ovn-bgp-agent/+/936968 | 10:37 |
ralonsoh | sahid, please check https://review.opendev.org/c/openstack/neutron/+/939348/comment/d272208e_7a28cb10/ | 11:15 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: [eventlet-deprecation] Remove eventlet from ``TestNovaNotify`` https://review.opendev.org/c/openstack/neutron/+/939210 | 11:20 |
opendevreview | Merged openstack/neutron master: [OVN] Reduce the OVN hash ring touch interval https://review.opendev.org/c/openstack/neutron/+/937351 | 13:26 |
opendevreview | Merged openstack/neutron master: [OVN] Check if the LRP exists in ``check_provider_distributed_ports`` https://review.opendev.org/c/openstack/neutron/+/938889 | 13:45 |
mlavalle | haleyb: meeting today? | 13:59 |
haleyb | mlavalle: i completely forgot until a second ago, let me check wiki | 13:59 |
slaweq | o/ | 14:00 |
slaweq | (in case there will be a meeting) :) | 14:00 |
ihrachys | ralonsoh: wondering if https://8db7cf905af5e8e0b8f5-eb0a5b05669bc8730a6c9efdcc0969f3.ssl.cf5.rackcdn.com/938106/4/gate/neutron-tempest-plugin-ovn/ce26d89/testr_results.html may mean port down missed... | 14:01 |
ihrachys | "RuntimeError: Timed out waiting for port 'e3b17cad-63ff-4b8d-ad30-ba2c2680d749' to transition to status 'DOWN'." | 14:01 |
haleyb | o/ i don't see anything on the agenda | 14:01 |
haleyb | so i will cancel | 14:01 |
ralonsoh | ihrachys, let me check | 14:01 |
haleyb | sorry i need to look the day before and send an email | 14:01 |
slaweq | thx haleyb | 14:02 |
slaweq | have a great weekend then :) | 14:02 |
haleyb | you too | 14:02 |
mlavalle | haleyb: ack, have a nice weekend. You too slaweq | 14:02 |
ihrachys | ralonsoh: (this is a run before your timeout change merged btw so not reflecting on it) | 14:02 |
lajoskatona | o/ | 14:03 |
ralonsoh | ihrachys, https://bugs.launchpad.net/neutron/+bug/2085421 there is a bug related to this | 14:03 |
lajoskatona | As so many good people lurking here let me use this opportunity :P: Please review my taas doc patches: https://review.opendev.org/q/topic:%22taas_driver_docs%22 | 14:03 |
ralonsoh | you hit this issue not in the cleanup, but same command | 14:04 |
ralonsoh | lajoskatona, sure | 14:04 |
lajoskatona | and if you have some more time please also check the tap mirror feature: https://review.opendev.org/q/topic:%22bug/2015471%22 Thanks in advance | 14:04 |
ykarel | ralonsoh, i checked again the logs https://review.opendev.org/c/openstack/neutron/+/934409 | 14:10 |
ralonsoh | ykarel, yes, I'm reading the comment and the logs | 14:10 |
ykarel | and it looks the patch hiding actual issue | 14:10 |
ykarel | ack | 14:10 |
ralonsoh | but the PG was created before (several seconds before) | 14:10 |
ralonsoh | so I don't know why the PG deletion fails... doesn't make sense | 14:11 |
ykarel | looks like some cache/idl issue | 14:14 |
ihrachys | ralonsoh: this doesn't look like the same issue, even if the same test case | 14:43 |
ihrachys | I see: SetLSwitchPortCommand(_result=None, lport=e3b17cad-63ff-4b8d-ad30-ba2c2680d749, external_ids_update={'neutron:device_owner': ''}, columns={'parent_name': [], 'up': False, 'tag': []}, if_exists=True) in neutron | 14:43 |
ihrachys | but I don't see the corresponding release message in ovn-controller | 14:43 |
ihrachys | errors listed in the bug are different | 14:43 |
opendevreview | Vasyl Saienko proposed openstack/neutron master: Increase ovs operation timeout for functional tests https://review.opendev.org/c/openstack/neutron/+/939439 | 14:47 |
ralonsoh | ihrachys, right, this is not the same issue | 14:50 |
ralonsoh | and could be caused, again, due to the hash ring | 14:50 |
ralonsoh | ihrachys, I'm planning to remove the refresh thing | 14:50 |
ralonsoh | we expect all API workers to be always enable | 14:51 |
ralonsoh | actually in one API worker goes down, the API will fail | 14:51 |
ralonsoh | so once we initialize the hash ring nodes, why should we continue refreshing them? | 14:51 |
ihrachys | ralonsoh: I am not sure about hash ring implication. looks like we ran txn against nb db to set up=false for LSP. and txn succeeded. so I'd think, at this point ovn-controller should release the port no? | 14:52 |
ralonsoh | yes, and OVN is working fine but the test is expecting the port to go down and we don't catch this event | 14:53 |
ihrachys | ralonsoh: not sure why refresh. maybe it was an attempt to handle deaths of workers? | 14:53 |
ralonsoh | but a death worker is a critical event | 14:54 |
ralonsoh | that implies (or should imply) the API restart | 14:54 |
ihrachys | ralonsoh: what I'm saying is ovn-controller is not setting pb to down. shoudnt' there be a release message in ovn-controller for the subport? there was a message when it claimed the pb. | 14:54 |
ralonsoh | ihrachys, hold on, a trunk subport? | 14:55 |
ralonsoh | do we have PB for subports? | 14:55 |
ralonsoh | I would need to check that | 14:55 |
ihrachys | there's 'Setting lport e3b17cad-63ff-4b8d-ad30-ba2c2680d749 up in Southbound' message when it configures the subport pb | 14:55 |
ralonsoh | I would need to check this log deeply | 14:57 |
ralonsoh | ihrachys, actually with my patch for the PB events, I'm hitting this error | 14:58 |
ihrachys | I admit I don't know what we should see in a good run. it could be that ovn-controller doesn't post "releasing" message for this type. | 14:58 |
ralonsoh | https://review.opendev.org/c/openstack/neutron/+/939345 | 14:58 |
opendevreview | Eduardo Olivares proposed x/whitebox-neutron-tempest-plugin master: Add config dict dataplane_podified_services https://review.opendev.org/c/x/whitebox-neutron-tempest-plugin/+/939519 | 14:59 |
opendevreview | Eduardo Olivares proposed x/whitebox-neutron-tempest-plugin master: Skip test_dscp_bwlimit_external_network with only one compute https://review.opendev.org/c/x/whitebox-neutron-tempest-plugin/+/939539 | 14:59 |
ralonsoh | ihrachys, I thought the subports were not bound/released, because they were just a configuration for the parent port | 14:59 |
ralonsoh | actually there is one physical port only | 14:59 |
ralonsoh | this is why I was asking about the subport PB | 14:59 |
ihrachys | ah the first message is probably before the port was added as subport then? | 15:02 |
ihrachys | so yeah I think you are right then. the lsp down event missed probably. same issue. | 15:06 |
ihrachys | ralonsoh: does anyone besides neutron complain that futurist doesn't honor intervals in wsgi env by chance? | 15:07 |
ralonsoh | ihrachys, but futuristic (or any other library) is using threads. If we monkey_patch that implies we are using greenthreads | 15:18 |
ralonsoh | and to be honest, I don't know why we are not relasing the GIL on time | 15:19 |
opendevreview | Ihar Hrachyshka proposed openstack/neutron master: tests: Don't assume update outside ovs txn notifies separately from create https://review.opendev.org/c/openstack/neutron/+/939542 | 15:43 |
opendevreview | Ihar Hrachyshka proposed openstack/neutron master: ovn: Use green futurist executor when eventlet is used https://review.opendev.org/c/openstack/neutron/+/939545 | 16:36 |
ihrachys | ralonsoh: I wonder ^ ... | 16:36 |
opendevreview | Ihar Hrachyshka proposed openstack/neutron master: ovn: Use green futurist executor when eventlet is used https://review.opendev.org/c/openstack/neutron/+/939545 | 16:43 |
ralonsoh | let me check | 16:44 |
ralonsoh | I'm going offline some hours, Ill check later | 16:44 |
opendevreview | Ihar Hrachyshka proposed openstack/neutron master: DNM - Test "neutron-ovn-tempest-ipv6-only-ovs*" with WSGI https://review.opendev.org/c/openstack/neutron/+/939360 | 16:47 |
ihrachys | haleyb: fyi apparently our nodes have memory pressure / something leaks and eventually functional job gets stuck / pids oomkilled: https://bugs.launchpad.net/neutron/+bug/2095196 | 19:06 |
ihrachys | it shows as ovs timeouts but really it's just everything hangs | 19:06 |
haleyb | ihrachys: interesting, i think we had dealt with that before, will have to see if we can change something. will have to see if there is an obvious culprit | 19:38 |
haleyb | think https://review.opendev.org/c/openstack/neutron/+/921420/ was the change, increased swap | 19:50 |
haleyb | https://launchpad.net/bugs/2065821 | 19:50 |
haleyb | cover job was failing | 19:50 |
opendevreview | Merged x/whitebox-neutron-tempest-plugin master: Skip test_dscp_bwlimit_external_network with only one compute https://review.opendev.org/c/x/whitebox-neutron-tempest-plugin/+/939539 | 19:58 |
opendevreview | Brian Haley proposed openstack/neutron master: Increase swap in functional job to 12GB https://review.opendev.org/c/openstack/neutron/+/939558 | 20:07 |
haleyb | and bug 1966394 | 20:08 |
opendevreview | Terry Wilson proposed openstack/neutron master: Update Nova aggregates on changed host mappings https://review.opendev.org/c/openstack/neutron/+/935990 | 21:55 |
ihrachys | haleyb: this seems like swiping trash under carpet :p | 21:59 |
haleyb | ihrachys: it would be temporary until we can investigate further i would hope | 22:03 |
ihrachys | haleyb: then the very least add TODO to remove it? | 22:03 |
ihrachys | I am fine with unblocking gate as much as possible; concerned later followups often never happen :) | 22:03 |
haleyb | looking at logs/screen-memory_tracker.txt i don't see one process being a hog in that patch | 22:08 |
ihrachys | yeah same, nothing popped up. there are lots of privsep processes I think but no single hogger | 22:13 |
haleyb | ihrachys: so in the "shortcut url" patch, i do see mem went to zero, https://19862c796f0176e05bae-03c0af7271380f8d13c3735dacc9c317.ssl.cf2.rackcdn.com/939272/2/gate/neutron-functional/23f58b6/controller/logs/screen-dstat.txt | 22:16 |
haleyb | so something got triggered, typically is well above 1G | 22:17 |
haleyb | it fell off a cliff at Jan 17 20:41:05.634911 | 22:19 |
haleyb | in 20 seconds it gobbled 2G, that's one little piggy | 22:20 |
haleyb | ovs-vswitchd | 22:21 |
ihrachys | it got close to zero in 20:18:20 too (150mb) | 22:21 |
haleyb | Jan 17 20:48:10.003783 np0039605950 memory_tracker.sh[342949]: [ovs-vswitchd (pid:66606)]=925872KB | 22:22 |
haleyb | 2 seconds before that | 22:22 |
haleyb | Jan 17 20:46:29.766784 np0039605950 memory_tracker.sh[332205]: [ovs-vswitchd (pid:66606)]=531716KB | 22:22 |
ihrachys | that's minutes not seconds | 22:23 |
ihrachys | but yeah, impressive performance, top eater | 22:23 |
haleyb | ETOOMANYNUMBERS | 22:24 |
ihrachys | pid number 66606, SPOOKY | 22:24 |
ihrachys | :) | 22:24 |
haleyb | number of the beast :-o at least we have a change that can trigger it | 22:27 |
ihrachys | a lot of messages in vswitchd: 2025-01-17T20:47:18.413Z|13078|netdev_linux|INFO|gre0 device has unknown hardware address family 778 though I don't know if maybe it's normal | 22:29 |
haleyb | yeah, and we could be reading that whole file wrong | 22:29 |
haleyb | all i can say is I don't like those messages, but i don't see anything in that code showing a memory leak | 22:53 |
opendevreview | Merged openstack/neutron master: Add option to configure live migration activation strategy for OVN https://review.opendev.org/c/openstack/neutron/+/938106 | 23:26 |
opendevreview | Brian Haley proposed openstack/neutron master: Increase swap in functional job to 12GB https://review.opendev.org/c/openstack/neutron/+/939558 | 23:39 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!