15:00:56 <slaweq> #startmeeting neutron_ci 15:01:01 <openstack> Meeting started Tue Dec 8 15:00:56 2020 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:02 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:03 <slaweq> welcome again :) 15:01:05 <openstack> The meeting name has been set to 'neutron_ci' 15:01:24 <bcafarel> hi again 15:02:47 <slaweq> lets wait few more minutes for other folks 15:03:46 <lajoskatona> Hi 15:03:55 <bcafarel> it gives me more time to watch progress/failure on https://review.opendev.org/c/openstack/neutron/+/766000 15:04:40 <slaweq> ok, lets start and do that quickly 15:04:49 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:04:56 <slaweq> #topic Actions from previous meetings 15:05:01 <slaweq> bcafarel to check and update doc https://docs.openstack.org/neutron/latest/contributor/policies/release-checklist.html 15:05:09 * bcafarel looks for link 15:05:49 <bcafarel> https://review.opendev.org/c/openstack/neutron/+/765959 I added also note on neutron-tempest-plugin template and *-master jobs 15:06:57 <slaweq> thx 15:07:03 <slaweq> I will check that later 15:08:49 <slaweq> next one 15:08:51 <slaweq> ralonsoh to report and check issue with TestSimpleMonitorInterface in functional tests 15:08:56 <slaweq> ralonsoh is not here today 15:09:18 <slaweq> but as this hits us pretty often recently, I spent some time yesterday to check that 15:09:23 <slaweq> LP https://bugs.launchpad.net/neutron/+bug/1907068 15:09:25 <openstack> Launchpad bug 1907068 in neutron "Functional test neutron.tests.functional.agent.linux.test_ovsdb_monitor.TestSimpleInterfaceMonitor.test_get_events" [Critical,In progress] - Assigned to Slawek Kaplonski (slaweq) 15:09:25 <slaweq> Patch https://review.opendev.org/c/openstack/neutron/+/765792 15:09:34 <slaweq> I hope that this will help with that issue 15:09:38 <slaweq> so please review it :) 15:10:21 <bcafarel> universal solution (aka add some sleep() ) detected :) 15:10:28 <slaweq> bcafarel: yes :/ 15:10:37 <slaweq> but I don't see better way to solve it really 15:11:38 <slaweq> next one 15:11:40 <slaweq> slaweq to check if test_dhcp_port_status_active will be still failing after https://review.opendev.org/c/openstack/neutron/+/755313 will be merged 15:11:47 <slaweq> it didn't help for sure 15:11:56 <slaweq> so bug reported https://bugs.launchpad.net/neutron/+bug/1906654 15:11:57 <openstack> Launchpad bug 1906654 in neutron "neutron_tempest_plugin.api.admin.test_dhcp_agent_scheduler.DHCPAgentSchedulersTestJSON.test_dhcp_port_status_active is failing often" [Critical,Confirmed] 15:12:05 <slaweq> and Skip proposed https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/765327 15:12:31 <slaweq> it's already merged so should give us a breath on CI 15:12:40 <slaweq> but we need to investigate why this happens really 15:12:53 <slaweq> next one 15:12:56 <slaweq> slaweq to report LP about SSH failures in the neutron-ovn-tempest-ovs-release-ipv6-only 15:12:59 <slaweq> Bug https://bugs.launchpad.net/neutron/+bug/1906490 15:13:00 <openstack> Launchpad bug 1906490 in neutron "SSH failures in the neutron-ovn-tempest-ovs-release-ipv6-only job" [Critical,Confirmed] 15:13:01 <slaweq> Patch to skip failing test https://review.opendev.org/c/openstack/neutron/+/765070 15:13:34 <slaweq> as in most or all cases it's this one test which is failing, I proposed to skip it for now 15:13:41 <slaweq> so please review that patch too :) 15:15:56 <slaweq> and that's all regarding last week 15:16:00 <slaweq> lets move on 15:16:12 <slaweq> anything regarding stadium projects or stable branches? 15:17:28 <bcafarel> not much from my side, https://review.opendev.org/c/openstack/requirements/+/764021 still not merged for ussuri 15:17:54 <bcafarel> and I did not check stable branches yet much, though nobody complained they were broken so it should be OK 15:18:15 <slaweq> :) 15:18:22 <lajoskatona> for stadium: the patch to make unsatble some bgpvpn tests was just merged 15:18:38 <slaweq> lajoskatona: yes, I saw 15:18:46 <slaweq> and it makes job to be voting again, right? 15:19:06 <lajoskatona> so the job is voting with again 15:19:14 <lajoskatona> yes exactly 15:19:21 <slaweq> ++ 15:21:15 <slaweq> W 15:21:29 <slaweq> sorry 15:21:49 <slaweq> so I think we can move on 15:21:55 <slaweq> #topic Grafana 15:22:40 <slaweq> in overall check queue seems to be in very bad state due to that issue with pip dependencies 15:23:14 <slaweq> also functional and fullstack jobs are failing 100% of times 15:25:29 <bcafarel> sigh 15:26:08 <bcafarel> for pip deps I am playing whack-a-mole "fix one dep, have new error" in https://review.opendev.org/c/openstack/neutron/+/766000 15:26:17 <slaweq> for fullstack it seems that error is similar https://zuul.opendev.org/t/openstack/build/d9c4e4ac0b794b7f8c667249a9eb3a35 15:26:22 <slaweq> but not exactly the same 15:26:51 <slaweq> it's the same for functional and fullstack 15:26:53 <lajoskatona> bcafarel: LOL 15:26:53 <bcafarel> yes that sounds like similar root cause (new resolver in pip or whatever) 15:27:30 <slaweq> lajoskatona: bcafarel: as You already playing with it, can You also check fullstack and functional jobs? 15:28:14 <bcafarel> at least to pass CI we will need all of them fixed indeed 15:28:44 <lajoskatona> slaweq: yeah, I think I ask around if some common wisdom come from infra team or similar place 15:28:56 <slaweq> lajoskatona: thx 15:29:26 <slaweq> please use this LP which I reported to track all those issues, I don't think we need another one for those fullstack/functional jobs too 15:29:47 <bcafarel> +1 15:30:10 <slaweq> thx 15:31:34 <slaweq> other than that I don't have too many other things to discuss today 15:32:32 <slaweq> in most cases I think that we are hitting known issues like failure with TestSimpleMonitor in functional job (should be fixed with sleep()), fullstack lack of resources which should be fixed by lajoskatona's patch 15:32:46 <slaweq> and neutron-ovn-tempest-ovs-release-ipv6-only issues with ssh 15:34:13 <slaweq> I have just one more topic to discuss 15:34:19 <slaweq> few days ago I sent email http://lists.openstack.org/pipermail/openstack-discuss/2020-December/019240.html 15:34:27 <slaweq> about kernel panics in guest vms 15:35:38 <slaweq> according to the log there we should try to use "noapic" option during boot of the vm 15:36:00 <slaweq> do You know if there is any way we can do that from our jobs? (I'm not expert there and I don't really know that) 15:36:12 <slaweq> or we would need to change cirros image to achieve that? 15:37:14 <bcafarel> hmm 15:37:46 <bcafarel> I think this can be done in qemu itself (passing this kind of option), I guess nova folks will know better 15:39:13 <slaweq> according to sean-k-mooney reply nova can provide such option so IIUC currently it isn't possible really 15:39:41 <sean-k-mooney> currently we cant disable it no 15:39:56 <sean-k-mooney> we woudl have to add a new flag to the glance image properties 15:40:05 <sean-k-mooney> this look like its a cirrios kernel bug 15:40:20 <sean-k-mooney> i dont think the 5.3 kernel has been patched 15:41:20 <sean-k-mooney> cirros is i belive more or less unmaintianed so we will eventurally have to move to somethign like alpine to file the same role as a longterm solution 15:41:41 <slaweq> sean-k-mooney: are You talking about https://alpinelinux.org/ ? 15:41:45 <slaweq> or something else? 15:41:48 <sean-k-mooney> yes that 15:42:12 <slaweq> but I saw there that those images are much bigger than cirros 15:42:13 <sean-k-mooney> its the smallest actvily developed os that woudl be a resonable replacement 15:42:24 <sean-k-mooney> not tha much 15:42:34 <sean-k-mooney> they are still ~ 40mb 15:42:45 <sean-k-mooney> less then 100mb certenly 15:43:00 <sean-k-mooney> we have a min vm size of 1GB 15:43:14 <slaweq> http://dl-cdn.alpinelinux.org/alpine/v3.12/releases/x86_64/alpine-virt-3.12.1-x86_64.iso 15:43:18 <slaweq> this one, right? 15:43:30 <bcafarel> and with larger community around too, when you use cirros people often reply "oh you are from openstack" 15:43:48 <sean-k-mooney> slaweq: ya one sec 15:44:06 <sean-k-mooney> https://review.opendev.org/q/topic:%22alpine%22+(status:open%20OR%20status:merged) 15:44:40 <sean-k-mooney> i started adding support alpine but i hit an initramfs issue using the minimal filesystem image 15:44:49 <sean-k-mooney> which si for contienr and chroots 15:45:03 <sean-k-mooney> i had planned to eventually swap to that iso as a base 15:45:09 <sean-k-mooney> but have not had time to work on it 15:45:32 <sean-k-mooney> short termif ths is blocking the gate 15:45:33 <slaweq> ok, I will try to check this iso maybe locally 15:45:54 <sean-k-mooney> i can add a flag to nova quickly for this 15:46:01 <sean-k-mooney> and then ping the core team to review 15:46:11 <sean-k-mooney> e.g. to disabel the ioapic 15:46:12 <slaweq> it's not blocking the gate but I see some jobs failed due to kernel panics in guest vm at least few times a week 15:46:55 <slaweq> for sure we have more urgent and critical issues but fixing somehow this one would also help, I think not only Neutron but also other projects :) 15:47:16 <sean-k-mooney> ill bring it up on thet nova channel shortly and get buy in if there is no objectsion ill quickly write a patch as a workaround 15:47:31 <slaweq> sean-k-mooney: would be great, thx a lot 15:47:31 <sean-k-mooney> ya it showe up in the nova jobs too 15:48:46 <slaweq> ok, that was the last thing which I had for this week 15:49:47 <slaweq> if You don't have anything else, I will give You few minutes back 15:50:32 <slaweq> thx for attending the meeting 15:50:35 <bcafarel> sounds good 15:50:38 <slaweq> have a great week 15:50:40 <bcafarel> o/ 15:50:40 <lajoskatona> Bye 15:50:41 <slaweq> o/ 15:50:44 <slaweq> #endmeeting