14:00:48 #startmeeting RDO meeting - 2021-07-21 14:00:48 Meeting started Wed Jul 21 14:00:48 2021 UTC and is due to finish in 60 minutes. The chair is ykarel. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:48 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:48 The meeting name has been set to 'rdo_meeting___2021_07_21' 14:01:10 Please add topic to agenda https://etherpad.opendev.org/p/RDO-Meeting 14:01:16 #topic roll call 14:02:04 o/ 14:02:10 #chair spotz 14:02:10 Current chairs: spotz ykarel 14:02:22 o/ 14:02:31 #chair amoralej 14:02:31 Current chairs: amoralej spotz ykarel 14:02:31 o/ 14:02:35 #chair jcapitao 14:02:35 Current chairs: amoralej jcapitao spotz ykarel 14:03:45 Ok let's start with topics in agenda 14:03:59 #topic C9 Stream Updates 14:05:46 #link https://review.rdoproject.org/r/c/testproject/+/33878 14:06:02 #link https://review.rdoproject.org/r/c/testproject/+/34549 14:06:37 #info some tests were run with devstack + c9 14:07:08 amoralej, any other updates apart from ^? 14:07:18 i think that's the main progress 14:07:28 i tried to debug some issues in devstack runs 14:07:37 but i think i need someone to check 14:07:49 amoralej, related to tempest failures? 14:07:54 yes 14:07:56 i noticed some of those were random 14:08:07 i'm not sure if it's just random tbh 14:08:20 okk 14:08:32 tosky, ^ we are running devstack on Centos9 and got some errors in cinder 14:08:50 Wasn;t there a post about tempest failures? Might have been last week 14:08:50 may we get some help from some cinder expert? 14:08:54 like in https://logserver.rdoproject.org/49/34549/20/check/devstack-platform-centos-9-stream/d2b0b81/testr_results.html there was just 1 failure 14:09:35 and the run before and after ^ there were more failrues 14:09:52 after adding swift we got more errors 14:09:53 https://logserver.rdoproject.org/49/34549/20/check/devstack-platform-centos-9-stream/793602b/testr_results.html 14:10:01 but i'm not sure if those are related to swift 14:10:07 or maybe performance related 14:10:17 but there seems to be some pattern 14:10:24 and be mainly related to cinder 14:10:36 amoralej: is it really cinder or something that looks cinder but it's really, say, nova? No :) 14:11:06 in fact it's nova complaining but seems to be cinder :) 14:13:24 it's unclear to me, tbh 14:13:49 Jul 19 07:59:29.720234 node-0001435902 devstack@c-api.service[107162]: CRITICAL cinder [None req-3be6ea4f-0dfb-4686-9ad3-3ad58f157fc6 tempest-ServerStableDeviceRescueTest-1149582697 tempest-ServerStableDeviceRescueTest-1149582697-project] Unhandled error: OSError: write error 14:13:59 that looks suspictious 14:14:08 https://logserver.rdoproject.org/49/34549/20/check/devstack-platform-centos-9-stream/793602b/controller/logs/screen-c-api.txt 14:14:59 anyway, we can follow up later 14:15:16 yeap ok let's move to next 14:15:33 #topic Jenkins Migration Updates 14:15:53 #link https://review.rdoproject.org/r/q/topic:jenkins-v2 14:16:01 jcapitao, anything to add in this 14:16:28 so the patch to migrate promotion jobs for tripleo master is ready 14:16:36 #link https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/34419 14:17:07 as well for per releases and distro jobs (based on master patch) 14:17:17 #link https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/34560 14:17:36 now we're working on tripleo-quickstart 3rd party job which needs some specific Jenkins configuration 14:18:46 ok Good, Thanks jcapitao 14:19:05 weshay|ruck, rlandy|ruck can u have a look ^ 14:19:34 ykarel: ack - will look after meetings 14:19:44 weshay|ruck is PTO 14:19:53 jcapitao: thanks for setting this up 14:19:56 rlandy|ruck, okk Thanks 14:20:42 Ok let's move to next topic 14:20:57 #topic RDO Trunk repos for centos-[queens|rocky|stein] from last meeting 14:21:13 amoralej, you got some updates on ^? 14:21:21 yes, sorry, i realized about i missed sending the mail 14:21:33 i was writing it right before the meeting, i'll send it asap 14:21:49 ok np 14:21:58 can followup next week 14:22:17 #topic chair for next week 14:22:25 any volunteer? 14:22:32 I can take it 14:22:52 Thhanks jcapitao 14:22:58 Thanks jcapitao 14:23:05 #action jcapitao to chair next week 14:23:24 #topic Open Floor 14:23:31 Feel free to bring any topic now 14:23:50 amoralej: back to that cinder issue (so that's in the log) that error you reported (OSError) can be seen also in different tests (look for tempest-ServerStableDeviceRescueTest), so it may not be critical 14:23:57 there is a proxy error here: https://logserver.rdoproject.org/49/34549/20/check/devstack-platform-centos-9-stream/793602b/controller/logs/screen-n-cpu.txt 14:24:06 look for Jul 19 07:59:29 14:25:26 tosky, iiuc the actual error there is: 14:25:27 ERROR nova.virt.libvirt.driver [None req-bc98f5f5-5bdf-4839-8e92-54e818ebbfca tempest-ServerStableDeviceRescueTest-1149582697 tempest-ServerStableDeviceRescueTest-1149582697-project] Waiting for libvirt event about the detach of device vdb with device alias virtio-disk1 from instance ca12fb3e-0e7c-4460-ab03-c77a593ccc75 is timed out. 14:25:40 that's how i was to check cinder 14:26:00 but i'm not sure who is responsible of that dettaching 14:26:07 is that nova issue? 14:28:48 it's on the boundary and that's the part I'm not sure about 14:28:57 I'm sure geguileo knows if it's cinder or not 14:29:03 yeah, we are in the same page then :) 14:29:28 ok i just rechecked, and will request node hold, so can check on live env 14:29:35 good 14:29:36 tosky: let me catch up on the conversation... 14:31:27 Joel Capitao proposed config master: Update script which sends review based on u-c changes https://review.rdoproject.org/r/c/config/+/33518 14:32:34 tosky: amoralej that issue is in Nova's domain, because the timeout is coming from libvirt 14:33:12 it's failing before they even call os-brick (telling the instance to stop using the block device) 14:33:15 ok, i'll look for some nova friend now 14:34:24 amoralej: no, I may be wrong... 14:34:28 let me have another look... 14:35:33 looking at request req-bc98f5f5-5bdf-4839-8e92-54e818ebbfca which is the one with the error pasted above 14:35:48 that error was temporary, because on the next try it succeeded 14:36:16 so, it failed calling cinder :-( 14:38:05 yes seems so, both the test where it was seeing timeout, have actually passed. 14:38:22 failing one is ERROR nova.compute.manager [None req-70164f48-e90d-494a-8ccd-fbcade5428d4 tempest-AttachVolumeMultiAttachTest-1454555876 tempest-AttachVolumeMultiAttachTest-1454555876-project] [instance: b725072c-0ddf-41ba-9ffd-ad61aa4e54fd] Failed to attach deaabc20-29d2-4ec4-a8a0-ba0fc827b0d7 at /dev/vdb: cinderclient.exceptions.ClientException: Proxy Error (HTTP 502) 14:39:10 the scary part is that the actual cinder code returns 200 14:39:25 Jul 19 07:59:34.690149 node-0001435902 devstack@c-api.service[107162]: INFO cinder.api.openstack.wsgi [req-bc98f5f5-5bdf-4839-8e92-54e818ebbfca req-86510417-b70f-4423-83b6-cceb0e70caab tempest-ServerStableDeviceRescueTest-1149582697 tempest-ServerStableDeviceRescueTest-1149582697-project] 14:39:26 it may be even uwsgi itself 14:39:27 https://38.102.83.14/volume/v3/170ba9baa97a46baac4c99ef456e5d84/attachments/beac4203-1e49-472f-8a1e-2bb3d7d8fe3c returned with HTTP 200 14:40:09 geguileo, the log is written before the request result is passed to uwsgi? 14:40:51 it would seem so, because it fails after cinder API code is "done" 14:41:12 https://stackoverflow.com/questions/36156887/uwsgi-raises-oserror-write-error-during-large-request 14:42:25 amoralej: no I believe it's wsgi who returns the log return value 14:43:57 let's see if we can reproduce it and check 14:44:18 amoralej: https://github.com/openstack/cinder/blob/81f2aaeea91bce6455f9b09cc8795855200e75e1/cinder/api/openstack/wsgi.py#L938-L954 14:46:56 that's the INFO logging and what cinder does afterwards... 14:47:09 doesn't seem to do any OS calls 14:47:52 uwsgi_response_writev_headers_and_body_do(): Broken pipe [core/writer.c line 306] 14:48:06 yes, looks like it's uwsgi itself 14:50:00 Lock acquired by "attachment_update" 14:50:07 doesn't seem this released ^ 14:50:26 ^ seems related geguileo ^? 14:51:20 following req-70164f48-e90d-494a-8ccd-fbcade5428d4 in cinder api log 14:51:32 https://logserver.rdoproject.org/49/34549/20/check/devstack-platform-centos-9-stream/d2b0b81/controller/logs/screen-c-api.txt 14:52:40 Ok let's have this offline, /me ends meeting 14:52:43 Thanks all 14:52:53 #endmeeting