opendevreview | Merged openstack/tempest master: Setting Tempest run concurrency to 4 for a few jobs https://review.opendev.org/c/openstack/tempest/+/890689 | 00:07 |
---|---|---|
opendevreview | Merged openstack/tempest master: Skip test_image_tasks_create() for bug 2030527 https://review.opendev.org/c/openstack/tempest/+/890687 | 00:29 |
opendevreview | Merged openstack/devstack master: Add SERVICE_REPORT_INTERVAL knob https://review.opendev.org/c/openstack/devstack/+/890439 | 05:05 |
opendevreview | wangjiaqi proposed openstack/hacking master: Use py3 as the default runtime for tox https://review.opendev.org/c/openstack/hacking/+/890725 | 06:40 |
opendevreview | sean mooney proposed openstack/devstack master: [WIP] add support for zswap and ksmtuned https://review.opendev.org/c/openstack/devstack/+/890693 | 11:29 |
opendevreview | sean mooney proposed openstack/devstack master: [WIP] add support for zswap and ksmtuned https://review.opendev.org/c/openstack/devstack/+/890693 | 12:34 |
opendevreview | Merged openstack/tempest master: Remove nova-network tests https://review.opendev.org/c/openstack/tempest/+/890471 | 13:04 |
opendevreview | Brian Haley proposed openstack/devstack master: Fix $LOGDIR owner to be stack.stack https://review.opendev.org/c/openstack/devstack/+/890792 | 13:08 |
opendevreview | Lukas Piwowarski proposed openstack/tempest master: Skip test_list_no_containers when pre-prov creds are used https://review.opendev.org/c/openstack/tempest/+/890798 | 14:06 |
opendevreview | Lukas Piwowarski proposed openstack/tempest master: Skip test_list_no_containers when pre-prov creds are used https://review.opendev.org/c/openstack/tempest/+/890798 | 14:06 |
dansmith | kopecmartin: around by chance? | 14:10 |
kopecmartin | dansmith: yes, what's up? | 14:11 |
dansmith | so I'm trying to debug this tempest test | 14:11 |
dansmith | and I swear it looks like _run_cleanups() starts running in the middle of the test, and somewhat in parallel to continued operation of the test | 14:11 |
dansmith | https://5248ae6d6484a440a059-bda51eedb42181063c5344e0473d3d05.ssl.cf2.rackcdn.com/888470/1/gate/nova-multi-cell/2964990/testr_results.html | 14:11 |
dansmith | open that test failure and search for _run_cleanups() | 14:12 |
dansmith | (the test not the class) | 14:12 |
dansmith | that first instance of _run_cleanups() happens right after we do a get on the server | 14:13 |
dansmith | and then right after it does a post for tokens, we see a post to do the rebuild | 14:13 |
dansmith | and then further down we see an ssh validation | 14:13 |
dansmith | so I just realized.. | 14:14 |
dansmith | maybe this is the test class rebuilding the server after the test is done? | 14:14 |
dansmith | it just so happens to be a rebuild test, so maybe that's where I went off the rails | 14:14 |
dansmith | er, wait, do we rebuild in between? _recreate_server() actually creates a new one... | 14:17 |
kevko | i would like to ask also ... is this test passing upstream ? | 14:17 |
kevko | https://paste.openstack.org/show/bOCkBdzQyhB9F1XrRPUr/ | 14:17 |
dansmith | kevko: it should be easy to verify, but yes: | 14:20 |
dansmith | test_server_detach_rules[compute,id-be615530-f105-437a-8afe-ce998c9535d9,image,network,slow,volume] | 14:20 |
dansmith | pass | 14:20 |
kopecmartin | dansmith: hm, the server is recreated in the setup (if the server isn't found) and in the resource_setup .. _test_rebuild_server should only rebuild it .. although I see that we run validation twice (once as part of the test and once in _test_rebuild_server) | 14:23 |
dansmith | kopecmartin: that log looks to me like _run_cleanups() is doing a rebuild of the server | 14:24 |
dansmith | it's doing a POST to /action, and that post is a rebuild | 14:24 |
dansmith | I can't even find _run_cleanups() .. is that part of tempest or testtools or something? | 14:24 |
dansmith | maybe something has done an addCleanup() that is doing a rebuild somewhere? | 14:25 |
kevko | dansmith: can u share a link to some config/test results in zuul so i can check my configs ? | 14:26 |
dansmith | kevko: the link I posted above has that test passing | 14:26 |
kevko | dansmith: can i see somehow configs ? | 14:27 |
dansmith | kopecmartin: oooh, I may have found it | 14:27 |
dansmith | kevko: go up the directory tree from that log report | 14:27 |
dansmith | kopecmartin: self.addCleanup(self._rebuild_server_and_check, old_image, server) | 14:28 |
dansmith | several tests do that | 14:28 |
kopecmartin | yes, only if the image is different than the original image | 14:31 |
kopecmartin | i'm going through the code and i don't see anything suspicious (except the extra validation) yet | 14:32 |
kopecmartin | _run_cleanups is a mystery though, i don't see such function anywhere o.O | 14:33 |
dansmith | I think that must be something in testtools that runs all your addCleanup() hooks | 14:33 |
dansmith | but yeah, I think what's going on is that we schedule a rebuild of the test server and then we later call another cleanup hook to detach the volume, but it can't (yet) | 14:34 |
dansmith | we should probably run the detach first so we're not rebuilding the server with the volume attached anyway | 14:35 |
kevko | dansmith: can i check somehow what roles users have ? | 14:36 |
dansmith | kopecmartin: yeah so those two volume wait traces at the bottom match what we schedule in attach_volume() so that's the deal | 14:38 |
dansmith | so detaching first might make that more likely to succeed, but the problem is likely just the flakiness involved in volume detach | 14:39 |
dansmith | kevko: tempest normally creates users for each test, and the roles that user has would be "member" AFAIK for anything that doesn't use an admin client | 14:42 |
kopecmartin | #startmeeting qa | 15:00 |
opendevmeet | Meeting started Tue Aug 8 15:00:36 2023 UTC and is due to finish in 60 minutes. The chair is kopecmartin. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:00 |
opendevmeet | The meeting name has been set to 'qa' | 15:00 |
kevko | dansmith: okay, and why do you have service_token_roles_required = False | 15:00 |
kopecmartin | #link https://wiki.openstack.org/wiki/Meetings/QATeamMeeting#Agenda_for_next_Office_hours | 15:01 |
kevko | dansmith: because there should be = True | 15:01 |
kopecmartin | agenda ^ | 15:01 |
frickler | o/ | 15:01 |
lpiwowar | o/ | 15:01 |
kevko | dansmith: as per https://docs.openstack.org/cinder/latest/configuration/block-storage/service-token.html | 15:01 |
lpiwowar | Is gerrit down or is it only problem on my side? | 15:02 |
kopecmartin | kevko: where is that? i don't see such setting in tempest or devstack | 15:03 |
kopecmartin | lpiwowar: seems down | 15:04 |
opendevreview | Lukas Piwowarski proposed openstack/tempest master: Skip test_list_no_containers when pre-prov creds are used https://review.opendev.org/c/openstack/tempest/+/890798 | 15:04 |
kevko | kopecmartin: https://5248ae6d6484a440a059-bda51eedb42181063c5344e0473d3d05.ssl.cf2.rackcdn.com/888470/1/gate/nova-multi-cell/2964990/controller/logs/screen-c-api.txt << ctrl+f service_token_roles_required | 15:04 |
kevko | ^^ this should be True as per https://docs.openstack.org/cinder/latest/configuration/block-storage/service-token.html | 15:04 |
frickler | gerrit look fine to me | 15:05 |
kevko | that's the reason why upstream tempest.scenario.test_server_volume_attachment.TestServerVolumeAttachmentScenario.test_server_detach_rules is passing ... | 15:05 |
kevko | in my env where i am setting service_token_roles_required = True it's failing ...but i got 401 ...so i think the test is broken .... | 15:05 |
lpiwowar | frickler: it works me now too | 15:06 |
frickler | kevko: so maybe devstack/tempest need to adapt the deployment and then fix the test. can we go on with the meeting now, though? | 15:07 |
kopecmartin | kevko: the setting seems to be coming from keystone middleware .. | 15:07 |
frickler | likely default=false for backwards compat | 15:07 |
kopecmartin | i'll try dig deeper, but now let's start with the meeting | 15:07 |
kopecmartin | yeah, will need to find some patches .. maybe it's something we forgot to enable in tempest/devstack | 15:08 |
kopecmartin | #topic Announcement and Action Item (Optional) | 15:08 |
kopecmartin | none from my side | 15:09 |
kopecmartin | #topic Bobcat Priority Items progress | 15:09 |
kopecmartin | #link https://etherpad.opendev.org/p/qa-bobcat-priority | 15:09 |
* kopecmartin checking the doc | 15:09 | |
kopecmartin | seems i need to follow up on | 15:10 |
kopecmartin | #link https://review.opendev.org/c/openstack/devstack/+/558930 | 15:10 |
kopecmartin | there have been recent changes | 15:10 |
frickler | I updated the venv name according to dansmith's suggestion | 15:11 |
frickler | which actually I had commented upon 5 years ago, too ;) | 15:11 |
frickler | then there's still the discussion whether to symlink only specific binaries into /usr/local or add the complete venv into the path | 15:12 |
frickler | I would like us to proceed with what we have now, but there also hasn't been a lot of reviews | 15:13 |
kopecmartin | yeah, let me revisit that right after this meeting, personally i'd rather merge this sooner than later as it may broke someone else and we'll be aproaching another release soon | 15:17 |
frickler | in particular adding the bookworm job would be good, ack | 15:19 |
kopecmartin | and as the jobs passed, it mostly works ,the rest can be figured out later | 15:19 |
kopecmartin | #topic Gate Status Checks | 15:20 |
kopecmartin | #link https://review.opendev.org/q/label:Review-Priority%253D%252B2+status:open+(project:openstack/tempest+OR+project:openstack/patrole+OR+project:openstack/devstack+OR+project:openstack/grenade) | 15:20 |
kopecmartin | nothing there | 15:20 |
kopecmartin | any urgent patches to review? | 15:20 |
lpiwowar | Can somebody please take a look at this patch: https://review.opendev.org/c/openstack/tempest/+/890653 ? It's about skipping tests that do user password update when miminum_password_age is set in keystone.conf. | 15:20 |
lpiwowar | But it is probably not urgent kopecmartin. | 15:21 |
kopecmartin | lpiwowar: right, it's on my list | 15:21 |
lpiwowar | kopecmartin: thanks! | 15:21 |
kopecmartin | #topic Bare rechecks | 15:21 |
kopecmartin | #link https://etherpad.opendev.org/p/recheck-weekly-summary | 15:22 |
kopecmartin | due to all the rechecks we made last week we quite pushed the percentage down :D | 15:22 |
kopecmartin | #topic Periodic jobs Status Checks | 15:22 |
kopecmartin | periodic stable full | 15:22 |
kopecmartin | #link https://zuul.openstack.org/builds?pipeline=periodic-stable&job_name=tempest-full-yoga&job_name=tempest-full-xena&job_name=tempest-full-zed&job_name=tempest-full-2023-1 | 15:22 |
kopecmartin | periodic stable slow | 15:22 |
kopecmartin | #link https://zuul.openstack.org/builds?job_name=tempest-slow-2023-1&job_name=tempest-slow-zed&job_name=tempest-slow-yoga&job_name=tempest-slow-xena | 15:22 |
kopecmartin | periodic extra tests | 15:22 |
kopecmartin | #link https://zuul.openstack.org/builds?job_name=tempest-full-2023-1-extra-tests&job_name=tempest-full-zed-extra-tests&job_name=tempest-full-yoga-extra-tests&job_name=tempest-full-xena-extra-tests | 15:22 |
kopecmartin | periodic master | 15:22 |
kopecmartin | #link https://zuul.openstack.org/builds?project=openstack%2Ftempest&project=openstack%2Fdevstack&pipeline=periodic | 15:22 |
opendevreview | Ashley Rodriguez proposed openstack/devstack-plugin-ceph master: Remote Ceph with cephadm https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/876747 | 15:23 |
kopecmartin | tempest-all timeouts .. it seems it's ran with concurrency == 6 | 15:25 |
kopecmartin | #link https://zuul.openstack.org/builds?job_name=tempest-all&project=openstack%2Ftempest&project=openstack%2Fdevstack&pipeline=periodic&skip=0 | 15:25 |
kopecmartin | hm | 15:25 |
kopecmartin | #topic Distros check | 15:27 |
kopecmartin | cs-9 | 15:27 |
kopecmartin | #link https://zuul.openstack.org/builds?job_name=tempest-full-centos-9-stream&job_name=devstack-platform-centos-9-stream&skip=0 | 15:27 |
kopecmartin | fedora | 15:27 |
kopecmartin | #link https://zuul.openstack.org/builds?job_name=devstack-platform-fedora-latest&skip=0 | 15:27 |
kopecmartin | debian | 15:27 |
kopecmartin | #link https://zuul.openstack.org/builds?job_name=devstack-platform-debian-bullseye&skip=0 | 15:27 |
kopecmartin | focal | 15:27 |
kopecmartin | #link https://zuul.opendev.org/t/openstack/builds?job_name=devstack-platform-ubuntu-focal&skip=0 | 15:27 |
kopecmartin | rocky | 15:27 |
kopecmartin | #link https://zuul.openstack.org/builds?job_name=devstack-platform-rocky-blue-onyx | 15:27 |
kopecmartin | openEuler | 15:27 |
kopecmartin | #link https://zuul.openstack.org/builds?job_name=devstack-platform-openEuler-22.03-ovn-source&job_name=devstack-platform-openEuler-22.03-ovs&skip=0 | 15:27 |
frickler | F36 is eol, I think we need to drop that from 2023.1, too | 15:29 |
kopecmartin | right, i've just removed that from the agenda | 15:29 |
kopecmartin | frickler: should we also delete the job from stable/2023.1? | 15:30 |
kopecmartin | i see it failing there | 15:30 |
frickler | yes, that's what I just said ... or tried to | 15:31 |
kopecmartin | :D sorry | 15:31 |
kopecmartin | i read it but didn't understand it cause i was thinking about the agenda ... ah | 15:32 |
* kopecmartin needs another coffee | 15:32 | |
kopecmartin | so yes, i agree, will you propose a patch or should I? | 15:32 |
frickler | I can do the patch | 15:32 |
kopecmartin | prefect, thanks! | 15:33 |
gmann | o/ | 15:33 |
kopecmartin | o/ | 15:33 |
lpiwowar | o/ | 15:34 |
kopecmartin | great, so that was fedora | 15:34 |
kopecmartin | this will help with centos9 | 15:34 |
kopecmartin | #link https://review.opendev.org/q/Icd99f467d47aaafaaf3ee8f2a3c4da08842cb672 | 15:34 |
kopecmartin | the rest of the distros look ok it seems | 15:35 |
kopecmartin | #topic Sub Teams highlights | 15:36 |
kopecmartin | Changes with Review-Priority == +1 | 15:36 |
kopecmartin | #link https://review.opendev.org/q/label:Review-Priority%253D%252B1+status:open+(project:openstack/tempest+OR+project:openstack/patrole+OR+project:openstack/devstack+OR+project:openstack/grenade) | 15:36 |
kopecmartin | nothing there | 15:36 |
kopecmartin | #topic Open Discussion | 15:36 |
kopecmartin | anything for the open discussion? | 15:36 |
lpiwowar | I added the skip for the object storage test we talked about last week: https://review.opendev.org/c/openstack/tempest/+/890798 ... We can start the discussion there. | 15:37 |
sean-k-mooney | are there any know issue with grenede failing in keyston upgrade | 15:37 |
gmann | sean-k-mooney: not afaik | 15:37 |
gmann | it was all good till yesterday, have not checked today yet | 15:38 |
sean-k-mooney | ok cause i have seen this 3 times today on one of my patches | 15:38 |
sean-k-mooney | https://zuul.opendev.org/t/openstack/build/e4ecb12b5ea74a669f6e0ff59f478ad5/log/controller/logs/grenade.sh_log.txt#5430 | 15:38 |
kopecmartin | lpiwowar: does it relate to this https://bugs.launchpad.net/tempest/+bug/2028671 ? | 15:38 |
lpiwowar | kopecmartin: yes:) | 15:38 |
sean-k-mooney | oh never mind | 15:38 |
lpiwowar | I will add the Related-Bug: | 15:39 |
gmann | ok, have not seen that. but I will monitor | 15:39 |
sean-k-mooney | its related to my patch | 15:39 |
opendevreview | Lukas Piwowarski proposed openstack/tempest master: Skip test_list_no_containers when pre-prov creds are used https://review.opendev.org/c/openstack/tempest/+/890798 | 15:39 |
gmann | just FYI, after nova-network test removal, I am removing the glance v1 APIs tests too #link https://review.opendev.org/q/topic:remove-glance-v1-tests | 15:40 |
opendevreview | Ashley Rodriguez proposed openstack/devstack-plugin-ceph master: Remote Ceph with cephadm https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/876747 | 15:40 |
gmann | it need some updates in plugins side also for config as well as scenario manager logic so doing those first | 15:40 |
kopecmartin | gmann: thanks, i added the topic to my review list | 15:41 |
gmann | with that i found there are many plugins still has their scenario manager copy, did not we removed all those ? or they are just missing | 15:41 |
gmann | lpiwowar: I think you did the most of the work in that ^^ | 15:41 |
gmann | kopecmartin: thanks | 15:41 |
gmann | I can check if we can remove them but just want to check if there is any reason we did not do that while doing it for otherplugins | 15:42 |
lpiwowar | gmann: I think that it was rpopelka if I remember correctly. | 15:42 |
kopecmartin | not all the patches are merged gmann | 15:42 |
kopecmartin | #link https://review.opendev.org/q/topic:tempest-scenario-manager-cleanup | 15:42 |
kopecmartin | or maybe we missed some | 15:42 |
gmann | lpiwowar: ohk | 15:42 |
gmann | kopecmartin: nice, thanks I will check those | 15:43 |
gmann | watcher and barbican are the one and changes are not merged. | 15:43 |
gmann | this explain it. | 15:43 |
kopecmartin | dansmith: what do we wanna do with that test_rebuild_server_with_volume_attached test? | 15:44 |
gmann | kopecmartin: is it failing consistently ? | 15:44 |
lpiwowar | gmann: We recently hit an issue with this in barbican-tempest-plugin regarding the removal of scenario manager: https://review.opendev.org/c/openstack/barbican-tempest-plugin/+/889354 | 15:44 |
kopecmartin | gmann: no, it doesn't seem so | 15:45 |
gmann | lpiwowar: ohk, will check this change. thanks for info | 15:45 |
ykarel | wrt rebuild test is this related to issue Host key for server does not match? | 15:45 |
ykarel | recently noticed in some of neutron jobs | 15:45 |
lpiwowar | The failure was caused by victoria job using master barbican-tempest-plugin. I'm trying to fix this here: https://review.opendev.org/c/openstack/barbican-tempest-plugin/+/890784 | 15:45 |
gmann | not sure, I think we fixed that test right ? | 15:46 |
kopecmartin | gmann: to save you some time, the barbican issue is that they use master plugin on victoria branch ... missing dependencies etc | 15:46 |
gmann | kopecmartin: I see, which they can use old plugin as victoria is too old for tempest master and barbican-tempest-plugin to use | 15:46 |
gmann | I will have a look into that | 15:46 |
kopecmartin | ykarel: seems not .. i started looking into that just an hour ago and started with only this https://5248ae6d6484a440a059-bda51eedb42181063c5344e0473d3d05.ssl.cf2.rackcdn.com/888470/1/gate/nova-multi-cell/2964990/testr_results.html | 15:47 |
ykarel | kopecmartin, ack i was talking about https://cfc27c554a994f35be4e-76bf72ffc642f12cb8e7d8393148d522.ssl.cf5.rackcdn.com/874797/41/gate/neutron-ovn-tempest-ipv6-only-ovs-release/8162e79/testr_results.html | 15:48 |
gmann | kopecmartin: I think we added wait for rebuild to finish there but need to check | 15:48 |
ykarel | and as per opensearch i see that in multiple jobs and times | 15:48 |
opendevreview | Lukas Piwowarski proposed openstack/tempest master: Skip test_list_no_containers when pre-prov creds are used https://review.opendev.org/c/openstack/tempest/+/890798 | 15:48 |
gmann | kopecmartin: we do wait for server to be active after rebuild https://github.com/openstack/tempest/blob/3e996dc6cefd5ad7136001a6eb846a9255cce961/tempest/api/compute/servers/test_server_actions.py#L158 | 15:50 |
opendevreview | sean mooney proposed openstack/devstack master: add support for zswap and ksmtuned https://review.opendev.org/c/openstack/devstack/+/890693 | 15:51 |
gmann | kopecmartin: dansmith: I think what happening here is that rebuild is stuck and teardown try to detach volume and fail. | 15:52 |
kopecmartin | gmann: yes, the issue seemed to be about a race between attached volumes | 15:52 |
gmann | we can make teardown little bit more smart to handle that | 15:52 |
kopecmartin | gmann: btw, we run validation twice there, once in _test_rebuild_server and once at the end of the test | 15:52 |
gmann | kopecmartin: yeah, I think we can remove until there is any reason whoami-rajat knows | 15:53 |
gmann | let me have a look into this in my noon and will update teardown also | 15:55 |
kopecmartin | ack, sounds good, thanks | 15:55 |
kopecmartin | gmann: btw, we're feeling with frickler adventurous and would move this forward - https://review.opendev.org/c/openstack/devstack/+/558930 | 15:56 |
gmann | kopecmartin: ok, i have not checked it actually but I think you and dansmith are looking into it. | 15:57 |
gmann | but do not wait for me, i might not be able to check it for this week or so | 15:57 |
kopecmartin | okey | 15:58 |
kopecmartin | #topic Bug Triage | 15:58 |
kopecmartin | #link https://etherpad.openstack.org/p/qa-bug-triage-bobcat | 15:58 |
kopecmartin | new numbers there ^^ | 15:58 |
kopecmartin | time is almost up | 15:58 |
kopecmartin | so if there isn't anything else , we can call it an hour | 15:58 |
kopecmartin | thanks everyone | 15:58 |
kopecmartin | #endmeeting | 15:58 |
opendevmeet | Meeting ended Tue Aug 8 15:58:56 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:58 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/qa/2023/qa.2023-08-08-15.00.html | 15:58 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/qa/2023/qa.2023-08-08-15.00.txt | 15:58 |
opendevmeet | Log: https://meetings.opendev.org/meetings/qa/2023/qa.2023-08-08-15.00.log.html | 15:58 |
frickler | thx kopecmartin | 15:59 |
lpiwowar | thanks kopecmartin | 15:59 |
gmann | thanks everyone | 15:59 |
kopecmartin | kevko: let's get to your issue now | 16:01 |
dansmith | gmann: yeah, something like that, but I'm not sure the rebuild is actually stuck or else we would fail waiting for active or something, but I'm not sure | 16:04 |
gmann | dansmith: yeah, will fix that in teardown not to fail in that case. | 16:05 |
kopecmartin | kevko: so the job which uses service_token_roles_required = false and probably shouldn't is nova-multi-cell defined in nova ... to get things moving, i'd suggest to edit the job, file a bug in tempest that the test is failing if that setting is true .. then we'll have a way of testing the fix in tempest | 16:16 |
sean-k-mooney | i tought that option had been removed form oslo | 16:31 |
sean-k-mooney | and not the service token role was always enforced | 16:31 |
sean-k-mooney | huh i tought that got removed https://docs.openstack.org/nova/latest/configuration/config.html#keystone_authtoken.service_token_roles_required | 16:33 |
sean-k-mooney | did that get reverted? | 16:33 |
sean-k-mooney | i guess not https://opendev.org/openstack/keystonemiddleware/src/branch/master/keystonemiddleware/auth_token/_opts.py#L175 | 16:38 |
sean-k-mooney | i tought seting htat to true and eventually remvoing it was part of the cinder volume cve fix | 16:38 |
whoami-rajat | gmann, not 100% sure about the test since that's not the one i added, this one has an image backed server with a volume attached to it | 17:06 |
whoami-rajat | based on the failure we are trying to detach a volume when the instance state is rebuilding, i think we need a waiter there until the VM becomes active | 17:07 |
gmann | whoami-rajat: ah right. no issue, I think I misread the test name | 17:07 |
whoami-rajat | regarding validation, i agree that it's done twice | 17:08 |
whoami-rajat | looking at the failure here, https://5248ae6d6484a440a059-bda51eedb42181063c5344e0473d3d05.ssl.cf2.rackcdn.com/888470/1/gate/nova-multi-cell/2964990/testr_results.html | 17:08 |
gmann | whoami-rajat: we do that but t seems somewhere server is stuck and test teardown is trying to detach volume, we can handle that in teardown | 17:08 |
whoami-rajat | ok, got it | 17:09 |
dansmith | gmann: kopecmartin frickler: thoughts on getting this in so we can hopefully diagnose if some of our service timeouts are related to apache pool exhaustion? https://review.opendev.org/c/openstack/devstack/+/890526 | 17:17 |
gmann | dansmith: agree, +2 | 17:19 |
opendevreview | Ghanshyam proposed openstack/tempest master: Ignore 409 error in detach volume cleanup https://review.opendev.org/c/openstack/tempest/+/890821 | 18:02 |
gmann | dansmith: kopecmartin ^^ what you think of this for test_rebuild_server_with_volume_attached failure | 18:18 |
dansmith | gmann: will that help? I guess I assumed if the volume is wedged the instance won't actually go away | 18:19 |
gmann | dansmith: that we handle in teardown, if anything wrong with server and cannot be recover we delete the server and create new one | 18:20 |
dansmith | right, but will the delete actually complete if we're stuck? | 18:20 |
gmann | dansmith: https://github.com/openstack/tempest/blob/3e996dc6cefd5ad7136001a6eb846a9255cce961/tempest/api/compute/servers/test_server_actions.py#L80 | 18:20 |
gmann | dansmith: ohk, so we can either wait for server to be in active before detach and if nothing comes up then force delete ? | 18:21 |
dansmith | gmann: I'm not saying we shouldn't do what you propose, I'm just not sure it will help | 18:22 |
dansmith | if the instance is stuck in DELETING waiting for the volume or something, a test that does a server list will still see it right? | 18:22 |
dansmith | I guess my point is.. | 18:22 |
gmann | dansmith: I got your point, you are right that it will fail in delete server because it has volume attach | 18:22 |
dansmith | I feel like we need to get to the bottom of these volume issues | 18:22 |
gmann | it is server task stuck in rebuilding state | 18:23 |
dansmith | probably because it can't detach the volume itself to complete the rebuild right? | 18:24 |
gmann | humm, possible | 18:24 |
dansmith | gmann: tbh I'm not sure that rebuilding in between tests is really the best plan anyway | 18:24 |
dansmith | I'm sure it was intended to be faster, but I dunno, seems like it's asking for this sort of problem | 18:24 |
gmann | dansmith: but test is for rebuild in between only. that is what you mean or something during teardown/setup we are doing ? | 18:39 |
dansmith | gmann: we're doing a rebuild of the class server so that the next test sees it as based on the original image right? | 18:40 |
gmann | dansmith: ah, right | 18:46 |
gmann | we should do that with test level server | 18:46 |
dansmith | right, so the test should be testing rebuild, but the cleanup should just delete and re-create it if it needs to | 18:49 |
dansmith | that said, I think it's possible the delete will fail if the volume is hung anyway, | 18:49 |
dansmith | but less rebuilding in cleanup would be better anyway | 18:49 |
gmann | dansmith: agree, let me update that | 18:51 |
dansmith | slaweq: sorry if I should know this, but do you have stats on the recheck "reason" ? | 19:06 |
dansmith | I'm wondering if we could calculate what percentage of the rechecks have "volume" or "cinder" in the reason | 19:07 |
slaweq | @dansmith no, I still didn't make this in my script | 19:24 |
dansmith | slaweq: okay | 19:32 |
opendevreview | Ghanshyam proposed openstack/tempest master: Improve test_rebuild_server_with_volume_attached https://review.opendev.org/c/openstack/tempest/+/890821 | 20:02 |
opendevreview | Merged openstack/devstack stable/2023.1: Use RDO official CloudSIG mirrors for C9S deployments https://review.opendev.org/c/openstack/devstack/+/890120 | 23:40 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!