*** ramishra has joined #openstack-qa | 00:01 | |
*** ramishra has quit IRC | 00:06 | |
*** ramishra has joined #openstack-qa | 00:12 | |
*** dviroel has quit IRC | 00:32 | |
*** tkajinam has quit IRC | 00:55 | |
*** tkajinam has joined #openstack-qa | 00:56 | |
*** ramishra has quit IRC | 01:08 | |
*** ramishra has joined #openstack-qa | 01:14 | |
*** artom has quit IRC | 01:56 | |
*** paras333 has quit IRC | 02:14 | |
*** paras333 has joined #openstack-qa | 02:25 | |
*** hamalq has quit IRC | 02:29 | |
*** paras333 has quit IRC | 02:30 | |
*** enriquetaso has quit IRC | 02:37 | |
*** ccamposr__ has quit IRC | 02:55 | |
*** ccamposr__ has joined #openstack-qa | 02:55 | |
*** rcernin has quit IRC | 03:54 | |
*** rcernin has joined #openstack-qa | 04:00 | |
*** psahoo has joined #openstack-qa | 04:16 | |
*** vishalmanchanda has joined #openstack-qa | 04:45 | |
*** gcheresh has joined #openstack-qa | 04:52 | |
*** whoami-rajat_ has joined #openstack-qa | 05:20 | |
*** ralonsoh has joined #openstack-qa | 05:21 | |
*** vhari has quit IRC | 06:18 | |
*** anandgvb has joined #openstack-qa | 06:19 | |
*** hemanth_n has joined #openstack-qa | 06:24 | |
*** ysandeep|holiday is now known as ysandeep | 06:24 | |
*** zbr has quit IRC | 06:30 | |
*** zbr has joined #openstack-qa | 06:32 | |
*** eolivare has joined #openstack-qa | 06:32 | |
*** csatari has quit IRC | 06:34 | |
*** ajitha has quit IRC | 06:34 | |
*** knikolla has quit IRC | 06:34 | |
*** rpioso has quit IRC | 06:34 | |
*** sboyron has joined #openstack-qa | 06:35 | |
*** lxkong has quit IRC | 06:35 | |
*** ajitha has joined #openstack-qa | 06:36 | |
*** knikolla has joined #openstack-qa | 06:36 | |
*** ajitha has quit IRC | 06:36 | |
*** csatari has joined #openstack-qa | 06:36 | |
*** lxkong has joined #openstack-qa | 06:37 | |
*** rpioso has joined #openstack-qa | 06:37 | |
*** anandgvb has joined #openstack-qa | 06:38 | |
*** anandgvb has quit IRC | 06:39 | |
*** vhari has joined #openstack-qa | 06:43 | |
*** Yarboa has quit IRC | 07:04 | |
*** Yarboa has joined #openstack-qa | 07:07 | |
yoctozepto | tbarron gouthamr vkmc gmann tosky kopecmartin : re: branching for ceph plugin - it aligns with my vision to branch all current stables; I replied to hberaud on the patch as well; I would use the stable/train branch to test all EMs people care about - it already is designed to be "branchless"; for future branches we will branch regularly | 07:32 |
---|---|---|
* yoctozepto asked if we should branch tempest as well would say yes | 07:33 | |
yoctozepto | but I'm not in the mood for fighting the tradition for now ;p | 07:34 |
yoctozepto | ceph-plugin was different thought | 07:34 |
yoctozepto | I'm glad you liked my idea :D | 07:34 |
kopecmartin | well, branching would help in packaging, f.e. for packaging tempest for train we had to do quite a few workarounds (mainly reverts) in order to succeed | 07:46 |
kopecmartin | https://review.rdoproject.org/r/c/openstack/tempest-distgit/+/31408 | 07:46 |
kopecmartin | and that was only 26.0.0 tag .. the newer ones are even more problematic | 07:46 |
kopecmartin | i would say the newer tags won't be delivered in a package for train , 27 not for sure, and 26.1.0 maybe | 07:47 |
kopecmartin | on the other hand, i don't know if this benefit would overweight the disadvantages | 07:49 |
*** tosky has joined #openstack-qa | 07:50 | |
yoctozepto | kopecmartin: maybe we should prepare a SWOT for branching tempest and discuss this during the PTG then | 07:55 |
*** rpittau|afk is now known as rpittau | 07:55 | |
*** jpena|off is now known as jpena | 07:57 | |
kopecmartin | yoctozepto: that we can do | 07:58 |
kopecmartin | yoctozepto: i put the topic to the 3rd day - https://etherpad.opendev.org/p/qa-xena-ptg | 08:00 |
*** gfidente|afk is now known as gfidente | 08:01 | |
*** lucasagomes has joined #openstack-qa | 08:01 | |
yoctozepto | kopecmartin: oh, cool, that's the best timing for me | 08:01 |
yoctozepto | you are reading my mind | 08:01 |
kopecmartin | ::) | 08:01 |
yoctozepto | I will try to fill it in before then; I hope others join too (especially you, kopecmartin) | 08:02 |
kopecmartin | sure, i'll write something down | 08:03 |
yoctozepto | thanks :-) | 08:07 |
*** slaweq_ has joined #openstack-qa | 08:19 | |
*** slaweq has quit IRC | 08:19 | |
*** rcernin has quit IRC | 08:24 | |
*** ricolin has quit IRC | 08:39 | |
*** slaweq_ is now known as slaweq | 08:51 | |
*** rcernin has joined #openstack-qa | 09:12 | |
*** yamamoto has quit IRC | 09:28 | |
yoctozepto | dansmith: https://bugs.launchpad.net/devstack/+bug/1923728 | 09:43 |
openstack | Launchpad bug 1923728 in devstack "install_tempest randomly fails in CI" [High,Triaged] | 09:43 |
yoctozepto | cc gmann, kopecmartin | 09:43 |
*** dtantsur|afk is now known as dtantsur | 09:45 | |
*** yamamoto has joined #openstack-qa | 09:59 | |
*** yamamoto has quit IRC | 09:59 | |
*** yamamoto has joined #openstack-qa | 10:00 | |
*** whoami-rajat_ is now known as whoami-rajat | 10:17 | |
*** Luzi has joined #openstack-qa | 10:46 | |
*** rcernin has quit IRC | 11:04 | |
*** yamamoto has quit IRC | 11:06 | |
*** paras333_ has joined #openstack-qa | 11:10 | |
*** paras333_ has quit IRC | 11:11 | |
*** eolivare_ has joined #openstack-qa | 11:12 | |
*** tkajinam has quit IRC | 11:14 | |
*** eolivare has quit IRC | 11:15 | |
*** dviroel_ has joined #openstack-qa | 11:21 | |
*** psahoo_ has joined #openstack-qa | 11:28 | |
*** psahoo has quit IRC | 11:31 | |
*** jpena is now known as jpena|lunch | 11:32 | |
*** yamamoto has joined #openstack-qa | 11:32 | |
*** dviroel_ is now known as dviroel__ | 11:35 | |
*** eolivare_ has quit IRC | 11:35 | |
*** dviroel__ is now known as dviroel | 11:36 | |
*** yamamoto has quit IRC | 11:39 | |
*** rcernin has joined #openstack-qa | 11:45 | |
openstackgerrit | OpenStack Release Bot proposed openstack/devstack-plugin-ceph stable/train: Update .gitreview for stable/train https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786208 | 11:58 |
openstackgerrit | OpenStack Release Bot proposed openstack/devstack-plugin-ceph stable/train: Update TOX_CONSTRAINTS_FILE for stable/train https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786209 | 11:58 |
openstackgerrit | OpenStack Release Bot proposed openstack/devstack-plugin-ceph stable/ussuri: Update .gitreview for stable/ussuri https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786210 | 11:59 |
openstackgerrit | OpenStack Release Bot proposed openstack/devstack-plugin-ceph stable/ussuri: Update TOX_CONSTRAINTS_FILE for stable/ussuri https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786211 | 11:59 |
openstackgerrit | OpenStack Release Bot proposed openstack/devstack-plugin-ceph stable/victoria: Update .gitreview for stable/victoria https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786212 | 11:59 |
openstackgerrit | OpenStack Release Bot proposed openstack/devstack-plugin-ceph stable/victoria: Update TOX_CONSTRAINTS_FILE for stable/victoria https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786213 | 11:59 |
*** rcernin has quit IRC | 11:59 | |
openstackgerrit | OpenStack Release Bot proposed openstack/devstack-plugin-ceph stable/wallaby: Update .gitreview for stable/wallaby https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786214 | 11:59 |
openstackgerrit | OpenStack Release Bot proposed openstack/devstack-plugin-ceph stable/wallaby: Update TOX_CONSTRAINTS_FILE for stable/wallaby https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786215 | 11:59 |
*** vhari has quit IRC | 12:04 | |
*** vhari has joined #openstack-qa | 12:10 | |
yoctozepto | yay, branched | 12:12 |
*** eolivare_ has joined #openstack-qa | 12:16 | |
*** hemanth_n has quit IRC | 12:24 | |
*** Yarboa has quit IRC | 12:25 | |
*** Yarboa has joined #openstack-qa | 12:26 | |
*** nweinber has joined #openstack-qa | 12:27 | |
*** jpena|lunch is now known as jpena | 12:29 | |
*** Luzi has quit IRC | 12:44 | |
*** ysandeep is now known as ysandeep|afk | 12:46 | |
*** tkajinam has joined #openstack-qa | 13:14 | |
dansmith | yoctozepto: really hard to imagine what that could be, but I'll start looking yeah | 13:23 |
yoctozepto | thanks dansmith | 13:23 |
*** ysandeep|afk is now known as ysandeep | 13:25 | |
*** ricolin has joined #openstack-qa | 13:34 | |
*** artom has joined #openstack-qa | 13:36 | |
*** vhari has quit IRC | 13:37 | |
*** anandgvb has joined #openstack-qa | 13:46 | |
*** hemanth_n has joined #openstack-qa | 13:47 | |
*** vhari has joined #openstack-qa | 13:50 | |
*** hemanth_n has quit IRC | 13:52 | |
gmann | yoctozepto: kopecmartin mission for Tempest being branchless is to test all the supported stable branch with same set of tests to avoid the interop or backward compatibility issues. I do not think we need to go back to make it branched. | 13:53 |
rpittau | hi all, it seems that ironic-grenade job in victoria is now fixed, in some way | 13:54 |
rpittau | although we're still seeing an issue on master due to the cirros version | 13:54 |
rpittau | or better, due to the CIRROS_VERSION variable in devstack | 13:54 |
yoctozepto | qa did not touch anything related afaik ;-) | 13:55 |
*** arxcruz has quit IRC | 13:57 | |
rpittau | yoctozepto: ok, then there's something wrong on how the value of CIRROS_VERSION gets set | 13:57 |
yoctozepto | for the record, I meant both the ironic-grenade and this :-) | 13:58 |
*** anandgvb has quit IRC | 13:59 | |
*** arxcruz has joined #openstack-qa | 13:59 | |
*** yamamoto has joined #openstack-qa | 14:01 | |
rpittau | yoctozepto: oh, I understand that, just wondering if I can get some help troubleshooting the issue | 14:04 |
rpittau | the only thing that I can see is that the value of CIRROS_VERSION from devstack is ignored by ironic plugin, but not by glance | 14:04 |
*** amodi has joined #openstack-qa | 14:06 | |
yoctozepto | rpittau: oh, sure; it's only on master now? this cirros | 14:06 |
rpittau | yoctozepto: yeah | 14:06 |
yoctozepto | https://opendev.org/openstack/ironic/src/commit/b4d8a493d9fdb693e64f3d0b11245523c2510665/devstack/common_settings#L10 | 14:07 |
yoctozepto | well, you override this to 0.5.1 | 14:07 |
rpittau | we still have CIRROS_VERSION on 0.5.1 by default in ironic, I see that in devstack was moved to 0.5.2 in wallaby | 14:07 |
yoctozepto | yes | 14:07 |
rpittau | mmm ok but shouldn't that take 0.5.2 from devstack? Am I confusing the priority? | 14:08 |
dansmith | yoctozepto: gmann: I'm just going to have to add some debug dumping for us to merge and run against the firehose I think | 14:10 |
rpittau | I also tried updating that value to 0.5.2 in ironic but it still fails | 14:10 |
dansmith | the only thing I can think of that might have happened is if something in one of t hose job configs has forked the main shell and we arrived at the wait in a child instead of the parent | 14:10 |
yoctozepto | rpittau: fails on what exactly? | 14:11 |
yoctozepto | I should have asked that first, shouldn't I? :D | 14:11 |
rpittau | yoctozepto: glance reads the value from devstack and downloads 0.5.2, while ironic still reads 0.5.1 | 14:11 |
yoctozepto | dansmith: can't it be done on affected projects? | 14:12 |
yoctozepto | rpittau: you mean even after fixing it in ironic? | 14:12 |
yoctozepto | show me | 14:12 |
rpittau | yoctozepto: yeah | 14:12 |
dansmith | yoctozepto: not sure what you mean | 14:12 |
rpittau | yoctozepto: https://zuul.opendev.org/t/openstack/build/ef180709484d46c1b890e7a10000c307 | 14:13 |
yoctozepto | dansmith: I mean without merging debugging in the gate... | 14:13 |
yoctozepto | create a debugging patch and depends-on in octavia and others | 14:13 |
yoctozepto | rpittau: looking | 14:13 |
dansmith | yoctozepto: with lots of rechecks you mean? sure, but the debug dump would be useful for the next time something happens, and will go quicker in the fire hose.. but whatever you want I guess | 14:13 |
gmann | dansmith: yoctozepto its happening in many projects seems not just ocatvia http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22finished%20install_tempest%20with%20result%20127%5C%22 | 14:14 |
yoctozepto | hmm, since it's random | 14:14 |
yoctozepto | dansmith might have a point here | 14:14 |
dansmith | gmann: right | 14:14 |
yoctozepto | that it would be more efficient to just collect it | 14:14 |
yoctozepto | eh, eh | 14:14 |
yoctozepto | well, we released wallaby | 14:14 |
yoctozepto | so go ahead :D | 14:14 |
yoctozepto | rpittau: I replied on the change | 14:16 |
*** slaweq has quit IRC | 14:16 | |
yoctozepto | rpittau: and one extra hint | 14:16 |
yoctozepto | ideally, there would be no need to do these overrides | 14:17 |
yoctozepto | I see neutron has it overridden to 0.5.1 in a few places too, eh | 14:17 |
*** vhari has quit IRC | 14:17 | |
rpittau | yoctozepto: ok, I see the point, devstack should just take care of that | 14:18 |
yoctozepto | indeed | 14:19 |
yoctozepto | but then I see you are using this later for your own purposes | 14:19 |
rpittau | yes, we're special :D | 14:19 |
yoctozepto | perhaps it's introduced because the value is not seen in the plugin | 14:19 |
yoctozepto | it was* | 14:19 |
rpittau | I'll start to see if patching wallaby works fine | 14:19 |
*** slaweq has joined #openstack-qa | 14:20 | |
rpittau | thanks yoctozepto :) | 14:20 |
yoctozepto | try dropping this line altogether first, rpittau | 14:20 |
yoctozepto | yw | 14:20 |
rpittau | alright, in case we want a specific version I guess we can set that as variable in the job | 14:21 |
rpittau | I probably need to update wallaby first anyway | 14:21 |
yoctozepto | yes, that should be the way to go properly | 14:21 |
yoctozepto | yes, wallaby | 14:21 |
yoctozepto | drop the line in wallaby :D | 14:21 |
rpittau | ok, much clear now, thanks! :) | 14:21 |
yoctozepto | then drop the bass | 14:21 |
rpittau | :D | 14:22 |
openstackgerrit | Dan Smith proposed openstack/devstack master: Add some debug to async_wait failures https://review.opendev.org/c/openstack/devstack/+/786250 | 14:22 |
dansmith | yoctozepto: gmann ^ | 14:22 |
dansmith | yoctozepto: gmann: looks like this: https://termbin.com/9kso | 14:23 |
dansmith | (when forced to fail locally) | 14:23 |
yoctozepto | almost like kernel panic | 14:23 |
yoctozepto | devstack panic | 14:24 |
dansmith | the failed ls is fine, it's just saying nothing else was running in the background.. else it will show those things | 14:24 |
yoctozepto | devstack screen of death | 14:24 |
dansmith | yoctozepto: devstack guru meditation | 14:24 |
*** yamamoto has quit IRC | 14:24 | |
yoctozepto | lovin' it | 14:24 |
yoctozepto | we can keep this permanently | 14:25 |
yoctozepto | I was worried you had worse debugging in mind | 14:25 |
openstackgerrit | Dan Smith proposed openstack/devstack master: Add some debug to async_wait failures https://review.opendev.org/c/openstack/devstack/+/786250 | 14:25 |
dansmith | (small tweak) | 14:25 |
dansmith | yoctozepto: right, I was expecting to just keep this | 14:26 |
yoctozepto | dansmith: rightously | 14:26 |
yoctozepto | +2 from me | 14:26 |
dansmith | thanks | 14:26 |
yoctozepto | ping gmann | 14:26 |
gmann | +A | 14:33 |
gmann | yoctozepto: these backports are ready https://review.opendev.org/q/Ic747ac9ddbb21a01e9dc18d8e8ad324d4d7d050d | 14:34 |
yoctozepto | ok | 14:34 |
*** dviroel is now known as dviroel|lunch | 15:07 | |
*** hyang has joined #openstack-qa | 15:09 | |
*** artom has quit IRC | 15:21 | |
*** hyang has quit IRC | 15:21 | |
*** hyang has joined #openstack-qa | 15:24 | |
*** ysandeep is now known as ysandeep|away | 15:25 | |
*** vhari has joined #openstack-qa | 15:31 | |
*** rcernin has joined #openstack-qa | 15:39 | |
*** vhari has quit IRC | 15:39 | |
*** rcernin has quit IRC | 15:43 | |
*** dviroel|lunch is now known as dviroel | 15:45 | |
*** hyang has quit IRC | 15:59 | |
*** gcheresh has quit IRC | 16:03 | |
*** lucasagomes has quit IRC | 16:04 | |
*** vhari has joined #openstack-qa | 16:06 | |
openstackgerrit | Merged openstack/tempest master: Add live migration with trunk test https://review.opendev.org/c/openstack/tempest/+/774689 | 16:08 |
johnsom | FYI, I am kicking the tires on the aync debug patch using the octavia tempest tests: https://review.opendev.org/786275 | 16:15 |
*** rpittau is now known as rpittau|afk | 16:21 | |
*** gfidente is now known as gfidente|afk | 16:25 | |
*** hamalq has joined #openstack-qa | 16:28 | |
*** psahoo_ has quit IRC | 16:45 | |
*** ralonsoh has quit IRC | 16:45 | |
*** eolivare_ has quit IRC | 16:48 | |
yoctozepto | ++ | 17:04 |
*** clarkb has quit IRC | 17:08 | |
*** clarkb has joined #openstack-qa | 17:08 | |
*** jpena is now known as jpena|off | 17:10 | |
*** gcheresh has joined #openstack-qa | 17:11 | |
johnsom | Looks like that async debug patch is bad: | 17:13 |
johnsom | https://www.irccloud.com/pastebin/skhViOKg/ | 17:13 |
johnsom | https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e33/786275/1/check/octavia-v2-dsvm-scenario/e33611c/job-output.txt | 17:13 |
dansmith | johnsom: no, that's expected | 17:13 |
dansmith | the ls thing I mean.. it means no other jobs were waiting | 17:13 |
dansmith | the cat failure means that the process we're waiting for is gone (i.e. already waiting) | 17:14 |
dansmith | er, already waited-for | 17:14 |
johnsom | lol, ok... So we have some results! grin | 17:14 |
johnsom | What am I looking for here. We have a number that passed and some that failed. | 17:15 |
johnsom | Looks like the passing is stable branch, so probably doesn't have the async change that is failing | 17:15 |
johnsom | Hmm, that one I linked still says install failed with 127 though | 17:16 |
johnsom | [2674 Async install_tempest:22726]: finished install_tempest with result 127 in 37 seconds | 17:16 |
dansmith | johnsom: the install is succeeding, as you can see. it's that our wait() on the child fails | 17:16 |
johnsom | 2021-04-14 16:37:38.438202 | controller | [2674 Async install_tempest:22726]: Waiting for completion of install_tempest running on PID 22726 (1 other jobs running) | 17:19 |
johnsom | 2021-04-14 16:37:38.440901 | controller | /opt/stack/devstack/inc/async: line 152: wait: pid 22726 is not a child of this shell | 17:19 |
dansmith | unfortunately it seems like $$ isn't giving me what I want | 17:20 |
dansmith | this debug patch hasn't started running in the gate, so I'm going to tweak it and pull it out | 17:20 |
johnsom | Ok. Let me know if I can help in some way | 17:21 |
openstackgerrit | Dan Smith proposed openstack/devstack master: Add some debug to async_wait failures https://review.opendev.org/c/openstack/devstack/+/786250 | 17:23 |
dansmith | johnsom: if you could recheck your DNM ^ | 17:23 |
johnsom | Yep, NP | 17:23 |
johnsom | 786275 in zuul | 17:25 |
dansmith | johnsom: just to give you some context here, what we've done is run the install_tempest function with &, recorded its pid, and then later we call wait on that pid | 17:27 |
dansmith | and bash is saying that that pid is not a child we can wait for | 17:27 |
dansmith | it's clear that it's running because we see it finish well above the point at which we wait for it | 17:27 |
johnsom | Yeah, just coming up to speed on this now. | 17:28 |
dansmith | so either we double-waited for it, or we're calling wait from not the original parent somehow | 17:28 |
johnsom | Where would the console log for the child end up (once this is resolved)? | 17:29 |
dansmith | we cat it when we do the wait (which is displayed), but it's also preserved in /opt/stack/async/install_tempest.log | 17:30 |
johnsom | Ok | 17:30 |
dansmith | we also shouldn't really be able to double-wait, | 17:31 |
dansmith | because in order to wait, we cat the ini file for that task that we created, and then wait on that pid | 17:31 |
dansmith | so unless there's a crazy race in there, it surely seems like the task must be still ready for us to wait, but we're not the parent | 17:32 |
clarkb | and linux shouldn't be reusing pids until the parent waits and reaps them right? | 17:33 |
clarkb | (maybe bash does some magic around that?) | 17:33 |
dansmith | right, until we wait, the child is still "running" in Z state | 17:33 |
*** artom has joined #openstack-qa | 17:34 | |
clarkb | just talking out loud here: if the child double forks and changes process groups it would reassocaite to init as its parent (daemonizing essentially) any chance something like that is happening? | 17:34 |
johnsom | I would be super shocked if we are in a use case reusing PIDs.... lol | 17:34 |
johnsom | We have a lot of tests and such, but.... grin | 17:34 |
dansmith | clarkb: a grandchild could do that, but we should still have the original child to wait for | 17:35 |
clarkb | dansmith: ah right, we would never see the grandchild in taht case | 17:35 |
dansmith | clarkb: but yeah I thought about that | 17:35 |
dansmith | clarkb: right, so that could cause us to exit our wait before all the work has been done, but otherwise unrelated I think | 17:35 |
clarkb | ++ | 17:35 |
dansmith | johnsom: you're loading a lot of plugins.. are you doing anything in the other stack phases like test_config in any of those? | 17:38 |
*** rcernin has joined #openstack-qa | 17:39 | |
johnsom | I don't think so. Here is a successful run if you want to search: https://zuul.opendev.org/t/openstack/build/5d411e88a2394062994dbf3660172919/logs | 17:40 |
*** dtantsur is now known as dtantsur|afk | 17:44 | |
*** rcernin has quit IRC | 17:44 | |
dansmith | johnsom: and to be clear, it's not 100% fail right? | 17:44 |
dansmith | that kinda makes it even weirder | 17:44 |
johnsom | Yeah, that is absolutely the strange thing | 17:45 |
johnsom | The successful run I just linked was last night. Plus rechecks of the same patch will sometimes succeed. | 17:46 |
yoctozepto | eh the random issues | 17:46 |
dansmith | okay... so... | 17:49 |
dansmith | on a successful run it almost looks like install_tempest gets called twice | 17:49 |
dansmith | well, maybe not, maybe that's just plugin config after the install in the successful case | 17:54 |
*** gcheresh has quit IRC | 17:55 | |
johnsom | dansmith https://65eec222656bf012f685-8fedbdeaaaf080ca764e90bf11123056.ssl.cf5.rackcdn.com/786275/1/check/octavia-v2-dsvm-scenario/32e9a5f/job-output.txt | 18:05 |
johnsom | Failure with the new patch | 18:05 |
yoctozepto | well, whatever you find out, I guess it might need fixing on both sides - for devstack to easy spot the oddity and octavia (and others) to avoid the oddity | 18:10 |
johnsom | So, looking closer at the logs. It's clear that job finished before the "wait" was called. Calling wait in bash on a non-existent PID will throw a 127 error. It's this just a bug that we are attempting to wait on a PID that already exited? | 18:10 |
yoctozepto | I might find some cycles later this week | 18:10 |
yoctozepto | johnsom: ooh, this! | 18:11 |
yoctozepto | that makes total sense | 18:11 |
clarkb | a child process shouldn't fully exit until its parent has waited on it and reaped it though | 18:11 |
yoctozepto | but why do we wait for it then? | 18:11 |
yoctozepto | clarkb: yes, but if we wait twice... | 18:11 |
yoctozepto | now though why do we? :D | 18:11 |
clarkb | right that could explain it (the double wait theory) | 18:12 |
yoctozepto | dansmith: ^^ | 18:12 |
yoctozepto | Dan knows his code so will find out | 18:13 |
* yoctozepto goes to his well-deserved rest | 18:13 | |
dansmith | johnsom: no, the early "finish" message is printed by the child itself, basically saying "okay I'm done" and then later when we wait, we cat the console so you can see what the job was doinbg | 18:13 |
dansmith | there's only one wait, and that's where it's failing | 18:14 |
dansmith | yeah, so that most recent fail seems to show that the pids are all correct.. i.e. it's the same parent waiting for the same child | 18:17 |
dansmith | and this: cat: /proc/22719/status: No such file or directory | 18:17 |
dansmith | really means that the child is gone from the system | 18:17 |
dansmith | and that pid does not get waited for multiple times (by us) | 18:20 |
jparker | artom: for https://review.opendev.org/c/openstack/whitebox-tempest-plugin/+/776112/7/whitebox_tempest_plugin/config.py#238 would I be updating zuul.yaml or the devstack settings? I thought the zuul.yaml parameters were for how the environment was deployed. | 18:31 |
artom | jparker, yeah, the devstack settings can work too | 18:32 |
jparker | artom: ack ty | 18:33 |
artom | jparker, I agree the zuul/devstack settings thing can be confusing... | 18:33 |
artom | Devstack can install whitebox (via the plugin) without Zuul | 18:33 |
artom | And Zuul uses Devstack... | 18:33 |
artom | So it's a matter of "does the value that we're setting apply for *all* Devstack deployments, even those folks do manually on their own machines/VMs?" | 18:34 |
artom | "Or is this something specific to our Zuul CI?" | 18:34 |
artom | I realize it's *very* academic | 18:35 |
artom | Because in practice they're one and the same | 18:35 |
artom | But that's my reading of it | 18:35 |
jparker | artom: does it make more sense to just default to True? | 18:37 |
artom | jparker, for the "allow_disabling" thing? Don't see why not | 18:38 |
*** whoami-rajat has quit IRC | 18:38 | |
openstackgerrit | Dan Smith proposed openstack/devstack master: Add some debug to async_wait failures https://review.opendev.org/c/openstack/devstack/+/786250 | 18:38 |
artom | It's True in upstream master, and is downstream OSP16 | 18:38 |
artom | *in | 18:38 |
jparker | artom: ack ok I'll just update the default | 18:39 |
jparker | artom: also not sure if we should set up some discussion about pinning whitebox for a py27 13 only deployment sometime soon | 18:39 |
*** jlvillal has joined #openstack-qa | 18:42 | |
*** vishalmanchanda has quit IRC | 18:44 | |
openstackgerrit | Ghanshyam proposed openstack/devstack-plugin-ceph master: Remove the stable branch jobs from master gate https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786307 | 18:53 |
openstackgerrit | Ghanshyam proposed openstack/devstack-plugin-ceph stable/wallaby: Remove the stable branch jobs from stable/wallaby gate https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786308 | 18:56 |
openstackgerrit | James Parker proposed openstack/whitebox-tempest-plugin master: Test allow disabling CPU flags https://review.opendev.org/c/openstack/whitebox-tempest-plugin/+/776112 | 18:57 |
openstackgerrit | Ghanshyam proposed openstack/devstack-plugin-ceph stable/victoria: Remove the stable branch jobs from stable/victoria gate https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786309 | 19:08 |
openstackgerrit | Ghanshyam proposed openstack/devstack-plugin-ceph stable/wallaby: Remove the stable branch jobs from stable/wallaby gate https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786308 | 19:10 |
openstackgerrit | Ghanshyam proposed openstack/devstack-plugin-ceph stable/victoria: Remove the stable branch jobs from stable/victoria gate https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786309 | 19:12 |
dansmith | johnsom: clarkb: I dunno if you'd know, but any chance this job is running inside a container | 19:13 |
dansmith | ? | 19:13 |
johnsom | Should not be, no. | 19:13 |
johnsom | It's just a standard dsvm job that runs tempest. We use the standard zuul templates. | 19:14 |
dansmith | ack, I wouldn't have thought, just looking for reasons why it would have been different | 19:15 |
dansmith | based on this: | 19:15 |
dansmith | https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1590001 | 19:15 |
openstack | Launchpad bug 1590001 in lxd (Ubuntu) "bash complains wait: pid is not a child of this shell" [Undecided,Invalid] | 19:15 |
dansmith | looks like bash's wait() no longer translates directly to the waitpid() system call and it does its own processing of states, but that it can lose track | 19:16 |
dansmith | and about 200 children seemed to trigger it in that bug | 19:16 |
dansmith | which I'm sure we're near or above, so just wondering if that's what is happening | 19:17 |
openstackgerrit | Ghanshyam proposed openstack/devstack-plugin-ceph stable/wallaby: Remove the stable branch jobs from stable/wallaby gate https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786308 | 19:17 |
dansmith | and, | 19:18 |
dansmith | that's by far the oldest child | 19:18 |
openstackgerrit | Ghanshyam proposed openstack/devstack-plugin-ceph stable/ussuri: Remove the stable branch jobs from stable/ussuri gate https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786310 | 19:20 |
openstackgerrit | Ghanshyam proposed openstack/devstack-plugin-ceph stable/train: Remove the stable branch jobs from stable/train gate https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786311 | 19:25 |
*** sboyron has quit IRC | 19:30 | |
openstackgerrit | Dan Smith proposed openstack/devstack master: Add some debug to async_wait failures https://review.opendev.org/c/openstack/devstack/+/786250 | 19:37 |
*** gcheresh has joined #openstack-qa | 19:52 | |
openstackgerrit | Merged openstack/devstack-plugin-ceph stable/wallaby: Update .gitreview for stable/wallaby https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786214 | 19:54 |
openstackgerrit | Merged openstack/devstack-plugin-ceph stable/wallaby: Update TOX_CONSTRAINTS_FILE for stable/wallaby https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786215 | 19:54 |
*** nweinber has quit IRC | 20:00 | |
*** gcheresh has quit IRC | 20:10 | |
dansmith | clarkb: you might be interested: https://github.com/mfragkoulis/bash/blob/master/jobs.c#L4213 | 20:12 |
dansmith | CHILD_MAX defaults to 32, by the way | 20:13 |
dansmith | johnsom: latest patch passes your DNM, but we probably need more than one run to confirm a fix | 20:19 |
johnsom | Yep, it's intermittent for sure | 20:25 |
dansmith | johnsom: so if it's cool, I will recheck your patch when this is done, before I go clean up the devstack patch for actual merge if it succeeds again | 20:26 |
dansmith | johnsom: it sounded like it was pretty frequent and that two successive passes would be a good indication that it's better right? | 20:26 |
johnsom | dansmith Be my guest. Yeah, it was repeatable, just not every time. | 20:27 |
dansmith | johnsom: what I meant was.. rechecking and then revising my patch to be cleaned up will take a little longer serial time, | 20:30 |
dansmith | I dunno how panicked you are about your gate being unstable | 20:30 |
johnsom | We are not in “critical” mode at the moment | 20:32 |
johnsom | This week would be nice. Grin | 20:33 |
dansmith | ack, well, I'll recheck this in a bit and try to get it cleaned up this afternoon after that second run if things look good | 20:33 |
clarkb | dansmith: ah so it is doing magic after all? | 20:38 |
dansmith | clarkb: yeah, I think it's actually catching SIGCHLD, recording the status and then just making wait read from that status | 20:38 |
*** rcernin has joined #openstack-qa | 21:02 | |
dansmith | johnsom: second run also worked | 21:25 |
dansmith | johnsom: agree that two in a row was unlikely before the fix? | 21:25 |
johnsom | Are you watching the console on that DNM patch? | 21:26 |
johnsom | Ah, yeah, I see it started. | 21:26 |
johnsom | I would lean towards three to be sure | 21:27 |
dansmith | aight | 21:30 |
dansmith | well, I'll get the cleanup going and then we can recheck and get the third on top of that | 21:31 |
johnsom | Sounds like a pan | 21:31 |
johnsom | plan | 21:31 |
openstackgerrit | Merged openstack/grenade stable/wallaby: Make heat and octavia grenade jobs as voting https://review.opendev.org/c/openstack/grenade/+/785930 | 21:35 |
openstackgerrit | Merged openstack/grenade stable/ussuri: Make heat and octavia grenade jobs as voting https://review.opendev.org/c/openstack/grenade/+/785932 | 21:35 |
openstackgerrit | Dan Smith proposed openstack/devstack master: Add some debug to async_wait failures https://review.opendev.org/c/openstack/devstack/+/786250 | 21:36 |
openstackgerrit | Dan Smith proposed openstack/devstack master: Work around CHILD_MAX bash limitation for async https://review.opendev.org/c/openstack/devstack/+/786330 | 21:36 |
dansmith | johnsom: ^ | 21:37 |
johnsom | Nice, thanks~! | 21:37 |
*** rcernin has quit IRC | 21:45 | |
*** rcernin has joined #openstack-qa | 21:46 | |
*** rcernin has quit IRC | 22:20 | |
*** yamamoto has joined #openstack-qa | 22:30 | |
*** tkajinam has quit IRC | 22:43 | |
*** tkajinam has joined #openstack-qa | 22:52 | |
*** hamalq has quit IRC | 22:57 | |
*** hamalq has joined #openstack-qa | 22:57 | |
*** yamamoto has quit IRC | 22:59 | |
*** yamamoto has joined #openstack-qa | 22:59 | |
*** rcernin has joined #openstack-qa | 23:06 | |
openstackgerrit | Merged openstack/devstack-plugin-ceph stable/victoria: Update .gitreview for stable/victoria https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786212 | 23:10 |
openstackgerrit | Merged openstack/devstack-plugin-ceph stable/victoria: Update TOX_CONSTRAINTS_FILE for stable/victoria https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/786213 | 23:10 |
*** tosky has quit IRC | 23:29 | |
dansmith | johnsom: looks like we're good on #3 too | 23:49 |
johnsom | +1, I have already given my +1 vote on your patch. Thanks for the work on this. | 23:49 |
dansmith | johnsom: ah I see now | 23:50 |
*** irclogbot_0 has quit IRC | 23:50 | |
*** irclogbot_3 has joined #openstack-qa | 23:56 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!