14:00:08 <mwhahaha> #startmeeting tripleo 14:00:08 <mwhahaha> #topic agenda 14:00:08 <mwhahaha> * Review past action items 14:00:08 <mwhahaha> * One off agenda items 14:00:08 <mwhahaha> * Squad status 14:00:08 <mwhahaha> * Bugs & Blueprints 14:00:08 <mwhahaha> * Projects releases or stable backports 14:00:08 <mwhahaha> * Specs 14:00:08 <mwhahaha> * open discussion 14:00:09 <openstack> Meeting started Tue Oct 16 14:00:08 2018 UTC and is due to finish in 60 minutes. The chair is mwhahaha. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:10 <mwhahaha> Anyone can use the #link, #action and #info commands, not just the moderatorǃ 14:00:10 <mwhahaha> Hi everyone! who is around today? 14:00:11 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:13 <openstack> The meeting name has been set to 'tripleo' 14:00:32 <marios|rover> abishop: https://bugs.launchpad.net/tripleo/+bug/1797918 14:00:32 <openstack> Launchpad bug 1797918 in tripleo "teclient returns failures when attempting to provision a stack in rdo-cloud" [Critical,Triaged] - Assigned to Marios Andreou (marios-b) 14:00:35 <Tengu> «o/ 14:00:41 <matbu> o/ 14:00:41 <marios|rover> o/ 14:00:43 <rfolco> o/ 14:00:46 <Tengu> you're a bit early mwhahaha :) 14:00:53 <weshay> abishop, wait for the rdo-cloud status update 14:00:55 <fultonj> o/ 14:00:56 <weshay> 0/ 14:01:00 <mwhahaha> am not 14:01:03 <mwhahaha> i'm 8 seconds late 14:01:09 <mwhahaha> Meeting started Tue Oct 16 14:00:08 2018 14:01:10 <EmilienM> o/ 14:01:12 <beagles> o/ 14:01:18 <ksambor> o/ 14:01:23 <abishop> weshay, marios|rover: ack, thx! 14:01:26 <abishop> o/ 14:01:47 <Tengu> mwhahaha: ah, maybe my server is a bit late? thanks ntpd -.- 14:02:06 <mwhahaha> maybe you should be running chrony 14:02:07 <mwhahaha> :D 14:02:14 <Tengu> :] 14:02:17 <Tengu> easy target ;) 14:02:35 <panda> o/ 14:02:48 <rlandy> o/ 14:03:13 <ccamacho> o/ 14:04:02 <mwhahaha> alright let's do this 14:04:09 <owalsh> o/ 14:04:10 <mwhahaha> #topic review past action items 14:04:10 <mwhahaha> None! 14:04:29 <mwhahaha> since there were no action items lets move on 14:04:33 <mwhahaha> #topic one off agenda items 14:04:33 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-meeting-items 14:04:44 <mwhahaha> (bogdando) replace ovb jobs with multinode plus HW-provisioning done on fake QEMU VMs locally on undercloud, then switched to predeployed servers 14:04:46 <dprince> hi 14:04:47 <bogdando> o/ 14:05:00 <bogdando> I hope the topic is self explaining? 14:05:08 <jrist> o/ 14:05:13 <mwhahaha> not really 14:05:28 <mwhahaha> where would these multinode jobs run? 14:05:33 <bogdando> ok, it's about removing network communications while testing HW prov 14:06:17 <bogdando> once we are sure qemu vms can be introspected and provisioned, we switch to deployed servers and continue the multinode setup, instead of ovb 14:06:35 <bogdando> so qemu are fakes to be thrown away 14:06:56 <openstackgerrit> Sorin Sbarnea proposed openstack/tripleo-quickstart-extras master: Run DLRN gate role only if to_build is true https://review.openstack.org/610728 14:06:56 <bogdando> the expectation is to see only ironic et al bits in action and having 0 interference from L2 networking 14:07:00 <mwhahaha> so would this be an additional step to be added to like the udnercloud job? 14:07:07 <dtantsur> this removes coverage for non-deployed-servers flow, right? 14:07:18 <bogdando> dtantsur: not sure, thus asking/proposing 14:07:32 <dtantsur> btw your idea is quite close to what derekh is doing for his ironic-in-overcloud CI job 14:07:42 <bogdando> do we really test something of Ironic stack after the nodes introspected? 14:07:50 <bogdando> interesting 14:07:52 <mwhahaha> yes 14:07:57 <dtantsur> tear down at least? 14:07:58 <mwhahaha> we need to make sure the images are good 14:08:02 <bogdando> I see 14:08:03 <mwhahaha> and that the provisioning askpect works 14:08:15 <bogdando> ok, could we provision those QEMU vms then? 14:08:18 <mwhahaha> so if you just wanted to test introspection early in the udnercloud job, that might be ok 14:08:34 <mwhahaha> but we need ovb because it provides actual provisioning coverage (and coverage for our image building) 14:08:35 <dtantsur> bogdando: this what devstack does. nested virt is slooooooooooooooooo 14:08:39 <dtantsur> ooooooooooooooooo 14:08:43 <dtantsur> you got the idea :) 14:08:47 * mwhahaha hands dtantsur an ow 14:09:03 <dtantsur> yay, here it is: ooow! thanks mwhahaha 14:09:06 <bogdando> well yes, but it hopefully takes not so much just to provision a node? got stats? 14:09:34 <derekh> bogdando: the ironic is overcloud job can be seen here https://review.openstack.org/#/c/582294/ see tripleo-ci-centos-7-scenario012-multinode-oooq-container 14:09:40 <dtantsur> no hard data, but expect 2-3 times slow down of roughly everything 14:09:59 <mwhahaha> given our memory constraints i'm not sure we have enough to do the undercloud + some tiny vms for provisioning 14:10:15 <openstackgerrit> Michele Baldessari proposed openstack/puppet-tripleo master: Fix ceph-nfs duplicate property https://review.openstack.org/609599 14:10:17 <ooolpbot> URGENT TRIPLEO TASKS NEED ATTENTION 14:10:18 <ooolpbot> https://bugs.launchpad.net/tripleo/+bug/1797600 14:10:19 <ooolpbot> https://bugs.launchpad.net/tripleo/+bug/1797838 14:10:19 <ooolpbot> https://bugs.launchpad.net/tripleo/+bug/1797918 14:10:19 <openstack> Launchpad bug 1797600 in tripleo "Fixed interval looping call 'nova.virt.ironic.driver.IronicDriver._wait_for_active' failed: InstanceDeployFailure: Failed to set node power state to power on." [Critical,Incomplete] 14:10:20 <openstack> Launchpad bug 1797838 in tripleo "openstack-tox-linters job is failing for the tripleo-quickstart-extras master gate check" [Critical,In progress] - Assigned to Sorin Sbarnea (ssbarnea) 14:10:21 <openstack> Launchpad bug 1797918 in tripleo "teclient returns failures when attempting to provision a stack in rdo-cloud" [Critical,Triaged] - Assigned to Marios Andreou (marios-b) 14:10:24 <bogdando> Would be nice to see it in real CI jobs logs, how long it takes 14:10:31 <dtantsur> tiny? 14:10:39 <bogdando> in real ovb jobs 14:10:41 <dtantsur> our IPA images require 2G of RAM just to run... 14:11:01 <mwhahaha> right we can't do that on our upstream jobs 14:11:50 <bogdando> anyway, once the provisioning done the idea was to throw vms away and switch to multinode as usual 14:11:51 <openstackgerrit> Michele Baldessari proposed openstack/puppet-tripleo master: WIP Fix ipv6 addrlabel for ceph-nfs https://review.openstack.org/610987 14:12:16 <bogdando> may be run some sanity checks before that... 14:12:41 <dtantsur> this removes the ability to check that our overcloud-full images 1. are usable, 2. can be deployed by ironic 14:12:48 <holser_> o/ 14:12:59 <dtantsur> (this is assuming that ironic does its job correctly, which is verified in our CI.. modulo local boot) 14:13:01 <arxcruz> o/ 14:13:11 <bogdando> dtantsur: ;-( for 1, but not sure for 2 14:13:31 <dtantsur> bogdando: ironic has some requirements for images. e.g. since we default to local boot, we need grub2 there. 14:13:41 <dtantsur> and if/when we switch to uefi, fun will increase 14:13:42 <bogdando> I thought that 2 will be covered by Ironic provisioning those images on qemu 14:13:50 <weshay> hrm.. need scenario12 doc'd https://github.com/openstack/tripleo-heat-templates/blob/master/README.rst#service-testing-matrix 14:14:01 <derekh> weshay: will do 14:14:17 <weshay> thank you sir /me looks at code 14:14:24 <bogdando> ok, how is that idea different to what derekh does? 14:15:08 <dtantsur> bogdando: derekh only need to verify that ironic boots stuff. he does not need to verify the final result. 14:15:28 <bogdando> also, an alternative may be to convert those qemu into images for the host cloud 14:15:36 <bogdando> to continue with multi-node 14:15:56 <weshay> fyi.. anyone else.. scenario12 defined here https://review.openstack.org/#/c/579603/ 14:15:59 <bogdando> the take away is we avoided L2 dances :) 14:16:05 <dtantsur> bogdando: what's the problem you're trying to solve? 14:16:18 <dtantsur> these L2 dances you're referring to? 14:16:20 <bogdando> avoiding networking issues 14:16:34 <mwhahaha> i think we need to fix those and not just work around them in ci 14:16:44 <dtantsur> ++ 14:16:46 <bogdando> the main thing to solve is https://bugs.launchpad.net/tripleo/+bug/1797526 14:16:46 <openstack> Launchpad bug 1797526 in tripleo "Failed to get power state for node FS01/02" [Critical,Triaged] 14:16:48 <mwhahaha> we need proper coverage for this 14:16:54 <bogdando> none knows how to solve it it seems 14:17:07 <bogdando> and bunch of similar issues hitting us periodically 14:17:09 <mwhahaha> this also seems like a good thing to add some resilience to 14:17:17 <dtantsur> that's bad, but won't we just move the problems to a different part? 14:17:41 <dtantsur> I can tell you that nested virt does give us hard time around networking from time to time 14:17:51 <dtantsur> e.g. ubuntu updated their ipxe ROM - and we're broken :( 14:17:54 <weshay> noting that it's not easy to deliniate the issues we're seeing in 3rd party atm.. 14:18:03 <bogdando> I do not know tbh, if solving network issues is domain of tripleo 14:18:08 <weshay> introspection errors can be caused by rdo-cloud networking 14:18:11 <bogdando> the base OS rather 14:18:15 <bogdando> and infrastructure 14:18:16 <mwhahaha> so it seems like it would be better to invest in improved logging/error correction in the introspection process 14:18:17 <bogdando> not tripleo 14:18:19 <weshay> it's a very unstable env atm 14:18:36 <dprince> is this something that a retry would resolve. Like ironic.conf could be tuned for this a bit perhaps? 14:18:37 <weshay> ya.. we have a patch to tcpdump during introspection 14:18:37 <mwhahaha> rather than coming up with a complex ci job to just skip it 14:18:42 <bogdando> I do not mean control/data plance network ofc 14:18:56 <bogdando> but provisioning and hardware layers 14:19:29 <dtantsur> bogdando: OVB is an openstack cloud. if we cannot make it reliable enough for us.. we're in trouble 14:19:33 <bogdando> if integration testing may be done w/o real networking involved, only virtual SDNs, may be we can accept that? dunno 14:19:49 <derekh> but the problems in CI (at least the ones I've looked at) were where ipmi to the bmc wasn't working over a longer term, reties aint going to help 14:19:49 <bogdando> just not sure if that would be moving the problems to a different part 14:20:30 <weshay> dtantsur, agree.. there is work being done to address it. apevec has some details there however right now I agree it's unstable and painful 14:20:30 <dtantsur> what's the problem with IPMI? the BMC is unstable? using UDP? 14:20:35 <derekh> Can you make periodic jobs leave envs around on failure so they can be debugged ? 14:20:37 <bogdando> dtantsur: it is not related to the testing patches 14:20:38 <weshay> and it's causing a lot of false negatives 14:20:46 <dtantsur> if the former, we can fix it. if the latter, we can switch to redfish (http based) 14:20:53 <bogdando> only produces unrelevant noice and tones of grey failures 14:20:58 <weshay> derekh, you can recreate those w/ the reproducer scripts 14:21:05 <bogdando> weshay: ++ 14:21:06 <weshay> in logs/reproduce-foo.sh 14:21:12 <bogdando> for flase negatives 14:21:15 <dprince> derekh: I'd like to know more on this. If its a long term connectivity issue.. Then that is our problem right? 14:21:24 <derekh> weshay: yes, but we were not able to on friday 14:21:34 <bogdando> basically, 99% of errors one can observe in those integrational builds had nothing to the subject of testing 14:22:09 <bogdando> (taking numbers out of air) 14:22:16 <openstackgerrit> John Fulton proposed openstack/tripleo-common master: Remove ceph-ansible Mistral workflow https://review.openstack.org/604783 14:22:29 <dtantsur> are we use IPMI is always to blame in these 99% of made up cases? :) 14:22:32 <weshay> derekh, k k.. not sure what you were hitting it's hit or miss w/ the cloud atm but also happy to help. derekh probably getting w/ panda locally will help too 14:22:33 <derekh> It looks to me like either 1. ipmi traffic from the undercloud to the bmc node was dispupted or 2. traffic from the BMC to the rdo-cloud api was distrupted (perhapes because of DNS outages) 14:22:49 <weshay> ya 14:23:02 <derekh> *disrupted 14:23:26 <bogdando> mwhahaha: so not sure for "we need to fix thoseЭ 14:23:29 <bogdando> " :) 14:23:41 <weshay> the next topic in the agenda is related 14:23:47 <weshay> basically rdo-cloud status 14:24:00 <mwhahaha> k let's dive into the next topic then 14:24:00 <mwhahaha> RDO-Cloud and Third Party TripleO CI 14:24:06 <mwhahaha> weshay: is that yours 14:24:19 <weshay> ya.. just basically summarizing what most of you already know 14:24:33 <weshay> we have an unstable env for 3rd party atm.. and it's causing pain 14:24:53 <weshay> the rdo team has built a new ovs package based on osp and we're waiting on ops to install 14:25:03 <mwhahaha> should we remove it from check until we can resolve stability? 14:25:10 <weshay> nhicher, is also experimenting with other 3rd party clouds 14:25:13 <openstackgerrit> Derek Higgins proposed openstack/tripleo-heat-templates master: Add scenario 012 - overlcoud baremetal+ansible-ml2 https://review.openstack.org/579603 14:25:24 <dtantsur> mwhahaha: if it's not gating, you can ignore it even if it's voting 14:25:24 <mwhahaha> so we can actually devote efforts to trouble shoot rather than continue to add useless load and possibly hiding issues? 14:25:34 <mwhahaha> dtantsur: i mean not even run it on patches 14:25:35 <weshay> mwhahaha, I think thats on the table for sure.. however I can't say I'm very comfortable with that idea 14:25:37 <dtantsur> ah 14:25:56 <mwhahaha> if it's 25% what's the point of even running it other than wasting resources 14:26:00 <dtantsur> well, that means not touching anything related to ironic until it gets fixed 14:26:03 <dtantsur> (which may be fine) 14:26:12 <weshay> testing and loading the cloud?? ya .. it's a fair question 14:26:28 <weshay> mwhahaha, we still need rdo-cloud for master promotions and other branches 14:26:33 <weshay> so the cloud NEEDS to work 14:26:48 <weshay> I've had luck on the weekends 14:27:00 <mwhahaha> if it's fine on the weekends that points to load related issues 14:27:08 <weshay> fine is relative 14:27:10 <derekh> in the mean time merge the previously mentioned scenario012 and use it to verify ironic in overcloud instead of the undercloud 14:27:11 <weshay> better maybe 14:28:13 <mwhahaha> weshay: do we know what has changed in the cloud itself since the stability issues started? 14:28:45 <weshay> mwhahaha, personally I'd like to see the admins, tripleo folks, and ironic folks chat for a minute and see what we can do to improve the situation there 14:28:56 <huynq> Hi, What's about Undercloud OS upgrade? 14:29:03 <weshay> mwhahaha, I have 0 insight into any changes 14:29:23 <dtantsur> I'm ready to join such chat, even though I'm not sure we can do much on ironic side 14:29:35 <weshay> better lines of communication between tripleo, ops, and ironic, ci etc.. are needed.. 14:29:35 <dtantsur> (well, retries for IPMI are certainly an option) 14:30:05 <weshay> dtantsur, agree however being defensive and getting the steps in ci that prove it's NOT ironic would be helpful I think 14:30:20 <weshay> it's too easy for folks to just think.. oh look.. introspection failed 14:30:22 <weshay> imho 14:30:39 <dtantsur> unstable IPMI can be hard to detect outside of ironic 14:30:40 <weshay> when really it's the networking or other infra issue 14:30:54 <dtantsur> as I said, if UDP is a problem, we have TCP-based protocols to consider 14:31:11 <weshay> ah.. interesting 14:31:14 <panda> maybe add a tcpdump log from a tcpdump in the background ? 14:31:23 <weshay> panda, ya.. there is a patch up to do that 14:31:27 <rlandy> we added that already 14:31:28 <weshay> rascasoft, already added that code :) 14:31:40 <bogdando> > as I said, if UDP is a problem, we have TCP-based protocols to consider 14:31:40 <bogdando> good idea ot explore 14:31:49 <derekh> along with a tcpdump on the BMC node 14:31:51 <panda> I can never have an original idea :( 14:31:57 <weshay> lolz 14:32:07 <rlandy> would the IPMI retries go into the CI code or the ironic code? 14:32:20 <rascasoft> actually I used it to debug a previous ironic problem so it should fit also this issue 14:32:29 <rlandy> we had retries at one point 14:32:45 <dtantsur> we have retries in ironic. we can have moar of them for the CI. 14:33:02 <panda> rascasoft: past me the link to that change in pvt please 14:33:21 <rascasoft> panda, it's merged into extras 14:33:34 <weshay> lol 14:33:49 <mwhahaha> dtantsur: are they on by default? 14:33:54 <mwhahaha> or is it tunable in a way? 14:33:58 <mwhahaha> (and are we doing that) 14:33:59 <rascasoft> panda, https://github.com/openstack/tripleo-quickstart-extras/commit/78c60d2102f5e22c3abb30bb0c8179d4c999829c 14:34:03 <dtantsur> mwhahaha: yes, yes, dunno 14:34:35 <weshay> I like the idea of not using udp if possible 14:34:41 <rlandy> dtantsur: what's involved with switching to TCP-based protocols - can we discuss after the meeting? we can try it 14:35:00 <weshay> and maybe ironic error'ing out with nodes unreachable, please check network 14:35:01 <bogdando> but https://github.com/openstack/tripleo-quickstart-extras/commit/78c60d2102f5e22c3abb30bb0c8179d4c999829c doesn't show the vbmc VM side 14:35:03 <dtantsur> it's a different protocol, so the BMC part will have to be changed 14:35:15 <dtantsur> hint: etingof may be quite happy to work on anything redfish-related :) 14:35:16 <bogdando> and we need to place dstate there also IMO 14:35:22 <bogdando> dstat 14:35:23 <mwhahaha> #action rlandy, weshay, dtantsur to look into switching to TCP for IMPI and possibly tuning retries 14:35:26 <mwhahaha> tag you're it 14:35:59 <derekh> btw, yesterday I ploted these ipmi outages on a undercloud I'm running on rdo-cloud, they lasted hours, retries and tcp isn't going to help that https://plot.ly/create/?fid=higginsd:2#/ 14:36:05 <weshay> can we just blame evilien? 14:36:19 <weshay> derekh++ 14:36:47 <dtantsur> yeah, hours is too much 14:36:47 <derekh> in the case of the env about DNS wasn't working (using 1.1.1.1 , to the BMC couldn't talk to the rdo-cloud API) 14:36:49 <EmilienM> :-o 14:37:24 <derekh> /about/above/ 14:37:27 <weshay> dtantsur, derekh so would you guys be open to the idea of an ovb no-change job that runs say every 4 hours or so and does some kind of health report on the our 3rd party env? 14:37:28 <dtantsur> for the record: these two options together control ipmi timeouts: https://docs.openstack.org/ironic/latest/configuration/config.html#ipmi 14:37:36 <weshay> until we have this solved? 14:37:59 <dtantsur> weshay: you mean, instead of voting OVB jobs? 14:38:23 <weshay> I think the two are independent proposal, both I think should be considered 14:38:42 * dtantsur welcomes etingof 14:38:51 <derekh> weshay: if it could be left up afterwards it would be great, I'd be happy to jump on and try and debug it 14:38:52 <weshay> I think ironic takes the brunt of at least the current issues in our 3rd party cloud 14:39:05 <weshay> rock on.. 14:39:06 <dtantsur> etingof: tl;dr a lot of ipmi outages in the ovb CI, thinking of various ways around 14:39:37 <weshay> really happy to see you guys in the mix and proactive here.. this has been very painful for the project.. so thank you!!!!! 14:39:38 <etingof> that reminds me of my ipmi patches 14:39:45 <dtantsur> etingof: yep, this is also relevant 14:39:49 <bogdando> https://imgflip.com/i/2k8a00 14:39:49 <derekh> weshay: actaully if you want I couldjust do that on my own rdo account 14:39:57 <dtantsur> we may want to increase retries here, so we'd better sort out our side 14:40:13 <weshay> I'm not sure if I see the same networking issues in personal tenants as I see in the openstack-infra tenant 14:40:26 <weshay> not sure what others have noticed 14:40:28 <derekh> weshay: ok 14:40:31 <dtantsur> etingof: but also thinking if switching ovb to redfish will make things a bit better 14:40:40 <etingof> so I have this killer patch that limits ipmitool run time -- https://review.openstack.org/#/c/610007/ 14:41:14 <panda> getting tcpdumps from BMC is a bit more difficult 14:41:18 <etingof> that patch might hopefully improve situation when we have many nodes down 14:41:23 <weshay> https://docs.openstack.org/ironic/pike/admin/drivers/redfish.html ? 14:41:36 <dtantsur> weshay: yep 14:42:21 <etingof> we would probably have to switch from virtualbmc to sushy-tools emulator 14:43:20 <mwhahaha> the redfish stuff seems like a longer term investment 14:43:29 <dtantsur> etingof: ovb does not use vbmc, but it's own bmc implementation 14:43:42 <dtantsur> well, but you have a point: sushy-tools has a redfish-to-openstack bridge already 14:44:19 <etingof> dtantsur, ah, right! because it manages OS instances rather than libvirt? 14:44:24 <dtantsur> yep 14:44:37 <etingof> luckly, sushy-tools can do both right out of the box \o/ 14:45:03 <dtantsur> it's a big plus indeed, it means we can remove the protocol code from ovb itself 14:45:25 <weshay> help me understand how sushy-tools and redfish would help introspection / provisioning when networking is very unstable 14:45:35 <weshay> maybe offline, but that is not clear to me yet 14:45:47 <dtantsur> weshay: simply because of using tcp instead of udp for power actions 14:45:54 <weshay> ah k 14:45:58 <dtantsur> I'm not sure if it's going to be a big contribution or not 14:46:01 <dtantsur> but worth trying IMO 14:46:09 <dtantsur> esp. since the world is (slowly) moving towards redfish anyway 14:46:10 <weshay> agree, should be on the table as well 14:46:13 <weshay> nhicher, ^^^ 14:46:41 <derekh> sounds like a lot of work to take on before figuring out the problem 14:46:59 <dtantsur> well, the proper fix is to make networking stable 14:47:01 <derekh> although possibly work we'll do eventually anyways 14:47:08 <weshay> we should wait for the admins to upgrade ovs imho 14:47:13 <weshay> run for a day or two w/ that 14:47:22 <mwhahaha> any ETA on the OVS upgrade? 14:47:25 <etingof> will we have to maintain ipmi-based deployments for a long time (forever)? 14:47:34 <weshay> asking admins to join 14:47:46 <etingof> because people in the trenches still use ipmi 14:47:52 <mwhahaha> yea we'll have to support it 14:48:12 <weshay> amoralej, do you know? 14:48:16 <mwhahaha> but we don't necessarily need to rely on it 100% upstream 14:48:22 <dtantsur> this ^^^ 14:48:53 <Tengu> well it needs to be tested anyway, if it's supported. 14:48:57 <mwhahaha> anyway so it looks like there's some possible improvements to OVB that could be had, who wants to take a look at that? 14:49:05 <weshay> <kforde> weshay, I need to do it in staging ... that will be tomorrow 14:49:06 <amoralej> weshay, what's the question? 14:49:12 <mwhahaha> Tengu: yea but we could limit it to a single feature set rather than all OVB jobs 14:49:16 <amoralej> ovs update? 14:49:18 <weshay> amoralej, when we're going to get the new ovs package in rdo-cloud 14:49:25 <Tengu> mwhahaha: yup. 14:49:32 <mwhahaha> Tengu: and we can just do a simple 1+1 deploy rather than full HA 14:49:44 <Tengu> +1 14:49:52 <mwhahaha> Tengu: we techincally support the redfish stuff (i think) but we don't have any upstream coverage in tripleo 14:49:57 <amoralej> weshay, no idea about the plan to deploy it in rdo-cloud 14:49:59 <mwhahaha> anyway we need to move on 14:50:10 <amoralej> alan did a build that hopefully will help 14:50:20 <mwhahaha> who wants to take on the slushy-tools ovb review bits? dtantsur or etingof? 14:50:45 <weshay> amoralej, k.. we need to expose these details more to the folks consuming 3rd party imho 14:50:48 * dtantsur pushes etingof forward :D 14:51:00 <mwhahaha> #action etingof to take a look at OVB+slushy-tools 14:51:00 <mwhahaha> done 14:51:03 <mwhahaha> moving on :D 14:51:04 <bogdando> dtantsur: network is unreliable by definition :) 14:51:19 <dtantsur> bogdando: yeah, there are grades of unreliability :) 14:51:27 <bogdando> especially if thinking of edge cases in future, we barely can/should "fix" it 14:51:41 <mwhahaha> in all seriousness we do need to move forward, please feel free to continue this discussion after the meeting or on the ML 14:51:51 <dtantsur> k k 14:52:01 <mwhahaha> (rfolco) CI Community Meeting starts immediately upon this meeting closing in #oooq. This is our team's weekly "open office hours." All are welcome! Ask/discuss anything, we don't bite. Agenda (add items freely) --> https://etherpad.openstack.org/p/tripleo-ci-squad-meeting ~ L49. 14:52:40 <mwhahaha> that's it on the meeting items 14:52:50 <mwhahaha> moving on to the status portion of our agenda 14:52:55 <mwhahaha> #topic Squad status 14:52:55 <mwhahaha> ci 14:52:55 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-ci-squad-meeting 14:52:55 <mwhahaha> upgrade 14:52:55 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-upgrade-squad-status 14:52:55 <mwhahaha> containers 14:52:55 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-containers-squad-status 14:52:56 <mwhahaha> edge 14:52:56 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-edge-squad-status 14:52:57 <mwhahaha> integration 14:52:57 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-integration-squad-status 14:52:58 <mwhahaha> ui/cli 14:52:58 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-ui-cli-squad-status 14:52:59 <mwhahaha> validations 14:52:59 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-validations-squad-status 14:53:00 <mwhahaha> networking 14:53:01 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-networking-squad-status 14:53:01 <mwhahaha> workflows 14:53:01 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-workflows-squad-status 14:53:02 <mwhahaha> security 14:53:02 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-security-squad 14:53:16 <mwhahaha> any particular highlights that anyone would like to raise? 14:54:50 <mwhahaha> sounds like no 14:55:13 <mwhahaha> #topic bugs & blueprints 14:55:13 <mwhahaha> #link https://launchpad.net/tripleo/+milestone/stein-1 14:55:13 <mwhahaha> For Stein we currently have 28 blueprints and about 743 open Launchpad bugs. 739 stein-1, 4 stein-2. 102 open Storyboard bugs. 14:55:13 <mwhahaha> #link https://storyboard.openstack.org/#!/project_group/76 14:55:47 <weshay> rfolco, post the link for the bluejeans ci here please 14:55:58 <mwhahaha> please take a look at the open blueprints as there are a few without approvals and such 14:56:07 <weshay> oops sorry 14:56:11 <rfolco> https://bluejeans.com/5878458097 --> ci community mtg bj 14:56:21 <mwhahaha> any specific bugs, other than the rdo cloud ones, that people want to point out? 14:57:18 <mwhahaha> sounds like no on that as well 14:57:19 <mwhahaha> #topic projects releases or stable backports 14:57:25 <mwhahaha> EmilienM: stable releases? 14:57:56 <EmilienM> mwhahaha: I do once a month now 14:58:01 <EmilienM> no updates this week 14:58:04 <mwhahaha> k 14:58:11 <mwhahaha> #topic specs 14:58:11 <mwhahaha> #link https://review.openstack.org/#/q/project:openstack/tripleo-specs+status:open 14:58:38 <mwhahaha> please take some time to review the open specs, looks like there is only a few for stein 14:58:45 <mwhahaha> #topic open discussion 14:58:47 <mwhahaha> anything else? 14:59:42 <huynq> mwhahaha: What's about Undercloud/Overcloud OS upgrade? 14:59:52 <mwhahaha> huynq: upgrade to what? 15:00:03 <openstackgerrit> Marius Cornea proposed openstack/tripleo-upgrade master: Add build option to plugin.spec https://review.openstack.org/611005 15:00:19 <slagle> thrash: therve : is it an ok idea to kick off workflow from within an action? 15:00:33 <thrash> slagle: Uhhh... I would say no 15:00:35 <huynq> mwhahaha: e.g from CentOS7 to CentOS8 15:01:16 <mwhahaha> huynq: some folks are still looking into that )cc: chem, holser_) but i'm not sure we have a solid plan at the moment due to lack of CentOS8 15:01:25 <mwhahaha> alright we're out of time 15:01:28 <mwhahaha> #endmeeting