#tripleo log

14:00:08 <mwhahaha> #startmeeting tripleo
14:00:08 <mwhahaha> #topic agenda
14:00:08 <mwhahaha> * Review past action items
14:00:08 <mwhahaha> * One off agenda items
14:00:08 <mwhahaha> * Squad status
14:00:08 <mwhahaha> * Bugs & Blueprints
14:00:08 <mwhahaha> * Projects releases or stable backports
14:00:08 <mwhahaha> * Specs
14:00:08 <mwhahaha> * open discussion
14:00:09 <openstack> Meeting started Tue Oct 16 14:00:08 2018 UTC and is due to finish in 60 minutes.  The chair is mwhahaha. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:10 <mwhahaha> Anyone can use the #link, #action and #info commands, not just the moderatorǃ
14:00:10 <mwhahaha> Hi everyone! who is around today?
14:00:11 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:13 <openstack> The meeting name has been set to 'tripleo'
14:00:32 <marios|rover> abishop: https://bugs.launchpad.net/tripleo/+bug/1797918
14:00:32 <openstack> Launchpad bug 1797918 in tripleo "teclient returns failures when attempting to provision a stack in rdo-cloud" [Critical,Triaged] - Assigned to Marios Andreou (marios-b)
14:00:35 <Tengu> «o/
14:00:41 <matbu> o/
14:00:41 <marios|rover> o/
14:00:43 <rfolco> o/
14:00:46 <Tengu> you're a bit early mwhahaha :)
14:00:53 <weshay> abishop, wait for the rdo-cloud status update
14:00:55 <fultonj> o/
14:00:56 <weshay> 0/
14:01:00 <mwhahaha> am not
14:01:03 <mwhahaha> i'm 8 seconds late
14:01:09 <mwhahaha> Meeting started Tue Oct 16 14:00:08 2018
14:01:10 <EmilienM> o/
14:01:12 <beagles> o/
14:01:18 <ksambor> o/
14:01:23 <abishop> weshay, marios|rover: ack, thx!
14:01:26 <abishop> o/
14:01:47 <Tengu> mwhahaha: ah, maybe my server is a bit late? thanks ntpd -.-
14:02:06 <mwhahaha> maybe you should be running chrony
14:02:07 <mwhahaha> :D
14:02:14 <Tengu> :]
14:02:17 <Tengu> easy target ;)
14:02:35 <panda> o/
14:02:48 <rlandy> o/
14:03:13 <ccamacho> o/
14:04:02 <mwhahaha> alright let's do this
14:04:09 <owalsh> o/
14:04:10 <mwhahaha> #topic review past action items
14:04:10 <mwhahaha> None!
14:04:29 <mwhahaha> since there were no action items lets move on
14:04:33 <mwhahaha> #topic one off agenda items
14:04:33 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-meeting-items
14:04:44 <mwhahaha> (bogdando) replace ovb jobs with multinode plus HW-provisioning done on fake QEMU VMs locally on undercloud, then switched to predeployed servers
14:04:46 <dprince> hi
14:04:47 <bogdando> o/
14:05:00 <bogdando> I hope the topic is self explaining?
14:05:08 <jrist> o/
14:05:13 <mwhahaha> not really
14:05:28 <mwhahaha> where would these multinode jobs run?
14:05:33 <bogdando> ok, it's about removing network communications while testing HW prov
14:06:17 <bogdando> once we are sure qemu vms can be introspected and provisioned, we switch to deployed servers and continue the multinode setup, instead of ovb
14:06:35 <bogdando> so qemu are fakes to be thrown away
14:06:56 <openstackgerrit> Sorin Sbarnea proposed openstack/tripleo-quickstart-extras master: Run DLRN gate role only if to_build is true  https://review.openstack.org/610728
14:06:56 <bogdando> the expectation is to see only ironic et al bits in action and having 0 interference from L2 networking
14:07:00 <mwhahaha> so would this be an additional step to be added to like the udnercloud job?
14:07:07 <dtantsur> this removes coverage for non-deployed-servers flow, right?
14:07:18 <bogdando> dtantsur: not sure, thus asking/proposing
14:07:32 <dtantsur> btw your idea is quite close to what derekh is doing for his ironic-in-overcloud CI job
14:07:42 <bogdando> do we really test something of Ironic stack after the nodes introspected?
14:07:50 <bogdando> interesting
14:07:52 <mwhahaha> yes
14:07:57 <dtantsur> tear down at least?
14:07:58 <mwhahaha> we need to make sure the images are good
14:08:02 <bogdando> I see
14:08:03 <mwhahaha> and that the provisioning askpect works
14:08:15 <bogdando> ok, could we provision those QEMU vms then?
14:08:18 <mwhahaha> so if you just wanted to test introspection early in the udnercloud job, that might be ok
14:08:34 <mwhahaha> but we need ovb because it provides actual provisioning coverage (and coverage for our image building)
14:08:35 <dtantsur> bogdando: this what devstack does. nested virt is slooooooooooooooooo
14:08:39 <dtantsur> ooooooooooooooooo
14:08:43 <dtantsur> you got the idea :)
14:08:47 * mwhahaha hands dtantsur an ow
14:09:03 <dtantsur> yay, here it is: ooow! thanks mwhahaha
14:09:06 <bogdando> well yes, but it hopefully takes not so much just to provision a node? got stats?
14:09:34 <derekh> bogdando: the ironic is overcloud job can be seen here https://review.openstack.org/#/c/582294/ see tripleo-ci-centos-7-scenario012-multinode-oooq-container
14:09:40 <dtantsur> no hard data, but expect 2-3 times slow down of roughly everything
14:09:59 <mwhahaha> given our memory constraints i'm not sure we have enough to do the undercloud + some tiny vms for provisioning
14:10:15 <openstackgerrit> Michele Baldessari proposed openstack/puppet-tripleo master: Fix ceph-nfs duplicate property  https://review.openstack.org/609599
14:10:17 <ooolpbot> URGENT TRIPLEO TASKS NEED ATTENTION
14:10:18 <ooolpbot> https://bugs.launchpad.net/tripleo/+bug/1797600
14:10:19 <ooolpbot> https://bugs.launchpad.net/tripleo/+bug/1797838
14:10:19 <ooolpbot> https://bugs.launchpad.net/tripleo/+bug/1797918
14:10:19 <openstack> Launchpad bug 1797600 in tripleo "Fixed interval looping call 'nova.virt.ironic.driver.IronicDriver._wait_for_active' failed: InstanceDeployFailure: Failed to set node power state to power on." [Critical,Incomplete]
14:10:20 <openstack> Launchpad bug 1797838 in tripleo "openstack-tox-linters job is failing for the tripleo-quickstart-extras master gate check" [Critical,In progress] - Assigned to Sorin Sbarnea (ssbarnea)
14:10:21 <openstack> Launchpad bug 1797918 in tripleo "teclient returns failures when attempting to provision a stack in rdo-cloud" [Critical,Triaged] - Assigned to Marios Andreou (marios-b)
14:10:24 <bogdando> Would be nice to see it in real CI jobs logs, how long it takes
14:10:31 <dtantsur> tiny?
14:10:39 <bogdando> in real ovb jobs
14:10:41 <dtantsur> our IPA images require 2G of RAM just to run...
14:11:01 <mwhahaha> right we can't do that on our upstream jobs
14:11:50 <bogdando> anyway, once the provisioning done the idea was to throw vms away and switch to multinode as usual
14:11:51 <openstackgerrit> Michele Baldessari proposed openstack/puppet-tripleo master: WIP Fix ipv6 addrlabel for ceph-nfs  https://review.openstack.org/610987
14:12:16 <bogdando> may be run some sanity checks before that...
14:12:41 <dtantsur> this removes the ability to check that our overcloud-full images 1. are usable, 2. can be deployed by ironic
14:12:48 <holser_> o/
14:12:59 <dtantsur> (this is assuming that ironic does its job correctly, which is verified in our CI.. modulo local boot)
14:13:01 <arxcruz> o/
14:13:11 <bogdando> dtantsur: ;-( for 1, but not sure for 2
14:13:31 <dtantsur> bogdando: ironic has some requirements for images. e.g. since we default to local boot, we need grub2 there.
14:13:41 <dtantsur> and if/when we switch to uefi, fun will increase
14:13:42 <bogdando> I thought that 2 will be covered by Ironic provisioning those images on qemu
14:13:50 <weshay> hrm.. need scenario12 doc'd https://github.com/openstack/tripleo-heat-templates/blob/master/README.rst#service-testing-matrix
14:14:01 <derekh> weshay: will do
14:14:17 <weshay> thank you sir /me looks at code
14:14:24 <bogdando> ok, how is that idea different to what derekh does?
14:15:08 <dtantsur> bogdando: derekh only need to verify that ironic boots stuff. he does not need to verify the final result.
14:15:28 <bogdando> also, an alternative may be to convert those qemu into images for the host cloud
14:15:36 <bogdando> to continue with multi-node
14:15:56 <weshay> fyi.. anyone else.. scenario12 defined here https://review.openstack.org/#/c/579603/
14:15:59 <bogdando> the take away is we avoided L2 dances :)
14:16:05 <dtantsur> bogdando: what's the problem you're trying to solve?
14:16:18 <dtantsur> these L2 dances you're referring to?
14:16:20 <bogdando> avoiding networking issues
14:16:34 <mwhahaha> i think we need to fix those and not just work around them in ci
14:16:44 <dtantsur> ++
14:16:46 <bogdando> the main thing to solve is https://bugs.launchpad.net/tripleo/+bug/1797526
14:16:46 <openstack> Launchpad bug 1797526 in tripleo "Failed to get power state for node FS01/02" [Critical,Triaged]
14:16:48 <mwhahaha> we need proper coverage for this
14:16:54 <bogdando> none knows how to solve it it seems
14:17:07 <bogdando> and bunch of similar issues hitting us periodically
14:17:09 <mwhahaha> this also seems like a good thing to add some resilience to
14:17:17 <dtantsur> that's bad, but won't we just move the problems to a different part?
14:17:41 <dtantsur> I can tell you that nested virt does give us hard time around networking from time to time
14:17:51 <dtantsur> e.g. ubuntu updated their ipxe ROM - and we're broken :(
14:17:54 <weshay> noting that it's not easy to deliniate the issues we're seeing in 3rd party atm..
14:18:03 <bogdando> I do not know tbh, if solving network issues is domain of tripleo
14:18:08 <weshay> introspection errors can be caused by rdo-cloud networking
14:18:11 <bogdando> the base OS rather
14:18:15 <bogdando> and infrastructure
14:18:16 <mwhahaha> so it seems like it would be better to invest in improved logging/error correction in the introspection process
14:18:17 <bogdando> not tripleo
14:18:19 <weshay> it's a very unstable env atm
14:18:36 <dprince> is this something that a retry would resolve. Like ironic.conf could be tuned for this a bit perhaps?
14:18:37 <weshay> ya.. we have a patch to tcpdump during introspection
14:18:37 <mwhahaha> rather than coming up with a complex ci job to just skip it
14:18:42 <bogdando> I do not mean control/data plance network ofc
14:18:56 <bogdando> but provisioning and hardware layers
14:19:29 <dtantsur> bogdando: OVB is an openstack cloud. if we cannot make it reliable enough for us.. we're in trouble
14:19:33 <bogdando> if integration testing may be done w/o real networking involved, only virtual SDNs, may be we can accept that? dunno
14:19:49 <derekh> but the problems in CI (at least the ones I've looked at) were where ipmi to the bmc wasn't working over a longer term, reties aint going to help
14:19:49 <bogdando> just not sure if that would be moving the problems to a different part
14:20:30 <weshay> dtantsur, agree.. there is work being done to address it. apevec has some details there however right now I agree it's unstable and painful
14:20:30 <dtantsur> what's the problem with IPMI? the BMC is unstable? using UDP?
14:20:35 <derekh> Can you make periodic jobs leave envs around on failure so they can be debugged ?
14:20:37 <bogdando> dtantsur: it is not related to the testing patches
14:20:38 <weshay> and it's causing a lot of false negatives
14:20:46 <dtantsur> if the former, we can fix it. if the latter, we can switch to redfish (http based)
14:20:53 <bogdando> only produces unrelevant noice and tones of grey failures
14:20:58 <weshay> derekh, you can recreate those w/ the reproducer scripts
14:21:05 <bogdando> weshay: ++
14:21:06 <weshay> in logs/reproduce-foo.sh
14:21:12 <bogdando> for flase negatives
14:21:15 <dprince> derekh: I'd like to know more on this. If its a long term connectivity issue.. Then that is our problem right?
14:21:24 <derekh> weshay: yes, but we were not able to on friday
14:21:34 <bogdando> basically, 99% of errors one can observe in those integrational builds had nothing to the subject of testing
14:22:09 <bogdando> (taking numbers out of air)
14:22:16 <openstackgerrit> John Fulton proposed openstack/tripleo-common master: Remove ceph-ansible Mistral workflow  https://review.openstack.org/604783
14:22:29 <dtantsur> are we use IPMI is always to blame in these 99% of made up cases? :)
14:22:32 <weshay> derekh, k k.. not sure what you were hitting it's hit or miss w/ the cloud atm but also happy to help. derekh probably getting w/ panda locally will help too
14:22:33 <derekh> It looks to me like either 1. ipmi traffic from the undercloud to the bmc node was dispupted or 2. traffic from the BMC to the rdo-cloud api was distrupted (perhapes because of DNS outages)
14:22:49 <weshay> ya
14:23:02 <derekh> *disrupted
14:23:26 <bogdando> mwhahaha: so not sure for "we need to fix thoseЭ
14:23:29 <bogdando> " :)
14:23:41 <weshay> the next topic in the agenda is related
14:23:47 <weshay> basically rdo-cloud status
14:24:00 <mwhahaha> k let's dive into the next topic then
14:24:00 <mwhahaha> RDO-Cloud and Third Party TripleO CI
14:24:06 <mwhahaha> weshay: is that yours
14:24:19 <weshay> ya.. just basically summarizing what most of you already know
14:24:33 <weshay> we have an unstable env for 3rd party atm.. and it's causing pain
14:24:53 <weshay> the rdo team has built a new ovs package based on osp and we're waiting on ops to install
14:25:03 <mwhahaha> should we remove it from check until we can resolve stability?
14:25:10 <weshay> nhicher, is also experimenting with other 3rd party clouds
14:25:13 <openstackgerrit> Derek Higgins proposed openstack/tripleo-heat-templates master: Add scenario 012 - overlcoud baremetal+ansible-ml2  https://review.openstack.org/579603
14:25:24 <dtantsur> mwhahaha: if it's not gating, you can ignore it even if it's voting
14:25:24 <mwhahaha> so we can actually devote efforts to trouble shoot rather than continue to add useless load and possibly hiding issues?
14:25:34 <mwhahaha> dtantsur: i mean not even run it on patches
14:25:35 <weshay> mwhahaha, I think thats on the table for sure.. however I can't say I'm very comfortable with that idea
14:25:37 <dtantsur> ah
14:25:56 <mwhahaha> if it's 25% what's the point of even running it other than wasting resources
14:26:00 <dtantsur> well, that means not touching anything related to ironic until it gets fixed
14:26:03 <dtantsur> (which may be fine)
14:26:12 <weshay> testing and loading the cloud?? ya .. it's a fair question
14:26:28 <weshay> mwhahaha, we still need rdo-cloud for master promotions and other branches
14:26:33 <weshay> so the cloud NEEDS to work
14:26:48 <weshay> I've had luck on the weekends
14:27:00 <mwhahaha> if it's fine on the weekends that points to load related issues
14:27:08 <weshay> fine is relative
14:27:10 <derekh> in the mean time merge the previously mentioned scenario012 and use it to verify ironic in overcloud instead of the undercloud
14:27:11 <weshay> better maybe
14:28:13 <mwhahaha> weshay: do we know what has changed in the cloud itself since the stability issues started?
14:28:45 <weshay> mwhahaha, personally I'd like to see the admins, tripleo folks, and ironic folks chat for a minute and see what we can do to improve the situation there
14:28:56 <huynq> Hi, What's about Undercloud OS upgrade?
14:29:03 <weshay> mwhahaha, I have 0 insight into any changes
14:29:23 <dtantsur> I'm ready to join such chat, even though I'm not sure we can do much on ironic side
14:29:35 <weshay> better lines of communication between tripleo, ops, and ironic, ci etc.. are needed..
14:29:35 <dtantsur> (well, retries for IPMI are certainly an option)
14:30:05 <weshay> dtantsur, agree however being defensive and getting the steps in ci that prove it's NOT ironic would be helpful I think
14:30:20 <weshay> it's too easy for folks to just think.. oh look.. introspection failed
14:30:22 <weshay> imho
14:30:39 <dtantsur> unstable IPMI can be hard to detect outside of ironic
14:30:40 <weshay> when really it's the networking or other infra issue
14:30:54 <dtantsur> as I said, if UDP is a problem, we have TCP-based protocols to consider
14:31:11 <weshay> ah.. interesting
14:31:14 <panda> maybe add a tcpdump log from a tcpdump in the background ?
14:31:23 <weshay> panda, ya.. there is a patch up to do that
14:31:27 <rlandy> we added that already
14:31:28 <weshay> rascasoft, already added that code :)
14:31:40 <bogdando> > as I said, if UDP is a problem, we have TCP-based protocols to consider
14:31:40 <bogdando> good idea ot explore
14:31:49 <derekh> along with a tcpdump on the BMC node
14:31:51 <panda> I can never have an original idea :(
14:31:57 <weshay> lolz
14:32:07 <rlandy> would the IPMI retries go into the CI code or the ironic code?
14:32:20 <rascasoft> actually I used it to debug a previous ironic problem so it should fit also this issue
14:32:29 <rlandy> we had retries at one point
14:32:45 <dtantsur> we have retries in ironic. we can have moar of them for the CI.
14:33:02 <panda> rascasoft: past me the link to that change in pvt please
14:33:21 <rascasoft> panda, it's merged into extras
14:33:34 <weshay> lol
14:33:49 <mwhahaha> dtantsur: are they on by default?
14:33:54 <mwhahaha> or is it tunable in a way?
14:33:58 <mwhahaha> (and are we doing that)
14:33:59 <rascasoft> panda, https://github.com/openstack/tripleo-quickstart-extras/commit/78c60d2102f5e22c3abb30bb0c8179d4c999829c
14:34:03 <dtantsur> mwhahaha: yes, yes, dunno
14:34:35 <weshay> I like the idea of not using udp if possible
14:34:41 <rlandy> dtantsur: what's involved with switching to TCP-based protocols - can we discuss after the meeting? we can try it
14:35:00 <weshay> and maybe ironic error'ing out with nodes unreachable, please check network
14:35:01 <bogdando> but https://github.com/openstack/tripleo-quickstart-extras/commit/78c60d2102f5e22c3abb30bb0c8179d4c999829c doesn't show the vbmc VM side
14:35:03 <dtantsur> it's a different protocol, so the BMC part will have to be changed
14:35:15 <dtantsur> hint: etingof may be quite happy to work on anything redfish-related :)
14:35:16 <bogdando> and we need to place dstate there also IMO
14:35:22 <bogdando> dstat
14:35:23 <mwhahaha> #action rlandy, weshay, dtantsur to look into switching to TCP for IMPI and possibly tuning retries
14:35:26 <mwhahaha> tag you're it
14:35:59 <derekh> btw, yesterday I ploted these ipmi outages on a undercloud I'm running on rdo-cloud, they lasted hours, retries and tcp isn't going to help that https://plot.ly/create/?fid=higginsd:2#/
14:36:05 <weshay> can we just blame evilien?
14:36:19 <weshay> derekh++
14:36:47 <dtantsur> yeah, hours is too much
14:36:47 <derekh> in the case of the env about DNS wasn't working (using 1.1.1.1 , to the BMC couldn't talk to the rdo-cloud API)
14:36:49 <EmilienM> :-o
14:37:24 <derekh> /about/above/
14:37:27 <weshay> dtantsur, derekh so would you guys be open to the idea of an ovb no-change job that runs say every 4 hours or so and does some kind of health report on the our 3rd party env?
14:37:28 <dtantsur> for the record: these two options together control ipmi timeouts: https://docs.openstack.org/ironic/latest/configuration/config.html#ipmi
14:37:36 <weshay> until we have this solved?
14:37:59 <dtantsur> weshay: you mean, instead of voting OVB jobs?
14:38:23 <weshay> I think the two are independent proposal, both I think should be considered
14:38:42 * dtantsur welcomes etingof
14:38:51 <derekh> weshay: if it could be left up afterwards it would be great, I'd be happy to jump on and try and debug it
14:38:52 <weshay> I think ironic takes the brunt of at least the current issues in our 3rd party cloud
14:39:05 <weshay> rock on..
14:39:06 <dtantsur> etingof: tl;dr a lot of ipmi outages in the ovb CI, thinking of various ways around
14:39:37 <weshay> really happy to see you guys in the mix and proactive here.. this has been very painful for the project.. so thank you!!!!!
14:39:38 <etingof> that reminds me of my ipmi patches
14:39:45 <dtantsur> etingof: yep, this is also relevant
14:39:49 <bogdando> https://imgflip.com/i/2k8a00
14:39:49 <derekh> weshay: actaully if you want I couldjust do that on my own rdo account
14:39:57 <dtantsur> we may want to increase retries here, so we'd better sort out our side
14:40:13 <weshay> I'm not sure if I see the same networking issues in personal tenants as I see in the openstack-infra tenant
14:40:26 <weshay> not sure what others have noticed
14:40:28 <derekh> weshay: ok
14:40:31 <dtantsur> etingof: but also thinking if switching ovb to redfish will make things a bit better
14:40:40 <etingof> so I have this killer patch that limits ipmitool run time -- https://review.openstack.org/#/c/610007/
14:41:14 <panda> getting tcpdumps from BMC is a bit more difficult
14:41:18 <etingof> that patch might hopefully improve situation when we have many nodes down
14:41:23 <weshay> https://docs.openstack.org/ironic/pike/admin/drivers/redfish.html ?
14:41:36 <dtantsur> weshay: yep
14:42:21 <etingof> we would probably have to switch from virtualbmc to sushy-tools emulator
14:43:20 <mwhahaha> the redfish stuff seems like a longer term investment
14:43:29 <dtantsur> etingof: ovb does not use vbmc, but it's own bmc implementation
14:43:42 <dtantsur> well, but you have a point: sushy-tools has a redfish-to-openstack bridge already
14:44:19 <etingof> dtantsur, ah, right! because it manages OS instances rather than libvirt?
14:44:24 <dtantsur> yep
14:44:37 <etingof> luckly, sushy-tools can do both right out of the box \o/
14:45:03 <dtantsur> it's a big plus indeed, it means we can remove the protocol code from ovb itself
14:45:25 <weshay> help me understand how sushy-tools and redfish would help introspection / provisioning when networking is very unstable
14:45:35 <weshay> maybe offline, but that is not clear to me yet
14:45:47 <dtantsur> weshay: simply because of using tcp instead of udp for power actions
14:45:54 <weshay> ah k
14:45:58 <dtantsur> I'm not sure if it's going to be a big contribution or not
14:46:01 <dtantsur> but worth trying IMO
14:46:09 <dtantsur> esp. since the world is (slowly) moving towards redfish anyway
14:46:10 <weshay> agree, should be on the table as well
14:46:13 <weshay> nhicher, ^^^
14:46:41 <derekh> sounds like a lot of work to take on before figuring out the problem
14:46:59 <dtantsur> well, the proper fix is to make networking stable
14:47:01 <derekh> although possibly work we'll do eventually anyways
14:47:08 <weshay> we should wait for the admins to upgrade ovs imho
14:47:13 <weshay> run for a day or two w/ that
14:47:22 <mwhahaha> any ETA on the OVS upgrade?
14:47:25 <etingof> will we have to maintain ipmi-based deployments for a long time (forever)?
14:47:34 <weshay> asking admins to join
14:47:46 <etingof> because people in the trenches still use ipmi
14:47:52 <mwhahaha> yea we'll have to support it
14:48:12 <weshay> amoralej, do you know?
14:48:16 <mwhahaha> but we don't necessarily need to rely on it 100% upstream
14:48:22 <dtantsur> this ^^^
14:48:53 <Tengu> well it needs to be tested anyway, if it's supported.
14:48:57 <mwhahaha> anyway so it looks like there's some possible improvements to OVB that could be had, who wants to take a look at that?
14:49:05 <weshay> <kforde> weshay, I need to do it in staging ... that will be tomorrow
14:49:06 <amoralej> weshay, what's the question?
14:49:12 <mwhahaha> Tengu: yea but we could limit it to a single feature set rather than all OVB jobs
14:49:16 <amoralej> ovs update?
14:49:18 <weshay> amoralej, when we're going to get the new ovs package in rdo-cloud
14:49:25 <Tengu> mwhahaha: yup.
14:49:32 <mwhahaha> Tengu: and we can just do a simple 1+1 deploy rather than full HA
14:49:44 <Tengu> +1
14:49:52 <mwhahaha> Tengu: we techincally support the redfish stuff (i think) but we don't have any upstream coverage in tripleo
14:49:57 <amoralej> weshay, no idea about the plan to deploy it in rdo-cloud
14:49:59 <mwhahaha> anyway we need to move on
14:50:10 <amoralej> alan did a build that hopefully will help
14:50:20 <mwhahaha> who wants to take on the slushy-tools ovb review bits? dtantsur or etingof?
14:50:45 <weshay> amoralej, k.. we need to expose these details more to the folks consuming 3rd party imho
14:50:48 * dtantsur pushes etingof forward :D
14:51:00 <mwhahaha> #action etingof to take a look at OVB+slushy-tools
14:51:00 <mwhahaha> done
14:51:03 <mwhahaha> moving on :D
14:51:04 <bogdando> dtantsur: network is unreliable by definition :)
14:51:19 <dtantsur> bogdando: yeah, there are grades of unreliability :)
14:51:27 <bogdando> especially if thinking of edge cases in future, we barely can/should "fix" it
14:51:41 <mwhahaha> in all seriousness we do need to move forward, please feel free to continue this discussion after the meeting or on the ML
14:51:51 <dtantsur> k k
14:52:01 <mwhahaha> (rfolco) CI Community Meeting starts immediately upon this meeting closing in #oooq. This is our team's weekly "open office hours." All are welcome!  Ask/discuss anything, we don't bite. Agenda (add items freely) --> https://etherpad.openstack.org/p/tripleo-ci-squad-meeting ~ L49.
14:52:40 <mwhahaha> that's it on the meeting items
14:52:50 <mwhahaha> moving on to the status portion of our agenda
14:52:55 <mwhahaha> #topic Squad status
14:52:55 <mwhahaha> ci
14:52:55 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-ci-squad-meeting
14:52:55 <mwhahaha> upgrade
14:52:55 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-upgrade-squad-status
14:52:55 <mwhahaha> containers
14:52:55 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-containers-squad-status
14:52:56 <mwhahaha> edge
14:52:56 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-edge-squad-status
14:52:57 <mwhahaha> integration
14:52:57 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-integration-squad-status
14:52:58 <mwhahaha> ui/cli
14:52:58 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-ui-cli-squad-status
14:52:59 <mwhahaha> validations
14:52:59 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-validations-squad-status
14:53:00 <mwhahaha> networking
14:53:01 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-networking-squad-status
14:53:01 <mwhahaha> workflows
14:53:01 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-workflows-squad-status
14:53:02 <mwhahaha> security
14:53:02 <mwhahaha> #link https://etherpad.openstack.org/p/tripleo-security-squad
14:53:16 <mwhahaha> any particular highlights that anyone would like to raise?
14:54:50 <mwhahaha> sounds like no
14:55:13 <mwhahaha> #topic bugs & blueprints
14:55:13 <mwhahaha> #link https://launchpad.net/tripleo/+milestone/stein-1
14:55:13 <mwhahaha> For Stein we currently have 28 blueprints and about 743 open Launchpad bugs.  739 stein-1, 4 stein-2.  102 open Storyboard bugs.
14:55:13 <mwhahaha> #link https://storyboard.openstack.org/#!/project_group/76
14:55:47 <weshay> rfolco, post the link for the bluejeans ci here please
14:55:58 <mwhahaha> please take a look at the open blueprints as there are a few without approvals and such
14:56:07 <weshay> oops sorry
14:56:11 <rfolco> https://bluejeans.com/5878458097 --> ci community mtg bj
14:56:21 <mwhahaha> any specific bugs, other than the rdo cloud ones, that people want to point out?
14:57:18 <mwhahaha> sounds like no on that as well
14:57:19 <mwhahaha> #topic projects releases or stable backports
14:57:25 <mwhahaha> EmilienM: stable releases?
14:57:56 <EmilienM> mwhahaha: I do once a month now
14:58:01 <EmilienM> no updates this week
14:58:04 <mwhahaha> k
14:58:11 <mwhahaha> #topic specs
14:58:11 <mwhahaha> #link https://review.openstack.org/#/q/project:openstack/tripleo-specs+status:open
14:58:38 <mwhahaha> please take some time to review the open specs, looks like there is only a few for stein
14:58:45 <mwhahaha> #topic open discussion
14:58:47 <mwhahaha> anything else?
14:59:42 <huynq> mwhahaha: What's about Undercloud/Overcloud OS upgrade?
14:59:52 <mwhahaha> huynq: upgrade to what?
15:00:03 <openstackgerrit> Marius Cornea proposed openstack/tripleo-upgrade master: Add build option to plugin.spec  https://review.openstack.org/611005
15:00:19 <slagle> thrash: therve : is it an ok idea to kick off workflow from within an action?
15:00:33 <thrash> slagle: Uhhh... I would say no
15:00:35 <huynq> mwhahaha: e.g from CentOS7 to CentOS8
15:01:16 <mwhahaha> huynq: some folks are still looking into that )cc: chem, holser_) but i'm not sure we have a solid plan at the moment due to lack of CentOS8
15:01:25 <mwhahaha> alright we're out of time
15:01:28 <mwhahaha> #endmeeting