14:00:05 #startmeeting tripleo 14:00:06 Meeting started Tue Feb 2 14:00:05 2021 UTC and is due to finish in 60 minutes. The chair is marios. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:07 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:09 The meeting name has been set to 'tripleo' 14:00:14 #topic agenda 14:00:15 * Review last minutes & action items 14:00:15 * One off agenda items 14:00:15 * Bugs & Blueprints 14:00:15 * Projects releases or stable backports 14:00:17 * Specs 14:00:19 * open discussion 14:00:22 Anyone can use the #link, #action and #info commands, not just the moderatorǃ 14:00:24 Hello folks, who is around today? o/ 14:00:27 o/ 14:00:34 o/ 14:00:36 o/ 14:00:37 0/ 14:00:49 0/ 14:00:59 \o 14:01:07 o/ 14:01:10 o/ 14:01:12 hi 14:01:20 hello all... k let's get going and folks will catch up as they can 14:01:29 #topic review last meeting logs & action items 14:01:40 #link http://eavesdrop.openstack.org/meetings/tripleo/2021/tripleo.2021-01-19-14.00.html 14:01:51 anything someone wants to bring up about last meeting? 14:01:54 * marios checks minutes 14:02:04 o/ 14:02:41 k ... moving on to one-off items 14:02:48 #topic one off agenda items 14:02:53 #link https://etherpad.openstack.org/p/tripleo-meeting-items 14:02:55 o/ 14:02:57 o/ 14:03:13 \o 14:03:16 if you don't mind I'd like to start with the community items this week 14:03:22 i think there will be some discussion around the first one 14:03:40 #info "tripleo ci usage" - came up at the TC and they have reached out to us 14:03:55 #link http://eavesdrop.openstack.org/meetings/tc/2021/tc.2021-01-21-15.00.log.html#l-48 14:04:00 o/ 14:04:03 o/ 14:04:04 #link http://paste.openstack.org/show/jD6kAP9tHk7PZr2nhv8h jobs breakdown 14:04:08 o/ 14:04:28 so the logs above are from a recent (last week) tc meeting where our (high) job usage came up 14:04:42 the paste is from a compilation/breakdown of the jobs 14:04:59 perhaps gate jobs must not be same as check? why do they always so? if check can't be passed, nothing normally gets merged, so repeating same at the gate seems pointless 14:05:05 o/ 14:05:12 let's leave only a few standalone jobs for the gate 14:05:18 the tc reached out via mnaser (email to me) about this. I pointed to some of the things we're doing around reducing content providers and removing upgrade jobs upstream and waiting to hear back 14:05:35 * weshay|ruck notes the historical context 14:05:39 and let periodic to catch merge-related collisions, if any 14:05:44 thanks bogdando ... so yeah this is to ask for ideas around how else we can improve 14:05:48 at one point tripleo used 55+% of upstream resources 14:05:52 (and with my vexxhost hat on, there's some seemingly slow systems on our donated ci resources, so working on that) 14:06:09 ramishra: brought us some more ideas just now in the community call around reducing some of the unnecessary scenario coverage eg. against changes in tripleoclient 14:06:11 bogdando, merging collisions will break check jobs 14:06:12 I think the problem now is not that much time of a job as recheck numbers 14:06:18 o/ mnaser 14:06:21 and we've brought that down to 30.8%, perhaps more is needed but.. I think the tripleo group has responded well to infra concerns in the past 14:06:32 sshnaidm|ruck: that's even better then 14:06:52 bogdando, having merging conflict in repo? :) 14:07:21 with my tripleo-ci hat on, i can say that we will likely have at least one task on improvements/optimizations to jobs this coming sprint which should help (files: matching, tempest jobs optimization ... ? other ideas) 14:07:30 the above paste is from For period from 2020-12-20 06:25:32,187 to 2021-01-19 15:53:34,923, can we collect matrix post cleanup of upgrade/providers jobs, and also before introduction of provider/consumer jobs so we have more data 14:07:40 ykarel: right good point 14:07:42 amolkahat proposed openstack/tripleo-ci master: Log directory workspace/logs do not exists. https://review.opendev.org/c/openstack/tripleo-ci/+/772372 14:07:55 ykarel, yes, and after we optimize standalones 14:08:06 on how much we improved post cleanup and how much provider/consumer helped in reduction resources 14:08:11 yes.. moving off of docker.io to use content-providers did increase our util... but we're bringing that back down as well 14:08:19 marios, worth to mention that it was a transient period 14:08:28 sshnaidm|ruck: not in the repo, no, just if something that passed the check gets suddenly broken while gating 14:08:33 for data.. one can refer to 14:08:34 http://dashboard-ci.tripleo.org/d/Z4vLSmOGk/cockpit?viewPanel=71&orgId=1&from=now-6M&to=now 14:08:38 and looses coverage 14:08:47 to see zuul enqueued time over the last 6 months 14:08:53 and still gets merged... so periodics to catch that up 14:08:54 sshnaidm|ruck: yes right ... we were in a rush to implement it on a deadline before docker changes and we missed some efficiency improvements but they are coming/have come now 14:09:21 as weshay|ruck pointed out above too 16:08 <@weshay|ruck> yes.. moving off of docker.io to use content-providers did increase our util... but we're bringing that back down as well 14:09:30 mnaser: is there any data around on what we need to reduce to, either in terms of node time or %'age? 14:09:42 Ronelle Landy proposed openstack/tripleo-specs master: WIP: Add spec for single sourcing repo setup https://review.opendev.org/c/openstack/tripleo-specs/+/772442 14:09:49 mnaser, yes.. we would love to set targets 14:09:57 not to rehash, but when this has come up in the past, it's always "tripleo uses too much" 14:09:58 slagle: mnaser: right, i was waiting to hear back about any specific actions that are required from us now, i didnt see something in the followup tc meeting either 14:10:06 then we want to know what our targets are 14:10:07 slagle: +1 14:10:08 * weshay|ruck agress w/ slagle 14:10:14 agrees even 14:10:14 i think that's a fair ask 14:10:16 we're going to keep adding stuff 14:10:28 Merged openstack/diskimage-builder master: Fix CentOS Stream 8 base repo in centos element https://review.opendev.org/c/openstack/diskimage-builder/+/771979 14:10:29 unless we set some limits for ourselves 14:11:01 slagle: well we kinda *are* in the sense that we aren't just mindlessly adding things, we do have recurring cycles of 'lets improve our jobs' in the ci team 14:11:06 slagle: are setting limits i mean 14:11:07 also would be good if those matrix can be public so we can always see ourselves how much it g etting improve/degrade with ci improvement efforts like at https://grafana.opendev.org/d/9XCNuphGk/zuul-status?orgId=1 14:11:47 mnaser, marios: yes, i know, but it seems like we don't have the insight into resource usage and quotas, etc. 14:11:49 do we have numbers about how many jobs are failing and how many of them are just unreliable requiring to recheck? 14:12:08 slagle: i think that's the hard question the tc needs to come up with -- what is the tripleo target they should aim for 14:12:12 we're actually quite good about managing our 3rd party rdo jobs, b/c we have set our quotas, and we limit our jobs to respect that 14:12:14 amoralej: we could get some stats around that from grafana 14:12:27 amoralej: see job exploration panel you can search by job and it gives breakdown 14:12:37 would be nice to see how much rechecks put in 14:12:37 amoralej, it'd be possible to see if people were adding bug number or reason to recheck, but it's not so popular 14:12:56 especially if check has passed, then rechecking the failed gate, and here we start all over again 14:12:59 mnaser, and /me notes.. the data re: our usage is usually pastebin'd to us from time to time 14:13:14 is there on-going data we can refer to? 14:13:19 k i am going to cap this in a couple mins at 10 mins as we have other items, we can revisit in open discussion and also on mailing list after we hear back from tc 14:13:31 mnaser: marios: i'm happy to work point on this part of the conversation, if I can help at all from the tripleo side 14:13:39 * sshnaidm|ruck is thinking of automation and visualization of rechecks in grafana 14:13:42 in fact zbr and I checked in w/ infra 3-4 weeks ago.. and they seemed ok w/ our consumption at that time 14:13:45 slagle: ok, great, thank you, i'll take this and bring it back to the tc meeting tmrw 14:13:48 but especially mnaser we need to hear about what the ask is from the tc at this point besides 'please reduce jobs' i mean we need something a bit more specific/target to work towards 14:13:51 mnaser: please consider making recheck'ed jobs always be the last in the queue 14:14:04 so new jobs will arrive before any queued rechecks 14:14:21 mnaser: and as brought up here, can we recheck those stats after our most recent improvements, they must be a little lower now 14:14:57 k anything else before we move on for now at least 14:14:59 yeah so i think actionable things are: make some sort of transparent way of getting access to stats to see improvements or not, get a 'quota' to see where they can aim for 14:15:15 and a suggestion of 'recheck' going to end of queue 14:15:29 mnaser: sounds good thank you 14:15:53 k just a couple of announcement type community points before i hand it off to others ... 14:15:53 maybe worth to not allow recheck if there is no reason mentioned in it? 14:15:59 #info openstack-tempest-skiplist - release-management: none https://review.opendev.org/c/openstack/governance/+/771488 http://lists.openstack.org/pipermail/openstack-discuss/2021-January/019846.html 14:16:10 like must set something after "recheck" 14:16:23 that came up from the release team... we don't need to make releases there so we made it so 14:16:36 #info os-collect-config, os-refresh-config, os-apply-config moving to independent release model https://review.opendev.org/c/openstack/releases/+/772570 http://lists.openstack.org/pipermail/openstack-discuss/2021-January/019778.html 14:16:54 Ronelle Landy proposed openstack/tripleo-specs master: WIP: Add spec for single sourcing repo setup https://review.opendev.org/c/openstack/tripleo-specs/+/772442 14:17:08 os-collec-config and friends... moving to no longer create branches... we aren't using it, looks like no-one else is either (heat may be, but they didn't seem interested to own it at this point) 14:17:10 sshnaidm|ruck: being always the last on queue would be much less restrictive 14:17:18 #info wallaby-2 release https://review.opendev.org/c/openstack/releases/+/771830 http://lists.openstack.org/pipermail/openstack-discuss/2021-January/019839.html 14:17:25 #info stable/ussuri release https://review.opendev.org/c/openstack/releases/+/772047 (or request from ykarel) 14:17:31 yeah we need to force some discipline for blind rechecks.. though I'm not sure pushing all rechecks to end of queue is a good idea 14:17:32 #info created wallaby-2 release https://launchpad.net/tripleo/wallaby & move_bugs.py tripleo wallaby2 wallaby-3 https://gist.github.com/marios/b3155fe3b1318cc26bfa4bc15c764a26#gistcomment-3612676 14:17:40 and only projects with flaky jobs will suffer, not all of them waiting in the gate 14:17:41 bogdando, it still prevents people from blind rechecks and motivates them to see what is the problem 14:17:48 k any comments or questions on any of those community projects? 14:18:02 sshnaidm|ruck: bogdando: ramishra: let's revisit in open discussion in a bit otherwise it will be hard to stick to the agenda in time 14:18:06 k 14:18:09 thanks folks sorry 14:18:19 k lets move on... taking it form the top of the etherpad 14:18:27 jpodivin: please go ahead? 14:18:39 jpodivin: is that one yours validations-libs and validations-common documentation dependecies 14:18:41 hi 14:18:50 yep that is me sorry for the delay. 14:19:07 Thing came up by accident and turned rather complex. 14:19:20 Is everyone familiar with openstack/requirements? 14:19:41 jpodivin: so this is missing openstack/requirements from some of our repos? 14:20:22 essentially yes, 14:20:34 Jose Luis Franco proposed openstack/tripleo-ansible master: Reshape tripleo-transfer to transfer files between remote hosts. https://review.opendev.org/c/openstack/tripleo-ansible/+/771657 14:20:53 but the real issue is that the projects have the checks disabled. 14:21:30 jpodivin: openstack-tox-requirements job you mean 14:21:42 yes. 14:21:54 Turns out it's a bit more widespread. 14:22:09 weshay|ruck: yeah, no pressure from infra regarding us at this moment, but we should not take this as granted forever, it can easily change. 14:22:17 jpodivin: so is that something you can/interested in working on ? i think we should have that job running in all the repos afaik 14:22:30 zbr, yes.. agree 14:22:48 certainly, the issue is, that there is some confusion about necessity of it. 14:22:55 anyone else have comment about missing openstack-tox-requirements job on tripleo repos or have context about why it was removed? 14:22:58 Merged openstack/tripleo-heat-templates stable/ussuri: Remove External{Internal,Public,Admin}Url parameters https://review.opendev.org/c/openstack/tripleo-heat-templates/+/773418 14:23:07 with the newly upgraded pip, some of the depency issues are seemingly resolved. 14:23:07 Merged openstack/tripleo-quickstart-extras master: Refresh bindep config https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/772879 14:23:39 in VF a question came up, if we shouldn't just break from openstack/requirements. 14:23:45 in fact that is one issue why I find the low-resource testing (lint/unit/molecule) as a good investment for us: it helps us save resources (and time) by finding bugs quickly and lowering the rountrips. 14:23:50 jpodivin: what is VF 14:24:04 validation framework probably 14:24:20 zbr +1 14:24:26 yes that is what I meant. 14:24:34 sorry 14:24:40 should have led with that. 14:25:22 so jpodivin are you blocked on that or would you like to try and post reviews so we can see any specific objections to adding those? 14:26:02 by see i mean i am hoping folks will comment on the reviews 14:26:11 Giulio Fidente proposed openstack/tripleo-heat-templates master: Enforces minimum Ceph client version to Mimic https://review.opendev.org/c/openstack/tripleo-heat-templates/+/773539 14:26:39 k going to move on in a second if there is nothign further on that for now... 14:27:04 I thinl that's it for now. 14:27:09 thanks jpodivin 14:27:18 ok slagle would you like to go ahead? 14:27:29 i think you're one is next on https://etherpad.opendev.org/p/tripleo-meeting-items 14:27:37 your you're ;) 14:27:45 sure, topic is ephemeral-heat 14:27:53 first, any objections to merging the spec? 14:28:05 #link https://review.opendev.org/c/openstack/tripleo-specs/+/765000 14:28:23 the initial patches are already posted 14:28:26 #link https://review.opendev.org/q/topic:%22ephemeral-heat%22+(status:open%20OR%20status:merged) 14:29:00 +2 merge it 14:29:05 if anyone wants to test locally, i have a script 14:29:08 #link https://gist.github.com/slagle/7fc36854947ec2ac2e309f6981b99b65 14:29:09 * cloudnull is happy to +w 14:29:47 nice thanks for the update slagle looks like it is starting to shape up and we might even have something by wallaby (just cos the spec was a bit late) 14:29:49 you don't need an undercloud if you use pre-deployed nodes. which is really nice, you can run it from a laptop. i'm testing it with a bare centos vm 14:30:10 slagle: well let just merge the spec now 14:30:11 cloudnull: thanks. let's merge it 14:30:14 i think we've had enough time 14:30:16 going to 14:30:29 marios: that's all from me 14:30:32 slagle done 14:30:33 done 14:30:34 :D 14:30:46 double done 14:30:56 ;) 14:31:19 next is storage fultonj fmount gfidente o/ folks please go ahead 14:31:22 Sandeep Yadav proposed openstack/tripleo-ci master: Remove tripleo_* file pattern for Standalone scen https://review.opendev.org/c/openstack/tripleo-ci/+/773692 14:31:39 #info Need six reviews to get THT standalone scenario001 using cephadm in place of ceph-ansible and default to Ceph ocotopus 14:31:51 list of reviews on line 26 of 14:31:54 https://etherpad.opendev.org/p/tripleo-meeting-items 14:32:03 fultonj: https://review.opendev.org/q/topic:%22cephadm_integration%22+(status:open%20OR%20status:merged) ? 14:32:09 topic branch might be better if that is correct 14:32:26 same topic yes, but order not same 14:32:31 fultonj: k 14:32:53 changing default ceph container from nautilus to ocotpus 14:33:14 hard coding jobs to nautilus 14:33:18 then moving jobs one by one 14:33:40 001-standalone is working without ceph-ansible and with ocotpus 14:33:44 it will move first 14:33:50 hopefully this week 14:33:53 then 004 14:34:01 Merged openstack/tripleo-heat-templates stable/ussuri: Remove ffwd lifecycle environment files. https://review.opendev.org/c/openstack/tripleo-heat-templates/+/772414 14:34:03 #link https://blueprints.launchpad.net/tripleo/+spec/tripleo-ceph 14:34:11 has been updated status to "good progress" 14:34:20 fultonj: nice 14:34:20 #info We expect W to not require ceph-ansible at GA, but ceph is still deployed DURING overcloud if network isolation desired 14:34:42 #info However, we plan to add CLI tools to deploy ceph before overcloud for experiemental use in W (just no network isol before overcloud yet) 14:34:50 #info ceph-ansible deployment of Nauitlus will work but be deprecated in W (removed in X) 14:35:03 #info cephadm deployment of Pacific will be recommended Ceph default in W 14:35:07 #info Pacifc to GA in March during Wallaby M3 (currently using Octopus) 14:35:19 we have a draft of the release note 14:35:20 #link https://review.opendev.org/c/openstack/tripleo-heat-templates/+/767294/45/releasenotes/notes/cephadm-28185ca8ac814567.yaml 14:35:24 still working on it 14:35:35 so it's M2, and I think we're on schedule for the ceph switch 14:36:42 that's all from the ceph side. any questions? 14:37:17 o/ i got dropped ... is this thing on 14:37:29 ack 14:37:32 thanks cloudnull 14:37:45 sorry if i missed something fultonj is there something more on the storage update 14:37:57 thanks fultonj great summary 14:38:02 marios: nothing else, thanks 14:38:04 marios: I think it's all on ceph side 14:38:09 thanks as always for a great update 14:38:14 fultonj:++ 14:38:24 moving on ... 14:38:35 from ci i added a point earlier about some ongoing work, just early socialisation 14:38:41 #info (early work/ongoing) Spec for single sourcing repo setup using tripleo-repos https://review.opendev.org/c/openstack/tripleo-specs/+/772442 14:38:57 we want to streamline the way we are laying down repos ... if you are interested please comment on the spec ^^^ 14:39:16 we may not merge it for W realistically delivered for X 14:39:31 any other one items? 14:39:49 Merged openstack/tripleo-specs master: Ephemeral Heat for the Overcloud spec https://review.opendev.org/c/openstack/tripleo-specs/+/765000 14:39:53 then let's move on and get to open discussion quickly as possible... 14:40:01 #topic Bugs and blueprints 14:40:01 #link https://bugs.launchpad.net/tripleo/ 14:40:01 #link https://storyboard.openstack.org/#!/project/openstack/tripleo-ansible 14:40:01 #link https://launchpad.net/tripleo/+milestone/wallaby-3 14:40:02 #link https://launchpad.net/tripleo/wallaby 14:40:34 i closed out the wallaby-2 series in launchpad and moved the bugs to wallaby-3 https://gist.github.com/marios/b3155fe3b1318cc26bfa4bc15c764a26#gistcomment-3612676 14:40:42 Ronelle Landy proposed openstack/tripleo-specs master: WIP: Add spec for single sourcing repo setup https://review.opendev.org/c/openstack/tripleo-specs/+/772442 14:40:49 any bugs someone wants to hilight? 14:40:55 sshnaidm|ruck: do we have any gate blockers right now or akahat|rover 14:41:08 marios, gate blockers - no 14:41:13 only promotion blockers 14:41:15 sshnaidm|ruck: thank you 14:41:23 sshnaidm|ruck: anything you want to higlight on bugs? 14:41:45 nope, akahat|rover ? ^ 14:41:54 marios, no 14:41:59 akahat|rover: sshnaidm|ruck: ack thanks folks just checking 14:42:09 since its the right agenda item ;) ... moving on 14:42:18 #topic Project releases or stable backports 14:42:19 #info tripleo wallaby repos https://releases.openstack.org/teams/tripleo.html#wallaby 14:42:30 as noted in the etherpad i made a wallaby-2 release and a stable/ussuri. 14:43:18 any other release requests or other comments? 14:43:34 k moving on ... 14:43:35 #topic specs 14:43:36 #info https://review.opendev.org/q/project:openstack/tripleo-specs 14:43:37 #info https://opendev.org/openstack/tripleo-specs/src/branch/master/specs/wallaby 14:43:43 i think we got them all now no? 14:44:10 ah frrouter one can still merge 14:44:22 #link https://review.opendev.org/c/openstack/tripleo-specs/+/758249 14:44:41 any objection to mergeing that one? 14:44:44 it has come up here before 14:44:47 it has enough votes 14:45:03 doesn't look like it will be updated far as i can see ... anyone else know more about that one? 14:45:55 k i just workflowed it 14:46:02 we can update it if there is something further anyway ;) 14:46:16 #topic open discussion 14:46:18 Anything else that folks want to bring up to the meeting? 14:46:45 any more discussion on the jobs /resource usage ? 14:46:53 or any other topic 14:46:59 any other tripleo topic :) 14:47:19 Pranali Deore proposed openstack/puppet-tripleo master: Handle cinder_mount_point_base for cinder mounting needs https://review.opendev.org/c/openstack/puppet-tripleo/+/773699 14:47:24 going once 14:47:50 2x 14:47:54 swift removal work is coming along nicely - https://review.opendev.org/q/topic:%22env_merging%22+(status:open) - if folks have a moment to review it would be greatly appreciated :D 14:48:05 * cloudnull shameless plug -cc ramishra 14:48:17 thanks cloudnull 14:49:00 cloudnull: iiuc, it seems more like the approach now is completely remove plan and associated handling 14:49:16 *instead* of replace swift with native os calls? 14:49:35 seems like the server side env handling facilitated the shift 14:49:56 yeah, its a bit of a change from what I was originally trying to do, however, imo what rabi has put together is a lot better. 14:50:19 make sense. just wanted to make sure i was understanding the intent 14:50:29 cloudnull: is it worth updating the spec? 14:50:38 while the plan is going away - https://review.opendev.org/c/openstack/python-tripleoclient/+/773565 - that will archive all the files 14:50:45 so we still have access to rebuild it all 14:51:28 marios we could though the approach is more an evolution of intent. so idk if its worth it 14:51:40 but happy to do it you think it best 14:51:57 cloudnull: no just checking, i mean we still are removing swift ... just also removing the plan altogether 14:52:01 cloudnull: thanks 14:52:05 ++ 14:52:57 k last call any other discussion please 14:53:09 * cloudnull promises to remain silent 14:53:14 you have the right 14:53:26 thank you everyone for participating today and bringing your topics. next meeting Tue 16 Feb 14:53:38 #endmeeting tripleo