15:01:11 #startmeeting openstack-helm 15:01:12 Meeting started Tue Apr 9 15:01:11 2019 UTC and is due to finish in 60 minutes. The chair is portdirect. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:16 The meeting name has been set to 'openstack_helm' 15:01:21 \o 15:01:23 o/ 15:01:25 o/ 15:01:25 o/ 15:01:29 lets give it a couple of mins for people to arrive 15:01:41 \o o/ 15:01:43 the agenda for today is here: https://etherpad.openstack.org/p/openstack-helm-meeting-2019-04-09 15:01:49 o/ 15:01:51 please add to it :) 15:02:20 itxaka: your blue is... quite punchy :D 15:02:42 I know, I dont know what to change it to so its easily discernible but doesnt make eyes bleed 15:03:13 =) 15:03:27 it reminds me of word processing in displaywrite4 15:04:27 souvenirs! 15:04:39 o/ 15:05:09 Hi 15:05:18 ok - lets go 15:05:27 #topic Office hours 15:05:43 This is Prabhu from Juniper 15:05:54 so after last weeks meeting I did a straw poll amongst cores 15:06:23 I thought you were sending this publicly in the ML? 15:06:25 and we would like to run office hours on wednesday each week from 20:00-21:00UTC 15:06:54 as this is the time that would allow us to ensure that we have at least a couple of cores present every week 15:07:18 is there no EU cores? seems a very late time for EUr people to ask questions :) 15:07:25 itxaka: there are none 15:07:42 and that was the critical thing we needed to ensure to give us a chance of giving real value to these sessions 15:08:04 itxaka: at the moment only lost europeans - though it would be great to change that 15:08:06 portdirect: that's correct, without cores, it's impossible to get mentored 15:08:40 Let's see if you have enough ppl to mentor. I am pretty sure jayahn will love this news :) 15:08:56 It's a good step forward, thanks portdirect for organising this! 15:09:20 itxaka: we can still have jhesketh as our representative for questions :) 15:09:25 :) 15:09:56 ok - I'll send out a note to the ml after this and get the wiki updated 15:10:00 I have been traveling, and someone mentioned my name. :) 15:10:26 I am in Las Vegas for NAB show. Doing some exhibition. :) 15:10:26 I did -- Please don't hesitate to read the meeting log -- there is a proposal of office hours time. 15:10:44 I will surely read through 15:10:49 portdirect: thanks 15:11:10 portdirect: thanks as well. :) 15:11:45 np - im hoping that we can get some real flow out of these 15:11:51 so lets move on 15:11:55 #topic Status of Tempest chart 15:11:58 thats me! 15:12:04 floors yours :) 15:12:24 So looking yesterday at the tempest chart and trying to deploy I found out that it has some weird and missing default values which make it impossible to deploy 15:12:55 i.e: wrong image (does not exists), default null values which are not really null and so on 15:13:18 was wondering if we need more testing in there as its currently not possible to deploy tempest with the default values properly 15:13:38 btw, sent some patches already for those things, but makes me wonders which others may be broken as well 15:14:03 i think this really speaks to the gaps we have in our testing atm 15:14:08 we used to trigger an experimental job with the following: https://github.com/openstack/openstack-helm/blob/master/tools/deployment/multinode/900-tempest.sh 15:14:11 which links with the next point whiich is basically the same but for spiceproxy :) 15:14:16 albeit, it's a bit dated at this point 15:14:45 jayahn: i think you are currently using the tempest chart? 15:15:07 Im not sure of what a solution could be, other than getting more people to use the tempest chart and report things :) 15:15:32 itxaka: making this part of the jobs, and voting? 15:15:40 not voting 15:15:48 as currently it's not exercised at all 15:15:54 not immediately 15:15:59 I mean as a long term goal 15:16:04 if it doesnt vote it will get ignored forever :D 15:16:21 also in my experience, running tempest as an experimental check previously would more often than not put a large amount of strain on the nodepool vms 15:16:21 if it's maintained, I don't see a reason to not make it voting 15:16:36 it needs a promotional period like all the other jobs we introduce 15:16:50 srwilkers: I have noticed the impact when doing this in parallel, but running in serial is longer but with less impact. 15:16:51 where we can vet it and determine whether it's reliable or not 15:16:53 i really think it would be great to make this a goal 15:17:12 as tempest should be the de-facto way we validate the basic sanity of an osh deployed cloud 15:17:25 i dont disagree 15:17:25 srwilkers: totally -- it's not like we should drop a new job and say "hey, it's the new world order" 15:17:32 it's up to cores to +w anyway :p 15:17:42 well, if you make it sound that cool, maybe we will 15:17:44 portdirect: agreed 15:17:52 s/will/should/g 15:18:22 whould anyone like to step up here? 15:18:28 tempest has multiple tests, we don't need to run everything. I think we could run only smoke tests 15:18:36 ++ 15:18:49 +1 for some tests 15:18:51 portdirect: I guess the first action point would be to fix things, and discuss it with jayahn? 15:18:52 smoke 15:19:07 evrardjp: yes 15:19:09 tempest is set up to run smoke tests by default if i recall 15:19:20 it is, srwilkers 15:19:23 if we get jayahn to review (as main user of said helm chart), we should be good to re-introduce a non voting job 15:19:51 I can take that unless somebody really, really, *really* want it :) 15:19:57 fix, test, add job 15:20:05 srwilkers: we should maybe whitelist until confidence, then move to more complete scenarios then 15:20:17 because there are lots of smoke tests in default test list 15:20:21 evrardjp: im fine with that. however, id also like to see more individuals take interest in maintaining the jobs we currently have. we've got a small subset of people actually looking at our periodic jobs and attempting to fix what's broken 15:20:30 itxaka: i think from what jayahn just posted in openstack-helm, that would be great - and i'd add jaesang to reviews 15:20:37 srwilkers: that's a fair point. 15:21:01 but if those jobs are voting, people will have no choice than to fix things? :p 15:21:07 thats a lazy approach 15:21:19 srwilkers: you have been doing pretty much all the work here - and really deserve thanks for that 15:21:21 srwilkers, should we bring that up in the agenda to discuss? Im interested in that 15:21:38 the lazy approach IMO would be to make the jobs non voting or experimental when they start to fail... 15:21:47 but I agree it's always the problem when new jobs are introduced 15:21:54 its not even new jobs evrardjp 15:22:18 we've had failing voting jobs before, and my previous statement applies: it's a small set of people working to triage and fix them as soon as possible to get the pipes moving again 15:22:43 which is why im super skeptical about just jamming in new voting jobs, until that changes 15:22:52 srwilkers: agreed 15:22:53 that's everyone's duty IMO. 15:23:05 you see something failing you fix it 15:23:21 i think we should probably try and implement what we planned at the last ptg evrardjp 15:23:21 some people are just unaware of some of the issues though, and they don't monitor periodics 15:23:38 but aware that we are all time poor there 15:23:40 * itxaka didnt even knew there were periodic jobs... 15:25:45 did I drop or everyone went silent? 15:25:50 http://zuul.openstack.org/builds?project=openstack%2Fopenstack-helm 15:26:21 the scarier is here: 15:26:27 http://zuul.openstack.org/builds?project=openstack%2Fopenstack-helm&pipeline=periodic 15:27:46 should we...move onto the next point? 15:27:59 I would be fine with that 15:28:32 #topic Status of Spiceproxy 15:28:41 itxaka: your up again :) 15:28:47 same thing, different name 15:29:08 here i think jayahn has a production clouding usuing this for vdi? 15:29:09 spiceproxy was missing serviceaccounts, which leads me to think that there was no testing in there for it 15:29:24 damn jayahn has everything which is broken :P 15:29:40 no - i think this is just the tip of the iceberg 15:29:49 thanks for digging into it itxaka 15:29:54 so same thing as with tempest, should we fix, test, add non-voting? 15:30:02 lgtm 15:30:04 i think thats the prudent thing 15:30:14 itxaka: you already added the svc account i think? 15:30:23 I could help there, unfortunately I have never used spiceproxy so Im not sure what the configs for it are 15:30:27 portdirect, yep, I did 15:30:31 would do this in two different patches though (fixing and change testing) 15:30:44 i could take that on if you like, i used to use spice a bit 15:30:48 but it still didnt work for me due to permission errors, so Im wondering what else is missing in there 15:31:00 (except if tempest testing of this is already present, which I doubt) 15:31:11 the spice must flow 15:31:12 portdirect, that sounds good, I'll post in openstack-helm the exact error that I got then 15:31:26 thx - I'll update the etherpad 15:32:09 ok to move on? 15:32:14 :+1 15:32:18 #topic status of image registry authentication 15:32:26 not sure who added this? 15:32:31 that would be me 15:32:48 i'm looking into using a private registry to pull images 15:32:55 and i need to use authentication 15:33:06 stumbled upon angiewang's specs 15:33:21 yeah - its a nice bit of work 15:33:24 but it seems to need some work and i was wondering what was the status on it 15:33:49 we should chase up with Angie, but since its been merged i dont think we have managed to make any progress 15:34:09 i could not find any related patch in progress 15:34:11 though the spec makes pretty clear the work that needs done, so i think anyone could pick it up 15:34:43 wow that spec is pretty cool, the code is basically there :o 15:34:47 i would gladly try it but i have very little experience with helm 15:35:12 hi currently, the implementation for the support private registry with auth has not started yet. 15:35:55 sounds like an opportunity for some mentoring no? 15:36:07 itxaka: +1 15:36:09 that would be great 15:37:16 portdirect: sorry to highjack the topic, but there is an onboarding session at the next summit for OSH, right? 15:37:23 evrardjp: there is 15:37:37 awesome 15:37:47 sadly those are not recorded, but the eventual slides can be re-used/shared. 15:37:52 pgaxatte: will you be there? 15:37:59 evrardjp: yes 15:38:05 perfect. 15:38:16 angiewang: do you have time to work on this atm? it would be great if we could get a refence to spread out across charts 15:38:44 angiewang: and if you could bring pgaxatte along the way that would be helpful to grow the community 15:39:17 (no pressure though!) 15:40:11 do you mean pgaxatte will be implementing the changes in my spec and I will be helping with any issues 15:41:20 I just mean there are opportunities to work together 15:41:30 for example, pgaxatte could learn by reviewing 15:41:45 together we can spread the load 15:41:48 or the other way around pgaxatte could learn by doing and you would mentor in the reviews 15:42:04 anyway i'm ok with it 15:42:10 or what portdirect said.. but basically communication is key :) 15:42:46 let it be noted that i still have a lot of things to understand on openstack-helm and openstack-helm-infra 15:42:47 s/but// 15:43:05 pgaxatte: dont worry - we all do :) 15:43:30 especially on helm-toolkit 15:43:55 are you working on the tungesten-fabric charts atm pgaxatte ? 15:44:30 I have other tasks right now, I am not available to implement this at the moment(in the next month). I can help pgaxatte in reviews though. 15:44:38 portdirect: no we're looking at core components and mistral at the moment 15:45:26 angiewang: help with the reviews would be great too 15:45:42 ok - mistral falls well within that bucket of things that have regressed in tests since the last ptg - we should also work to restore that 15:46:31 portdirect: that's something we plan on working on too, we'll probably have the occasion to contribute back 15:46:58 pgaxatte Yeah, I can help in reviews. 15:47:04 I can also help with implementing a month later 15:47:19 If it's not rush 15:47:40 angiewang: will you be in denver at the end of the month? 15:47:40 I think the introductions are done now 15:47:44 no problem we can bypass that for now in our roadmap 15:48:11 portdirect, no 15:48:16 pgaxatte: as a short term, you can add .dockercfg with the appropriate creds to the kubeletes home dir 15:48:34 but lets see if we can get the ball rolling on implementing the correct approach 15:48:50 angiewang: thats a shame, would have been great to have you out there 15:49:16 ok to move on? 15:49:29 ok for me 15:49:29 Sorry to jump in. 15:49:43 I'm Prabhu, first time to this mtg. 15:49:52 Need some time at the end 15:50:01 will do prabhusesha 15:50:08 I didn't add my stuff to the agenda 15:50:15 can you add that now? https://etherpad.openstack.org/p/openstack-helm-meeting-2019-04-09 15:50:20 thanks1 15:50:24 #topic reviews 15:50:47 so - we have quite a few outstanding reviews this week that could do with some love 15:51:11 https://review.openstack.org/#/c/647493/ Add internal tenant id in conf (irc: LiangFang) 15:52:05 ^ would be great to get some more fb on this - would be really nice to have this feature 15:52:10 https://review.openstack.org/#/c/651140/ WIP: Use real rpc methods to find out if the service is alive 15:52:10 problem - maybe not enough read-only RPC calls? 15:52:10 problem - RPC service versioning changes per objective and sometimes per release (versions could be tracked with release specific values overrides) 15:52:46 jsuchome riases a good point here, and one we were painfully aware of when implementing the probes inititally 15:53:03 hi 15:53:10 it would be nice to re-visit and see if theres a way we can implement this without causing tracebacks in logs 15:53:28 originally I wanted to ask if it could be the right approach, to use some real methods instead of fake one 15:53:35 but I was looking at it today 15:53:39 it would be lovely is oslo.messgaing supported a `ping` but there we are ;) 15:53:44 maybe a different approach? do we have to check for RPC aliveness? Is there any other places we can look at for how to do this? 15:53:59 and it seems to me that some components like conductor and scheduler does not seem to support read-only rpc calls 15:54:14 this was the challenge we hit jsuchome :( 15:54:35 ah, so that explains weird implementation 15:55:00 itxaka: with the absence of more rich reporting from the agents, im at a loss as to how we can do this another way 15:55:07 but would love to have one 15:55:45 I guess it's wrong approach, trying to add such 'ping' calls to nova code... ? 15:55:49 get hypervisor status for compute? 15:56:07 evrardjp: but that does not cover the hard ones 15:56:22 yeah. I suppose we should discuss this in self-healing 15:56:22 we need read-only rpc across the board 15:56:27 compute actually is solvable, as proposed in that patch 15:56:30 because they probably have ideas there 15:56:43 evrardjp: agreed 15:56:59 looks like a good collaboration topic for ptg 15:57:02 would anyone else (apart from helm project) benefit from new calls? 15:57:04 it would be awesome if we could move off this approach 15:57:05 with oslo team 15:57:24 jsuchome: if done properly - every deployment/management project could 15:57:52 anything that implements monitoring would take advantage of that 15:57:53 it doesn't need to be a community goal, but something like that can be approach by a pop up team 15:58:04 approached* 15:58:09 hm, might be a good idea to bring it in nova meeting 15:58:34 there have been a few efforts in the past here, though normally they have stalled 15:58:46 getting a project like nova to lead the way would be great 15:58:59 * portdirect we ok to run over a few mins 15:59:29 portdirect: in this room? I think it's better to continue in #openstack-helm if necessary 15:59:40 being a good citizen :) 15:59:41 prabhusesha: you oke if we dicsuss your topic in the main #openstack-helm channel? `Keystone failure with all-in-one setup` 16:00:01 ok 16:00:06 is anyone here active in nova? If not, I can try to ask there myself (still RPC topic) 16:00:18 jsuchome: that would rock 16:00:37 itxaka: did my comments in the etherpad help you with your questions re reviews? 16:00:58 if not lets take it to the openstack-helm channel 16:01:01 portdirect, yep feel free to ignore those, I got enough as to move them forward 16:01:14 awesome - thanks :) 16:01:18 I mean, it would be better if it would be someone better known to nova community, but I'd do that if no one else fits such description 16:01:25 sorry we out of time folks 16:01:35 #endmeeting openstack-helm