16:01:26 #startmeeting openstack_ansible_meeting 16:01:27 Meeting started Tue Sep 25 16:01:26 2018 UTC and is due to finish in 60 minutes. The chair is mnaser. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:28 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:29 round 2 16:01:30 The meeting name has been set to 'openstack_ansible_meeting' 16:01:32 #topic rollcall 16:02:13 o/ ? :) 16:02:18 o/ 16:02:21 o/ 16:02:40 o/ 16:02:49 weizj proposed openstack/openstack-ansible-os_tacker master: Remove the unnecessary vao https://review.openstack.org/605117 16:03:03 ok cool lets get rolling 16:03:11 #topic Last week highlights 16:03:19 i see some stale items there, i think the only relevant one is evrardjp's 16:03:24 evrardjp: Making Ocata "extended maintenance" (This would add ocata-em tag, not closing the branch). See also: https://governance.openstack.org/tc/resolutions/20180301-stable-branch-eol.html . No real value for us, what do we point to for upstream repos (IMO, the ocata-em upstream tags when available, else the stable branch it is attached to). 16:03:39 weizj proposed openstack/openstack-ansible-os_tacker master: Remove the unnecessary verbose defined https://review.openstack.org/605117 16:03:54 evrardjp: would you like to expand on that? 16:04:47 yes 16:05:13 well I guess we discussed this today with odyssey4me and it was discussed during PTG (sorry to have missed that conversation) 16:05:36 <_d34dh0r53_> o/ 16:05:38 There are discussions on the ML about moving projects to EM for ocata 16:05:39 it was a very brief discussion 16:06:01 I suggest we move said projects to EM as soon as possible, instead of continuing to track stable/ocata 16:06:31 For the ops side of the house are we going to be a tag vs branch? 16:06:39 it looks like most maintainers/users dont seem to care 16:06:46 indeed 16:07:13 so the advantage of doing so is that we have a reference that's stable and meaningfull - never bumped until someone definitely wants to 16:07:26 so do we just tag ocata-em and call it a day? 16:07:40 and it won't break anytime should EOL appear and the branch be deleted 16:07:42 For me, I'd just like to reach a stage where we update the docs to indicate the status (and provide reference to what it means) - and stop releasing it. 16:08:10 mnaser: oh you mean just mark OSA as EM? 16:08:15 Yeah my concern is folks looking for the wrong location so definitely want to make sure our docs are good 16:08:17 and not care about the branches? 16:08:37 no, tag ocata-em and run jobs against ocata-em only 16:08:42 it will break, eventually, inevitebly 16:08:54 my question is 16:08:59 (i should know this answer) 16:09:03 It's nice to have the branch there for those of us who'll still need to do patch updates. We have a ton of newton forks thanks to the EOL, and had to do a ton of tooling changes to switch where things pointed. 16:09:05 you mean for upstream projects or OSA? 16:09:15 if we don't follow stable policy 16:09:21 it's kinda up to us to do what we want 16:09:30 i dont think stable policy works well or makes sense for a deployment project 16:09:40 Agreed, which is why we don't follow it. 16:09:41 i really hope no one is doing ocata greenfield deployments :X 16:09:41 agreed mnaser 16:09:59 Merged openstack/openstack-ansible-ops master: Change VM definition in deploy-vm mnaio playbook https://review.openstack.org/602832 16:09:59 mnaser: there are ppl doing it though, but we can't do anything for them :p 16:10:00 Merged openstack/openstack-ansible-ops master: Use group vars to reduce redundancy in host vars https://review.openstack.org/602834 16:10:01 Merged openstack/openstack-ansible-ops master: Enable setting mnaio disk size by pxe server group https://review.openstack.org/602835 16:10:01 Merged openstack/openstack-ansible-ops master: Make root partition size configurable https://review.openstack.org/602859 16:10:02 ahahah 16:10:42 The integrated build is pinned, so we don't need to do anything there. We can leave the roles as-is and if they break, we can fix them as needed. 16:11:02 I'd suggest removing the periodic tests though. 16:11:05 ok I will follow odyssey4me /mnaser 's proposition: Next release will be marked as EM instead of usual tag, and the rest stays the same. If someone wants to fix, (s)he can 16:11:26 I will clean things up: removing periodic testing -- changing the version name 16:11:35 marking jobs as non voting 16:11:40 Ok and give me a heads up and I'll try to get the docs patch up at the same time 16:11:46 we can discuss things in the review 16:12:04 spotz: you can push a patch already without the date I'd say. 16:12:20 well, as for non-voting jobs - we know that no-one bothers to check them, so why not just remove them all and make patches just do doc builds and nothing more? 16:12:32 i am in favour of dropping nv jobs. 16:12:36 lets not waste resources 16:12:37 maybe just linters 16:12:45 Anyone using an EM project should be doing their own testing. 16:12:50 ++ 16:12:58 well actually 16:13:01 things should still be backported so 16:13:02 that should be written in a release note then 16:13:04 patches should be good 16:13:16 maybe we keep the testing up until we can't? 16:13:22 the branch would still be up so it's fine 16:13:26 well 'em' gives the impression that someone maintains it. if you drop the jobs, then things start failing so the 'maintenance' is not sure accurate anymore 16:13:37 drop the testing till we cant 16:13:37 yeah, the EM should definitely include a release note saying that it's EM and therefore no longer actively tested or developed 16:13:48 means someone has to go out and write a patch to drop the jobs 16:13:50 when hey break 16:13:52 they* 16:14:13 yeah, I'm fine with leaving them as-is and removing them piecemeal as they break 16:14:14 i dunno 16:14:25 there is no good reason why things start failing when the openstack components are pinned 16:14:25 I've already done some :p 16:14:26 hwoarang: that's kinda the idea of EM though: until we can't merge things anymore, we keep branch open, else we close the branch 16:14:34 hwoarang: we've seen plenty of things happen 16:14:42 external dependencies changing url (ms + github stuff) 16:14:46 mariadb changing urls 16:14:49 aaaaaall sort of things happen 16:14:51 yeah but if the branch is open you need some jobs to get an idea of how terrible the situation is :) 16:15:04 evrardjp: We've got docs where it mentions the location which is why if the branch goes away and we become a tag I'd rather wait 16:15:04 hwoarang: :) 16:15:13 well see the thing is 16:15:17 we dont follow stable policy 16:15:22 that's true 16:15:27 so we have zero obligation of doing this 16:15:32 indeed 16:15:37 my question is: is there someone here willing to step up to handle this 16:15:44 we can close the branch right now if you want : p 16:15:45 because it provides value for their organization 16:16:07 if someone steps up to take care of the jobs, tagging em, etc, i'm all for it 16:16:22 but i dont think a deployment project being in EM makes a whole lot of sense, it does for the actual openstack services 16:16:23 I would prefer not work on this anymore :p 16:16:37 then lets not do it :) 16:16:41 we could draft up a small email to ML 16:16:46 and ask if someone wants to step up 16:17:05 I am not in agreement there -- deployers probably want their branch to live FOREVER 16:17:20 but they will never get an incentive to upgrade -- which is a bad message :p 16:17:30 against sending an email to ML? 16:17:37 killing the branch is the ultimate incentive 16:17:41 if someone wants it to live forever, they can pick up the work :-) 16:17:41 oh no I am for it : ) 16:17:51 mnaser: well that was the idea of EM :D 16:17:55 mnaser: we (rax) need it, and will handle it 16:17:55 anyway 16:18:04 well there, no need of an ML post 16:18:12 we spent much time for this, sorry for taking the time 16:18:20 nah, it's fine. 16:18:33 Good covos are worth the time 16:18:34 we only need a small subset of the roles 16:18:37 convos even... 16:18:52 in that case, we can go ahead with EM and given odyssey4me (or someone at rax) will be leading it then i'll leave it up to him/them to decide how they'd like to maintain it 16:18:56 i think we could leave gating in place for now and remove nv and periodics 16:19:01 (as long as we all agree on it) 16:19:02 at least it's discussed and we know the interest of people is there -- we'll clarify the situation outside this meeting, summarize in the ML and take actions. 16:19:02 but we would prefer to keep it upstream rather than have to manage a downstream forks, right d34dh0r53 prometheanfire cjloader ? 16:19:25 yes, upstream is better 16:19:26 mnaser: agreed 16:19:33 absolutely 16:19:41 always 16:19:54 awesome 16:19:58 understood -- so no EOL -- a path towards EM is better :) 16:20:09 or no EM but keeping things okayish 16:20:15 so in that case, it's your call to see how best we deal with EM (and where you want to maintain jobs or not) 16:20:23 as long as we have some form of ownership, i'm happy to have it 16:20:54 (shared responsibility is still a thing, we all help each other land stuff, but the bulk of removing any jobs or what not) 16:21:55 cool 16:21:59 ont other fun stuff 16:22:09 any other comments about this? 16:22:56 none 16:22:56 i guess not! 16:23:06 #topic bug triage 16:23:18 #link https://bugs.launchpad.net/openstack-ansible/+bug/1793781 16:23:19 Launchpad bug 1793781 in openstack-ansible "Queens -> Rocky Upgrade breaks cinder/glance" [Undecided,New] 16:24:02 based on last comment seems like a user config issue? 16:24:20 indeed 16:24:25 Yeah so DNF 16:24:28 or invalid 16:24:52 quorum on invalid? 16:25:02 ok 16:25:22 #link https://bugs.launchpad.net/openstack-ansible/+bug/1793389 16:25:22 Launchpad bug 1793389 in openstack-ansible "Upgrade to Ocata: Keystone Intermittent Missing 'options' Key" [Undecided,New] - Assigned to Alex Redinger (rexredinger) 16:26:01 hmm 16:26:04 looks like they pushed a fix 16:26:07 but nothing on master 16:26:22 is alex on irc? 16:26:30 EM on ocata, it's time:p 16:26:32 hahah 16:26:44 its a patch that went straight to master 16:26:50 err 16:26:53 straight to stable branches 16:26:55 which means it will regress 16:27:22 true. we should ask for proper backports 16:27:23 d34dh0r53: https://review.openstack.org/#/c/604846/ mind taking off the +2 here because there is no master patch? 16:27:39 it started in queens apparently: https://review.openstack.org/#/c/604804/1 16:27:54 and that looks like a keystone bug too 16:27:54 Well unless there's no issue with master and it is only a backport? 16:27:59 my bad, should have looked at that closer 16:28:10 no its an upgrade issue with keystone 16:28:18 well master is different now 16:28:29 but it doesn't prevent a master implementation 16:29:37 confirmed/medium and ask for an impl. in master? 16:29:40 also it's very weird: why only the service tenant, and not the other keystone commands? 16:29:50 it looks like a keystone bug to me honestly 16:30:17 im tempted to put this down as invalid 16:30:19 or a conf issue 16:30:21 OH 16:30:24 you know what this might be?! 16:30:29 memcache not being flushed 16:30:30 in the upgrade 16:30:44 so the dict it pulls is invalid 16:30:45 mmm 16:30:51 we have a restart in it 16:31:04 not during the keystone play 16:31:16 oh wait 16:31:31 https://github.com/openstack/openstack-ansible/blob/stable/ocata/scripts/run-upgrade.sh#L185 16:31:46 it is before keystone play 16:32:18 mnaser: I suggest we ask if memcached was properly flushed in the bug, see how it goes? 16:32:20 but is the bug reporter using the same upgrade tooling? 16:32:32 yeah my point odyssey4me 16:32:42 mark it incomplete until we know how the upgrade was triggered 16:32:49 I just asked him to join 16:32:56 Merged openstack/openstack-ansible-ops master: MNAIO: Correct README regarding DATA_DISK_DEVICE https://review.openstack.org/604064 16:33:23 hmm 16:33:29 he's on his way 16:33:37 we dont have much bugs 16:34:04 o/ Alex_Red1nger 16:34:10 do you need scrollback? 16:34:11 o/ 16:34:13 \\o 16:34:25 Yes please 16:34:29 Alex_Red1nger: we were talking about https://bugs.launchpad.net/openstack-ansible/+bug/1793389 16:34:29 Launchpad bug 1793389 in openstack-ansible "Upgrade to Ocata: Keystone Intermittent Missing 'options' Key" [Undecided,New] - Assigned to Alex Redinger (rexredinger) 16:34:33 hey Alex_Red1nger , it's fun with bugs time currently : ) 16:34:35 Alex_Red1nger: pasted it in slack 16:34:46 d34dh0r53: boo slack! 16:34:53 yep, boo slack 16:35:00 but didn't want to spam this changel 16:35:03 chanel 16:35:04 cut him some.......slack 16:35:09 ba dum cha 16:35:17 Haha 16:35:18 d34dh0r53: spamalot much? 16:35:32 that's sir spamalot to you 16:35:34 Anyway, this is definitely a keystone bug. 16:35:54 hehehe 16:35:55 My patch is meant as a work-around to expidite upgrades. 16:36:01 Might I ask what it means when my bug ticket is moved from open to INvalid with no comment of why? 16:36:14 * spotz notes no nano in our containers is a major bug:) 16:36:33 spotz: +1 16:36:42 skiedude: got a link to one? 16:36:53 https://bugs.launchpad.net/openstack-ansible/+bug/1793781 16:36:53 Launchpad bug 1793781 in openstack-ansible "Queens -> Rocky Upgrade breaks cinder/glance" [Undecided,Invalid] 16:36:54 Alex_Red1nger: you were using the upgrade tooling? 16:37:16 Merged openstack/openstack-ansible-ops master: MNAIO: Ensure consistent defaults https://review.openstack.org/604059 16:37:17 Merged openstack/openstack-ansible-ops master: MNAIO: Disable metering services by default https://review.openstack.org/604034 16:37:18 skiedude: sorry, i should have added a comment, it looks like you mentioned that it was something with your config so we assumed that it was just something to figure out in your config. 16:37:57 ah ok, well I'm just assuming it is my config, I'm using the basic config options in the test production example, so nothing special 16:38:06 mnaser: Yeah, I ran into the bug during execution of run-upgrade.sh 16:38:09 Alex_Red1nger: was part of the upgrade process to flush memcache, as done in every stable branch upgrade script? https://github.com/openstack/openstack-ansible/blob/stable/pike/scripts/run-upgrade.sh#L192 16:38:09 guess its not technically a bug, just wasn't sure where to put for support 16:38:12 mnaser: yeah, the error pops up when going from ocata to pike and it happens randomly, sometimes it passes successfully, other times, it just needs another run of the keystone playbook and it goes away 16:38:24 (using the run-upgrade.sh) 16:38:26 im assuming the "it needs another run" really is 16:38:32 aha, ok thanks for clarifying 16:38:33 stuff getting cleared out of memcache 16:38:57 maybe there needs to be some sort of check until it's cleared task? 16:39:01 yeah, it was kind of random 16:39:14 cacheing does seem like a likely cuplrit, since nothing in the code seems susceptible to a race. 16:39:26 skiedude: Best thing it so ask here with the understanding someone might not be available or the ML 16:39:30 yeah im 99% sure its the memcache 16:39:49 understood, wheres a link to the ML? 16:40:04 skiedude: I'll PM you so as not to spam:) 16:40:10 thanks! 16:41:04 maybe the memcache flush needs to happen within the keystone role rather, because that flush happens while the old system is running 16:41:23 exactly, i'd classify that as a bug in our deployment tooling 16:41:29 more retries is just hoping that luck expires memcache stuff 16:41:32 could just have a flag to set, and use that flag in the upgrade script 16:41:44 maybe as a handler 16:41:59 with a delegate_to 16:42:07 yep, probably as a handler because that'll fire when the old one is shut down and the new one starts 16:42:15 anytime we restart keystone 16:42:18 we do that 16:42:33 that sounds like it'd be too much - and break rolling upgrades 16:42:48 i dunno how much it'd break rolling upgrades 16:42:52 memcache is just stateless 16:42:53 it'll repopulate 16:43:06 ok, but doesn't it hold all the tokens and such? 16:43:45 nope, with fernet tokens 16:43:49 we've only ever needed this done on a major upgrade as far as I know, so it seems prudent to me to just have a flag and use it during the major upgrade 16:43:50 they dont get stored anywhere actually 16:44:05 with fernet tokens it validates it using the credential files 16:44:05 anyway, those are details which can get sorted out in review 16:44:12 yeah. 16:44:18 im not opposed to adding more retries if you think it helps 16:44:19 it wont hurt 16:44:23 Alex_Red1nger: just need a patch for master too 16:44:29 if you can do that then we'll be able to push the rest 16:45:14 given the cache is for performance, it seems prudent to me not to flush the cache every time you restart keystone - but rather only when needed, otherwise we'd just be slowing the environment down 16:45:31 not opposed to that either, or if we know its a major upgrade, we can figure out the details 16:45:43 Alex_Red1nger: ill mark it as confirmed and we'll take a patch for master before we can land the backported ones 16:46:03 Arx Cruz proposed openstack/openstack-ansible-os_tempest master: WIP - Enable stackviz support https://review.openstack.org/603100 16:46:09 skiedude: mailing list or irc (outside meeting hours) will be the best place to gather help :) 16:46:16 aka in less than 15 minutes 16:46:36 #link https://bugs.launchpad.net/openstack-ansible/+bug/1791085 16:46:36 Launchpad bug 1791085 in openstack-ansible "OVN Metadata Service Broken" [Undecided,New] - Assigned to James Denton (james-denton) 16:46:40 mnaser: OK, I'll move things over. 16:46:59 jamesdenton: do we need anytihng about that or just there to track progress? 16:47:13 eventually, yes. 16:47:26 it's #22 of #100 16:47:50 heehe 16:47:55 so do we mark as wishlist? how do we track wip items usualy? 16:48:12 good question. certainly not urgent 16:48:32 i guess confirmed 16:48:35 confirm/in-progress I guess - priority probably low/medium 16:48:37 because its confirmed broken. 16:48:48 confirmed/low it is 16:48:53 #link https://bugs.launchpad.net/openstack-ansible/+bug/1782388 16:48:53 Launchpad bug 1782388 in openstack-ansible "Installing Multipath But Not Enabling In Nova Causes Volume Attachment Failures" [Undecided,New] 16:48:54 thanks 16:48:56 our trail 16:49:08 i dont see byron around 16:50:47 i guess i can leavei t as is 16:50:49 #topic open discussion 16:50:58 10 fun minutes if anyone has anything fun 16:51:13 Can I possibly get some eyes on https://review.openstack.org/#/c/556586/? 16:51:23 The designate spec 16:51:26 mnaser evrardjp did we want to chat about rocky releasing? 16:51:36 we can 16:51:38 oh sure 16:51:46 i'll look at that spec soon cjloader 16:52:01 lets see the state of rocky 16:52:20 https://review.openstack.org/#/q/project:%255Eopenstack/openstack-ansible.*+is:open+branch:stable/rocky 16:52:22 we have some open things 16:52:25 We chatted a bit earlier. I suggested that we do the final RC tag tomorrow (Wednesday) and ask for the release based on it. JP's on holiday friday. 16:52:37 mnaser: when I proposed rocky was successfully building in periodics. 16:52:52 im just trying to see if there is something really important to merge 16:53:08 mnaser: makes sense 16:53:11 but for the most part 16:53:15 they all seem mostly like cleanups 16:53:24 All the galera_client deps are gone from rocky, so I'm happy to go ahead 16:53:34 nothing too fundamental and important 16:53:37 After the meeting I need to know if we have anache conf overrides:) 16:53:41 there are always bugs, and we release often, so I think it's time we released it into the wild 16:53:43 apache even 16:53:51 i agree, it's in a good state, nothing pressing 16:53:51 odyssey4me: I agree there 16:54:01 just for confirmation 16:54:08 So we can check periodics in the morning, and if it's good - we go for it 16:54:15 another final rc, right? not reusing what's up 16:54:16 +1 16:54:17 ++ i agree 16:54:26 ok sounds good 16:54:31 yes, whats up is pretty far behind 16:54:34 doesnt have all the galera_client stuff 16:54:47 i think the galera_client stuff has helped drive deploy times so much and made it more stable 16:54:48 very happy about that one 16:54:51 mnaser: But it's working well in SJC so far? 16:55:03 wonderfully so 16:55:08 ok 16:55:08 Sweet:) 16:55:13 all my rocky-deploy patches were merged too 16:55:31 any other subjects? :) 16:55:35 pike 16:55:42 pike release? 16:55:57 need to get https://review.openstack.org/#/c/604623/ going so that we can get pike updated 16:56:18 Alex Redinger proposed openstack/openstack-ansible-os_keystone master: Increase retries to mitigate intermittent failure https://review.openstack.org/605146 16:56:37 i think release is on halt right now 16:56:41 because of some issues with ssl? 16:56:45 doug posted something about it a few days ago on ML 16:56:56 ohh, must have missed that, I'll go look 16:56:59 or maybe not, im not sure 16:57:00 https://review.openstack.org/#/q/project:openstack/releases 16:57:03 i see things merging 16:57:16 d34dh0r53: you can ask in #openstack-release 16:57:32 I thought the comment there was specific with us and I was gonna blame mhayden 16:57:32 mnaser: yeah 16:57:46 I asked earlier - no response yet 16:58:07 I think it's fixed now 16:58:28 the releases issue I mean 16:58:44 sorry I haven't got the chance to continue on that -- a lot of fire still 16:58:46 I can poke Sean 16:58:48 cool 16:58:56 Alex Redinger proposed openstack/openstack-ansible-os_keystone stable/rocky: Increase retries to mitigate intermittent failure https://review.openstack.org/605148 16:59:46 thats all folks? 17:00:04 that's all from me 17:00:11 yep, I'm good 17:00:28 thanks everyone 17:00:31 #endmeeting