19:03:39 <fungi> #startmeeting infra 19:03:40 <openstack> Meeting started Tue Mar 27 19:03:39 2018 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:03:41 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:03:44 <openstack> The meeting name has been set to 'infra' 19:03:51 <fungi> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:03:57 <fungi> #topic Announcements 19:04:20 <fungi> #info clarkb is travelling this week, which is why I am returning as a guest lecturer 19:04:51 <fungi> he thankfully didn't leave me any new announcements, and i wasn't creative enough to come up with more than that one 19:05:15 <fungi> #topic Actions from last meeting 19:05:46 <fungi> #link http://eavesdrop.openstack.org/meetings/infra/2018/infra.2018-03-20-19.01.html Minutes from last meeting 19:05:52 <fungi> "1. (none)" 19:05:57 <fungi> and done 19:06:40 <fungi> #topic Specs approval: Amend top-level project hosting spec (corvus) 19:06:54 <fungi> #link https://review.openstack.org/555104 Amend top-level project hosting spec 19:06:59 <AJaeger> lecturer? Ok, I'll move a few rows up and listen... 19:07:18 <corvus> i went around and got informal support for this change from almost everyone who reviewed the original 19:07:25 <corvus> enough so that i went ahead and actually implemented the change 19:07:37 <fungi> yeah, this one is really just a minor adjustment to the git hosting part of the hosting spec anyway 19:07:52 <corvus> but now (or soon) would be a good time to raise objections or questions on it :) 19:08:25 <corvus> i think we may be only minutes away from declaring that entire spec implemented 19:08:25 <fungi> we're just about to the point where that spec can be moved to implemented if this change is approved, right? 19:08:34 <fungi> heh 19:08:40 <corvus> yeah, i mean, we might already be there if i weren't typing with a sandwich in my hand 19:08:45 <fungi> fastest enter key in the west 19:09:26 <fungi> i'll risk incurring the wrath of our illustrious ptl and declare open season for roll call on this, since he's already put a roll call vote on it himself 19:09:54 <corvus> i think that's appropriate 19:10:16 <fungi> #info Infra Council voting is open on the "Amend top-level project hosting spec" change until 19:00 UTC on Thursday, March 29 19:10:44 <fungi> any other comments before we continue with the agenda? 19:11:20 <fungi> #topic Priority Efforts: A Task Tracker for OpenStack - utf8mb4 transition (fungi) 19:11:59 <fungi> just a quick note that diablo_rojo discovered a while back we couldn't import lp bugs containing 4-byte characters into storyboard 19:12:35 <fungi> i've been hacking on solving it, and there are a couple remaining outstanding changes i've tested manually on storyboard-dev so that we can move forward there 19:12:55 <corvus> i bet i can guess which emoji caused the problem 19:12:57 <dmsimard> \o 19:13:04 <fungi> #link https://review.openstack.org/555787 Use utf8mb4 for MySQL database charset (puppet-storyboard) 19:13:04 <diablo_rojo> Looks like py27 & py35 failed again on your latest patch :/ 19:13:13 <dmsimard> (╯°□°)╯︵ ┻━┻ < corvus 19:13:31 <corvus> there was a very specific emoji which took out etherpad.o.o during a summit... 19:13:31 <fungi> https://review.openstack.org/556626 For utf8mb4 shorten teams.name and users.email (storyboard) 19:13:40 <dmsimard> interesting 19:14:02 <fungi> actually, the problem emoji was something like "person with both hands raised in celebration" which appears in babel install stdout these days 19:14:04 <diablo_rojo> corvus, out of curiousity, what emoji? 19:14:23 <diablo_rojo> But also didn't show up in the lp bug actually. 19:14:40 <fungi> the one which notoriously took down etherpad at the summit was teh snowman 19:15:06 <corvus> fungi: oh was it? i thought it was... nevermind. 19:15:11 <fungi> anyway, barring anything beyond a minor typo failing the unit tests on 556626 those could both use quick reviews 19:15:39 * mordred has already reviewed 19:15:39 <corvus> i thought it was, pardon my language, U+1f4a9. 19:15:59 * dmsimard googled that 19:16:01 <fungi> corvus: it's possible the first etherpad outage was a different codepoint, yes, and then the subsequent deaths were all snowman tests 19:16:03 * mordred did too 19:16:41 <fungi> anyway, i didn't have much else to add here other than it would be nice to be able to keep up our momentum on project imports, so figured i'd highlight these priority changes 19:16:54 <dmsimard> +1, we need to ditch launchpad one day :) 19:17:10 <dmsimard> (and ubuntu one SSO) 19:17:38 <fungi> #topic Priority Efforts: Zuul v3 - git.zuul-ci.org (corvus) 19:17:47 <fungi> i guess this is now a thing? 19:17:52 <corvus> er, as mentioned, this is nearly done 19:17:57 <corvus> it may in fact be done 19:18:04 <corvus> i just hit a missing css file in one of my tests 19:18:06 <mordred> it renders for me 19:18:14 <pabelanger> and me! 19:18:30 <corvus> so i need to figure out whether all the servers are identical, or if i left git08 different somehow, or if i just caught it mid-rollout 19:18:31 <fungi> done-ish! 19:18:50 <corvus> i may ask for help testing in a bit, but am not ready now 19:18:53 <mordred> $ git clone https://git.zuul-ci.org/zuul 19:18:56 <mordred> just worked fine for me 19:19:01 <corvus> anyway, hopefully it's the latter, and it's all done :) 19:19:24 <fungi> i suppose the next big chunk will be updating zuul repos to switch git references 19:19:40 <corvus> yeah, that's mostly documentation 19:19:49 <corvus> the zuul and gitreview files won't change 19:19:55 <fungi> sure 19:20:28 <fungi> more critical that people who want to continuously consume the zuul stdlib will be able to use git.zuul-ci.org urls now 19:21:01 <fungi> though i suppose things like puppet-zuul can also be adjusted in time 19:21:25 <fungi> anything else you wanted to say about this, or does anyone still have any questions about it? 19:21:33 <dmsimard> Are we planning to move things like puppet-zuul and puppet-nodepool there ? 19:21:35 <fungi> obviously closely related to the spec change discussed earlier 19:21:53 <fungi> dmsimard: i'm not aware of any immediate plans to do that, but it would likely be possible 19:22:01 <corvus> dmsimard: not atm; i don't think the zuul project has taken on packaging/distribution tasks 19:22:12 <dmsimard> okay -- and another question 19:22:18 <corvus> (and if we did, i can think of two choices which would probably be ahead of puppet on our list) 19:22:24 <fungi> heh 19:22:35 <dmsimard> zuul and zuul-jobs are currently mirrored on both git.o.o and git.zuul-ci.org by gerrit automatically ? or by other means 19:22:51 <dmsimard> (and I guess the other repos) 19:23:03 <corvus> dmsimard: automatically. git.zuul-ci.org are symlinks on the git farm to the git.o.o repos 19:23:11 <fungi> dmsimard: it's all the same server, just some magic linking to make it appear in different sites 19:23:24 <dmsimard> ah, nice. thanks. 19:23:24 <corvus> the magic is configured in projects.yaml in project-config 19:23:32 <dmsimard> I'll have a look 19:23:35 <fungi> dmsimard: also worth reading that spec/change 19:23:42 <fungi> which roughly outlines how it works 19:24:11 <dmsimard> I don't have anything else, we can move on :) 19:25:11 <fungi> #topic Clouds update (dmsimard) 19:25:22 <dmsimard> ohai 19:25:30 <fungi> clarkb suggested maybe you could give us a brief summary of the limestone situation 19:25:57 <fungi> well, you or somebody, but i picked you because you seemed to have the most background on it 19:26:06 <fungi> feel free to delegate ;) 19:26:22 <dmsimard> The networking is still giving us a bit of trouble. My understanding is that there's two main problems still in the process of being resolved 19:27:00 <dmsimard> #1 In some roles we assumed that nodes always had an ipv4 which was the case until Limestone came around with ipv6 only nodes 19:27:25 <fungi> well, most recently the case, it was last the case with osic before the great ansibling 19:27:41 <ianw> #link https://review.openstack.org/#/c/556747/ 19:27:51 <frickler> see https://review.openstack.org/556784 too 19:27:55 <dmsimard> #2 We've witnessed some DNS lookup failures (most notably with pypi? maybe with other things) so we need to fix DNS things too. 19:28:39 <dmsimard> What we've done for the time being is that we pulled limestone out by assigning custom nodepool labels (prefixed with limestone-) and these are being tested under https://review.openstack.org/#/c/556849/ 19:28:39 <ianw> current theory on this from me is that we're using the v4 nat gateway, and we know that to be a little unreliable, because of a bad ipv6 detection in unbound 19:28:45 <ianw> #link https://review.openstack.org/556740 19:29:02 <fungi> so in both cases it's basically been a regression of things we had working with osic under a similar network model, but lost in the zuulv3 transition because we didn't have anywhere to keep us honest about continuing to support that model 19:29:04 <dmsimard> These custom labels will allow us to make sure that the patches we want to land are indeed fixing the issues we're seeing without impacting our regular customers :) 19:29:11 <pabelanger> Yah, I feel we had this same issue on osic with DNS 19:29:20 <pabelanger> and added the logic to do ipv6 there too 19:29:35 <dmsimard> fungi: I wasn't root back when OSIC was a thing so I'll have to defer to the others for that 19:29:37 <fungi> pabelanger: oh, we did. we had a whole dns-over-ipv6 chooser in the old ready scripts 19:29:52 <pabelanger> Yup 19:30:13 <corvus> which, to be fair, we still have, though as 556740 shows, it may not be working 19:30:56 <ianw> i'm uncertain if it was working, and something changed, or it never worked 19:31:15 <corvus> dmsimard, mordred: if you have a sec to look at https://review.openstack.org/556740 -- several of us are curious as to (whether or why) that's necessary 19:31:48 <dmsimard> corvus: I've seen that and I am dubious as well, I intend on testing it but haven't got around to it yet 19:32:00 <dmsimard> corvus: these variables should be equivalent 19:32:10 <ianw> that's what i thought, there's a couple of links in there to the logs 19:32:29 <corvus> should we start by throwing some debug lines in that role? 19:33:09 <ianw> corvus: i can ... 19:33:33 <ianw> since it triggers on rax, there's a fair probability of hitting it for comparison purposes 19:34:36 <fungi> makes sense as that's also got global ipv6 by default 19:35:42 <clarkb> ya as does ovh except it doesnt work :( 19:36:36 <corvus> i have looked at the supplied log, and the logs of a result including 556740 and agree that the old one lacks unbound_use_ipv6, and the new one has it 19:37:03 <corvus> so that seems to indicate the change actually changes behavior. why is still a mystery. 19:37:21 <pabelanger> I wonder if we are missing facts on that playbook 19:37:36 <pabelanger> but hostvars is setup 19:38:03 <corvus> it's the first pre-playbook, from base 19:39:13 <mordred> *weird* 19:40:44 <corvus> i also verified the change correctly sets the value to false on a host w/o ipv6 19:41:20 <fungi> so we have evidence it does what we want, but no documented explanation for why it works that way? 19:41:29 <corvus> fungi: that's where i am on the subject 19:41:34 <corvus> i left a bewildered +2 19:41:42 <ianw> that's about where i got to :) 19:42:03 <fungi> something we'd like to bring to upstream ansible attention? 19:43:42 <pabelanger> I'm a happy to add some debug statements and see if I can find out why it is failing, but agree, weird 19:43:57 <ianw> i can try a few debug statements and file an issues 19:44:21 <pabelanger> ianw: happy to debug along side 19:44:39 <fungi> sounds good 19:45:05 <fungi> i like when we figure out how to fix thnigs, but i'm left feeling uneasy when we can't explain why the fix works (or is even necessary) 19:45:36 <fungi> if it's fixed via unintended side-effect then there's an increased risk it will re-break 19:45:49 <corvus> i reckon we can continue to debug this post-meeting. this is the only limestone-blocking change that hasn't merged? 19:45:50 <fungi> when assumptions spontaneously change out from under us 19:46:28 <corvus> wait a sec 19:46:41 <corvus> this role runs in base/pre, right? 19:46:49 <ianw> corvus: i think so 19:46:54 <corvus> that means it's trusted, so it's not self-testing 19:47:14 <corvus> which means my analysis of the results are wrong, and it may not be safe to land without going through the base-test dance. 19:47:28 * fungi hadn't looked, assumed this was being tested via support already landed to base-test 19:47:50 <pabelanger> don't we run it from base-minimal? 19:47:59 <pabelanger> for ozj 19:48:00 <corvus> we've got nice big warnings around the base job itself, but not the roles it uses 19:48:52 <corvus> it may be covered by the integration jobs; if so, perhaps it's safe to land? i'm not sure what that means for evaluating whether the change works though. 19:49:58 <corvus> anyway, perhaps dmsimard, ianw, i, others can work through that some more post-meeting 19:50:33 <pabelanger> wfm 19:50:48 <fungi> cool, we'll move on with the agenda in that case 19:50:50 <fungi> thanks! 19:50:58 <fungi> #topic LimeSurvey spec (anteaya) 19:51:02 <fungi> #link https://review.openstack.org/349831 Add survey spec 19:51:06 <fungi> i gather anteaya is interested in breathing some new life into this proposed spec pleia2 put together a couple years ago 19:51:13 <anteaya> yes 19:51:39 <anteaya> we don't have an open source survey tool and I think that it would be useful to have one 19:51:42 <anteaya> I'd use it 19:52:03 <anteaya> since the last patchset a puppet-limesurvey module has appeared 19:52:04 <fungi> we've heard from various corners of the community that they'd be interested in having ways to put together ad hoc surveys 19:52:29 <corvus> we could at least use it to verify interest in the tool. ;) 19:52:41 <anteaya> and I have learned via testing with fungi that users just need a token to take the survey, the don't need to register 19:52:50 <anteaya> corvus, :) 19:52:56 <fungi> heh 19:53:06 <anteaya> so perhaps we can launch it with the current server auth 19:53:13 <fungi> so i guess this is as much to refresh visibility into the spec and see if anyone has any recent input on it before we see about putting it up for a council vote? 19:53:22 <anteaya> whilst those that are dueling in the auth wars figure out a winner 19:53:36 <pabelanger> could the civs software from cornell also be used for surveys? Just thinking about the discussions around elections 19:54:00 <fungi> civs can be used for certain kinds of surveys 19:54:01 <anteaya> pabelanger, well the only thing I've ever seen it used for is elections 19:54:22 <anteaya> hard to use it for select one answer out of these ones, or fill in some text 19:54:40 <anteaya> also, correct me if I'm wrong civs sometimes has load issues on its servers 19:54:47 <fungi> well, what civs can't really do well is multi-question sorts of surveys, nor surveys with free-form answers 19:54:57 <anteaya> indeed yes 19:54:58 <fungi> we could host our own instance of civs and may want to for other purposes 19:55:10 <anteaya> civs is the tool for the method it advertizes 19:55:17 <pabelanger> I think civs supports ' Allow voters to write in new choices', but I have never tested it 19:55:17 <fungi> but the two tools aren't really filling the same needs 19:55:30 <pabelanger> unsure about multi options 19:55:31 <anteaya> but the surveys I want to offer need more flexibility in how questions are answered 19:55:54 <pabelanger> ack 19:56:24 <anteaya> any objection to moving ahead with server auth for the time being? 19:56:41 <fungi> pabelanger: less about multi-option, more about your typicak feedback-style surveys where you're setting x-out-of-y values for some things, setting multi-select for some, entering your opinions on certain questions 19:56:46 <anteaya> I'm on my own dime here so I will have to stand this up on a server for myself if this gets bogged down 19:57:00 <anteaya> but thought I would try here first 19:57:39 <corvus> what's the server auth for? 19:57:46 <anteaya> for the admins 19:58:01 <anteaya> to set up the surveys and create the tokens for inviting users 19:58:03 <corvus> is it like, we make a secret password and give it out to people who ask to be able to create surveys? 19:58:09 <anteaya> or takers of surveys 19:58:52 <anteaya> well if that is the way infra wants to do it 19:59:11 <anteaya> out of the box is is register with username email and password 19:59:29 <corvus> it's not a preference, i'm trying to understand the question "any objection to moving ahead with server auth for the time being?" 19:59:35 <anteaya> and, I believe, click on emailed verification 19:59:45 * anteaya nods 20:00:01 <fungi> i don't see any reason we couldn't turn up a trial deployment, especially if there's already a puppet module for it so it's just a patch to system-config and someone to enter hiera keys and run the launch script, which may allow us to suss out what the missing bits are for a more self-service experience for people who want to set up their own surveys (maybe we could set up surveys through 20:00:02 <anteaya> so I don't know how infra wants to deal with who can register 20:00:03 <fungi> structured metadata and code review instead of through service-specific accounts for example? i haven't looked at how it stores them or what sort of api it might have, just guessing) 20:00:51 <fungi> oh, anyway, we're at time for today 20:00:55 <corvus> also, there's an openid module for apache, which may be able to be used as the "webserver-based" authentication. 20:01:15 <anteaya> corvus, could you add a link to that in the spec? 20:01:23 <anteaya> thanks for the time 20:01:35 <fungi> yeah, that's something clarkb has successfully coupled with some stuff 20:01:53 <clarkb> its a bit clunky and has bugs but mostly works 20:02:22 <corvus> noted 20:02:44 <fungi> anyway, we're two minutes over, but doesn't sound like there are major concerns if somebody's volunteering to do the patch to system-config (and i can do the hiera/launch bits if nobody else wants to volunteer for those) 20:03:06 <anteaya> my patch won't work but I can stand one up 20:03:11 <anteaya> corrections welcome 20:03:18 <fungi> that's why we have code review and tests! 20:03:23 <fungi> thanks everyone! 20:03:24 <anteaya> :) 20:03:30 <fungi> #endmeeting