#openstack-meeting log

19:03:22 <fungi> #startmeeting infra
19:03:23 <openstack> Meeting started Tue Jan 12 19:03:22 2016 UTC and is due to finish in 60 minutes.  The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:03:24 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:03:27 <openstack> The meeting name has been set to 'infra'
19:03:31 <nibalizer> o/
19:03:32 <craige> o/
19:03:51 <fungi> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
19:03:53 <yolanda> o/
19:04:02 <fungi> #topic Announcements
19:04:14 <zaro> o/
19:04:34 <fungi> i didn't have any specific announcements at this point, anything i need to mention for posterity which we won't cover as part of the agenda?
19:04:46 <SotK> o/
19:05:08 <fungi> #topic Actions from last meeting
19:05:11 <fungi> #link http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-01-05-19.05.html
19:05:13 <fungi> there were none, completed successfully
19:05:26 <fungi> #topic Specs approval
19:05:37 <krotscheck> o/
19:06:07 <fungi> #info "Consolidation of docs jobs" specification voting is deferred and placed back into an in-progress state due to requested revisions
19:06:16 <fungi> #link https://review.openstack.org/246550
19:06:18 <AJaeger> fungi, yep. I want to enhance a bit more but currently concentrate on the translation work since that's more urgent.
19:06:30 <fungi> there were no approvals or new proposed specs for voting this week
19:06:50 <fungi> #topic Priority Efforts: Zuul v3
19:07:01 <jeblair> jhesketh timrc pabelanger anteaya Clint mordred GheRivero zaro yolanda rcarrillocruz SpamapS fungi fbo cschwede greghaynes nibalizer fdegir: ping!
19:07:09 * Clint twitches.
19:07:13 <jeblair> you all put your name on an etherpad in vancouver to help with zuulv3
19:07:16 <jhesketh> pong!
19:07:19 <anteaya> I did
19:07:21 <yolanda> yes
19:07:22 <Clint> yup
19:07:23 <fungi> yep
19:07:25 <nibalizer> yep
19:07:25 <anteaya> I have reviewed some patches
19:07:27 <jeblair> and, as we all know, you can't edit an etherpad after the fact.
19:07:33 <Clint> true
19:07:35 <fungi> totally impossible
19:07:35 <anteaya> that is the law
19:07:37 <jeblair> so you're stuck with it.
19:08:04 <fungi> is there a list of next steps we can get more specific about volunteering to tackle now?
19:08:11 <jeblair> anyway... i have some really bad code that very roughly sketches out the bulk of the major changes in zuulv3
19:08:28 <jeblair> and in the spec, there are a few major work items that i think can proceed in parallel: http://specs.openstack.org/openstack-infra/infra-specs/specs/zuulv3.html#work-items
19:08:36 <fbo> pong - yes
19:09:12 <jeblair> i think we can take my very bad code and land it on the v3 branch
19:09:24 <jeblair> and then start to make it better with some parallel efforts
19:09:27 <jeblair> probably not too many yet
19:09:40 <anteaya> jeblair: the gerrit topic you are using is feature/zuulv3 (v3) yes?
19:10:04 <fungi> it's actually a branch
19:10:11 <jeblair> but i think that a) nodepool, b) job definition and playbook loading/organization, c) ansible launching  are all major areas that can happen without too much toe stepping
19:10:16 <pabelanger> jeblair: indeed!
19:10:27 <anteaya> sorry that is the branch yes: https://review.openstack.org/#/q/status:open+branch:feature/zuulv3+topic:v3
19:10:33 <jhesketh> yes, I've been landing some of jeblair's code.. There are some very large patchsets so I haven't finished reviewing them all, but the strategy was to land them and then create followups where necessary
19:10:42 <clarkb> re nodepool, current plan is to get image builder changes in tomorrow and have it running for thursday image builds
19:10:51 <greghaynes> ohai
19:11:00 <clarkb> which is sort of an ancillary to zuulv3 related work
19:11:03 <pabelanger> clarkb: nice, already to start using them
19:11:04 <jeblair> cool, so we can probably branch nodepool next week for work on (a)
19:11:19 <greghaynes> Ya, hopefully
19:11:24 <fungi> the udea was to branch nodepool after the image worker stuff is in then?
19:11:27 <greghaynes> third time is the charm
19:11:27 <fungi> er, idea
19:11:33 <clarkb> fungi: yes
19:11:37 <fungi> righteous
19:11:51 <jeblair> does anyone want to start hacking on any of those things?  i'm thinking i'd like to take (b).. anyone want (a) or (c) ?
19:12:19 <jhesketh> I'm happy to take a look at c)
19:12:23 <jeblair> or did someone have their heart set on (b)? :)
19:12:42 <jhesketh> (also happy for (b) if there are no other takers)
19:13:01 <anteaya> I will continue to open patches and stare at them saying i did so and asking questions if I can
19:13:11 <mordred> we want to branch nodepool before we land the shade patches?
19:13:45 <jeblair> maybe we should land shade?
19:13:51 <jhesketh> (err, (a) I meant)
19:14:09 <jeblair> mordred: how ready is shade?
19:14:21 <fungi> so we're mostly in need of a volunteer for a so that jhesketh doesn't need to work on both a and c
19:14:35 <mordred> jeblair: shade patch I've been ignoring until the image worker patches land
19:14:38 <jeblair> yeah, the nodepool work is fairly self-contained
19:14:38 <fungi> specifically "Modify nodepool to support new allocation and distribution"
19:14:44 <mordred> jeblair: largely just needs to be rebased once the codebase is not churn
19:15:01 <yolanda> sorry, i cannot compromise for that in the next weeks, but i'll be happy to review, and help on the future
19:15:09 <clarkb> mordred: don't we also need to get to the bottom of the larger api call counts?
19:15:15 <clarkb> mordred: or was that figured out?
19:15:17 <mordred> jeblair: how about I take a quick stab at the rebase after the image builders land, and if it's big, we can shelve it until v3 timeframe
19:15:24 <mordred> clarkb: figured out
19:15:30 <clarkb> yay
19:15:38 <jeblair> mordred: sounds like a plan.
19:15:45 <mordred> \o/
19:16:02 <jeblair> so, no volunteers yet to actually do the nodepool v3 work?
19:16:13 <mordred> on, I'll definitely work on it
19:16:18 <fungi> my pipeline has backed up a lot since the summit, so while i'm willing to take a stab at item (a) i'm hoping someone else steps up
19:16:44 * mordred actively wants to work on it
19:16:53 <fungi> all yours, mordred!
19:16:58 <nibalizer> jeblair: i want to work on a) but I'm worried about overcommiting
19:17:07 <jeblair> cool, i'll propose a change to the spec to add these
19:17:20 <fungi> nibalizer: overcommitting is a time-honored infra tradition
19:17:45 <jeblair> nibalizer: i'm hoping that as these larger things take shape we can start having more people chip in on less all-encompasing tasks
19:18:42 <fungi> i expect some somewhat tightly scoped subtasks to fall out of these larger tasks as well
19:18:50 <jeblair> jhesketh: i've learned some things this very day about ansible, so we'll chat later
19:19:06 <jhesketh> jeblair: sounds good
19:19:23 <jeblair> fungi: i think that's good for now
19:19:25 <jhesketh> (I'd also like to chat a few zuul things after this)
19:19:41 <fungi> thanks jeblair, mordred, jhesketh!
19:19:55 <fungi> #topic Turning off HPCloud node provider(s) gracefully (clarkb)
19:20:04 <fungi> this was held over from last week's agenda
19:20:22 <fungi> we chatted a little after the meeting about plans, but would help to echo the nuggets of that in the meeting
19:20:42 <clarkb> sure, basic plan was to not ease off of it and instead just turn it off off at some point before the 31st
19:20:59 <fungi> i believe we decided some time shortly before the 31st we should gracefully shift from current quotas to 0 in nodepool
19:21:01 <clarkb> that way we can control when it happens and don't run into funnyness with nodepool operations (say 500 tests all turning off at the same time like aderaan
19:21:24 <jeblair> clarkb: a great disturbance indeed
19:21:32 <pleia2> (alderaan)
19:21:35 <fungi> jeblair beat me to the quote
19:21:48 <clarkb> the changes to do this have been written, there are two portions, first is to set our max-servers to -1 so that we stop running jobs, then the second removes the configuration from nodepool.yaml
19:21:51 <mordred> pleia2: did your IRC client just show you a picture of alderaan?
19:21:51 <pleia2> makes sense, and I spoke with some HPE folks last week and they were asking about our plans
19:21:58 <craige> +1 pleia2
19:22:01 <clarkb> let me get change links
19:22:17 <clarkb> #link https://review.openstack.org/#/c/264371/
19:22:19 <clarkb> that one and its child
19:22:24 <jeblair> it might be good to push this back as far as we feel comfortable though...
19:22:28 <fungi> it would help to get confirmation from hpe that they don't intend to pull that rug out from under us before the 31st
19:22:39 <pleia2> fungi: no, the 31st it is
19:22:45 <yolanda> we will be fine until 31
19:22:47 <fungi> okay, awesome
19:23:02 <jeblair> so that a) we have more opportunity for other quota to magically show up, and b) do we have some image types that are hpcloud only?
19:23:10 <anteaya> the 31st is a Sunday
19:23:11 <fungi> so given that's a sunday we likely want to do it on a day people are more likely around to pull the trigger
19:23:12 <pleia2> jeblair: ++
19:23:17 <clarkb> jeblair: there is one that is only in hpcloud
19:23:28 <clarkb> and I actually need to finish detangling that (or someone does)
19:23:32 <clarkb> it is devstack-centos7 iirc
19:23:51 <clarkb> yup
19:23:54 <fungi> ianw: were there issues getting that booting in other providers?
19:24:03 <anteaya> I'm at nova mid-cycle the last week of Jan, so won't be around to answer the phone
19:24:14 <clarkb> so I may need to split the 3 changes into 2 changes with devstack-centos7 handled first, then max-servers = -1, then remove hpcloud from config
19:24:22 <clarkb> er 2 into 3
19:24:24 <jeblair> clarkb: ++
19:24:24 <clarkb> I can maths
19:24:39 <ianw> fungi: i haven't tried it on the other providers yet.  it's on my new-year todo list
19:24:40 <fungi> clarkb and i aren't really around that week leading up to the 31st either
19:24:57 <clarkb> right, traveling sunday through thurs
19:24:57 <anteaya> fewer phone answers
19:25:02 <fungi> ianw: are there any voting jobs depending on it? i assume no?
19:25:02 <ianw> given past experience, my inclination is that it will not work
19:25:24 <ianw> should not be voting jobs
19:25:26 <pleia2> yeah, I'm leaving for LCA on the 27th
19:25:29 <clarkb> ianw: rax is the only provider with funny lack of dhcp, we may be able to get it workin on the other providers with dib
19:25:41 <ianw> clarkb: yep, that's the plan
19:25:42 <fungi> i won't be home until mid-day on friday the 29th but could approve changes that afternoon
19:25:46 <clarkb> ianw: kk
19:26:12 <clarkb> there is also a non zero possibility that I may be buying a house which makes everyting extra crazy
19:26:17 <anteaya> so who _is_ avalilable the 28th/29th to answer the channel and calm and disgruntled masses?
19:26:25 <anteaya> s/and/any
19:26:28 <jhesketh> I should be around until the 31st then off to LCA (but still available)
19:26:35 <anteaya> thanks
19:26:36 <fungi> i mean, i _could_ do it from the plane or a hotel before the afternoon of the 29th but would rather not commit to that
19:26:37 <anteaya> one
19:26:44 <anteaya> fungi: yup
19:26:46 * krotscheck is probably not available
19:26:47 <jhesketh> clearly a different timezone to the masses though
19:26:57 <AJaeger> I might be around but can't commit yet either ;(
19:27:04 <anteaya> jhesketh: we'll take what we can get
19:27:20 <jhesketh> or alternatively we move it to a few days earlier
19:27:26 <anteaya> jhesketh: ++
19:27:27 * yolanda will not be available, on travel
19:27:36 <jeblair> i expect to be around then
19:27:36 * craige will be already at LCA but you can ping me to help, jhesketh
19:27:53 <anteaya> craige: you'd need to be here to help
19:27:57 <anteaya> jeblair: yay two
19:28:29 <fungi> so as for what to expect, we're going to roughly halve our current capacity. if that coincides with some major problems in another provider (particularly rackspace) then that could be pretty terrible. otherwise it's likely we'll just be backed up a bit more than usual at peak times
19:29:43 <clarkb> right we also need to work on getting unittests on the new providers
19:30:10 <clarkb> which is the bindep and pre test setup related work
19:31:04 <fungi> yeah, those macros should in theory be ready to get added to jobs/job-templates now, but more testing would help
19:31:58 <fungi> the trick is that we need to reorder builders a little bit because bindep's ability to read a project-supplied list of package names means that repo cloning needs to happen prior to revoking sudo
19:32:23 <clarkb> though we could flip to bindep with the global list to start
19:32:26 <fungi> right now our jobs which revoke sudo do so before cloning the repo
19:32:35 <clarkb> then reorder, either way we have options and much of the work is done we just need to get it in place
19:32:59 <fungi> yeah, it should just work and ignore repo-supplied package lists in that case
19:33:09 <fungi> rather, ignore the fact that it can't find any
19:33:22 <clarkb> I am not sure I have time to take that on this week, but can attempt next week
19:33:27 <jeblair> i don't think the reorder should be problematic
19:33:43 <fungi> the database setup and bindep macros should just be a no-op on bare-.* workers so could be added now
19:34:28 <fungi> does anybody have time to hack on that a little between now and the end of the month?
19:34:53 <fungi> i can help but am wary of promising to have available time to do it all before then
19:35:55 <ianw> i think i can get involved with that
19:36:22 <fungi> ianw: awesome--it would actually help to have more rh-oriented insight on it too
19:36:35 <ianw> as mentioned, i'll be keeping on the general rpm distro side of things throughout
19:37:23 <fungi> i'll get up with you after the meeting on that
19:37:43 <fungi> so did we decide to merge the hpcloud turn-down changes on friday the 29th?
19:38:28 * craige thought we did.
19:38:31 <fungi> that's the soonest i could do it that week, but i didn't see anybody else volunteer
19:39:03 <anteaya> I won't be here to respond to questions so I don't have an opinion
19:39:45 <clarkb> 29th sounds good
19:39:52 <fungi> #agreed HPCloud nodepool quotas will be set to 0 on some time on Friday, January 29th in preparation for their public cloud sunset on the 31st
19:39:54 <clarkb> gives us a couple of additional days to work through anything funny that may come up
19:40:19 <fungi> meh, my grammar was terrible on that but not so terrible that i'm going to fix it
19:40:35 <fungi> okay, so 3 more topics in the next 20 minutes
19:40:51 <fungi> er, 4
19:40:58 <fungi> #topic puppetlabs-apache migration (pabelanger)
19:41:08 <fungi> #link https://review.openstack.org/205596
19:41:17 <fungi> pabelanger: how's this going?
19:42:59 <fungi> or anybody else know why he wanted to discuss it in the meeting?
19:43:26 <Clint> *crickets*
19:43:31 <fungi> i guess we can come back to it after the other topics if he returns and there's still time
19:43:32 <anteaya> I do not know
19:43:39 <fungi> #topic clarifying requirements for moving release tools into project-config repository (dhellmann)
19:43:44 <dhellmann> hi!
19:43:50 <anteaya> dhellmann: hi
19:43:50 <fungi> howdy
19:44:02 <dhellmann> so last week during the discussion of release automation it was mentioned that the scripts would need to move as part of that work
19:44:16 <dhellmann> I interpreted that as them needing to move into project-config, though that may not be a valid interpretation
19:44:29 <dhellmann> so I'm looking for clarification on why and where, to make sure I can line up that work
19:44:55 <dhellmann> in fact, now that I think harder, I think fungi said the jenkins slave scripts directory?
19:44:57 <anteaya> dhellmann: sorry if this is obvious, which release tools?
19:45:06 <fungi> dhellmann: basically any scripts that run on the release.slave or signing.slave hosts will need to be in project-config's jenkins/scripts directory
19:45:22 <anteaya> fungi: ah thanks now I understand
19:45:23 <dhellmann> anteaya : good question. There are a bunch of tools in openstack-infra/release-tools. Some shell, some python. They are all needed as part of the release tagging and publishing process.
19:45:41 <pabelanger> sorry
19:45:46 <jeblair> 19:50:16 <fungi> dhellmann: i think we discussed a while back that it might need to move to project-config jenkins/scripts directory instead
19:45:49 <jeblair> http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-01-05-19.05.log.html
19:45:52 <dhellmann> fungi : ok, so that's where. I want to understand why, because that's going to make maintaining them much more inconvenient for us.
19:45:52 <anteaya> dhellmann: sounds like scripts that run on release or signing slaves are the ones that need to be moved, yes?
19:45:57 <pabelanger> fungi: go distracted, will loop back in open topic
19:46:13 <dhellmann> not that I'm objecting, just making sure I fully understand
19:46:30 <dhellmann> anteaya : that will be enough of them that we might as well move them all.
19:46:35 <anteaya> oh okay
19:46:44 <fungi> dhellmann: mostly so that the infra-core and project-config-core reviewers have a chance to audit them to make sure they hopefully don't expose the private key material or other credentials we have secured on those hosts
19:47:11 <anteaya> so scripts that don't run on release and signing slaves, in jenkins/scripts as well (this is for the rest of infra)?
19:47:20 <dhellmann> fungi : ok. Since they are a mix of python and bash, how do we manage the installation on the nodes? is it ok for them to pip install things, for example?
19:47:40 <AJaeger> dhellmann: in a virtual environment that they setup
19:47:48 <fungi> dhellmann: we do pip install some things from pypi if that's what you're asking
19:47:53 <dhellmann> AJaeger : sure, they use a virtualenv now
19:48:04 <fungi> though actually now that i think about it, no we don't any longer (twine was an exception)
19:48:19 <fungi> we now install distro packages of twine on release.slave
19:48:23 <dhellmann> fungi : yeah, the scripts look for a virtualenv and create it if they need to, but it's not clear if that goes against security policies to do that at runtime vs. image build time or something
19:48:49 <pabelanger> fungi: will have to get back to it next week, need to step away from computer
19:48:59 <dhellmann> in particular, the script that figures out which releases were requested is python and need pyyaml and the script that updates launchpad with comments after a release needs some launchpad libraries
19:49:07 <fungi> dhellmann: we try to make sure any dependencies are installed on the machine in its system context rather than using a virtualenv
19:49:13 <dhellmann> there are likely to be others, those are the big ones I can think of off the top of my head
19:49:18 <fungi> ideally from distro packages of those python libraries
19:49:18 <jeblair> both of those are in ubuntu
19:49:35 <dhellmann> ok, so I'll get the rest of the list of requirements
19:49:46 <dhellmann> and I guess we can change the script to not use a virtualenv and require that those things be installed
19:49:55 <dhellmann> however, some of the scripts themselves are installed as console scripts, too
19:50:00 <dhellmann> I guess that will need to change
19:50:06 <jeblair> the desire for packages is more about a desire for stability of jobs that run in the post or release pipelines than security, per se
19:50:16 <dhellmann> ok, that makes sense, too
19:50:28 <fungi> yeah, avoids shifting dependencies
19:50:44 <dhellmann> and how do I express the dependencies on system packages for those scripts?
19:51:18 <fungi> dhellmann: puppet
19:51:21 <fungi> #link http://git.openstack.org/cgit/openstack-infra/system-config/tree/modules/openstack_project/manifests/release_slave.pp
19:51:36 <dhellmann> ok
19:51:50 <fungi> you can see there where it's installing the "twine" package
19:51:56 <fungi> for example
19:52:10 <fungi> also python-wheel
19:52:28 <fungi> other questions about this?
19:52:57 <dhellmann> fungi : yes, can ttx and I get +2 approval on that part of the project-config tree?
19:53:30 <AJaeger> dhellmann: I promise to give your changes priority in reviewing...
19:53:39 <fungi> dhellmann: not really without agreeing to review project-config changes in general. are the scripts that complex?
19:53:54 <fungi> and going to experience frequent changes?
19:54:04 <ttx> they tend to change often
19:54:14 <dhellmann> fungi : it's hard to say. if I have to move the code, I would like to move the review team with it, though.
19:54:20 <ttx> but maybe that's just a transition phase
19:54:50 <AJaeger> I suggest to let's see how this turns out and then discuss again if needed
19:54:56 <dhellmann> fungi : some of the python is a little twisty, but the tagging stuff is pretty simple
19:55:06 <anteaya> dhellmann ttx if one of you proposes something as long as the other +1's AJaeger and I will give it priority
19:55:16 <jeblair> the jenkins slave scripts directory isn't a great place for things that need to be updated frequently -- since they are installed on images, they can take weeks to actually be updated.
19:55:31 <dhellmann> yeah, that's not optimal in this case
19:55:31 <fungi> it seems like it's probably similar to some of the other scripts we have for, e.g., proposing requirements changes or translations updates from a complexity perspective
19:56:24 <ttx> jeblair: I fear "urgent" release tools updates as we discover problems in a release and need to fix something in the next one coming in 10 min
19:56:27 <dhellmann> fungi : there's a lot of launchpad and release notes stuff in here, too
19:56:28 <fungi> true, not the case at the moment for the release and signing slaves, but in zuul/nodepool v3 those become dynamically-generated workers
19:56:38 <dhellmann> in fact, reno is needed and that's probably not available in a system package yet
19:57:14 <fungi> i'm worried that this topic is going to overrun the remainder of the meeting (and is probably not something we're going to be able to decide in the next 3 minutes)
19:57:22 <dhellmann> yeah, I agree
19:57:29 <ttx> yeah we should move that off meeting
19:58:10 <fungi> yolanda: zaro: are your topics urgent that they get covered in the next minute?
19:58:10 <dhellmann> fungi: have the tc meeting next, so I'll come back to -infra tomorrow and ping you to continue?
19:58:27 <yolanda> fungi, not from my side, i'm progressing on several areas on infra cloud
19:58:52 <fungi> dhellmann: yeah, it's likely a large-ish change both to the complexity of what we expect to run in release jobs and to the teams expected to review the tooling, so tomorrow would be good to continue
19:58:53 <zaro> fungi: no
19:59:25 <fungi> dhellmann: and we should try to get input from more project-config-core and infra-root people
19:59:51 <dhellmann> fungi : sounds good
19:59:54 <fungi> okay, i'm going to defer pabelanger, yolanda and zaro's topics to next week and skip open discussion
19:59:56 <fungi> thanks everyone!
19:59:59 <fungi> #endmeeting