19:01:00 #startmeeting infra 19:01:01 Meeting started Tue Sep 29 19:01:00 2015 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:02 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:04 The meeting name has been set to 'infra' 19:01:16 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:01:28 #topic Announcements: reminder of Stackforge migration on October 17 19:01:36 #link http://lists.openstack.org/pipermail/openstack-infra/2015-August/003119.html 19:01:43 ohai 19:01:54 just a friendly reminder, this big rename is happening on october 17 19:02:00 o/ 19:02:08 also any non-emergency rename work is postponed for now 19:02:08 o/ 19:02:11 o\ 19:02:28 Clint has patches, stackforge-rename is the gerrit topic I think 19:02:31 we didn't have any pending renames listed on the agenda after i cleared out the ones we already took care of 19:02:42 #topic Announcements: Mitaka Summit Planning Etherpad 19:02:51 let the games begin 19:02:52 #link http://lists.openstack.org/pipermail/openstack-infra/2015-September/003237.html 19:02:58 #link https://etherpad.openstack.org/p/infra-mitaka-summit-planning 19:03:04 mass-stackforge-renames 19:03:05 thanks pleia2 for putting that together! 19:03:11 o/ 19:03:45 let's collectively brainstorm on summit topics for our allotted timeslots/rooms and then discuss them in an upcoming meeting 19:03:53 o/ 19:04:01 brainstorm in the etherpad you mean? 19:04:07 yep 19:04:10 aye 19:04:22 if we have lots of ideas in there by next tuesday i'll try to set aside a good chunk of the meeting to work through prioritizing them as a group 19:04:33 if not, i'll shame you into adding more ;) 19:04:35 note: we have some shared meetup space with qa and ironic; hopefully we'll have some cross pollination with our big projects 19:04:40 O/ 19:05:30 yes, i'd love for us to have one focus be infra cloud, since we can ab^H^Huse the opportunity with ironic folk to flatter them and work through challenges we might have 19:05:51 fungi: i'll bring barricades 19:05:59 fungi: ++ 19:06:09 and as always, qa are our brethren, so plenty of things we can work with them on 19:06:26 #topic Announcements: jenkins-job-builder RFH and core reviewer list adjustments 19:06:35 #link http://lists.openstack.org/pipermail/openstack-infra/2015-September/003238.html 19:06:40 #link https://review.openstack.org/#/admin/groups/194,members 19:07:14 thanks to zxiiro for stepping up and helping as a core reviewer on jjb! 19:07:28 zxiiro++ 19:08:07 also i've pruned some long-inactive core reviewers on jjb 19:08:27 thanks for past service 19:08:32 absolutely 19:08:42 and as always, i know the current reviewers appreciate any additional reviewing help you can spare 19:08:43 congratulations zxiiro 19:08:46 I'll forward that mail to my team, might get someone else to start reviewing 19:09:02 thanks waynr! 19:09:08 #topic Actions from last meeting 19:09:18 #link http://eavesdrop.openstack.org/meetings/infra/2015/infra.2015-09-22-19.02.html 19:09:26 nibalizer propose change adding nibalizer to infra-root 19:09:30 yolanda propose change adding yolanda to infra-root 19:09:39 where did those end up? i'd like to go ahead and approve them 19:09:44 pending from review 19:09:47 i meant to before the meeting, but ENOTIME 19:09:57 several +2, pending from more and approval 19:10:07 #link https://review.openstack.org/226526 19:10:17 #link https://review.openstack.org/226643 19:10:29 thanks, was just hunting those up 19:10:38 sorry, for joining late... 19:10:44 AJaeger: glad you are here 19:11:21 there's already a majority of root admins with +2 on that so i'll approve right after the meeting. any remaining +2 feel free to add yourselves before then 19:11:35 fungi, thx 19:11:42 jeblair automate some sort of mass contact attempt for stackforge move 19:11:49 i did that! 19:11:50 i saw positive evidence that did indeed happen 19:11:58 we're getting lots of replies too 19:12:04 seems to have helped 19:12:13 things are indeed looking a little better with the wiki now 19:12:14 yup people are aware and talking 19:12:19 the wikipage had many edits in the history 19:12:25 in the email, i said we'd put unresponsive projects on the retire list by the end of the week 19:12:34 yay deadlines 19:12:39 most awesome. to reiterate, i agree 19:12:45 is it worth sending a second wave? 19:12:53 so maybe next week, we send out another email to those and say "hey, we did this; better change the wiki page if you disagree"? 19:13:03 that's not a terrible idea 19:13:11 especially since you already have the script! 19:13:29 yep, marginal cost is low 19:13:31 and I like the wording of that 19:13:32 Clint write script to prepare project-config change for migration 19:13:33 #link https://review.openstack.org/#/c/228002/ 19:13:39 timing 19:13:53 he had it all typed up there, i bet ;) 19:13:56 he did 19:14:11 comedy duo 19:14:23 please to reviewing 19:14:38 it has some parent changes which also look important 19:14:38 Clint: 228002 is just for the channels, we need similar script for zuul/layout.yaml as well 19:14:59 AJaeger: oops 19:15:06 Or at least some sed magic to replace everything - we can leave them in the "Stackforge" section 19:15:14 Yeah, that could work... 19:15:28 We also need gerrit/projects.yaml to get updated 19:15:47 it does gerrit/projects.yaml 19:15:57 and i'm not sure what needs renaming in zuul/layout.yaml 19:16:08 oh, i see 19:16:13 Clint: namespaces on the repos 19:16:13 Clint: we use "stackforge/repo" - but have it in sections. 19:16:19 got it 19:16:32 yeah, i'm less concerned about the section ordering/grouping there and more about making the rename happen 19:16:47 fungi: yeah... 19:16:54 sed should be enough... 19:17:02 Clint: so you want to work on the layout.yaml stuff too since you already have the projects.yaml and channels stuff worked out? 19:17:06 sure 19:17:23 #action Clint write script to prepare layout.yaml change for migration 19:17:28 thanks! 19:17:40 everyone review https://review.openstack.org/#/c/214207/ 19:17:48 that seems to have happened and it's now merged 19:17:57 jhesketh: are the indices working as expected? 19:18:19 was that the change that jhesketh mentioned wanting to rewrite completely? 19:18:33 fungi: its easy to test let me get a link 19:19:11 http://logs.openstack.org/51/226751/14/check/nodepool-coverage/ doesn't seem to work properly 19:19:54 http://logs.openstack.org/51/226751/14/check/nodepool-coverage/d06565b/ is the path to files in swift 19:20:08 so we should get an index page with a listing of job uuid prefixes 19:20:12 but get a 404 instead 19:20:18 :( 19:20:32 anyways somethign to go digging in apache logs to debug 19:20:32 similarly http://logs.openstack.org/43/226643/8/check/gate-system-config-pep8/ has no index even though it should include http://logs.openstack.org/43/226643/8/check/gate-system-config-pep8/886b686/ 19:21:02 #action jhesketh look into why the swift upload indices are still not appearing 19:21:15 #topic Specs approval: Translation check site spec (pleia2) 19:21:25 #link http://specs.openstack.org/openstack-infra/infra-specs/specs/translation_check_site.html 19:21:35 that got approved last week 19:21:40 thanks everyone 19:22:00 looking forward to an exciting implementation 19:22:01 yay 19:22:15 #topic Specs approval: JJB 2.0 API (waynr) 19:22:19 #link https://review.openstack.org/182542 19:22:24 * waynr beeps 19:22:37 this spec has been submitted for council vote 19:22:56 i'd like to resume work on #link https://review.openstack.org/155305 , can i get some more eyes? 19:23:14 right now it has only one rollcall vote, so it should probably collect a few more 19:23:33 fungi: okay, is there a list of folks I should ping to get eyeson? 19:23:55 fungi: would you like to set a voting timeframe? 19:24:21 waynr: they're here 19:24:37 cool 19:25:17 i just re-reviewed 19:26:01 #info voting open on Jenkins Job Builder 2.0.0 API Changes/Rewrite spec until 2015-10-01 19:00 UTC 19:26:27 #topic Specs approval: Host third-party CI monitoring dashboard (mmedvede) 19:26:39 thanks fungi, the spec proposes deploying CI Watch, can be seen running at 19:26:42 #link https://review.openstack.org/194437 19:26:56 #link Example deployment: http://ci-watch.tintri.com/project?project=nova 19:27:14 as one of infrastructure's services at ci-dashboard.openstack.org 19:27:46 the third party group has been working toward agreement on what tool infra should host for some time 19:27:47 looks like it has a couple rollcall votes but still needs a few more 19:28:00 congratulations to everyone involved for getting this far 19:28:05 #info voting open on initial CI monitoring dashboard service spec until 2015-10-01 19:00 UTC 19:28:18 third party folks are supporting this solution 19:28:18 maybe a question for another time, but how are we keeping track of this vs. openstack health? 19:28:27 I could see confusion happening 19:28:31 pleia2: it is a good question 19:28:56 should probably select naming very carefully at this stage 19:28:57 right, one is for upstream ci tracking failure rates for specific jobs if i recall correctlt 19:29:05 correctly 19:29:10 fungi: yes that is openstack health 19:29:11 * pleia2 nods 19:29:15 so potential for confusion does exist 19:29:21 do we have spec for openstack health? 19:29:27 and the spec for it is sdague's and was called ci-watch 19:29:30 qa might 19:29:39 oh, the hosting spec for it, yes 19:29:40 I also don't see how its supposed to provide additional info over what is already in gerrit. Are we looking for trends? maybe it should interpret the data then? 19:29:41 I thought it was in infra 19:29:47 yes the hosting spec 19:30:07 clarkb: it is a visualization tool 19:30:13 clarkb: openstack-health shows subunit data 19:30:14 anteaya: yes but its not showing me any new information 19:30:18 #link openstack health sample deploy: http://15.126.244.104:8001/#/ 19:30:21 clarkb: For CI operators this shows a lot more data at once. 19:30:25 and really helps projects see what they need to see to track their jobs 19:30:25 anteaya: lifeless specifically talking about the third party ci thing 19:30:30 clarkb, we're just looking for a quick an easy tool 19:30:38 clarkb: oh the third party tool 19:30:46 clarkb: projects want it and developers want it 19:30:48 skylerberg: right and because of that its unreadable 19:30:55 It can be tedious to figure out when your CI started failing and to see if others are also failing from a similar problem. 19:30:56 skylerberg: anyways we can iterate on what is actually useful later 19:31:21 yeah, i'm fine with this being an experiment. its success or failure does not need to be predetermined 19:31:26 FWIW: grafana could also be used for health checks 19:31:30 clarkb: nova is using the third party ci dashboard in their meetings 19:31:37 Openstack health looks like it is more for overall status. third-parties are interested in how their CI perfroms, and that is what CI Watch does 19:31:40 clarkb: to track their nova drivers 19:32:24 mmedvede: right I think openstack health is more in line with what I would expect the third part ci dashboard to do. Interpret the vast quantity of data for me so I don't have to scroll horizontally multiple browser widths. But again we can hash those details out later 19:32:42 okay, further feedback to the review. if there are new concerns we can defer approving it but get your comments in by thursday 19:32:50 #link sdague's ci watch spec: https://review.openstack.org/#/c/192253/ 19:32:54 fungi: thanks 19:33:01 o/ (sorry for late) 19:33:05 worst case we can always adjust the spec after it merges too 19:33:09 I would like to see if we can approve the spec as is and figure out the naming afterwards 19:33:23 it has taken over a year for the third party folks to get this far 19:33:26 not like specs are written in stone, just good to get them as close as possible in the beginning if we can 19:33:35 it would be nice if we can figure out a way to approve the spec 19:33:36 clarkb: Don't worry we are on the same page and have discussed adding pagination and summarizing results. The scrolling forever thing is a known issue :) 19:33:39 I think the spec is great as is 19:33:50 #topic Specs approval: Spec for nodepool image update retries (yolanda) 19:34:04 #link https://review.openstack.org/155305 19:34:13 (last-minute addition) 19:34:27 it has been stopped for long, i'd like to resume it these days 19:34:40 so it needs some more eyes, it's a quite simple addition 19:34:41 yolanda: does DIB make that not necessary? 19:35:02 in my experience dib builds are quite reliable due to the caching dib performs 19:35:15 it does seem like it should make it less necessary at least 19:35:20 clarkb, when i was using DIB, it was failing to me as well 19:35:25 since it's not been on the agenda before today's meeting i'll defer opening voting until next week, but this can serve as a reminder that it's ready 19:35:30 although i'm not a dib user for a long time 19:36:08 yolanda: my preference would be to not complicate image builds more right now because we are trying to switch to shade, have recently found many bugs in image building and need many more tests, and in theory DIB fixes many of the reliability problems present with snapshot builds 19:36:35 there's also an assumption here that it's important for image builds to succeed daily 19:36:38 Also uploads and such retry with shade 19:36:46 or they can and should 19:36:47 yes, not ready for voting, but needs more reviews 19:36:47 i'll add for next week agenda 19:36:47 clarkb, also, do we want to deprecate snapshots method at some point or we will let it coexist? 19:37:07 we should deprecate snapshots 19:37:12 +1 19:37:12 right, we've so far (upstream) operated on the assumption that it's nice to have daily image refreshes but jobs should be resillient to working from stale images when needed 19:37:57 i could get onboard with snapshot deprecation, though obviously that implies a major version bump in nodepool when we eventually do remove them 19:37:58 i think i'd be in favor of postponing this until we're on fully-dib and shade, so we see what our actual need is then 19:38:05 jeblair: wfm 19:38:19 ok 19:38:45 for downstream usage at least, is important to have images regenerated daily 19:39:07 really a need now, if we have some failure, we are enforced to regenerate manually 19:39:18 yolanda: that's unfortunate; i'm happy to talk with you about ways to avoid that 19:39:48 jeblair, yes, i'd like to have help on that 19:39:56 cool 19:40:09 (also our upstream jobs serve as a model for how we work around them) 19:40:14 #topic Priority Efforts 19:40:44 any urgent updates on the priority efforts front? if not, i propose we skip ahead to discussion topics since we're at 20 minutes remaining and have several on the agenda 19:40:55 #vote skip 19:41:05 mostly just looking for blockers which need some attention 19:41:13 going once... 19:41:17 twice... 19:41:25 i just want to say thanks to TW guys for the huge efforts on puppet module testing 19:41:28 #topic Policy around bash-isms vs. POSIX compliance in infra (pleia2) 19:41:35 this is actually broader than this specific thing, with more cores and the team growing in general, I don't think we have any kind of style guide or such to judge things like this 19:41:42 maybe we should? 19:42:02 pleia2: was there a usuage you questioned that motivated the topic? 19:42:03 defining things like posix vs bash, preferred puppet methods, etc 19:42:05 we have never been posix and explicitly bash everywhere 19:42:15 so far i think our policy has been "don't use bashisms unless you declare #!/bin/bash in your entry-point scripts" 19:42:17 i think it's bash everywhere 19:42:23 anteaya: negative review and "fixes" switching to posix rather than bash 19:42:32 pleia2: link? 19:42:32 so it was clear to me that not everyone knows we decided on bash 19:42:34 we were entertaining adopting bash8 at some point... 19:42:41 anteaya: in the meeting agenda, I'll copy... 19:42:45 #link https://review.openstack.org/212663 19:42:45 I usually stick to posix if it's trivial, just for easier porting. This is just a question of reviewer calibration. I've added nits to ask if posix, but I don't think I've ever -1'ed. 19:42:55 thanks fungi 19:42:58 #link https://review.openstack.org/#/c/224326/1/jenkins/scripts/common_translation_update.sh 19:42:58 fungi: I had a patch that I abondoned where I did try to put in some consistency 19:42:59 dougwig: thank you for handling it that way :) 19:43:00 dougwig: so I disagree on that actually because sh isn't ever bourne shell anymore 19:43:04 bash is way more portable 19:43:14 So, if [a = b] ; instead of [a == b] 19:43:20 i would say maintain consistency with surrounding code 19:43:21 so if we want portability we should be explicit that its bash and not argue :) 19:43:37 you mean [[ ]] which would be bash vs posix 19:43:38 er not argue over the code (if it works in bash great) 19:43:40 I agree with clark 19:43:48 greghaynes: == is also bash, = is posix 19:44:01 maybe it's just this old dog being stuck on '='. no worries, not arguing. :) 19:44:06 /bin/sh can be any number of incompatible things at this point 19:44:29 * jeblair is just happy if his shell scripts run and don't delete filesystems 19:44:34 as long as the shebang is an accurate reflection of the script's interpreter requirements, i see "portability" changes as unnecessary rewrite noise for the most part 19:44:38 jeblair: ha ha ha 19:45:13 fungi: this is a stance i could agree with (i could also agree with clarkb) 19:45:38 it's easy to run checkbashisms against a script and propose "fixes" based on that, but since we don't have consensus that we should avoid using bash in our scripts (what's the portability/efficiency benefit there?) it's a waste of reviewer and developer effort 19:46:08 yeah I'm not sure what problem we are solving here 19:46:17 right I think we should just be explicit and be fine with them. I am guilty of abusing bashisms to make the code simpler and more readable fwiw so I definitely have a bias here 19:46:18 ++ to posix changes being not a useful way to spend time for everyone involved 19:46:23 for us 19:46:29 But I think that in general saying "bash is ok" is good as a result 19:46:36 wfm 19:46:39 yep 19:46:44 fungi: yep, changes just to modify unnecessary. i'd also not reject things that are "posix-ism" (maybe bash has a better/different way) if consistent with existing code 19:46:50 wfm as well.... 19:46:56 so is it valuable to record decisions like this somewhere for reviewers? 19:46:58 i sometimes wonder whether people are looking for easy ways to provide benefit, so the answer is likely finding them other more productive ways to contribute 19:47:09 i don't think there's an argument here, i'm guessing you can move on. :) 19:47:14 plenty of spelling errors in devstack :) 19:47:29 pleia2: i think so 19:47:35 pleia2, a guide for proper reviews sounds like a good idea 19:47:39 thats what the first line of the script is for 19:47:56 #agree Project Infrastructure scripts don't seek to eliminate use of bash, and unsolicited POSIX-compliance changes are not a review priority 19:47:58 "this is bash" done 19:49:03 clarkb: that doesn't help if someone thinks "oh, wouldn't it be better if this weren't in bash" 19:49:10 oh, i think the keyword is "agreed" not "agree"? 19:49:16 * Clint nods. 19:49:21 fungi: agreed 19:49:26 Clint: let's chat later to see about other institutional knowledge and come up with a proposal for such a thing if we get critical mass :) 19:49:30 so anyway, anyone strongly disagree before i correct my agree? 19:49:38 pleia2: sure 19:49:57 #agreed Project Infrastructure scripts don't seek to eliminate use of bash, and unsolicited POSIX-compliance changes are not a review priority 19:50:21 #topic Fedora 22 nodes (ianw) 19:50:35 #link https://review.openstack.org/186619 19:50:39 oh this was last week, but if we have 2 minutes 19:50:49 i have fedora-minimal changes out, they're stuck int he dib queue 19:50:55 ahh, yep i see the sept 22 on these 19:50:57 i wouldn't mind if we merged this while that goes through 19:51:04 then i'll convert 19:51:45 the issue was that we want the minimal images to boot on rax 19:51:52 oh, right, i kept these topics in because they didn't get hit last week 19:52:16 right, the original fedora22 images would only boot on hpcloud 19:52:32 ianw: so you would like to start with that then convert to minimal and add rax after dib updates go in? 19:52:33 ianw: tomorrow im back to being able to work real hours 19:52:35 yep, but i'd rather us start building the images, and i can debug everything that goes wrong that !rax 19:52:49 I am fine with that 19:52:51 so ill try and review those 19:52:55 clarkb: exactly, there's so much other stuff, puppet etc 19:52:58 ianw: ya 19:53:04 let's get that shaken out 19:53:33 greghaynes: np, thanks. the fedora-minimal images are bigger than the non-minimal images :) still looking into that 19:53:42 ianw: also you had a topic up for docker use in a project's job configuration. looks like that merged. still urgent? if not i'll jump to gerrit upgrade discussion 19:53:43 hah 19:54:12 well, on docker, i mean i guess we don't care so much if jobs go ahead and install their own docker debs, etc 19:54:34 i just pointed out to the owner of the job that it's very possible we'll do something to the underlying platform that totally breaks it though 19:54:41 ideally they do and we cache the packages in images 19:54:48 and it's going to be a super-fragile job with network, imo 19:54:48 i think we care more as they start to need to use dockerhub (mayb we need a mirror) or are part of official openstack projects 19:55:28 so yeah, just a heads up that i think this has started in the wild, and it may come back to us to do a more centralised thing (mirrors, caching, etc) in the future 19:55:35 thanks ianw! 19:55:41 #topic Gerrit upgrade (zaro, anteaya) 19:55:45 #link https://etherpad.openstack.org/p/I0eV4IgkZS 19:55:56 just a quick check-in on current gerrit upgrade status 19:55:59 so zaro has been working hard on the gerrit upgrade 19:56:16 and as we tried to outline on the etherpad he is currently blocked 19:56:26 so i think the questions is are there any ideas on how we can move this forward? 19:56:30 so we thought sharing that info would be a good start to figureing out next steps 19:56:43 bascially we are a unique snowflake 19:57:00 and unless we fix the upstream thing ourselves zaro feels it won't get fixed 19:57:21 so the summary is that most gerrit users just up the timeout and hope they don't still hit the issue, jgit devs are disinclined to view it as a bug? 19:57:59 zaro: gerrit and android and cyanogenmod are all hosted publicly as well 19:58:04 zaro: do they not care about this bug? 19:58:07 well, it's definately a bug. the issue is whether it's a gerrit bug or a jgit bug 19:58:13 also jgit itself is publicly hosted too 19:58:36 gerrit and android use a database-backed rather than file-backed repo solution though right? or was that disinformation? 19:58:54 that is correct 19:58:55 (and yes, i know filesystems are technically databases) 19:59:00 good points. so maybe i'm way off. but just seems like nobody wants to touch it. 19:59:09 BUT my understanding of the bug is its thread contention of the same object in gerrit/jgit 19:59:14 so would be surprised if that fixes it for them 19:59:14 if we look to 2.11 is it fixed there? 19:59:26 or is this issue just going to follow future releases? 19:59:35 well, we're down to the last few seconds for the meeting, but maybe try to put our heads together in #openstack-infra and regroup 19:59:44 thanks for the time 19:59:53 thanks for bringing it up! 19:59:54 it's a bug in 2.11 20:00:08 it's a bug in master 20:00:11 thanks for attending, everyone! 20:00:16 #endmeeting