19:01:22 #startmeeting tripleo 19:01:23 Meeting started Tue Mar 18 19:01:22 2014 UTC and is due to finish in 60 minutes. The chair is SpamapS. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:24 hi hi 19:01:25 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:27 The meeting name has been set to 'tripleo' 19:01:27 hello 19:01:28 o/ 19:01:47 o/ 19:01:54 hello 19:02:02 #link https://wiki.openstack.org/wiki/Meetings/TripleO 19:02:05 Morning 19:02:18 #topic bugs 19:02:22 o/ 19:02:25 o/ 19:03:01 #link https://bugs.launchpad.net/tripleo/ 19:03:01 #link https://bugs.launchpad.net/diskimage-builder/ 19:03:01 #link https://bugs.launchpad.net/os-refresh-config 19:03:01 #link https://bugs.launchpad.net/os-apply-config 19:03:01 #link https://bugs.launchpad.net/os-collect-config 19:03:03 #link https://bugs.launchpad.net/tuskar 19:03:06 #link https://bugs.launchpad.net/python-tuskarclient 19:03:37 still swimming in red 19:04:21 o/ 19:04:31 https://bugs.launchpad.net/tripleo/+bug/1270646 19:04:32 no assignee 19:04:41 "PMTUd / DF broken in GRE tunnel configuration" 19:05:05 Does anyone have time or want to dive into that? 19:05:26 dprince: am I to believe that you don't actually see this problem on the RH CI region? 19:05:48 i see it on baremetal all the time 19:05:49 SpamapS: Not yet. 19:05:57 i've been asking neutron folk to look at it 19:06:05 SpamapS: I'm hitting a problem much earlier than that. 19:06:14 I can dig in it 19:06:24 so far the only response has been that, yes, you must lower mtu to at least 1458 in the vm 19:06:27 ahh.. can't tell if the fuel injectors are broken if the starter is busted right? ;) 19:06:49 slagle: ugh 19:07:17 in fact, comment 13 says to do that 19:07:24 Right, but no actual solution. 19:07:26 i don't think that's a "workaround", i think it's the requirement 19:07:26 just a workaround 19:07:30 why? 19:07:35 you have to account for the gre overhead 19:07:43 (that's what i'm being told anyway) 19:07:52 slagle: IIRC, the problem is that something is setting DF 19:07:57 openstack install guide says to do the same thing 19:09:08 http://docs.openstack.org/admin-guide-cloud/content/openvswitch_plugin.html 19:09:13 actually says to use 1400 19:09:21 well that may just be the common workaround. :-P 19:09:38 maybe 19:09:41 the root cause is that most of public routers, block the ICMP messages with the PMTU 19:09:48 My reading was that we're not seeing the icmp messages to tell us something was fragmented 19:09:51 we need someone in neutron to care about this :) 19:10:11 indeed, ok, so we need somebody to care.. 19:10:15 And the neutron team's response was that you can't rely on those because of what GheRivero said 19:10:25 no neutron/host do not realize about the limitations 19:10:39 GheRivero: so you'll assign yourself and spend a little time making sure it should still be a critical and/or getting MTU set lower on vms? 19:10:51 SpamapS: sure 19:10:58 GheRivero: my hero. :) 19:11:08 https://bugs.launchpad.net/tripleo/+bug/1271344 19:11:20 "neutron-dhcp-agent doesn't hand out leases for recently used addresses" 19:11:26 I haven't run into this in a while 19:13:06 Seems to me like it is really blocked on Neutron and we can't do much to mitigate 19:13:34 so I'll leave it at that 19:14:09 no derekh so I'll skip bug 1272803 19:14:19 https://bugs.launchpad.net/tripleo/+bug/1272969 19:14:27 "dhcp-all-interfaces interactions breaks bridged configurations" 19:14:29 dprince: ^^ ? 19:15:17 SpamapS: That is the one you put a -2 on sir 19:15:22 SpamapS: for good reason :) 19:15:32 dprince: indeed, I rechecked it this morning.. 19:15:57 SpamapS: My fix works on Fedora, I fixed the seed on Ubuntu, but still seem to get Undercloud/Overcloud failures w/ Ubuntu. 19:16:04 dprince: anyway, if it is still just wending its way through our twisty review process.. we can move on. 19:16:37 SpamapS: yes, I think so. Also, I proposed a session for the dev summit around core networking stuff 19:17:27 dprince: cool, I want to also tie into that the discussion about how to make sure when we reboot we configure os-collect-config properly as well... they are related 19:17:46 https://bugs.launchpad.net/tripleo/+bug/1285269 seems to be a duplicate of 1289582, marking as such 19:18:43 https://bugs.launchpad.net/tripleo/+bug/1287453 is the heat domain users thing.. we're making incremental progress on that I think. 19:18:47 slagle: concurr? 19:20:34 anyway, more bugs.. 19:21:21 no assignee for https://bugs.launchpad.net/tripleo/+bug/1290486 .. it probably could use a new pair of eyes to try and verify it. 19:21:46 SpamapS: yes, i believe the patch just needs another core review 19:22:25 also https://bugs.launchpad.net/tripleo/+bug/1290490 looks like it may need to be verified.. may have been transient problems 19:22:37 4 more.. 19:23:03 slagle: https://bugs.launchpad.net/tripleo/+bug/1291011 ... you're assigned.. isn't that a dupe? 19:23:22 log is lost so we'll never know.. :-P 19:23:45 slagle: I think we need to close that as Invalid .. I seem to recall that one breaking CI for a day and then getting fixed upstream. 19:24:16 greghaynes: https://bugs.launchpad.net/tripleo/+bug/1291060 .. I pinged you earlier.. you'll pick this back up yes? 19:24:31 yep, hopefully this afternoon 19:24:39 I'm happy to start looking at +1290486 (dhcp agent not serving) 19:24:51 It should be a good learning exercise 19:25:08 SpamapS: not sure it's a dupe, but i'll close it, since we're not seeing it anymore 19:25:28 this one is ongoing and causes headaches a lot ... https://bugs.launchpad.net/tripleo/+bug/1292141 19:25:42 I think derekh is looking into it.. but he's not here to claim the bug so I'll leave it alone for now 19:26:07 and https://bugs.launchpad.net/tripleo/+bug/1293782 has an assignee.. StevenK seems to be on it 19:26:21 two untriaged bugs https://bugs.launchpad.net/tripleo/+bugs?search=Search&field.status=New 19:26:25 please attack those :) 19:26:29 ^^ (fwiw on the 1292141, I'm using checkspelling on and haven't hit this since) 19:27:02 jang1: interesting 19:27:34 the case semi-sensitivity is "an interesting choice" :-/ 19:28:04 jang1: I didn't look closely in the comments, but if you haven't added your findings there it would be most helpful. 19:28:37 I'm not sure how checkspelling works; is it possible that it's implicitly trying to get the file a second time and succeeding that time? 19:28:55 any other bugs people wnat to discuss? The other bits seem to be under control and have a healthy bug collection relative to their size and complexity. 19:29:11 tchaypo: thats what I wonder too. :-P 19:29:37 checkspelling generates a redirect if the original file isn't there and a nearby one is. 19:30:08 ah that sounds .. diabolical. :-P 19:30:24 Although the errors seem to specifically be on pypi.openstack.org - I'm guessing you must have your own mirror, maybe that's inherently more stable? 19:30:46 Do your logs show spelling doing it's thing and issuing redirects? 19:31:04 yup. 19:31:20 Actually we should probably take this offline and let the meeting continue 19:31:23 also to the having a local mirror. There was truly no alternative. 19:31:24 Aye 19:31:28 (agreed) 19:31:32 #topic reviews 19:31:48 http://www.stackalytics.com/report/reviews/tripleo-group/open 19:34:07 #link http://www.stackalytics.com/report/reviews/tripleo-group/open 19:34:10 #link http://stackalytics.com/report/contribution/tripleo-group/30 19:34:14 #link http://stackalytics.com/report/contribution/tripleo-group/90 19:34:18 there we go 19:34:21 so russelb's stats are I think down 19:34:25 or were last week anyway 19:34:28 * SpamapS updates wiki 19:34:43 up again now 19:34:57 \o/ 19:35:53 russellb: the only thing that stays the same is everything changes :-P 19:36:13 * slagle wonders what "Emails" is in the stackalytics tables 19:36:19 # http://russellbryant.net/openstack-stats/tripleo-openreviews.html 19:36:25 russellb: a real statsman you are 19:36:40 whatever it is, i'm aiming to keep my 0 of them 19:36:42 slagle: hopefully something that is sorted ascending.. as in.. the winner is the person who sends 0 :) 19:36:56 anyway, we are not doing as bad as we were 19:37:03 but there are 89 reviews waiting on reviewers 19:37:42 note that there are a bunch of reviews in t-i-e that are for glance changes that seem to have been sort of abandoned by the submitters 19:38:29 I think we can probably run through and mark them all WIP since they are mostly dependent on OUTDATED patches 19:38:42 That should be part of our normal weeding process. 19:39:35 +1 to mark them WIP 19:39:47 anyway, on a related note.. CI _should_ be working, but please always wait for recent check passes before +A 19:40:08 and check in #tripleo if you are seeing spurious failures. 19:40:21 anybody else have more on reviews? 19:40:46 #topic Projects needing releases 19:40:56 not a problem, I can take this 19:41:00 woot 19:41:08 rpodolyaka1: my release tag pushing hero :) 19:41:12 :) 19:41:18 #topic CD cloud status 19:41:21 :( 19:41:33 Basically Heat is so broken we can't keep the stack alive. 19:41:38 so I've given up 19:41:55 in what manner is it broken? 19:41:59 just generally all over? 19:42:14 This is, to me, a critical bug.. but a strong argument has been made that so is the lack of graceful failure notifications.. and the latter unblocks people working on HA 19:42:28 tchaypo: no, specifically, if a stack-update fails, the stack must be deleted 19:42:45 tchaypo: which means servers turned off, data deleted, generally "not a good thing for a running cloud" 19:43:17 excitement 19:43:18 so basically, our CD cloud is dead until we can recover from a failed stack-update 19:43:35 Or until Heat promises 0% failure rate for all OpenStack apis. 19:43:55 #topic CI 19:44:07 CI is working.. or was when the meeting started. 19:44:20 We were broken by Neutron for about 24 hours or so 19:44:28 a 0% failure rate sounds difficult to achieve. is there anything we can do to make the rebuild easier? 19:44:49 tchaypo: it is impossible to achieve. That was snark. :) 19:44:54 say, some way of making the server do pxe-boot on next startup and be treated as though it's a new blank machine ready for emborgening? 19:45:13 tchaypo: the retry problem is a fundamental flaw in the way Heat handles updates. It requires rewriting stack-update. 19:45:22 do all our CI's now pass or is precise still broke? 19:45:31 greghaynes: precise is the only CI we have 19:45:44 all 3 jobs are passing 19:45:47 or at least, should be 19:45:59 er, pricise undercloud I think is the one that was broke forever 19:46:01 \O/ 19:46:48 btw topic for extra discussion - splitting the meeting into two diferent tz riendly times 19:47:30 yeah I think we'll have to do an alternating TZ meeting 19:47:34 5am is not really doable 19:47:52 ok anyway, CI .. working.. growing.. caring.. sharing.. it's all good until it is broken by neutron/nova/keystone again.. ;) 19:48:05 #topic alt TZ meeting times 19:48:22 I think we should probably start this on the mailing list, since those who might benefit most from the discussion are likely not here or are very tired right now 19:48:31 lifeless: concurr? 19:49:33 yes 19:50:04 anybody want to take that action? 19:50:26 sorry, trying to keep Emails=0 19:50:45 jk, i'll do it :) 19:51:14 nice 19:51:33 #action slagle to email list about alternating time zone friendly meetings 19:51:37 #topic open discussion 19:53:03 I've been keeping notes about things I had to do to get myself set up at https://etherpad.openstack.org/p/tripleo-newdev-notes 19:53:14 tchaypo: ^5 .. great idea 19:53:28 Hopefully we can point future newbies at it and it will help get them across all the things faster 19:53:34 tchaypo: +1 19:54:06 yes.. another good thing to do with that knowledge is to write docs for the tools 19:54:25 It can be hard and extra-boring to write docs for tools that you create yourself and organically grow over many months.. 19:54:31 but as a newbie, you'll know what is and isn't obvious. 19:54:52 I've also been rummaging through unassigned bugs and looking for things I can replicate and work on, which is working okay as a way to get me introduced to bits of the stack 19:55:35 but if anyone has something they want an extra pair of hands/eyes on, especially if it's something that would benefit from more working during APAC daylight, feel free to ping me :) 19:56:07 tchaypo: I will have some Heat templates that will need mass testing and refactoring soon. :) 19:56:17 but not quite yet 19:56:33 I'd like to give everyone back 3 minutes of their lives. Anyone objet to early endmeeting? 19:56:56 none from me 19:57:24 * SpamapS evokes the black knight 19:57:26 SO BE IT 19:57:28 #endmeeting