16:01:16 #startmeeting neutron_ci 16:01:17 Meeting started Tue Oct 24 16:01:16 2017 UTC and is due to finish in 60 minutes. The chair is ihrachys. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:18 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:20 The meeting name has been set to 'neutron_ci' 16:01:21 o/ 16:01:22 o/ 16:01:24 hi 16:01:32 #topic Actions from prev meeting 16:01:42 "haleyb to follow up with infra on missing grafite data for zuulv3 jobs" 16:01:52 * haleyb hides 16:01:53 haleyb, any news on the missing data? 16:01:58 o/ 16:02:08 haleyb, meaning no? :) 16:02:19 * mlavalle waiting for call from hospital where wife is undergoing minor surgery procedure. Might drop from the meeting to go pick her up. Will advice the team 16:02:30 i have been busy with other priorities, will ping them now 16:02:47 mlavalle, ack, thanks for the heads up, I hope everything is ok 16:02:49 mlavalle: don't worry about work 16:02:58 mlavalle: I hope everything will be alright 16:03:02 minor thing, she is ok 16:03:46 haleyb, ok. let's tackle that in next days. if you are busy with other stuff, we can find someone else to follow up 16:03:52 empty grafana is PITA 16:04:32 #action jlibosva to post a patch for decorator that skips test case results if they faile 16:04:37 #undo 16:04:37 Removing item from minutes: #action jlibosva to post a patch for decorator that skips test case results if they faile 16:04:39 oops 16:04:44 #action haleyb to follow up with infra on missing grafite data for zuulv3 jobs 16:04:49 I think we have been pushing statsd events for a while now 16:05:00 just a matter of updating the paths to the gauges/counters? 16:05:03 clarkb, 'a while' is weeks? 16:05:28 ihrachys: zuulv3 hasn't been deployed for weeks :P since last week at least iirc 16:06:11 ok 16:06:12 clarkb: i'll have to follow-up in in the infra channel as our grafana page has no stats, maybe we just have to change where we grab them? 16:06:19 haleyb: ya that 16:06:26 i didn't see new names in graphite 16:06:36 I think we had that discussion, there was a doc page somewhere describing new paths 16:07:41 i just don't see it in the faq or migration 16:08:41 sorry, can't find the link now :-x 16:08:53 after a quick look the data is there, swing by the infra channel after your meeting and we'll get you sorted out 16:09:04 clarkb: great, thanks 16:09:29 great. next item was "jlibosva to post a patch for decorator that skips test case results if they failed" 16:09:43 I did today but it failed CI cause of my bad mistakes 16:09:55 next PS was sent a few minutes ago: https://review.openstack.org/#/c/514660/ 16:11:07 and that's all from me :) 16:11:10 great, seems reasonable, I will have a look 16:11:17 thanks 16:11:21 next was "slaweq to take over https://review.openstack.org/#/c/504186/ from armax" 16:11:32 I pushed patch today: https://review.openstack.org/#/c/514586/ 16:11:33 I see the patch was abandoned, replaced by https://review.openstack.org/#/c/514586/ 16:11:54 I didn't abandon any patch but maybe someone else did it 16:12:19 fullstack test for trunk passed with this my patch: https://review.openstack.org/#/c/514586/ 16:12:24 one question to it. do we want that config opt to be exposed to users or is it just for testing purposes? 16:12:37 afaik the original patch from armax stated that's just for testing 16:13:17 should be test only 16:13:27 probably need to pick some explicit name for the option 16:13:53 like '__prefix_for_testing' 16:14:02 his patch was doing something else, he don't initialized trunk handler if trunk wasn't enabled 16:14:23 my config option is available for all but IMHO it should be used only for tests 16:14:49 I can of course update help message and name for this option to be more "test" only 16:15:04 yeah, maybe that would be safer 16:15:08 thanks :) 16:15:17 so, please review my patch and I will address Your comments :) 16:15:18 let us review what you have and then you respin 16:15:26 ihrachys: thx 16:15:59 next item was "ihrachys to talk to oslo folks about RWD garbage data in the socket on call interrupted" 16:16:27 I popped at their channel a bunch of times, didn't seem like there is interest in it. I was told Thierry is the expert on rootwrap. :) 16:16:50 I also started looking at the library code but was distracted by a flu 16:17:00 we can corner him in Sydney, if you want :-) 16:17:20 mlavalle, well I guess I should just ask him. though afair he was not behind the daemon mode. 16:17:32 cool 16:17:53 so tl;dr I will continue looking at it 16:17:53 toshii is also looking at it 16:18:06 he sent a patch to prove that rwd is not a friend with eventlet 16:18:11 as there might be yet another issue 16:18:15 toshii? 16:18:22 https://review.openstack.org/#/c/514547/ 16:18:33 didn't know that's his nic :) 16:18:56 rwd fails with fullstack in agents too, not just testrunner - http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Second%20simultaneous%5C%22 16:19:20 the 15 minutes doesn't show the result but bigger window does ^^ 16:20:04 interesting. is it the same issue? 16:20:26 ihrachys: yeah, he's in a lot different timezone than you are so probably you don't overlap :) 16:20:36 no, I think that's a different issue 16:20:52 one is that data are left in socket if timeout raises before data are read 16:21:15 this is about accessing fd concurrently, if I understand the trace 16:21:30 fully flushing the data before calling a command should fix the first, for what I understand 16:21:37 right, I also think it's different 16:21:48 the error is: RuntimeError: Second simultaneous read on fileno 9 detected. Unless you really know what you're doing, make sure that only one greenthread can read any particular socket. Consider using a pools.Pool. 16:21:48 maybe need another bug 16:22:14 the second failure usually happens when some part of i/o libraries is not patched 16:22:22 but I thought each process (agent) spawns its own rwd process 16:22:41 hmm, we had this issue in the past with customized agents 16:22:56 that dhcp agent was not monkey_patched but that was fixed iirc 16:23:25 I see it happens in dhcp agent only 16:24:22 we'll need a separate bug for that. I will ask toshii to spin it off the existing one. 16:25:09 ack 16:25:12 it could be e.g. that oslo.rootwrap is imported before monkey patching happens in the customized fullstack dhcp agent 16:25:38 thanks for letting know there is also that issue 16:25:51 finally, we had "mlavalle to explore how to move legacy jobs to new style" 16:25:59 mlavalle, any discoveries? 16:26:09 i made some progress there reading the guide 16:26:16 the process has two steps 16:26:38 1) Move a legacy job to our repo, under .zuul.yaml 16:27:06 2) Once moved and executing from our repo, tweak it to exploit zuulv3 goodies 16:27:25 I am in the process of moving legacy-neutron-dsvm-api 16:27:56 clarkb: I have a question. is .zuul.yaml expected to be at the top our repo? 16:28:02 mlavalle, is it identical copy when first step is done? 16:28:11 yes it is 16:28:16 mlavalle: yes I think zuul only looks at the repo root 16:28:34 will zuul use our override right after it lands, or we'll need to clean it up first in global space? 16:29:12 I mean, are we able to gate on the move right away, or we'll need to merge/clean up to see result? 16:29:29 you should be able to gate on the move right away 16:29:33 goood 16:29:44 but you'll have potentially duplicate jobs running until cleaned up 16:29:48 (so you'll want to get that done quickly too) 16:29:51 and then there is a number of steps explianed in the doc on how to remove it from zuul 16:30:14 I will take care of this as well 16:30:16 clarkb, you mean two jobs with same name? or do you suggest that names will differ? 16:30:30 no, the moved job has a different name 16:30:33 the names have to differ 16:30:37 it is not legacy anymore 16:30:40 but they will likely be functionally equivalent 16:30:47 then you delete the old one and have a single set of job(s) 16:30:53 mlavalle, I thought 'legacy' means 'no fancy ansible' 16:31:11 legacy means it is not in our repo 16:31:17 gotcha 16:31:31 as soon as it is in our repo it is up to us what the job does 16:31:47 as far as exploiting facy ansible 16:31:57 I see. so we can stick to calling a script as we do, that's fine with infra? 16:32:34 yes 16:33:20 clarkb: I will add your name to the patches in neutron and infra so you can baby sit me 16:33:53 nice progress, thanks mlavalle and clarkb 16:34:50 #topic Scenarios 16:35:12 during the previous meeting, we briefly talked about this bug: https://launchpad.net/bugs/1717302 16:35:13 Launchpad bug 1717302 in neutron "Tempest floatingip scenario tests failing on DVR Multinode setup with HA" [High,Confirmed] 16:35:15 * mlavalle just got call from hospital. She did great. Pick her up in 1 houir, so will be able to finish this meeting :-) 16:35:47 haleyb, I believe Swami was going to have a look? 16:36:13 ihrachys: yes, but i don't see an update 16:36:57 i am thinking he is on pto or otherwise busy 16:37:17 right, I am just pulling attention of l3 Olympus :) 16:37:23 LOL 16:37:46 will make sure we review this on Thursday 16:37:52 delegate is my middle name 16:38:28 * mlavalle wonders if haleyb iz Zeus 16:39:03 not I 16:39:46 there are other scenario failures, but seems like we already have hands full, so I will skip enumerating all of them.. 16:40:31 we already talked about grafana, and fullstack... so I will skip to... 16:40:33 #topic Open discussion 16:40:42 anything to discuss beyond all above? 16:40:59 I don't have anything 16:41:06 I'm good 16:41:26 nothing here 16:41:38 ok 16:42:07 one thing is that I need to follow up with Chandan on tempest plugin repo. it becomes a concern now, we diverge and not making progress. 16:42:20 #action ihrachys to follow up with Chandan on progress to split off tempest plugin 16:42:23 yeah, good catch 16:42:38 if we don't have anything else, I will wrap it up 16:42:40 ihrachys: going back to your comment above we can’t use __ in config options 16:42:44 as prefix 16:42:50 oslo doesn’t like it 16:42:52 meh 16:43:13 don’t shoot the messenger! 16:43:16 ok, we can say 'test_prefix_dont_use_yes_I_know_what_I_do_here' :) 16:43:16 :) 16:43:29 ahhh, armax was lurking 16:43:30 that’s very reasonable 16:43:33 yes, that's good name 16:43:34 :) 16:43:35 :) 16:43:50 actually I just looked at the patch while I was boringly attending another meeting :) 16:43:59 I dunno, I am having seconds thoughts :) 16:44:21 have we considered running fullstack serially for the time being? 16:44:35 armax: no, that would take eternity to finish :) 16:45:02 yeah, it's 52 mins now 16:45:04 did you quantify what ethernity means? 16:45:07 with 8 (?) threads 16:45:37 the setup takes a long and it's run per test method 16:45:48 I don't have exact numbers tho 16:45:54 there are jobs that take about 2 hours 16:46:05 so it’s not like if fullstack finishes fast we can do anything else 16:46:20 you assume no one runs it locally 16:46:40 who runs the entire fullstack suite locally? that’s insane :) 16:46:50 armax: I agree :) 16:46:53 I am pretty sure it will take like 5h serially. 16:47:06 OK then 16:47:51 slaweq, agree with what exactly? that we can serialize or that the patch is a hack? :) 16:48:07 ihrachys: should be easy to check, though 16:48:18 that it probably no one runs all fullstack tests locally :) 16:48:31 I do 16:48:32 I can try to check it with one thread 16:48:39 when I work on fullstack stuff 16:48:58 slaweq, ok please check it so that we have the data 16:48:58 jlibosva: you’re not staring at the screen though, are you? :) 16:49:05 ihrachys: ok, I will 16:49:16 armax: I am, that's why I do only a few things a day :-p 16:49:32 jlibosva: looks like we need to teach you one thing or two then :) 16:49:44 preaching the importance of short feedback loop to developers, I love it 16:49:55 :) 16:49:59 ihrachys: run only the test you touch! 16:50:06 that’s how race conditions creep in! 16:50:56 but even running the whole thing doesn’t stop that, so that’s such an emitter of CO emissions 16:50:57 ok, seems like we fall into joke contest 16:51:10 slaweq will check and we'll decide what to do with the data 16:51:15 ok 16:51:21 * jlibosva likes jokes 16:51:35 armax, CO2 is good. just move from California to a more reasonable place (New Hampshire anyone?) 16:51:43 either way, it looks to me that the cleanest solution to a test problem is to limit the change to the test code 16:52:17 armax: just say you want to use the monkey patch approach you and I talked about at the PTG :) 16:52:22 but I am not too strongly opinionated 16:52:35 jlibosva: I actually did on the patch :) 16:52:42 :D 16:53:03 I mean, I wouldn't be against :) 16:53:08 jlibosva: both are hackish 16:53:12 huh. so we are back to zero. ok, let's comment on the patch. at this point, I agree with anything that makes those green. :) 16:53:14 right 16:53:20 one hack is hidden in the fullstack subtree 16:53:27 and that’s feels cleaner 16:53:30 ihrachys: no 16:53:32 we’re not 16:53:48 we learned in the process 16:54:01 ihrachys: so we should use https://pypi.python.org/pypi/pytest-vw to make it green :D 16:54:41 jlibosva, we should use -vw ^ instead of your decorator :) 16:54:53 eh, I would have gone as far as stop running the job altogether 16:55:02 but pytest-vw looks good too 16:55:20 won't we need to change all the stuff to pytest ? 16:55:39 jlibosva, do you want all green or not??? 16:55:53 I do, less emissions, love it 16:56:23 I think we should make ‘what green means’ configurable 16:56:48 but I digress 16:56:55 I see the contest is rolling forward 16:56:59 but I will call it a day 16:57:01 thanks everyone :) 16:57:04 thanks :) 16:57:07 o/ 16:57:11 thanks 16:57:15 clarkb, thanks a lot for popping up in the meeting, really helpful 16:57:17 #endmeeting