17:00:11 #startmeeting 17:00:12 Meeting started Thu Jun 28 17:00:11 2012 UTC. The chair is jaypipes. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:24 good afternoon QA team 17:00:41 Hi there. 17:00:43 rohitk: around? 17:00:47 davidkranz: afternoon! 17:01:11 * jaypipes looks for some Rackers... 17:01:33 hi! 17:02:00 rohitk: hi! :) so, I've taken a first look at your proposed whitebox stuff. overall, looks quite good :) 17:02:26 thanks Jay :) 17:02:41 rohitk: if you could do a code review on this: https://review.openstack.org/#/c/8812/ 17:03:00 rohitk: that would be appreciated. you, too, davidkranz, although you already reviewed a prior patchset 17:03:12 jaypipes: yes, will take a look for sure 17:03:17 thx! :) 17:03:37 jaypipes: I already put some comment on your last patch. 17:04:00 jaypipes: Other than the little bugs i mentioned, is it nearing being ready to go? 17:04:09 davidkranz: yep. 17:04:16 jaypipes: Great! 17:04:33 jaypipes: Are there any remaining faliures other than that keystone issue? 17:04:44 davidkranz: still not possible to run in isolation (due to a bug in the multiprocess plugin) 17:05:06 davidkranz: but all tests do run in isolation now, and all keystone creds are being cleaned up properly. 17:05:21 jaypipes: That's too bad. The serialization of server creation on a server is also a problem for us but Vishy doesn't see it as being very important. 17:05:49 jaypipes: He suggested some possiblel patches but they didn't really do anything. 17:05:52 davidkranz: well, I've filed a bug with the nosetests folks on the plugin error 17:06:19 davidkranz: if there is no action on that bug by end of today, I will send a post to their mailing list 17:06:37 jaypipes: Sound's good. 17:06:43 davidkranz: re: server creation serialization, are you referring to a bug in Nova, or a bug in Tempest? 17:06:59 davidkranz: or even a bug in libvirt/KVM? 17:07:45 jaypipes: https://bugs.launchpad.net/nova/+bug/1016633, which Thierrey marked as wishlist. I don't agree. 17:07:47 Launchpad bug 1016633 in nova "Bad performance problem with nova.virt.firewall" [Wishlist,Confirmed] 17:08:39 davidkranz: one sec, lemme re-read that 17:09:18 davidkranz: did my second patch help at all? 17:09:55 vishy: No, the initial success was noise from variance. I ran it a bunch more times and the average wasn't any different than without it. 17:10:32 vishy: I don't think this is a critical, must fix now issue but I also think it is not really wishlist. 17:11:09 davidkranz: I'm not completely sure that it is the firewall code that is the culprit, but if it is, I would blame it on the multiple iptables saves and restore 17:11:46 vishy: is that a particularly expensive call? iptables save/restore? 17:11:46 might be solved by updating iptables in a periodic task, but that is a pretty heavy refactor and opens up potential for security holes 17:12:02 jaypipes: I've noticed it be slow before yes. 17:12:09 vishy: I don't know much about this area but I'm sure you guys will be able to come up with something. 17:12:52 davidkranz, jaypipes: I don't really see it is a huge issue considering that multiple vms on the same server will always be slow due to image downloading etc. 17:13:15 vishy: Even if they are booting the same image? 17:13:28 we've even discussed serializing everything to eliminate race conditions 17:13:37 vishy: I thought images were cached. 17:13:58 davidkranz: they are, but I think in normal cloud usage the cache hits will be relatively infrequent 17:14:38 vishy: I am bring this up because we are actually working on some applications that want to spin up vms pretty fast, but it is not a public cloud use case. 17:14:52 davidkranz: we need much better profiling of the launch to determine exactly why it is slow 17:15:03 vishy: Of course. 17:15:25 vishy: I just think a lot of people are working on cases that are not random users on a public cloud. 17:15:27 davidkranz: The distance in the log is not super useful, because it could be yielding to another greenthread during that call 17:15:31 davidkranz: we have similar use cases -- the tenant wants to spin up dozens of VMs (same flavor/image) at once 17:16:40 vishy: by "much better profiling", is this something davidkranz or I can work on? do you refer to more debug statements in the logs or something more invasive? 17:16:42 davidkranz: anyway I will continue to try to come up with ways to speed it up. 17:17:00 vishy: I could try to do this. It is the kind of thing I have a lot of experience with but not so much with the nova code. 17:17:13 jaypipes: It would be nice to do automatic profiling somehow and find out what the slow calls are 17:17:26 vishy: OK, thanks. 17:18:07 vishy: agreed, though doing so in something like cachegrind only shows a single process. where we are seeing the issues is when multiple processes (and greenthreads within processes) are in use. :( 17:18:33 in other news, my pug's snore volume just went through the roof... 17:18:49 LoL 17:19:01 too hot for her today... better turn up the air con. 17:19:32 jaypipes: I know the issue. I have a silent standard poodle but was taking care of a beagle for the weekend... 17:19:33 davidkranz: well, regardless, there's not going to be any quick fixes for this kind of thing. 17:19:42 davidkranz: :) 17:20:10 jaypipes: Do you think it is worth me spending any time trying to find out where the time is going? 17:20:11 davidkranz: so I think we should just press on, bringing as much data to vishy and others like comstud as we can by doing more and more repeatable testing 17:20:34 davidkranz: I think it is worth the time putting debug log statements into piece of Nova, yes... 17:20:41 davidkranz: that seems to me to be an easy win. 17:20:48 jaypipes: Trying my test with 16 or 32 servers might reveal something. 17:20:56 jaypipes: OK, I will give it a try. 17:21:04 davidkranz: and that would allow us at least to begin narrowing down for vishy where the problems might be lying 17:21:32 jaypipes: Yeah. Is there an easy way to restart nova in devstack without rerunning stack.sh? 17:21:42 davidkranz: what would be super is if we can marry test runs with something like Tach so we can get a better/easier view on the timings 17:21:42 jaypipes: After making a code change. 17:22:01 jaypipes: I have never used Tach. 17:22:23 davidkranz: you can always screen -x, go to the nova-compute/api/scheduler/network nodes and Ctrl-C, then hit up and Enter 17:22:41 davidkranz: but I have a script that resets it all and reruns stack.sh... I find it easier. 17:22:45 and more consistent :) 17:23:04 davidkranz: remember to set SCREEN_LOG_DIR 17:23:05 jaypipes: Yeah, that's what I have been doing. 17:23:15 k 17:23:53 davidkranz: well, it would be super if we could get my and rohitk's branches in today. that would free up a lot of the other things, because both patches touch a lot of files. 17:23:54 jaypipes: I have been using syslog due to having trouble with SCREEN_LOG_DIR. I might have a question about that later. 17:24:01 sure 17:24:38 jaypipes: I basically gave the go-ahead for your submission except for the little bugs. So I;m ready to go when they are fixed. 17:25:02 jaypipes: I will look at rohitk's in a little bit. 17:25:09 davidkranz: k. thx. I'll wait to get rohitk's views and then will look at approving it. 17:25:11 jaypipes: we've been doing a lot of heavy duty refactoring to the core tempest files 17:25:20 woudl be good to get daryl too, but no idea where he is 17:25:29 and this should reduce over time 17:25:32 IMO 17:25:39 rohitk: indeed. that's why I'm eager to get them in so merge hell isn't that bad ;) 17:25:55 jaypipes: Your's should probably go first. 17:26:02 davidkranz: ++ 17:26:06 k. 17:26:17 alright, let's end the meeting then unless anyone has objections? 17:26:30 jaypipes: that should speed up the Fuzz test efforts too 17:27:20 No objection. But it would be nice to know what is going on with the swift tests. 17:27:42 But no one who knows is here :( 17:28:36 right... 17:28:42 I'll email Jose and Daryl 17:28:56 ok dokey, bye guys 17:29:00 #endmeeting