16:01:46 <mnaser> #startmeeting openstack_ansible_meeting 16:01:47 <openstack> Meeting started Tue Jul 23 16:01:46 2019 UTC and is due to finish in 60 minutes. The chair is mnaser. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:48 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:50 <openstack> The meeting name has been set to 'openstack_ansible_meeting' 16:01:50 <mnaser> #topic rollcall 16:01:52 <mnaser> o/ 16:01:56 <noonedeadpunk> o/ 16:01:57 <namrata> o/ 16:02:05 <guilhermesp> o/ 16:02:25 <mnaser> hows everyone?! 16:02:32 <jrosser> o/ 16:02:41 * guilhermesp fixes deployments 16:02:56 <mnaser> fun :) 16:03:04 <mnaser> #topic office hours 16:03:53 <mnaser> personally 16:03:56 <mnaser> https://review.opendev.org/#/c/671783/ 16:04:00 <mnaser> i love the -392 part :D 16:04:19 <guilhermesp> it is lot :P 16:04:29 <noonedeadpunk> So I was thinking about dropping os-log-dir-setup.yml and rsyslog client in terms of this 16:04:45 <guilhermesp> seems that we started the work of cleaning up things 16:05:19 <mnaser> yeah i think at this point rsyslog_client wont really do much if we never run it 16:05:26 <mnaser> i would even retire rsyslog_server unless omeone wants to maintain it 16:05:31 <noonedeadpunk> but since it's still used by ceph, unbound and things like tempest, rally and utility container... 16:05:44 <mnaser> utility needs rsyslog? o_O 16:06:03 <noonedeadpunk> not rsyslog but os-log-dir-setup.yml 16:06:08 <mnaser> ahhh ok 16:06:16 <mnaser> this is the thing that does bind mounts right? 16:06:17 <noonedeadpunk> I mixed up things a bit:( 16:06:26 <noonedeadpunk> yep 16:06:33 <mnaser> i was always wondering why we just didn't log things inside the container and kills all those complicated bind mounts 16:08:47 <jrosser> it probably historical so you can just collect up whatever is in /openstack/logs/* and not worry about which containers you have 16:08:54 <noonedeadpunk> probably for compressing without entering container 16:08:57 <spotz> I thought it was so we could find everything in one place? 16:09:37 <noonedeadpunk> but since integrated test, which are actually metal ones, everything is already in one place 16:10:15 <evrardjp> o/ 16:10:26 <spotz> evrardjp!!!! 16:10:46 <evrardjp> spotz: ! :) 16:11:02 <evrardjp> mnaser: love the - in -392 16:12:46 <cjloader> o/ 16:12:48 <evrardjp> mnaser: yes, historical. I would prefer removing all the complex bind mounts too, as this was a pain to deal with. Cleaning this up would also simplify the code further 16:13:45 <jrosser> there is something that collects container journals on the host anyway isnt there? 16:13:47 <noonedeadpunk> so, the thing is that we don't need that almost anywhere, ecept 3 playbooks 16:14:01 <jrosser> so moving to all journals this is now of little point to keep the bind mounts 16:14:32 <mnaser> yeah i agree with all of this so maybe it would be a good clean up 16:14:45 <mnaser> btw also knock on wood our CI has been relatively stable recently 16:14:51 <mnaser> im pretty happy on where it is 16:15:06 <mnaser> the upgrade jobs need work unfortunately 16:15:37 <evrardjp> yeah. 16:15:44 <evrardjp> we've decreased coverage though :( 16:16:07 <evrardjp> if ppl want to help on increasing coverage, I have plenty of ideas, so little time. 16:16:17 <jrosser> evrardjp: ^ can you define what you mean by that, as there several different things 16:18:20 <evrardjp> well I think the first thing to do is to match what we removed -- so define new "specific" jobs having a pre-run play configuring the o_u_c or user_variables, to get feature parity back 16:18:36 <evrardjp> then I guess the idea would be to implement multi-node jobs in periodics 16:20:22 <noonedeadpunk> Yeah, since we might be missing system packages in roles 16:20:50 <noonedeadpunk> we won't catch this if they are already installed by previous role 16:21:01 <mnaser> would we get coverage again if we run both container and metal on every change? 16:21:12 <mnaser> i mean im not opposed to it but we need to figure out why centos takes stupid long with lxc 16:21:16 <mnaser> its like 2h20 to run aio 16:21:35 <jrosser> i would like to see haproxy back in some form 16:22:08 <mnaser> yes with the bind-to-mgmt stuff you're doing 16:22:10 <mnaser> it'll be running fine 16:22:19 <jrosser> the curent metal jobs are fast but they are not sufficiently real life 16:22:42 <jrosser> i'm a bit stuck on the galera stuff there 16:25:12 <mnaser> jrosser: its odd that it just works for us, i can try to help with looking that the fails 16:25:16 <mnaser> did you end up doig the my.cnf adjustments? 16:25:25 <jrosser> i did, and it's inconsistent 16:26:00 <jrosser> heres the change for the client my.cnf https://review.opendev.org/#/c/672101/ 16:26:24 <mnaser> jrosser: lol 16:26:26 <mnaser> you're gonna hhate me 16:26:40 <mnaser> jrosser: check the review :p 16:26:43 <jrosser> i figured i may have messed it up 16:27:13 <noonedeadpunk> yeah 16:27:14 <jrosser> aaaaaahhhhhh crap :) 16:27:42 <jrosser> thankyou :) 16:27:44 * jrosser fixes 16:28:09 <openstackgerrit> Jonathan Rosser proposed openstack/openstack-ansible-galera_client master: Default to connecting clients via ip rather than local socket https://review.opendev.org/672101 16:28:22 <mnaser> so that should hopefully help 16:28:48 <evrardjp> mnaser: it won't be enough, and I didn't intent to run centos+lxc :) 16:29:07 <mnaser> evrardjp: well i figured we'd run all the oeprating systems we cover 16:29:14 <evrardjp> I just wanted to have like mariadb cluster testing + keystone and stop there 16:30:00 <mnaser> that probably can be wired up in logic 16:30:09 <evrardjp> we already have all we need 16:30:17 <mnaser> ye sbut writing the 'dependency' system 16:30:19 <evrardjp> just change affinity 16:30:35 <evrardjp> what do you mean? 16:31:08 <mnaser> when testing os_keystone then grab what services need os_keystone (i.e. galera memcache etc) 16:31:13 <evrardjp> the idea to not run centos+lxc was just to have scenarios (ubuntu +lxc is the most frequent one) 16:31:38 <mnaser> when testing os_nova then grab its dependnecies which are os_neutron,os_keystone,os_glance 16:31:45 <mnaser> whos dependencies are.. etc 16:31:46 <mnaser> to reduce our run times 16:31:47 <evrardjp> that's kinda what's done in tests/role/bootstrap 16:32:03 <evrardjp> but I understand it would be smarter if we do it that way 16:32:17 <evrardjp> I thought this could be done using a CLI :) 16:32:21 <mnaser> mhmm 16:32:31 <mnaser> i think we should have full coverage, if we do aio_lxc, we do it for all supproted systems 16:32:38 <evrardjp> just encapsulate the logic there, instead of relying on so many conditionals in ansible 16:32:41 <mnaser> imho we either do it or drop it 16:32:56 <mnaser> or otherwise we'll get a change that breaks ar ole for centos that wont be caught there 16:33:02 <mnaser> and then itll be broken in integrated 16:33:45 <evrardjp> I meant to keep centos for baremetal, so it would still be tested. Just keeping the most common use cases 16:33:59 <evrardjp> but I understand your point 16:34:38 <mnaser> i feel like with a little bit more efort we'd understand the fundamnetal reason of why centos is sooooo slow 16:34:47 <mnaser> i think we've regressed something 16:34:50 <mnaser> it never took this long before 16:35:22 <jrosser> the data is all there in the ARA reports / db 16:35:38 <jrosser> to decide if its specific things that are slow, or it's just across the board 16:35:44 <mnaser> across the board 16:35:46 <mnaser> every operation is slower 16:35:49 <mnaser> like 3-4x slower 16:35:54 <mnaser> even simple things 16:36:09 <jrosser> does that still stand outside CI? 16:36:12 <noonedeadpunk> Our connection module affects? 16:36:36 <mnaser> i havent tried outside CI, i thougth about our connection moduel but figured it would regress in both OSes? 16:36:54 <jamesdenton> o/ 16:36:59 <mnaser> bonjour 16:37:39 <jamesdenton> i've been talking on IRC for like, at least a week, and just realized nothing was going thru 16:37:45 <mnaser> lolol 16:37:52 <jamesdenton> my feelings were hurt for a bit 16:37:55 <jamesdenton> lol 16:38:34 <mnaser> yeah i dont know 16:38:39 <mnaser> for the centos stuff 16:38:44 <mnaser> it deffo needs some profiling 16:39:01 <mnaser> it'd be nice to get to nspawn and not have to deal with that but 16:39:29 <jrosser> we bit too much off there in one go 16:39:43 <mnaser> yea macvlan+nspawn together is a lot 16:39:47 <jrosser> nspawn + macvlan is too much 16:39:48 <jrosser> yes 16:40:00 <mnaser> bridge+nspawn is easier to cosnume but i dunno if i have the current cycles to help with it 16:40:25 <jrosser> i think there may be (was?) a limitation with nspawn and the number of interfaces you could create 16:41:30 <jrosser> evrardjp: did you do some work on ansible profiling? 16:41:39 <evrardjp> long ago 16:41:44 <evrardjp> dw is better :D 16:41:50 <jrosser> i was just looking for dw but he's not in #mitogen 16:42:07 <jrosser> don't want to waste a bunch of time learning 10 wrong tools when someon can just say "do this" 16:42:18 <evrardjp> that reminds me I need to connect on that channel since I reconfigured my bouncer 16:42:24 <evrardjp> jrosser: totally 16:42:35 <evrardjp> just wait for him, last time he was super helpful to me 16:42:53 <evrardjp> maybe ping him on twitter? 16:45:43 <jrosser> done 16:46:47 <chandankumar> cloudnull: jrosser needs +w on this https://review.opendev.org/#/c/672225/ 16:46:47 <mnaser> so we'll keep improving and cleaning up things, journald seems to be cleaning up well 16:47:18 <cloudnull> chandankumar done 16:47:25 <chandankumar> cloudnull: thanks :-) 16:47:59 <mnaser> OH 16:48:01 <mnaser> also 16:48:06 <mnaser> did y'all see my email to the ML 16:48:10 <mnaser> about openstack-ansible-collab 16:48:52 <guilhermesp> hectic days, just saw an email around, but I will take a look 16:50:24 <jamesdenton> Just a heads up, but the unicast flood issue that was brought up last week is related to change in os-vif introduced in Stein. See: https://bugs.launchpad.net/os-vif/+bug/1837252. 16:50:24 <openstack> Launchpad bug 1837252 in neutron "IFLA_BR_AGEING_TIME of 0 causes flooding across bridges" [Undecided,Incomplete] 16:50:27 <jrosser> chandankumar: did you find a solution with your tempest undefined var? 16:51:15 <mnaser> jamesdenton: affecting lxb only? 16:51:31 <chandankumar> jrosser: above changes worked https://review.opendev.org/#/c/672231/ here, But I need to come up with a better solution 16:51:33 <jamesdenton> it affects the qbr bridges, too, with OVS. Just not sure what the overall effect is there 16:51:56 <chandankumar> it might happening due to mixing of venv and distro stuff 16:52:37 <chandankumar> jrosser: sorry wrong review 16:52:58 <chandankumar> jrosser: https://review.opendev.org/#/c/667219/ 16:57:30 <jrosser> evrardjp: word according to dw "'perf record -g ansible-playbook ...' of the ansible run /and/ separately on the host using simply 'perf record' followed by 'perf report' might show something obvious" 16:57:51 <jrosser> namrata: did you want to ask about upgrades? 16:58:08 <namrata> yeah I was waiting for open discussion 16:58:22 <evrardjp> jrosser: oh yeah that rings me a bell :) 16:58:31 <namrata> Hi would like to contribute to openstack ansible and I can start with the issue which I faced while upgrading R->S i.e upgrading WSREP SST method from xtrabackup-v2 to mariabackup. 16:58:31 <jrosser> namrata: just ask :) 16:59:02 <evrardjp> jrosser: could it be the fact we just added all those plays, and maybe there is cruft in the inventory? 16:59:05 <evrardjp> I haven't checked tbh 16:59:29 <namrata> jrosser suggested to take this up in the meeting so we can discuss how to handle it 16:59:36 <jrosser> mnaser: ^ so for fixing up the R-S upgrade for galera, do we make patches to master? i wasnt totally clear where we do that 16:59:42 <jrosser> evrardjp: ^ maybe you can advise too 17:00:11 <evrardjp> well, is S-master broken too? 17:00:28 <jrosser> if you start on S you are already on the new replication method 17:00:33 <evrardjp> ok 17:00:55 <evrardjp> so it's S upgrade only, so it's only implementable in stein 17:01:07 <evrardjp> you got your answer ? :p 17:01:30 <jrosser> i guess it's made more complicated by not having a working R-S upgrade CI job :/ 17:01:38 <jrosser> well, unless we do, of course 17:02:07 <mnaser> yeah stable only patch 17:02:09 <namrata> okay so I should push to stable/stein then 17:02:55 <jrosser> namrata: ok so that sounds like your answer, write something that goes onto stable/stein 17:03:08 <namrata> jrosser thanks 17:03:10 <jrosser> and thanks for taking the time to fix it up :) 17:03:28 <namrata> :] 17:15:41 <mnaser> #endmeeting