16:01:49 <evrardjp> #startmeeting openstack_ansible_meeting 16:01:51 <openstack> Meeting started Tue Feb 6 16:01:49 2018 UTC and is due to finish in 60 minutes. The chair is evrardjp. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:52 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:54 <openstack> The meeting name has been set to 'openstack_ansible_meeting' 16:02:13 <evrardjp> #topic rollcall 16:02:15 <prometheanfire> o/ 16:02:36 <evrardjp> nobody else? 16:03:11 <mgariepy> i'm half-here. 16:03:22 <odyssey4me> o/ 16:03:38 <evrardjp> let's start, 3.5 people :) 16:03:43 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1747629 16:03:44 <openstack> Launchpad bug 1747629 in openstack-ansible "A worker was found in dead state" [Undecided,New] 16:03:49 <evrardjp> #topic bugs 16:03:55 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1747629 16:04:08 <openstackgerrit> Kevin Carter (cloudnull) proposed openstack/openstack-ansible master: Ensure neutron agents & rabbitmq do not restart when upgrading https://review.openstack.org/541320 16:04:11 <odyssey4me> mgariepy https://media.giphy.com/media/xT9IgN8YKRhByRBzMI/giphy.gif ;) 16:04:12 <cloudnull> mgariepy: ^ 16:04:20 <cloudnull> evrardjp: i'm here this morning :) 16:04:26 <evrardjp> woot 16:04:36 <cloudnull> 4.5 people :) 16:04:37 <evrardjp> ok so the first one is interesting 16:04:41 <cloudnull> more like 3.75 16:04:43 <cloudnull> :P 16:04:46 <spotz> sorta here, ping if needed 16:05:16 <evrardjp> it doesn't seem to reproduce in my machine, but reliably happens in the gates. It seems we have dead workers after a certain time in the translations jobs 16:05:17 <odyssey4me> hmm, that worker dead state issue is something I've seen when the host runs out of memory 16:05:34 <odyssey4me> ansible basically just croaks 16:05:37 <evrardjp> odyssey4me: indeed, I've seen that too 16:05:51 <evrardjp> so maybe there is something to reduce memory consumption for that job 16:06:46 <mgariepy> cloudnull, reviewed ;) 16:06:58 <openstackgerrit> Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-os_neutron master: add ml2 config for networking bgpvpn https://review.openstack.org/522598 16:07:24 <evrardjp> is there anyone that wants to confirm this, work on it? 16:07:45 <evrardjp> if not I'll move to the next bug 16:08:00 <evrardjp> ok next 16:08:01 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1747628 16:08:02 <openstack> Launchpad bug 1747628 in openstack-ansible "Upgrades to Queens are broken due to new container scaffolding" [Undecided,In progress] - Assigned to Kevin Carter (kevin-carter) 16:08:09 <evrardjp> cloudnull: could you have a look? 16:08:11 <evrardjp> great! 16:08:15 <evrardjp> thanks 16:08:26 <evrardjp> must be something real quick to fix 16:08:32 <evrardjp> let's move on 16:08:32 <cloudnull> evrardjp: already done. 16:08:36 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1747350 16:08:38 <prometheanfire> lol 16:08:38 <openstack> Launchpad bug 1747350 in openstack-ansible "openstack-ansible failed on ubuntu 16.04 aio " [Undecided,New] 16:08:54 <cloudnull> https://review.openstack.org/#/c/541315/ 16:08:58 <evrardjp> cloudnull: yeah I've seen, thanks. 16:08:58 <cloudnull> evrardjp: ^ 16:09:12 <odyssey4me> that failure is not a failure 16:09:19 <odyssey4me> it's a try/rescue block 16:10:10 <evrardjp> it looks invalid 16:10:13 <evrardjp> indeed 16:10:18 <odyssey4me> I've commented 16:10:31 <cloudnull> ++ odyssey4me with https://review.openstack.org/#/q/Ic54a7524c09a170e20830c5f8d2c2a0658159ed0 16:10:37 <odyssey4me> That, and there is still the set of rabbitmq try/rescue blocks that causes confusion. 16:10:45 <andymccr> invalid! 16:10:59 <cloudnull> hopefully it'll be clearer when the playbooks fail or succeed 16:11:16 <cloudnull> that is until ansible 2.5 is released and the try/except ux is better 16:12:09 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1747313 16:12:09 <openstack> Launchpad bug 1747313 in openstack-ansible "openstack-ansible failed to build AIO" [Undecided,New] 16:12:10 <odyssey4me> Ja, added that comment to the bug. I'm not a fan of that set of patches - I would far rather have us rework that try/rescue block into something that doesn't require a task failure. 16:12:16 <cloudnull> cool 16:12:25 <odyssey4me> But meh, I don't really have the time to spend on it. 16:12:25 <evrardjp> we've all commented on that bug. nice! 16:12:57 <evrardjp> odyssey4me: that's what we should probably do indeed. Because the rescue is meant as a rescue :) 16:13:05 <evrardjp> last resort stuff, not expected failures. 16:13:08 <evrardjp> anyway 16:13:11 <evrardjp> next one is selinux 16:13:14 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1747313 16:13:15 <openstack> Launchpad bug 1747313 in openstack-ansible "openstack-ansible failed to build AIO" [Undecided,New] 16:13:28 <odyssey4me> this looks kinda like a dup of the last one, except for the selinux thing 16:13:41 <odyssey4me> I thought that we had selinux things in the ansible bootstrap though 16:13:54 <odyssey4me> in this one the failure is again not actually a failure 16:14:19 <evrardjp> let's check real quick 16:14:19 <odyssey4me> ah, except for the "Check for unlabeled device files" task 16:14:28 <evrardjp> wait 16:14:33 <evrardjp> openstack-ansible-security 16:14:37 <evrardjp> that's old. 16:15:04 <odyssey4me> oh, good catch there 16:15:13 <odyssey4me> it's tagged with the beta, but doesn't look beta related 16:15:23 <evrardjp> no it's not. 16:15:26 <evrardjp> that's not possible. 16:15:37 <evrardjp> will comment on the bug 16:16:20 <evrardjp> marked it as incomplete 16:16:29 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1746142 16:16:30 <openstack> Launchpad bug 1746142 in openstack-ansible "Nova uid/gid sync, default/standards" [Undecided,New] 16:17:23 <odyssey4me> The reason it's not defaulted is exactly for the reason he describes... it will break an upgrade 16:17:23 <evrardjp> here it's a question of: is that a whishlist, or a real bug? (think about people using shared storage) 16:17:53 <cloudnull> wishlist 16:18:04 <odyssey4me> both the cinder and nova roles implement very broad access rights to the storage folders so that the uid/gid should not matter 16:18:11 <odyssey4me> that's as far as I recall, at least 16:18:32 <odyssey4me> so if we implement a default, we have to implement a migration tool too so that upgrades work 16:18:32 <andymccr> yeah wishlist 16:18:42 <odyssey4me> so yeah, that's a feature request - and a hard one to do 16:18:48 <andymccr> imo just dont set the uid/gid why would you need to? unless you are doing a new deploy 16:19:38 <odyssey4me> andymccr on a shared storage system (like NFS), if each compute has a different nova uid, then stuff doesn't work right... but we work around that using very brooad access settings which is quite horrible, really 16:20:17 <odyssey4me> we've long talked about having a global uid/gid map - even back in juno I remember chatting about it 16:20:19 <evrardjp> I think we all agree, and we can continue 16:20:32 <evrardjp> if someone wants consistant uid he can do it beforehand 16:20:37 <odyssey4me> but the upgrade issue is always where it got stuck 16:21:09 <evrardjp> next 16:21:11 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1745675 16:21:14 <openstack> Launchpad bug 1745675 in openstack-ansible "aide database file is missing" [Undecided,New] 16:21:37 <evrardjp> mhayden: ? 16:21:43 <evrardjp> could you have a look at it? 16:22:25 <evrardjp> who is okay with the fact I assign that to mhayden ? 16:22:27 <evrardjp> :D 16:22:48 <odyssey4me> well, anyone could pick it up I guess - but mhayden would probably be interested :) 16:23:01 <evrardjp> yeah 16:23:11 <odyssey4me> looks like that process needs a little bit of TLC to make it more reliable 16:23:12 <evrardjp> I leave it as new for now. Just assigned mhayden for the incentive 16:23:24 <evrardjp> let's move on 16:23:25 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1745361 16:23:26 <openstack> Launchpad bug 1745361 in openstack-ansible "Failed to create subvolume /var/lib/machines/ when run 'openstack-ansible setup-hosts.yml' multiple times" [Undecided,New] 16:24:22 <evrardjp> gokhan: was definitely unlucky on this one 16:24:30 <openstackgerrit> Kevin Carter (cloudnull) proposed openstack/openstack-ansible master: Ensure neutron agents & rabbitmq do not restart when upgrading https://review.openstack.org/541320 16:24:41 <andymccr> hmm 16:25:00 <odyssey4me> Defer to cloudnull for that one. 16:25:15 <odyssey4me> There may already be a fix available - I remember seeing something about quotas and such. 16:25:43 <evrardjp> yeah maybe 16:25:49 <evrardjp> cloudnull: ? 16:25:57 <evrardjp> could you handle this one? 16:26:02 <cloudnull> yup, https://review.openstack.org/#/c/527592/ 16:26:04 <evrardjp> triage it and take decisions? 16:26:22 <cloudnull> I need to update it seems as reno is unhappy 16:26:44 <evrardjp> great 16:26:59 <evrardjp> yeah I've quickly reviewed to link both the bug and the review. 16:27:06 <evrardjp> let's continue 16:27:08 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1745287 16:27:09 <openstack> Launchpad bug 1745287 in openstack-ansible "ceph-mon : collect admin and bootstrap keys fails on CentOS7" [Undecided,New] 16:27:11 <andymccr> cloudnull knocking it out the park! 16:27:19 <odyssey4me> cloudnull need a closes-bug in that commit msg too then 16:27:27 <cloudnull> odyssey4me: ++ will add 16:28:27 <mhayden> evrardjp: sorry -- got caught in something... feel free to assign 16:28:30 <andymccr> hmm that is weird 16:28:32 <odyssey4me> looks like it failed to collect the data it needs from the mons 16:28:43 <odyssey4me> connectivity, or perhaps ssh key issue? 16:28:52 <evrardjp> it looks like a problem with quorum? 16:29:07 <evrardjp> but on top of that unable to find a keyring on /etc/ceph/ceph.client.admin.keyring 16:29:08 <odyssey4me> or yes, perhaps actually a ceph cluster issue 16:29:26 <odyssey4me> sounds to me like the cluster setup wasn't complete 16:29:48 <andymccr> well its a ceph-mon role from ceph-ansible task that is failing 16:29:53 <evrardjp> andymccr: does that look right to you? "addr": "172.29.236.177:6789/0", 16:30:26 <evrardjp> that doesn't like nice 16:30:27 <andymccr> im sure its the mon service 16:30:30 <andymccr> not sure what the /0 is 16:30:38 <evrardjp> or "addr": "0.0.0.0:0/1", 16:31:29 <jamespage> odyssey4me: pylxd 2.0.6 is now on pypi 16:31:40 <odyssey4me> jamespage brilliant, thank you very much! 16:31:52 <andymccr> the /0 is fine 16:32:12 <andymccr> "extra_probe_peers": [ 16:32:12 <andymccr> "172.29.236.32:6789/0", 16:32:12 <andymccr> "172.29.236.33:6789/0" 16:32:16 <andymccr> for e.g. on my test aio right now i have ^ 16:32:21 <andymccr> and that succeeded fine 16:32:40 <evrardjp> the configuration doesn't look that bad on the given tar file 16:32:55 <evrardjp> jamespage: \o/ 16:33:10 <jamespage> tinwood: ^^ take the praise (not my work) 16:33:22 <evrardjp> tinwood: \o/ too! 16:33:27 <tinwood> thanks 16:33:30 <evrardjp> andymccr: what's your take? 16:33:33 <evrardjp> on the bug 16:33:36 <andymccr> http://tracker.ceph.com/issues/21427 id follow that 16:33:53 <andymccr> it is failing on a specific task - ceph-create-keys i think if we could get more info 16:33:58 <andymccr> like what does a manual run output. etc 16:34:10 <andymccr> its hard to say if its something setup in our deploys or a ceph-ansible bug or a ceph bug overall 16:35:03 <evrardjp> could you comment on the bug to ask the additional info, please? 16:35:11 <evrardjp> I'll mark it as incomplete 16:35:55 <evrardjp> ok let's move on 16:35:57 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1745281 16:35:58 <openstack> Launchpad bug 1745281 in openstack-ansible "galera_server : Create galera users fails on CentOS7" [Undecided,New] 16:36:24 <evrardjp> this proves how centos is broken :/ 16:36:41 <evrardjp> or on how the deployer's environment is broken 16:36:45 <evrardjp> :D 16:36:55 <andymccr> will do 16:37:10 <andymccr> it also shows ppl kinda like centos :P 16:37:46 <evrardjp> I will tackle this one I guess, except if mgariepy has the time to do it between two things? 16:38:52 <openstackgerrit> Kevin Carter (cloudnull) proposed openstack/openstack-ansible-lxc_hosts master: Clean-up old systemd prep and allow machinctl to grow https://review.openstack.org/527592 16:38:53 <idlemind> +1 centos, i haven't seen this one yet but i only have a single galera container atm (fail on me i know right) 16:39:52 <evrardjp> good point idlemind , I have to try with a cluster. 16:39:55 <evrardjp> let's move on 16:40:07 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1745270 16:40:08 <openstack> Launchpad bug 1745270 in openstack-ansible "unable to connect to epmd (port 4369) on CentOS7" [Undecided,New] 16:40:40 <evrardjp> now I am thinking this whole series is a networking issue :p 16:41:04 <evrardjp> let's move on 16:41:06 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1745215 16:41:07 <openstack> Launchpad bug 1745215 in openstack-ansible "Every openstack client is built in the repo build" [Undecided,New] 16:42:16 <openstackgerrit> Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible stable/newton: Update os_nova and repo_build role SHA's https://review.openstack.org/541259 16:43:15 <cloudnull> evrardjp: seems like not a bug, given the mentioned refactor ? 16:43:28 <odyssey4me> yeah, feature release 16:43:43 <odyssey4me> it's a bit painful, but not a bug - it's due to heat and that's life 16:44:00 <odyssey4me> if you don't use heat, remove it from your config and life will be awesome again :) 16:44:23 <openstackgerrit> Major Hayden proposed openstack/openstack-ansible-tests master: Add a status line for SELinux status https://review.openstack.org/541371 16:45:01 <odyssey4me> I mean, we could propose patches to the heat client repo, which would be nice - if someone has the time and inclination, awesome 16:45:24 <evrardjp> yeah. In the meantime it's wishlist I'd say, and confirmed. 16:45:29 <odyssey4me> yep 16:45:40 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1745212 16:45:41 <openstack> Launchpad bug 1745212 in openstack-ansible "default_bind_mount_logs changes on N>O upgrade" [Undecided,New] 16:46:24 <evrardjp> we can mark this fix released I guess? 16:46:28 <odyssey4me> looks like it 16:46:53 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1744458 16:46:54 <openstack> Launchpad bug 1744458 in openstack-ansible "Failed to build cradox on CentOS7 " [Undecided,New] 16:47:01 <odyssey4me> unless it's not actually solved? mgariepy - I see the reviews were marked 'related', not 'closes' 16:47:27 <odyssey4me> I think this was solved recently 16:47:59 <odyssey4me> https://review.openstack.org/#/c/530570/ 16:48:01 <evrardjp> Yes I think it looks solved 16:48:32 <evrardjp> fix released 16:48:35 <evrardjp> woot 16:48:38 <mgariepy> hrm , sorry was the time i was half - not here haha :) 16:48:41 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1743032 16:48:42 <openstack> Launchpad bug 1743032 in openstack-ansible "Galera cluster maintenance in OpenStack-Ansible" [Undecided,New] - Assigned to Kevin Carter (kevin-carter) 16:49:22 <evrardjp> cloudnull: do you have time to check at this one, or should we unassign it? 16:49:40 <nurdie_> https://docs.openstack.org/openstack-ansible-ceph_client/latest <-- is that just a guide on what to include when making our own ceph-install play? So, for instance, this could replace the ceph-install play that gets called from /opt/openstack-ansible/playbooks/setup-infrastructure.yml? 16:50:03 <nurdie_> for integration with an existing ceph cluster 16:50:05 <evrardjp> nurdie_: let's talk after the bug triage, in 10 minutes. 16:50:14 <nurdie_> evrardjp, thanks! 16:50:15 <openstackgerrit> Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-os_nova stable/newton: Revert "Switch the LXD compute driver test to non-voting" https://review.openstack.org/541372 16:51:08 <cloudnull> evrardjp: I can look into that today 16:51:18 <evrardjp> that's great! 16:51:19 <openstackgerrit> Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-os_nova stable/newton: Revert "Switch the LXD compute driver test to non-voting" https://review.openstack.org/541372 16:51:25 <evrardjp> next 16:51:27 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1739472 16:51:28 <openstack> Launchpad bug 1739472 in openstack-ansible "mariadb-client excluded in ocata centos7 deploy" [Undecided,New] - Assigned to Markos Chandras (hwoarang) 16:52:11 <evrardjp> hwoarang: said he cannot attend the meeting today, so let's postpone the discussion of this, unless someone has anything to say? 16:52:34 <evrardjp> ok let's move on then 16:52:38 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1737827 16:52:39 <openstack> Launchpad bug 1737827 in openstack-ansible "(ceph-client): setting 'nova_ceph_client' results in deployment where volumes can't be attached to VMs" [Undecided,New] 16:53:07 <odyssey4me> that's an old bug IIRC 16:53:29 <odyssey4me> what I mean is, I've seen that discussed in channel before - I think admin0 made some noise about it 16:53:41 <andymccr> ahh igot feedback on that 16:54:09 <evrardjp> odyssey4me: yeah, I am sure I got that biting me too in the past... but things might have changed now :) 16:54:12 <evrardjp> andymccr: ? 16:54:24 <andymccr> lemme look at the comments 16:54:37 <andymccr> i'll take that one 16:54:40 <evrardjp> ok 16:54:50 <evrardjp> assigning it to you 16:54:58 <evrardjp> thanks 16:55:05 <evrardjp> next 16:55:07 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1729263 16:55:09 <openstack> Launchpad bug 1729263 in openstack-ansible "Swift (master) transient tempest failures under centos " [Undecided,New] 16:55:20 <evrardjp> let's skip that one 16:55:25 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1721554 16:55:26 <openstack> Launchpad bug 1721554 in openstack-ansible "os_ceilometer fails without swift installed" [Medium,New] 16:56:01 <evrardjp> I have the impression there was a patch for this one 16:57:01 <evrardjp> mnaser: are you there? 16:57:06 <mnaser> o/ 16:57:24 * mnaser reads that bug 16:58:06 <evrardjp> thanks mnaser ! 16:58:49 <mnaser> okay it looks like this is something that should be skipped if not using swift 16:58:50 <evrardjp> If you are working on removing mongo there are maybe other things on that topic that can cause issues, and things to be aware of... 16:58:57 <evrardjp> thanks for having a look! 16:59:46 <evrardjp> if you need any help, don't hesitate mnaser ! 17:00:00 <evrardjp> ok we are out of time 17:00:13 <evrardjp> thanks everyone for your help on making the world of openstack-ansible better 17:00:22 <evrardjp> #endmeeting