#openstack-ansible log

16:01:43 <evrardjp> #startmeeting openstack_ansible_meeting
16:01:44 <openstack> Meeting started Tue Apr 18 16:01:43 2017 UTC and is due to finish in 60 minutes.  The chair is evrardjp. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:01:46 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:01:48 <openstack> The meeting name has been set to 'openstack_ansible_meeting'
16:01:58 <andymccr> lets do this!
16:01:58 <evrardjp> #topic Last week AP
16:02:15 <evrardjp> All last week AP are done
16:02:25 <evrardjp> #topic this week triage
16:02:42 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1682481
16:02:44 <openstack> Launchpad bug 1682481 in openstack-ansible "When using openstack-ansible-lxc_container_create role with an SSH jumpbox, sits waiting forever for container even though ssh is available." [Undecided,New]
16:02:59 <andymccr> hmm.
16:03:14 <andymccr> confirmed/medium? im not sure what a good fix would be though
16:03:41 <evrardjp> delegate_to: {{ physical_host }}" ?
16:04:23 <evrardjp> but I'd say confirmed medium yes.
16:04:35 <evrardjp> ansible 2.3 and wait_for_connection should be a good solution
16:04:43 <evrardjp> but were are not there yet.
16:04:44 <andymccr> im not sure that'd work - basically you'd need to test the port via the SSH jumpbox i assume?
16:04:54 <andymccr> the only way that'd work is to test SSH rather than test a port
16:05:07 <evrardjp> yeah, that's what wait_for_connection will do
16:05:16 <andymccr> yeah but thats only 2.3 i guess
16:05:33 <evrardjp> I'll mark low-hanging-fruit and ask if delegate_to: "{{ physical_host }}" would be anough
16:05:35 <evrardjp> enough*
16:06:03 <andymccr> i think only performing an SSH connection test would actually work based on the bug
16:06:05 <andymccr> but yeah lets move on
16:06:20 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1682169
16:06:23 <openstack> Launchpad bug 1682169 in openstack-ansible "Upgrade N->O nova-manage cell_v2 issue" [Undecided,New]
16:06:40 <andymccr> i'll take a look at this one
16:06:57 <evrardjp> ok, for triaging or for fixing?
16:06:59 <andymccr> i need to run some tests around upgrades but it sounds plausible and repeatable so need to figure it out
16:07:04 <evrardjp> ok
16:07:09 <andymccr> fixing/confirming at the very least :)
16:07:13 <evrardjp> #action andymccr triage https://bugs.launchpad.net/openstack-ansible/+bug/1682169
16:07:20 <evrardjp> this way it's marked for next week.
16:07:34 <evrardjp> next
16:07:36 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1682161
16:07:37 <openstack> Launchpad bug 1682161 in openstack-ansible "Upgrade N->O Nova services launch with newton tag and not ocata (systemd issue)" [Undecided,New]
16:08:15 <cloudnull> o/
16:08:27 <spotz> cloudnull is volunteering:)
16:08:31 <evrardjp> it makes sense to me.
16:08:40 <evrardjp> I've seen other like this sadly.
16:08:49 <evrardjp> or I seem to recall other like this.
16:08:55 * cloudnull reading bug
16:08:58 <evrardjp> Confirmed medium?
16:09:07 <andymccr> hmm
16:09:20 <evrardjp> https://github.com/openstack/openstack-ansible-os_nova/blob/master/tasks/nova_init_systemd.yml#L78
16:10:09 <cloudnull> evrardjp: i think we need a daemon reload too
16:10:22 <andymccr> im wondering if the 2 are related
16:10:34 <evrardjp> or this: https://github.com/openstack/openstack-ansible-os_nova/blob/966ea269c9acb88052aca8982c37e0ba8a94a207/tasks/nova_init_common.yml#L20
16:10:42 <evrardjp> cloudnull: agreed.
16:11:09 <evrardjp> I think we need to make this daemon-reload standard on all our roles
16:11:19 <evrardjp> I thought of this: https://review.openstack.org/#/c/440463/
16:11:28 <cloudnull> well..
16:11:30 <cloudnull> https://github.com/openstack/openstack-ansible-os_nova/blob/stable/ocata/handlers/main.yml#L71
16:11:34 <cloudnull> so it's part of the task
16:11:39 <evrardjp> oh ok
16:11:51 <cloudnull> but... maybe it doesn't actually reload the daemon ?
16:11:52 <evrardjp> I misread
16:12:27 <cloudnull> is Grelaud (fabrice-grelaud) here ?
16:13:24 <cloudnull> so this could be an upstream ansible bug too
16:13:27 <evrardjp> so we are at a state where we need confirmation
16:14:02 <cloudnull> based on the diff it makes sense that something in the service / systemd task handler is not actually doing the daemon reload
16:14:28 <evrardjp> yeah
16:14:39 <evrardjp> oh wait
16:15:04 <cloudnull> this spells it out fairly clearly https://launchpadlibrarian.net/315388078/nova-ocata-systemd-bug.txt
16:17:45 <cloudnull> so i guess the next step is trying to figure out how we fix this everywhere.
16:18:51 <evrardjp> If we had a log of the whole run that would be great
16:18:55 <evrardjp> do we need to ask for this?
16:19:14 <evrardjp> or will someone try to figure it out without the complete run log?
16:19:21 <cloudnull> that would be helpful
16:19:33 <cloudnull> I think we've enough to confirm the issue and mark it medium
16:19:39 <evrardjp> I only think of issues in the run after the flush_handlers.
16:19:48 <jmccrory> did this help? https://review.openstack.org/#/c/452327/
16:19:48 <evrardjp> but even that
16:19:52 <cloudnull> go ahead and assign me to it and I'll try and recreate
16:20:11 <cloudnull> jmccrory: it should have
16:20:18 <evrardjp> wait.
16:20:28 <evrardjp> Maybe he is not using the latest ocata. That's right
16:20:40 <evrardjp> cloudnull: ok
16:20:46 <cloudnull> oh. . .https://github.com/openstack/openstack-ansible-os_nova/blob/15.1.0/handlers/main.yml#L71
16:20:55 <cloudnull> that bug was on 15.1.0
16:20:58 <evrardjp> haha!
16:21:03 <cloudnull> which did not have that change
16:21:09 <spotz> score!
16:21:12 <evrardjp> good catch jmccrory we were only on the branch, not on the right tag!
16:21:20 <cloudnull> good catch jmccrory!
16:21:20 <evrardjp> boom, already fixed!
16:21:38 <jmccrory> hah good
16:21:41 <cloudnull> andymccr: when can we get a tag out for 15.1.1
16:21:54 <andymccr> https://review.openstack.org/#/c/456640/
16:22:05 <evrardjp> boom!
16:22:06 <evrardjp> team team team
16:22:09 <cloudnull> fixed!
16:22:11 <andymccr> i will try follow it up cloudnull
16:22:15 <cloudnull> cool
16:22:29 <evrardjp> marking as invalid
16:22:31 <evrardjp> next
16:22:42 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1682108
16:22:44 <openstack> Launchpad bug 1682108 in openstack-ansible "[vagrant] openstack-ansible-lxc_hosts: The `lxc` module is not importable. Check the requirements."" [Undecided,New]
16:23:09 <evrardjp> that's hwoarang 's bug
16:23:30 <andymccr> ahh this has a fix already too :)
16:23:35 <hwoarang> yeah so i posted a solution which i'm not sure if it's a workaround or not
16:23:37 <evrardjp> Related, not cloased
16:23:42 <evrardjp> -a
16:23:56 <evrardjp> well at least we can review this :)
16:24:08 <hwoarang> i need some help from someone which access on a live openstack CI host to figure out why these host use pip2 from real root instead of the one from virtualenv :(
16:24:46 <hwoarang> otherwise i can't progress much since it's going to be a guessing game
16:24:54 <evrardjp> yeah, I guess.
16:24:57 <evrardjp> hahaha.
16:25:11 <cloudnull> hwoarang: we might be able to ask infra to hold a node for us
16:25:24 <evrardjp> that's possible
16:25:37 <cloudnull> sadly what evrardjp said is normal. its a guessing game.
16:25:39 <evrardjp> else, cloudnull there is a way to get latest infra image for local testing?
16:25:54 <cloudnull> you can build it with DIB
16:26:11 <evrardjp> there is no artifact of the daily build ?
16:26:32 <cloudnull> hwoarang: in the past we've added rapid break points to a commit so it runs, breaks and then we go look try and move things ahead
16:27:08 <evrardjp> I guess we should avoid that as much as possible, IIRC.
16:27:38 <hwoarang> i see
16:27:42 <evrardjp> cloudnull: could you have a look with hwoarang?
16:28:04 <hwoarang> if holding a node for half a hour is possible that would be very helpful
16:28:08 <cloudnull> sure.
16:28:30 <hwoarang> ok lets discuss after
16:28:39 <hwoarang> the meeting
16:28:41 <evrardjp> #action hwoarang cloudnull investigate on https://bugs.launchpad.net/openstack-ansible/+bug/1682108 with infra
16:28:42 <openstack> Launchpad bug 1682108 in openstack-ansible "[vagrant] openstack-ansible-lxc_hosts: The `lxc` module is not importable. Check the requirements."" [Undecided,New]
16:28:53 <evrardjp> something like that, we'll remember for next week :)
16:29:03 <evrardjp> ok next
16:29:05 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1681714
16:29:05 <openstack> Launchpad bug 1681714 in openstack-ansible " Ceph cluster’s performance are not monitored by Telegraf" [Undecided,New] - Assigned to Bertrand Lallau (bertrand-lallau)
16:29:17 <evrardjp> confirmed wishlist?
16:29:22 <andymccr> sure
16:29:22 <cloudnull> ++
16:29:31 <evrardjp> ok next
16:29:32 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1681695
16:29:34 <openstack> Launchpad bug 1681695 in openstack-ansible "Incorrect keystone with multiple memcache configuration" [Undecided,New] - Assigned to Jean-Philippe Evrard (jean-philippe-evrard)
16:30:01 <evrardjp> I'll work on this.
16:30:04 <cloudnull> if bertrand is here, we're intending to port a mess of ceph plugins over to monitorstack soon and get that into our telegraf setup
16:30:21 <evrardjp> for the last one basically we have to think where to put memcache in the future
16:30:35 <evrardjp> but it's fine I have already a commit for the start of this.
16:30:54 <evrardjp> that's all for today
16:30:55 <andymccr> cool :)
16:30:57 <andymccr> good job!
16:31:02 <evrardjp> thanks everyone!
16:31:08 <evrardjp> anything else to add?
16:31:13 <andymccr> nope!
16:31:16 <evrardjp> #endmeeting