16:03:55 <evrardjp> #startmeeting openstack_ansible_meeting
16:03:56 <openstack> Meeting started Tue Aug 22 16:03:55 2017 UTC and is due to finish in 60 minutes.  The chair is evrardjp. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:03:57 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:04:00 <openstack> The meeting name has been set to 'openstack_ansible_meeting'
16:04:09 <evrardjp> #topic rollcall
16:04:09 <spotz> \o/
16:04:40 <evrardjp> waiting a few seconds for ppl to join
16:04:54 <prometheanfire> o/
16:05:37 <andymccr> o/
16:06:52 <jmccrory> o/
16:07:01 <evrardjp> #topic this week's bugs
16:07:06 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1712315 Newton integrated build fails in nodepool when executing galera_server role
16:07:07 <openstack> Launchpad bug 1712315 in openstack-ansible "Newton integrated build fails in nodepool when executing galera_server role" [Undecided,New]
16:07:54 <evrardjp> That sounds serious for gating.
16:08:11 <evrardjp> Confirmed critical ?
16:08:19 <andymccr> we are non voting on xenial
16:08:22 <andymccr> so its not critical
16:08:32 <andymccr> we think its due to overlapping repos from infra/what we add in newton
16:08:43 <andymccr> but we should get it fixed/investigate more
16:08:54 <odyssey4me> it may become critical when infra removes the trusty gates
16:09:00 <odyssey4me> perhaps high for now
16:09:13 <andymccr> this is true
16:09:57 <evrardjp> sounds a good classification to me. Will note the remark about criticality
16:10:04 <evrardjp> we have a fix idea?
16:10:26 <andymccr> not sure - i havnt looked into it enough
16:10:53 <evrardjp> ok
16:11:10 <evrardjp> let's continue in the meantime, if someone has time, that would be great to work on it...
16:11:11 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1711765
16:11:12 <openstack> Launchpad bug 1711765 in openstack-ansible "dynamic_inventory.py adds 'physical_host = None' to hostvars" [Undecided,New]
16:11:20 <tasker> <- submitter if you have any questions.
16:11:34 <tasker> of  1711765
16:11:38 <evrardjp> a bug submitter at the triage meeting, that's great!
16:11:44 <tasker> . )
16:12:06 <evrardjp> thanks for being fast at replying too
16:12:31 <evrardjp> the openstack user config seems ok, except that "ip unclosed, but I guess it's the edition
16:12:39 <openstackgerrit> Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-os_neutron stable/pike: Simplify the string check for offline db migrations  https://review.openstack.org/495349
16:12:54 <tasker> right. boss-men wantde the IPs cleaned for submission.
16:13:09 <evrardjp> tasker: small question, are all the tuples (hostname, ip) the same for all the groups?
16:13:16 <evrardjp> fine for me
16:13:17 <tasker> yes
16:13:23 <evrardjp> ok
16:14:59 <tasker> I was thinking about the title earlier this morning and thinking that it might not accurately summarize what the problem is. it's not so much that "physical_host = None", but that None gets added to the host_list.
16:15:21 <openstackgerrit> Merged openstack/openstack-ansible-os_neutron stable/pike: tasks: neutron_install: Fix virtualenv-tools issue on openSUSE  https://review.openstack.org/495360
16:15:28 <evrardjp> yes, but weirdly I tried a full on metal thing recently, and I didn't get that
16:15:30 <openstackgerrit> Merged openstack/openstack-ansible-os_neutron master: Update reno for stable/pike  https://review.openstack.org/495064
16:15:35 <evrardjp> I am concerned that there is something else there
16:15:57 <evrardjp> while the sanity check doesn't hurt by itself, I am worried it could help hide other issues.
16:16:25 <tasker> and I'm more than willing to blame a problem in my inventory, but I thought that the sanity check could help at least prevent further issues. ... ahh, I got you.
16:17:22 <palendae> Has been a while since I looked at the inventory code, not sure what the follow on effects of that would be
16:17:24 <evrardjp> yeah. jmccrory opinion?
16:17:34 <evrardjp> palendae: \o/
16:17:45 <palendae> So basically I'm echoing evrardjp :p
16:18:01 <evrardjp> palendae: ah that's it meant :)
16:18:17 <tasker> the follow on is when the upstream ansible inventory.Host class tries to stringify the host.name when it's None.
16:18:20 <evrardjp> how can we help triaging the issue at the inventory then?
16:18:35 <palendae> tasker: Right, I'm more thinking of our inventory module, which is a bit messy
16:18:42 <jmccrory> evrardjp agree about sanity check maybe hiding cause of issue
16:18:48 <palendae> It's totally possible there's some hidden assumption about 'None' somewhere else
16:19:19 <evrardjp> so let's focusing on _not the fix_ but rather on triaging the issue then
16:19:47 <evrardjp> how can we reproduce this, as the o_u_c seems alright at first sight
16:20:15 <odyssey4me> I think that might have been added to help with the connection plugin
16:20:32 <tasker> does it help you to understand that my gateways are virtual machines within the cluster infrastructure?
16:21:24 <odyssey4me> https://github.com/openstack/openstack-ansible-plugins/blob/6c6753bbede974a4f787f0e574e693ac0414371c/connection/ssh.py#L68-L69
16:21:32 <evrardjp> tasker: you added a group, and then you are using your own playbooks targetting those nodes and you get that issue, whilst you don't get the issue for the other plays?
16:21:48 <jmccrory> think connection plugin was updated to ignore a None or non existent physical_host, but don't know why the inventory script would have set one to None
16:21:48 <odyssey4me> it would appear that this will return None on its own unless I'm misunderstanding how it works
16:21:59 <evrardjp> odyssey4me: oh I see.
16:22:01 <tasker> it occurs regardless of playbook -- it's whenever dynamic_inventory.py runs.
16:22:15 <lady_ada> @cloudnull thanks to your pointers last week, I was able to correct my configuration and get Ocata deployed on CentOS7~
16:22:36 <evrardjp> tasker: I guess these new groups have been configured with an env.d ?
16:22:49 <tasker> that, I don't know.
16:23:20 <evrardjp> tasker: currently openstack ansible need a structure to generate groups and hosts. If we don't provide a structure for new groups, it will not work.
16:23:30 <evrardjp> it's not simple YAML parsing like upstream ansible.
16:23:39 <evrardjp> we have extra bits
16:25:47 <evrardjp> tasker, we need information about those extra groups
16:25:58 <evrardjp> in the meantime I will mark it as incomplete
16:26:26 <tasker> thanks. I'm sorry i cannot provide any more. I did not layout the cluster. I'll try to get the more info for you in the meantime.
16:26:52 <evrardjp> thanks
16:26:55 <evrardjp> next
16:26:57 <evrardjp> #link #link https://bugs.launchpad.net/openstack-ansible/+bug/1711638
16:26:57 <openstack> Launchpad bug 1711638 in openstack-ansible "neutron-dnsmasq.log not forwarded to rsyslog" [Undecided,New]
16:26:58 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1711638
16:27:05 <jmccrory> evrardjp : looks like that's it. i just tried adding the last few lines from the o_u_c in ticket, got a null physical host
16:27:44 <evrardjp> yeah that's what I think too. Let's see what will be the real output, in case there was indeed an env.d
16:27:53 <evrardjp> So we'll probably mark this as invalid.
16:27:58 <evrardjp> but let's see
16:28:29 <odyssey4me> well, we should perhaps make the dynamic inventory handle that case better instead of forcing an env.d file to be created
16:28:32 <evrardjp> for the next one, dnsmasq, I think it would be linked to recently bug fixed during the bug squash, about rsyslog not starting properly
16:28:39 <evrardjp> odyssey4me: agreed.
16:28:54 <evrardjp> odyssey4me: I think we should take ansible 2.4 yaml upstream format, and get done with it.
16:29:05 <palendae> Just make sure there's a migration path
16:29:06 <evrardjp> but that's me and my radical thinking :p
16:29:19 <palendae> I don't disagree that our current structure should be deprecated
16:29:23 <palendae> Just make sure there's a path
16:29:28 <evrardjp> oh yeah, that would be the problem of radical thinking :p
16:29:32 <odyssey4me> evrardjp yeah, if we can use a better upstream method, and adjust the inventory to cater for both then that'd be a win
16:29:37 <odyssey4me> we can then deprecate the old method
16:29:58 <odyssey4me> I expect that in the dynamic inventory we could import something to allow both to work
16:30:01 <evrardjp> yeah, I think making the inventory optional and all that jazz.... But I haven't written the spec for the ptg yet.
16:30:12 <evrardjp> they can be cumulative.
16:30:20 <evrardjp> well
16:30:22 <palendae> Doesn't have to happen all at once, either
16:30:38 <evrardjp> in the future, it's possible for them to be cumulative
16:30:40 <palendae> I'd advocate a slower approach just because our current inventory isn't very well understood
16:30:42 <evrardjp> so we can phase out
16:30:46 <palendae> And pretty vital
16:30:49 <evrardjp> yeah
16:31:06 <evrardjp> that's what ansible 2.4 could bring us.
16:31:10 <palendae> Again, not disagreeing at all; just thinking slower might be better
16:31:17 <evrardjp> but let's continue the bug triage :)
16:31:20 <palendae> Yes
16:31:28 <evrardjp> for https://bugs.launchpad.net/openstack-ansible/+bug/1711638
16:31:29 <openstack> Launchpad bug 1711638 in openstack-ansible "neutron-dnsmasq.log not forwarded to rsyslog" [Undecided,New]
16:31:38 <evrardjp> I think it's related to rsyslog client.
16:31:55 <evrardjp> so I expect this to be fixed soon, when I'll backport the change.
16:32:06 <openstackgerrit> Merged openstack/openstack-ansible stable/pike: Bootstrap Ansible fails if partial keypair exists  https://review.openstack.org/496164
16:32:19 <openstackgerrit> Merged openstack/openstack-ansible stable/newton: Bootstrap Ansible fails if partial keypair exists  https://review.openstack.org/496166
16:32:21 <evrardjp> I'd say we can confirm in the meantime: if the rsyslog client doesn't ship logs, then it makes sense that dnsmasq logs aren't forwarded.
16:32:26 <openstackgerrit> Merged openstack/openstack-ansible stable/ocata: Bootstrap Ansible fails if partial keypair exists  https://review.openstack.org/496165
16:32:36 <openstackgerrit> Merged openstack/openstack-ansible master: Update reno for stable/pike  https://review.openstack.org/494992
16:32:37 <evrardjp> what do you think?
16:32:42 <evrardjp> Adri2000: are you there?
16:34:01 <openstackgerrit> Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-os_tempest master: Update reno for stable/pike  https://review.openstack.org/495079
16:34:10 <openstackgerrit> Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-os_nova master: Update reno for stable/pike  https://review.openstack.org/495067
16:34:10 <evrardjp> ok let's move on
16:34:12 <evrardjp> next
16:34:14 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1711577
16:34:14 <openstack> Launchpad bug 1711577 in openstack-ansible "os_neutron: vpnaas variables missing for RedHat" [Undecided,New]
16:34:15 <openstackgerrit> Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-os_ironic master: Update reno for stable/pike  https://review.openstack.org/495055
16:34:18 <evrardjp> Ok I'll take it.
16:34:26 <evrardjp> I think it's confirmed and lwo
16:34:39 <evrardjp> but I have a patch pending that touches vpnaas already.
16:35:24 <odyssey4me> evrardjp apologies - was elsewhere, the rsyslog one, I don't think we've told that file to ship anywhere... but it doesn't exist until neutron is running so I'm not sure it'll be easy to make it do the thing... but it may relate
16:35:26 <openstackgerrit> Jean-Philippe Evrard proposed openstack/openstack-ansible-os_neutron master: Fix VPNaaS variable definition for non-ubuntu  https://review.openstack.org/492495
16:36:25 <evrardjp> odyssey4me: oh, so it means it would become a wishlist item?
16:36:38 <evrardjp> I'd say it's still a bug, because expectations aren't set
16:36:41 <Adri2000> I think it's mentioned in the rsyslog config
16:37:03 <odyssey4me> evrardjp I'd suggest that it needs triage first I think - need to verify whether it's already configured
16:37:20 <evrardjp> odyssey4me: that's what we are doing right now :)
16:38:06 <Adri2000> /etc/rsyslog.d/99-neutron-rsyslog-client.conf:$InputFileName /var/log/neutron/neutron-dnsmasq.log
16:38:27 <evrardjp> Sounds confirmed to me
16:38:31 <odyssey4me> in that case, the two bugs may be related then - with the extra complexity that the file may not exist until a later restart
16:38:50 <evrardjp> odyssey4me: that's true
16:38:57 <evrardjp> criticality?
16:39:00 <odyssey4me> if we move all log shipping to use systemd then we'll likely have better results
16:39:05 <Adri2000> and: "<evrardjp> I think it's related to rsyslog client." < if you mean bug #1699875, I'm not sure because I fixed that bug locally on this deployment
16:39:06 <openstack> bug 1699875 in openstack-ansible "rsyslog client postrotate script contains invalid command" [High,In progress] https://launchpad.net/bugs/1699875 - Assigned to Jean-Philippe Evrard (jean-philippe-evrard)
16:39:21 <odyssey4me> my suggestion is low - it's not a big impact and there is a workaround (restart) available
16:40:19 <evrardjp> Adri2000: then what do you mean there?
16:40:23 <evrardjp> odyssey4me: I agree.
16:40:47 <Adri2000> evrardjp: I mean that a restart won't fix the neutron-dnsmasq.log issue
16:40:57 <Adri2000> unless I missed something of course :)
16:41:13 <Adri2000> I think it's really it's a different issue
16:41:22 <evrardjp> oh ok
16:41:26 <evrardjp> Oh yeah permissions.
16:41:35 <evrardjp> sounds different then.
16:41:41 <Adri2000> maybe permissions indeed
16:41:43 <spotz> oops got distracted:(
16:42:01 <evrardjp> I guess we need more ppl to look at it.
16:42:19 <evrardjp> in the meantime it's triaged, and we'll continue our lifes.
16:42:28 <evrardjp> we have much work this week.
16:42:31 <evrardjp> next
16:42:47 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1711376
16:42:49 <openstack> Launchpad bug 1711376 in openstack-ansible "intermittent AIO error: Timeout (7s) waiting for privilege escalation prompt" [Undecided,New]
16:42:59 <evrardjp> hwoarang: could you tell us more?
16:43:10 <evrardjp> is the fix there a definitive fix?
16:44:15 <evrardjp> has anyone noticed that, or got the chance to triage it?
16:44:53 <evrardjp> ok let's move on
16:44:55 <evrardjp> next
16:44:55 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1711349
16:44:56 <openstack> Launchpad bug 1711349 in openstack-ansible "CentOS/Pike - resolvconf - No file was found when using with_first_found." [Undecided,New]
16:45:17 <evrardjp> mhayden: logan-?
16:45:22 <odyssey4me> this one appears to not be centos specific
16:45:30 <evrardjp> ok
16:45:34 <odyssey4me> and cloudnull has a patch in for it
16:46:32 <evrardjp> I think it got a few eyes on it, so it should be good.
16:47:27 <odyssey4me> I think that one is release critical for pike though
16:48:14 <evrardjp> andymccr: are our pike gates busted?
16:48:25 <andymccr> evrardjp: not that i was aware of.
16:48:30 <evrardjp> sorry to ask I didn't got the chance to babysit
16:48:37 <andymccr> oh maybe on centos7
16:48:48 <odyssey4me> no, this doesn't show up in the gates for some reason - but it does show up when using vagrant or another environment to build
16:48:48 <andymccr> ok if that error is happening i have an idea why
16:49:09 <odyssey4me> I think we need a little more review/test attention in https://review.openstack.org/#/c/493739/
16:50:21 <evrardjp> well we need a proper triaging of this issue.
16:50:48 <evrardjp> We need a manual build if gates doesn't show it up.
16:50:51 <evrardjp> ok.
16:51:03 <evrardjp> I will try it.
16:51:36 <evrardjp> next
16:51:49 <jmccrory> it shows up in the periodic upgrade
16:51:59 <evrardjp> oh
16:52:01 <evrardjp> darn
16:52:04 <jmccrory> http://logs.openstack.org/periodic/periodic-openstack-ansible-upgrade-aio-master-ubuntu-xenial/
16:53:15 <evrardjp> yeah sounds legit to me then.
16:53:21 <evrardjp> Confirmed high ?
16:53:45 <jmccrory> sounds right
16:54:25 <evrardjp> ok next
16:54:27 <evrardjp> #link https://bugs.launchpad.net/openstack-ansible/+bug/1711347
16:54:28 <openstack> Launchpad bug 1711347 in openstack-ansible "Config file override mechanism broken" [Undecided,New]
16:55:34 <jamesdenton> oh thats a good one
16:55:36 <jamesdenton> :P
16:55:39 <evrardjp> :)
16:55:46 <evrardjp> I see what you did there jamesdenton! :p
16:55:47 <andymccr> what kind of personw ould file a bug like this
16:56:36 <evrardjp> HR trap there andymccr.
16:56:36 <ricardoas> Hi everyone... i´m trying to launch an aio with ironic following the instructions on https://docs.openstack.org/openstack-ansible/latest/contributor/quickstart-aio.html
16:56:37 <ricardoas> I copied ironic.yaml.aio to conf.d, removed .aio, and run openstack-ansible setup-{hosts,infrastructure,openstack}.yml but there is no ironic related container... is there any additional step to enable it?
16:56:48 <jamesdenton> yeah i don't quite know how to answer that
16:56:57 <evrardjp> haha
16:56:59 <evrardjp> well
16:57:06 <evrardjp> let's finish this triage shall we? :D
16:57:35 <evrardjp> so for us to reproduce the issue, we need to be full python3 ? I thought we didn't support that
16:58:13 <andymccr> evrardjp: not yet
16:58:24 <evrardjp> fair enough ;)
16:58:30 <andymccr> evrardjp: i think thats wish list. we need to do that next cycle tbh
16:58:40 <jamesdenton> Well, this environment was deployed on a fresh Ubuntu 16.04 image, so i'm not sure how it would've ended up using Py 3.5 rather than 2.7
16:58:42 <evrardjp> my point is that I am surprised that the action plugin is by default using python3
16:58:57 <evrardjp> /opt/ansible-runtime/lib/python3.5/site-packages/ansible/executor/task_executor.py
16:59:05 <evrardjp> :p
16:59:31 <evrardjp> so probably need to make the config template dual python compatible too.
16:59:35 <spotz> ricardoas: We're in the middle of bug triage, someone can help you in a few
16:59:42 <evrardjp> for me it makes sense to do it.
17:00:21 <evrardjp> so I'd say confirmed and medium at least, because it's in openstack strategy, .... but on top of it, if it prevents deploys, I'd move it high
17:00:26 <ricardoas> thanks, spotz! I´ll keep trying here... :)
17:00:35 <evrardjp> andymccr: opinion?
17:00:45 <andymccr> hmm
17:00:57 <andymccr> so how do we repeat this error?
17:01:18 <evrardjp> have python3 when bootstrapping ansible I guess
17:01:28 <andymccr> evrardjp: but we do that by default already?
17:01:37 <andymccr> for bootstrapping ansible i believe we already do py3
17:01:40 <andymccr> atleast on ubuntu
17:01:40 <evrardjp> and then we could technically use our plugin without the wrapper
17:02:17 <evrardjp> yeah but if we install python, all the #!/usr/bin/env python will be using python2
17:02:24 <evrardjp> what I mean here is we generally use python2
17:02:39 <evrardjp> we could force python3 in our wrapper?
17:02:50 <evrardjp> jamesdenton: what did you do to get that issue?
17:03:02 <evrardjp> Nothing particularily? Did you bootstrap-ansible.sh ?
17:03:14 <jamesdenton> yes, i bootstrapped aio and ansible
17:03:21 <jamesdenton> basically followed the developer guide
17:03:46 <evrardjp> mmm . We need to stop the bug triage meeting for today, we are already exceeding time allocated.
17:03:47 <jamesdenton> the environment went down fine. Just making changes post-deploy using the override mechanism is how i ran into it
17:03:57 <odyssey4me> we'll need ot make acall about whether py3 for the ansible venv should be in pike or not
17:04:13 <odyssey4me> we may need to revert to py2 for pike, and continue work on it for queens
17:04:17 <evrardjp> odyssey4me: it's too late for that I think.
17:04:31 <odyssey4me> evrardjp it's never too late ;)
17:04:37 <evrardjp> fair enough :)
17:04:58 <andymccr> jamesdenton: hmm maybe
17:04:59 <evrardjp> well let's leave the bug as is, for ppl to triage it
17:05:02 <andymccr> we should test that then
17:05:14 <andymccr> ok thats an interesting bug at least
17:05:17 <odyssey4me> I'd prefer to continue with what we have - but if we think there's a fair risk of trouble we could perhaps revert it - it's only tested on ubuntu right now
17:05:27 <andymccr> probably agree with odyssey4me on ripping out py3 from pike
17:05:45 <evrardjp> I'd say ok to me.
17:06:05 <evrardjp> let's close the bug triage and discuss that into another meeting
17:06:11 <odyssey4me> yup
17:06:22 <evrardjp> thanks everyone for this very busy and very active bug triage!
17:06:28 <evrardjp> more next week ! :p
17:06:31 <evrardjp> #endmeeting