16:01:03 <evrardjp> #startmeeting openstack_ansible_meeting
16:01:05 <openstack> Meeting started Tue Feb 28 16:01:03 2017 UTC and is due to finish in 60 minutes.  The chair is evrardjp. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:01:06 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:01:08 <asettle> Hooray
16:01:08 <openstack> The meeting name has been set to 'openstack_ansible_meeting'
16:01:19 <spotz> o/
16:01:25 <evrardjp> Large list today
16:01:38 <andymccr> mgariepy: perhaps that works as a stop gap for now.
16:01:45 <evrardjp> So, sorry for not having send an email last week to tell the meeting was cancelled
16:01:49 <evrardjp> during PTG
16:01:54 <andymccr> ^i am also sorry for that.
16:02:01 <evrardjp> anyway, let's move on.
16:02:05 <andymccr> ahha yes quickly move on :P
16:02:12 <alextricity25> lol
16:02:15 <evrardjp> :D
16:02:21 <evrardjp> first one of the day!
16:02:23 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1667960
16:02:23 <openstack> Launchpad bug 1667960 in openstack-ansible "Rebuilding keystone[0] container breaks credential keys" [Undecided,New]
16:03:04 <evrardjp> I saw a commit in a similar topic, done by logan-
16:03:23 <evrardjp> I think other ppl voted on it, and I think we can mark this as confirmed
16:03:23 <andymccr> ok
16:03:29 <andymccr> yeah that sounds legit
16:03:54 <evrardjp> I still think there is other ways to solve this, and we are more and more lacking of a proper facility to do file distribution with ansible
16:04:11 <evrardjp> but let's forget this for now, and focus on the bug
16:04:21 <evrardjp> IMO the criticality is high (at minimum)
16:04:30 <evrardjp> if we break credential keys, it's serious
16:04:32 <logan-> agreed on criticality and confirmedness
16:04:32 <andymccr> yeah
16:04:32 <andymccr> agreed
16:04:32 <andymccr> at least high
16:04:42 <logan-> because by the time you find out it is broken, the keys are unrecoverable
16:04:47 <andymccr> yeah
16:04:51 <evrardjp> critical then?
16:04:54 <andymccr> yeah lets say that
16:04:57 <evrardjp> ok
16:04:57 <logan-> yep
16:04:59 <andymccr> that is pretty nasty
16:05:00 <evrardjp> who works on it
16:05:16 <evrardjp> logan-: you have time to work on this one too?
16:05:30 <logan-> i have some ideas on how we might be able to solve it. will work on a patch this week for it
16:05:35 <evrardjp> I think whatever the method, the important is getting it merged right now (harry principle)
16:05:55 <openstackgerrit> Merged openstack/openstack-ansible stable/newton: Bump BIRD and etcd role pins  https://review.openstack.org/438688
16:05:59 <evrardjp> logan-: great!
16:06:01 <openstackgerrit> Merged openstack/openstack-ansible master: Use an explicit version of urrlib3  https://review.openstack.org/438977
16:06:09 <evrardjp> let's move on the next one.
16:06:21 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1667814
16:06:21 <openstack> Launchpad bug 1667814 in openstack-ansible "inventory-manage.py --file option can cause the output to overwrite the default inventory file" [Undecided,New] - Assigned to Joel Griffiths (5s-ubuntu)
16:06:23 <andymccr> well id rather it was a good fix than an anything goes fix :P but yeah it would be good to get it fixed asap
16:07:03 <evrardjp> I was not even aware of this feature
16:07:26 <evrardjp> palendae: can we assume it's worked on by Joel Griffiths?
16:07:32 <andymccr> hmm
16:07:51 <evrardjp> if True; status -> In Progress?
16:08:20 <andymccr> ahh this one is diff
16:08:48 <evrardjp> that's a bad bug
16:08:51 <andymccr> yeah
16:09:11 <evrardjp> can someone confirm it?
16:09:31 <andymccr> havnt tested it
16:09:52 <andymccr> evrardjp: lets move on - im sure palendae will be back in a bit and can help us look at that one!
16:09:59 <evrardjp> Importance would be medium because it breaks stuff (high) but with a low probability to happen.
16:10:00 <palendae> evrardjp: I'll double check it this afternoon, but last I talked to Joel it sounded plausible
16:10:05 <andymccr> ahh see! easy :P
16:10:10 <evrardjp> :D
16:10:36 <evrardjp> palendae: you can move it to In Progress if Joel is working on it :)
16:10:44 <evrardjp> in the meantime confirmed medium
16:10:49 <palendae> evrardjp: Alrighty
16:11:17 <evrardjp> next
16:11:19 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1667796
16:11:19 <openstack> Launchpad bug 1667796 in openstack-ansible "lookup url function proxy issue with venv checksum" [Undecided,New]
16:11:45 <evrardjp> let's wait for news
16:11:58 <evrardjp> technically ansible should respect env vars for that
16:12:03 <evrardjp> IIRC
16:12:16 <andymccr> evrardjp: agreed - more info would be good
16:12:29 <evrardjp> I'll mark it as incomplete
16:12:43 <evrardjp> next
16:12:45 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1667747
16:12:45 <openstack> Launchpad bug 1667747 in openstack-ansible "Swap creation fails on CentOS AIO" [Undecided,New]
16:13:18 <evrardjp> I'm fine with the fix.
16:13:22 <evrardjp> :D
16:13:22 <andymccr> agreed
16:13:34 <evrardjp> confirmed low, low hanging fruit
16:13:47 <evrardjp> any beginner in openstack-ansible wanting to patch this?
16:14:14 <evrardjp> ok let's move on
16:14:18 <evrardjp> next:
16:14:20 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1667337
16:14:20 <openstack> Launchpad bug 1667337 in openstack-ansible "Error running os-nova-install with nova-config tag in 15.0.0" [Undecided,New]
16:15:10 <andymccr> hmm
16:15:15 <evrardjp> probably due to filtering of the tag
16:15:34 <evrardjp> worth having a look
16:15:53 <evrardjp> I think cloudnull had a patch for tags in nova IIRC
16:15:58 <andymccr> jamesdenton: does that only fail with the "--tags", or would it fail normally too?
16:16:33 <jamesdenton> hi
16:16:44 <asettle> A wild operator appears
16:17:05 <andymccr> the irc ping thing is beautiful :D
16:17:07 <jamesdenton> so i had no issue with os-nova-install as part of the deploy, only with --tags nova-config
16:17:11 <asettle> andymccr: truly is
16:17:14 <andymccr> jamesdenton: ok cool thanks!
16:17:41 <evrardjp> yeah, the pip install part is not part of config, it's only part of install
16:17:49 <jamesdenton> i'll hang around for the other bugs, too :P
16:17:55 <andymccr> jamesdenton: repeat offender!
16:18:01 <palendae> Stop breaking it
16:18:03 <palendae> ;)
16:18:19 <jamesdenton> :P
16:18:31 <evrardjp> the question is why ansible just doesn't "only run this tag"
16:18:49 <evrardjp> jamesdenton: meanwhile you can maybe use --skip-tags=nova-install
16:19:17 <evrardjp> or just run the whole thing
16:19:30 <evrardjp> let's say confirmed and low for the triage, ok for everyone?
16:19:31 <jamesdenton> i could - and will - if i need to do it again
16:19:37 <jamesdenton> thanks for the suggestion
16:19:49 <evrardjp> I think we could refine this
16:19:56 <andymccr> yeah that tags thing is weird
16:20:00 <andymccr> but yeah the triage seems ok
16:20:18 <evrardjp> ok
16:20:19 <evrardjp> next
16:20:31 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1667193
16:20:31 <openstack> Launchpad bug 1667193 in openstack-ansible "During N->O upgrade, conditional check bombs when inventory hostname not in nova service name" [Undecided,New]
16:20:32 <evrardjp> still jamesdenton
16:20:34 <evrardjp> :D
16:20:40 <jamesdenton> ahh yes
16:21:06 <jamesdenton> In this case, the hostname of the compute node != to the name defined in inventory
16:21:41 <andymccr> oh i see
16:21:45 <evrardjp> what's the difference?
16:21:48 <andymccr> basically
16:21:49 <logan-> it is using ansible_hostname there, not inventory_hostname
16:21:49 <evrardjp> _ vs - ?
16:21:55 <logan-> do you have stale facts maybe?
16:22:13 <jamesdenton> no, in this case the inventory hostname was very basic (ie. compute01) vs i812847.NewtonTest.com
16:22:20 <evrardjp> ansible hostname should be fine, it should be the hostname of the compute node
16:22:26 <jamesdenton> and the latter is how the services register themselves to Nova
16:22:53 <jamesdenton> *shrug*
16:23:28 <logan-> unless you are overriding your nova hostname in nova.conf, ansible_hostname should match the default hostname that nova uses to register its service
16:23:45 <evrardjp> ^ agreed
16:23:46 <jamesdenton> yeah, definitely not overriding.
16:24:17 <evrardjp> jamesdenton: check the 127.0.1.1 line in your /etc/hosts
16:24:22 <evrardjp> could you report it in the bug?
16:24:25 <jamesdenton> yes one sec
16:24:36 <evrardjp> and hostname file
16:25:09 <evrardjp> I don't see a bug there in the code, but there could be a bug in our hosts file management
16:25:23 <evrardjp> (I still think we should use dns)
16:25:44 <evrardjp> rabbit!
16:26:31 <jamesdenton> i will update the bug in a bit - 127.0.0.1 line is just localhost
16:26:40 <evrardjp> and 1.1 ?
16:27:02 <evrardjp> Basically dump the file somewhere :)
16:27:04 <evrardjp> if you can
16:27:07 <evrardjp> same for hostname
16:27:16 <evrardjp> in the meantime, I'll have to continue the triage
16:27:24 <evrardjp> we have many bugs today
16:27:37 <evrardjp> next
16:27:40 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1667130
16:27:40 <openstack> Launchpad bug 1667130 in openstack-ansible "During N->O upgrade, nova-api-os-compute service won't start" [Undecided,New]
16:27:55 <evrardjp> I guess it's all the same issue
16:28:18 <evrardjp> if you can't properly list and match, that's gonna be a problem, right?
16:29:41 <jamesdenton> evrardjp i will get you the info after this
16:30:03 <jamesdenton> that bug there is related to systemd
16:30:34 <evrardjp> andymccr: could you have a look at this one?
16:30:59 <evrardjp> so nova_compute_wait should run after a flush handlers
16:31:02 <andymccr> hmm
16:31:17 <evrardjp> so I guess the flush handlers should have been doing a systemctl reload
16:31:28 <andymccr> i'll look
16:31:33 <evrardjp> thanks
16:31:37 <andymccr> we dont seem to do a systemctl reload so maybe we need that instead?
16:31:48 <evrardjp> fun with placement api moves ! :)
16:31:56 <evrardjp> oh I thought we had that in handlers
16:32:05 <evrardjp> yes we do
16:32:07 <andymccr> most likely cells
16:32:20 <andymccr> ahh daemon reload
16:32:24 <evrardjp> yes
16:32:27 <andymccr> yeah
16:32:28 <andymccr> ok
16:32:36 <jamesdenton> yeah andymccr, daemon reload!!!
16:32:37 <evrardjp> maybe the module notified doesn't have it
16:32:45 <evrardjp> in that case, we need to daemon reload
16:33:17 <evrardjp> jamesdenton: if you have the full playbook run it could help, this way we see which one changed, and therefore which notified handler needs a daemon reload
16:33:33 <evrardjp> or we can just daemon_reload everywhere :D
16:33:57 <andymccr> yeah im wondering when it failed to restart
16:33:58 <andymccr> hmm
16:34:14 <andymccr> ok well we can move on
16:34:14 <jamesdenton> evrardjp i may have to get that to you later as well - i can try to run thru this upgrade in another environment. Sorry I don't have that info anymore. Let me know how i can make the bug reports more informational moving forward
16:34:29 <evrardjp> probably we can change command: /bin/true to systemd module, and force it changed_when :D but that's not really better IMO
16:34:43 <evrardjp> jamesdenton: thanks!
16:35:04 <evrardjp> sometimes it's also helpful to have vars, but here I don't think it's gonna be needed :D
16:35:06 <evrardjp> FYI :p
16:35:08 <evrardjp> ok next
16:35:22 <andymccr> i just wonder why nova crashed if we do a daemon reload on restarts...
16:35:24 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1667127
16:35:24 <openstack> Launchpad bug 1667127 in openstack-ansible "Error running nova-install.yml during N->O upgrade due to table permissions" [Undecided,New]
16:35:24 <andymccr> but anyway
16:35:33 <jamesdenton> hey, me again
16:35:39 <evrardjp> andymccr: yes, let's gather more info.
16:35:44 <evrardjp> haha :D
16:35:54 <andymccr> i hate cells :P
16:35:55 <evrardjp> jamesdenton: you are the bug reporter of the week, congratulations!
16:36:16 <jamesdenton> early adopter award
16:36:31 <evrardjp> well you got the nova fun :D
16:36:35 <andymccr> hmm
16:36:41 <evrardjp> just a little after andymccr :D
16:36:47 <andymccr> the cell0 db should need access from nova_api user not nova user afaik
16:37:21 <evrardjp> I can't confirm or deny :/
16:37:22 <jamesdenton> OperationalError: (pymysql.err.OperationalError) (1044, u"Access denied for user 'nova'@'%' to database 'nova_cell0'")
16:37:31 <evrardjp> is there someone else that can help here?
16:38:06 <evrardjp> we don't want to overburden andymccr with nova, right ?
16:38:08 <andymccr> hmm
16:38:16 <andymccr> how is our gate passing is my question :/
16:38:39 <evrardjp> jamesdenton: is that a greenfield?
16:38:44 <jamesdenton> N->O upgrade
16:38:49 <evrardjp> ooh
16:38:57 <evrardjp> oh yes, I didn't see that in the title.
16:39:12 <evrardjp> you live on the edge!
16:39:31 <jamesdenton> yeah, i enjoy the pain
16:40:07 <evrardjp> I'd be happy to mark that has a high importance, but I can't confirm it
16:40:17 <andymccr> its high for upgrades at least
16:40:50 <evrardjp> andymccr: let's mark it as confirmed and high? Could we have some eyes in the community there?
16:41:02 <evrardjp> that was meant to be 2 sentences
16:42:32 <evrardjp> ok I'll leave it as is, hoping it gets more attention next week
16:42:39 <evrardjp> next one is
16:42:51 <evrardjp> (sorry for that)
16:42:57 <evrardjp> so next one is
16:42:59 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1667103
16:42:59 <openstack> Launchpad bug 1667103 in openstack-ansible "Error upgrading MariaDB during N->O upgrade" [Undecided,New]
16:43:11 <evrardjp> can we voluntold mancdaz for this one?
16:43:22 <andymccr> yeah just mark it critical and assign it to mancdaz
16:43:23 <spotz> +1:)
16:43:35 <mancdaz> evrardjp andymccr I don't think you understand how this works
16:43:48 <evrardjp> yes we do, yes we do!
16:44:08 <evrardjp> (same voice as in pulp fiction "yes you did")
16:44:15 <andymccr> that bug sounds sensible
16:44:25 <evrardjp> yes :/
16:44:37 <evrardjp> I still think it's worth confirming and high
16:44:39 <evrardjp> that's newton
16:44:42 <evrardjp> and breaking stuff
16:44:44 <jamesdenton> N->O
16:44:48 <mancdaz> it looks familiar
16:44:48 <evrardjp> from N
16:45:08 <evrardjp> jamesdenton: yes but upgrade of galera should be well known
16:45:10 <evrardjp> :D
16:45:29 <evrardjp> I guess systemd again :D
16:45:45 <evrardjp> andymccr: confirmed high ?
16:45:46 <andymccr> seems like we're doing a wsrep-new-cluster when we shouldnt
16:45:48 <andymccr> yeah
16:46:13 <evrardjp> galera needs love for its systemd units.
16:46:35 <evrardjp> I'm not assigning mancdaz, but we would be pleased :D
16:46:47 <evrardjp> next
16:46:59 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1667060
16:47:00 <openstack> Launchpad bug 1667060 in openstack-ansible "Ceilometer should batch by number of notification agents" [Undecided,New]
16:47:38 <evrardjp> stevelle: do you plan to have a commit after that, or do we keep it as wishlist for now?
16:47:46 <evrardjp> I'm not sure about the risk impact analysis on this one
16:48:35 <stevelle> evrardjp: based on upstream changes, medium is fine.
16:49:10 <stevelle> if someone wants to run ceilometer at anything beyond 1 node it's high
16:49:36 <evrardjp> so you mean we are breaking things?
16:49:57 <stevelle> you *can* get incorrect metrics if it's set wrong
16:50:04 <evrardjp> ok
16:50:07 <stevelle> not /will/
16:50:34 <evrardjp> ok, it breaks the user experience if wrongly deployed, so I understand your classification
16:50:41 <evrardjp> I'd be enclined to say medium
16:50:48 <stevelle> agree
16:50:57 <evrardjp> Let's put that into confirmed medium
16:51:17 <evrardjp> anyone can pick bugs, right :D
16:51:25 <andymccr> true true
16:51:44 <evrardjp> (just a friendly reminder :D)
16:51:59 <evrardjp> ok next
16:52:00 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1666765
16:52:00 <openstack> Launchpad bug 1666765 in openstack-ansible "RPC settings not set for neutron agents" [Undecided,New] - Assigned to Bjoern Teipel (bjoern-teipel)
16:52:09 <evrardjp> there are fixes in it
16:52:15 <openstackgerrit> Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-lxc_hosts stable/newton: Add Trusty backports repo if it is not enabled  https://review.openstack.org/439053
16:52:18 <evrardjp> still the status is new
16:52:36 <evrardjp> I'll retag
16:53:35 <evrardjp> next one
16:53:38 <evrardjp> https://bugs.launchpad.net/openstack-ansible/+bug/1666625
16:53:38 <openstack> Launchpad bug 1666625 in openstack-ansible "Ceilometer still sending samples to messaging queue" [Undecided,New]
16:54:37 <evrardjp> stevelle: could you give your opinion?
16:54:40 <evrardjp> it sounds legit
16:54:44 <evrardjp> alextricity25: are you working on it?
16:55:00 <alextricity25> not at the moment, no.
16:55:09 <alextricity25> I'm working on deprecating the ceilometer api ATM
16:55:17 <evrardjp> ok
16:55:36 <evrardjp> well I can't say for prios, but it sounds a good thing to fix :p
16:55:45 <alextricity25> actually....This one was resolved by removing the ceilometer-collector
16:55:58 <stevelle> yeah, that was resolved at PTG
16:56:13 <evrardjp> so we fixed that during the PTG?
16:56:20 <evrardjp> Do you have a commit I can refer to?
16:56:23 <stevelle> looking
16:56:34 <evrardjp> thanks
16:57:17 <alextricity25> evrardjp: https://review.openstack.org/#/c/436222/
16:57:24 <stevelle> I396b154d106c0afba44d57792ae6dad39b33a6f5
16:57:50 <alextricity25> evrardjp: So really this bug was fixed by another bug fix
16:57:53 <evrardjp> you are talking about the same
16:57:54 <evrardjp> ok
16:57:58 <evrardjp> good
16:58:05 <evrardjp> I'll mark this as fix released.
16:58:12 <alextricity25> I just forgot to reference that. Sorry
16:58:17 <evrardjp> no that's fine :)
16:58:21 <stevelle> fix committed*
16:58:22 <andymccr> boom fixing bugs
16:58:40 <evrardjp> in which world are we living!
16:59:04 <evrardjp> Anyway, we don't ahve time for the next bugs, so let's call it a day, and postpone to next week
16:59:12 <andymccr> si si
16:59:22 <evrardjp> The imperatress?
16:59:29 <evrardjp> If that translate
16:59:38 <evrardjp> +s
16:59:39 <andymccr> jamesdenton: i'll be spinning up some aio stable/newton to test those upgrades
16:59:47 <andymccr> hopefuly get some done tomorrow/thursday
16:59:47 <evrardjp> andymccr: thanks!
16:59:56 <jamesdenton> thanks andymccr. I don't know how soon i can get it done
17:00:08 <evrardjp> Collaboration \o/
17:00:19 <evrardjp> Let's close the meeting for today
17:00:23 <evrardjp> Thanks everyone
17:00:28 <stevelle> \o
17:00:32 <evrardjp> #endmeeting