16:01:03 #startmeeting openstack_ansible_meeting 16:01:05 Meeting started Tue Feb 28 16:01:03 2017 UTC and is due to finish in 60 minutes. The chair is evrardjp. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:06 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:08 Hooray 16:01:08 The meeting name has been set to 'openstack_ansible_meeting' 16:01:19 o/ 16:01:25 Large list today 16:01:38 mgariepy: perhaps that works as a stop gap for now. 16:01:45 So, sorry for not having send an email last week to tell the meeting was cancelled 16:01:49 during PTG 16:01:54 ^i am also sorry for that. 16:02:01 anyway, let's move on. 16:02:05 ahha yes quickly move on :P 16:02:12 lol 16:02:15 :D 16:02:21 first one of the day! 16:02:23 https://bugs.launchpad.net/openstack-ansible/+bug/1667960 16:02:23 Launchpad bug 1667960 in openstack-ansible "Rebuilding keystone[0] container breaks credential keys" [Undecided,New] 16:03:04 I saw a commit in a similar topic, done by logan- 16:03:23 I think other ppl voted on it, and I think we can mark this as confirmed 16:03:23 ok 16:03:29 yeah that sounds legit 16:03:54 I still think there is other ways to solve this, and we are more and more lacking of a proper facility to do file distribution with ansible 16:04:11 but let's forget this for now, and focus on the bug 16:04:21 IMO the criticality is high (at minimum) 16:04:30 if we break credential keys, it's serious 16:04:32 agreed on criticality and confirmedness 16:04:32 yeah 16:04:32 agreed 16:04:32 at least high 16:04:42 because by the time you find out it is broken, the keys are unrecoverable 16:04:47 yeah 16:04:51 critical then? 16:04:54 yeah lets say that 16:04:57 ok 16:04:57 yep 16:04:59 that is pretty nasty 16:05:00 who works on it 16:05:16 logan-: you have time to work on this one too? 16:05:30 i have some ideas on how we might be able to solve it. will work on a patch this week for it 16:05:35 I think whatever the method, the important is getting it merged right now (harry principle) 16:05:55 Merged openstack/openstack-ansible stable/newton: Bump BIRD and etcd role pins https://review.openstack.org/438688 16:05:59 logan-: great! 16:06:01 Merged openstack/openstack-ansible master: Use an explicit version of urrlib3 https://review.openstack.org/438977 16:06:09 let's move on the next one. 16:06:21 https://bugs.launchpad.net/openstack-ansible/+bug/1667814 16:06:21 Launchpad bug 1667814 in openstack-ansible "inventory-manage.py --file option can cause the output to overwrite the default inventory file" [Undecided,New] - Assigned to Joel Griffiths (5s-ubuntu) 16:06:23 well id rather it was a good fix than an anything goes fix :P but yeah it would be good to get it fixed asap 16:07:03 I was not even aware of this feature 16:07:26 palendae: can we assume it's worked on by Joel Griffiths? 16:07:32 hmm 16:07:51 if True; status -> In Progress? 16:08:20 ahh this one is diff 16:08:48 that's a bad bug 16:08:51 yeah 16:09:11 can someone confirm it? 16:09:31 havnt tested it 16:09:52 evrardjp: lets move on - im sure palendae will be back in a bit and can help us look at that one! 16:09:59 Importance would be medium because it breaks stuff (high) but with a low probability to happen. 16:10:00 evrardjp: I'll double check it this afternoon, but last I talked to Joel it sounded plausible 16:10:05 ahh see! easy :P 16:10:10 :D 16:10:36 palendae: you can move it to In Progress if Joel is working on it :) 16:10:44 in the meantime confirmed medium 16:10:49 evrardjp: Alrighty 16:11:17 next 16:11:19 https://bugs.launchpad.net/openstack-ansible/+bug/1667796 16:11:19 Launchpad bug 1667796 in openstack-ansible "lookup url function proxy issue with venv checksum" [Undecided,New] 16:11:45 let's wait for news 16:11:58 technically ansible should respect env vars for that 16:12:03 IIRC 16:12:16 evrardjp: agreed - more info would be good 16:12:29 I'll mark it as incomplete 16:12:43 next 16:12:45 https://bugs.launchpad.net/openstack-ansible/+bug/1667747 16:12:45 Launchpad bug 1667747 in openstack-ansible "Swap creation fails on CentOS AIO" [Undecided,New] 16:13:18 I'm fine with the fix. 16:13:22 :D 16:13:22 agreed 16:13:34 confirmed low, low hanging fruit 16:13:47 any beginner in openstack-ansible wanting to patch this? 16:14:14 ok let's move on 16:14:18 next: 16:14:20 https://bugs.launchpad.net/openstack-ansible/+bug/1667337 16:14:20 Launchpad bug 1667337 in openstack-ansible "Error running os-nova-install with nova-config tag in 15.0.0" [Undecided,New] 16:15:10 hmm 16:15:15 probably due to filtering of the tag 16:15:34 worth having a look 16:15:53 I think cloudnull had a patch for tags in nova IIRC 16:15:58 jamesdenton: does that only fail with the "--tags", or would it fail normally too? 16:16:33 hi 16:16:44 A wild operator appears 16:17:05 the irc ping thing is beautiful :D 16:17:07 so i had no issue with os-nova-install as part of the deploy, only with --tags nova-config 16:17:11 andymccr: truly is 16:17:14 jamesdenton: ok cool thanks! 16:17:41 yeah, the pip install part is not part of config, it's only part of install 16:17:49 i'll hang around for the other bugs, too :P 16:17:55 jamesdenton: repeat offender! 16:18:01 Stop breaking it 16:18:03 ;) 16:18:19 :P 16:18:31 the question is why ansible just doesn't "only run this tag" 16:18:49 jamesdenton: meanwhile you can maybe use --skip-tags=nova-install 16:19:17 or just run the whole thing 16:19:30 let's say confirmed and low for the triage, ok for everyone? 16:19:31 i could - and will - if i need to do it again 16:19:37 thanks for the suggestion 16:19:49 I think we could refine this 16:19:56 yeah that tags thing is weird 16:20:00 but yeah the triage seems ok 16:20:18 ok 16:20:19 next 16:20:31 https://bugs.launchpad.net/openstack-ansible/+bug/1667193 16:20:31 Launchpad bug 1667193 in openstack-ansible "During N->O upgrade, conditional check bombs when inventory hostname not in nova service name" [Undecided,New] 16:20:32 still jamesdenton 16:20:34 :D 16:20:40 ahh yes 16:21:06 In this case, the hostname of the compute node != to the name defined in inventory 16:21:41 oh i see 16:21:45 what's the difference? 16:21:48 basically 16:21:49 it is using ansible_hostname there, not inventory_hostname 16:21:49 _ vs - ? 16:21:55 do you have stale facts maybe? 16:22:13 no, in this case the inventory hostname was very basic (ie. compute01) vs i812847.NewtonTest.com 16:22:20 ansible hostname should be fine, it should be the hostname of the compute node 16:22:26 and the latter is how the services register themselves to Nova 16:22:53 *shrug* 16:23:28 unless you are overriding your nova hostname in nova.conf, ansible_hostname should match the default hostname that nova uses to register its service 16:23:45 ^ agreed 16:23:46 yeah, definitely not overriding. 16:24:17 jamesdenton: check the 127.0.1.1 line in your /etc/hosts 16:24:22 could you report it in the bug? 16:24:25 yes one sec 16:24:36 and hostname file 16:25:09 I don't see a bug there in the code, but there could be a bug in our hosts file management 16:25:23 (I still think we should use dns) 16:25:44 rabbit! 16:26:31 i will update the bug in a bit - 127.0.0.1 line is just localhost 16:26:40 and 1.1 ? 16:27:02 Basically dump the file somewhere :) 16:27:04 if you can 16:27:07 same for hostname 16:27:16 in the meantime, I'll have to continue the triage 16:27:24 we have many bugs today 16:27:37 next 16:27:40 https://bugs.launchpad.net/openstack-ansible/+bug/1667130 16:27:40 Launchpad bug 1667130 in openstack-ansible "During N->O upgrade, nova-api-os-compute service won't start" [Undecided,New] 16:27:55 I guess it's all the same issue 16:28:18 if you can't properly list and match, that's gonna be a problem, right? 16:29:41 evrardjp i will get you the info after this 16:30:03 that bug there is related to systemd 16:30:34 andymccr: could you have a look at this one? 16:30:59 so nova_compute_wait should run after a flush handlers 16:31:02 hmm 16:31:17 so I guess the flush handlers should have been doing a systemctl reload 16:31:28 i'll look 16:31:33 thanks 16:31:37 we dont seem to do a systemctl reload so maybe we need that instead? 16:31:48 fun with placement api moves ! :) 16:31:56 oh I thought we had that in handlers 16:32:05 yes we do 16:32:07 most likely cells 16:32:20 ahh daemon reload 16:32:24 yes 16:32:27 yeah 16:32:28 ok 16:32:36 yeah andymccr, daemon reload!!! 16:32:37 maybe the module notified doesn't have it 16:32:45 in that case, we need to daemon reload 16:33:17 jamesdenton: if you have the full playbook run it could help, this way we see which one changed, and therefore which notified handler needs a daemon reload 16:33:33 or we can just daemon_reload everywhere :D 16:33:57 yeah im wondering when it failed to restart 16:33:58 hmm 16:34:14 ok well we can move on 16:34:14 evrardjp i may have to get that to you later as well - i can try to run thru this upgrade in another environment. Sorry I don't have that info anymore. Let me know how i can make the bug reports more informational moving forward 16:34:29 probably we can change command: /bin/true to systemd module, and force it changed_when :D but that's not really better IMO 16:34:43 jamesdenton: thanks! 16:35:04 sometimes it's also helpful to have vars, but here I don't think it's gonna be needed :D 16:35:06 FYI :p 16:35:08 ok next 16:35:22 i just wonder why nova crashed if we do a daemon reload on restarts... 16:35:24 https://bugs.launchpad.net/openstack-ansible/+bug/1667127 16:35:24 Launchpad bug 1667127 in openstack-ansible "Error running nova-install.yml during N->O upgrade due to table permissions" [Undecided,New] 16:35:24 but anyway 16:35:33 hey, me again 16:35:39 andymccr: yes, let's gather more info. 16:35:44 haha :D 16:35:54 i hate cells :P 16:35:55 jamesdenton: you are the bug reporter of the week, congratulations! 16:36:16 early adopter award 16:36:31 well you got the nova fun :D 16:36:35 hmm 16:36:41 just a little after andymccr :D 16:36:47 the cell0 db should need access from nova_api user not nova user afaik 16:37:21 I can't confirm or deny :/ 16:37:22 OperationalError: (pymysql.err.OperationalError) (1044, u"Access denied for user 'nova'@'%' to database 'nova_cell0'") 16:37:31 is there someone else that can help here? 16:38:06 we don't want to overburden andymccr with nova, right ? 16:38:08 hmm 16:38:16 how is our gate passing is my question :/ 16:38:39 jamesdenton: is that a greenfield? 16:38:44 N->O upgrade 16:38:49 ooh 16:38:57 oh yes, I didn't see that in the title. 16:39:12 you live on the edge! 16:39:31 yeah, i enjoy the pain 16:40:07 I'd be happy to mark that has a high importance, but I can't confirm it 16:40:17 its high for upgrades at least 16:40:50 andymccr: let's mark it as confirmed and high? Could we have some eyes in the community there? 16:41:02 that was meant to be 2 sentences 16:42:32 ok I'll leave it as is, hoping it gets more attention next week 16:42:39 next one is 16:42:51 (sorry for that) 16:42:57 so next one is 16:42:59 https://bugs.launchpad.net/openstack-ansible/+bug/1667103 16:42:59 Launchpad bug 1667103 in openstack-ansible "Error upgrading MariaDB during N->O upgrade" [Undecided,New] 16:43:11 can we voluntold mancdaz for this one? 16:43:22 yeah just mark it critical and assign it to mancdaz 16:43:23 +1:) 16:43:35 evrardjp andymccr I don't think you understand how this works 16:43:48 yes we do, yes we do! 16:44:08 (same voice as in pulp fiction "yes you did") 16:44:15 that bug sounds sensible 16:44:25 yes :/ 16:44:37 I still think it's worth confirming and high 16:44:39 that's newton 16:44:42 and breaking stuff 16:44:44 N->O 16:44:48 it looks familiar 16:44:48 from N 16:45:08 jamesdenton: yes but upgrade of galera should be well known 16:45:10 :D 16:45:29 I guess systemd again :D 16:45:45 andymccr: confirmed high ? 16:45:46 seems like we're doing a wsrep-new-cluster when we shouldnt 16:45:48 yeah 16:46:13 galera needs love for its systemd units. 16:46:35 I'm not assigning mancdaz, but we would be pleased :D 16:46:47 next 16:46:59 https://bugs.launchpad.net/openstack-ansible/+bug/1667060 16:47:00 Launchpad bug 1667060 in openstack-ansible "Ceilometer should batch by number of notification agents" [Undecided,New] 16:47:38 stevelle: do you plan to have a commit after that, or do we keep it as wishlist for now? 16:47:46 I'm not sure about the risk impact analysis on this one 16:48:35 evrardjp: based on upstream changes, medium is fine. 16:49:10 if someone wants to run ceilometer at anything beyond 1 node it's high 16:49:36 so you mean we are breaking things? 16:49:57 you *can* get incorrect metrics if it's set wrong 16:50:04 ok 16:50:07 not /will/ 16:50:34 ok, it breaks the user experience if wrongly deployed, so I understand your classification 16:50:41 I'd be enclined to say medium 16:50:48 agree 16:50:57 Let's put that into confirmed medium 16:51:17 anyone can pick bugs, right :D 16:51:25 true true 16:51:44 (just a friendly reminder :D) 16:51:59 ok next 16:52:00 https://bugs.launchpad.net/openstack-ansible/+bug/1666765 16:52:00 Launchpad bug 1666765 in openstack-ansible "RPC settings not set for neutron agents" [Undecided,New] - Assigned to Bjoern Teipel (bjoern-teipel) 16:52:09 there are fixes in it 16:52:15 Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-lxc_hosts stable/newton: Add Trusty backports repo if it is not enabled https://review.openstack.org/439053 16:52:18 still the status is new 16:52:36 I'll retag 16:53:35 next one 16:53:38 https://bugs.launchpad.net/openstack-ansible/+bug/1666625 16:53:38 Launchpad bug 1666625 in openstack-ansible "Ceilometer still sending samples to messaging queue" [Undecided,New] 16:54:37 stevelle: could you give your opinion? 16:54:40 it sounds legit 16:54:44 alextricity25: are you working on it? 16:55:00 not at the moment, no. 16:55:09 I'm working on deprecating the ceilometer api ATM 16:55:17 ok 16:55:36 well I can't say for prios, but it sounds a good thing to fix :p 16:55:45 actually....This one was resolved by removing the ceilometer-collector 16:55:58 yeah, that was resolved at PTG 16:56:13 so we fixed that during the PTG? 16:56:20 Do you have a commit I can refer to? 16:56:23 looking 16:56:34 thanks 16:57:17 evrardjp: https://review.openstack.org/#/c/436222/ 16:57:24 I396b154d106c0afba44d57792ae6dad39b33a6f5 16:57:50 evrardjp: So really this bug was fixed by another bug fix 16:57:53 you are talking about the same 16:57:54 ok 16:57:58 good 16:58:05 I'll mark this as fix released. 16:58:12 I just forgot to reference that. Sorry 16:58:17 no that's fine :) 16:58:21 fix committed* 16:58:22 boom fixing bugs 16:58:40 in which world are we living! 16:59:04 Anyway, we don't ahve time for the next bugs, so let's call it a day, and postpone to next week 16:59:12 si si 16:59:22 The imperatress? 16:59:29 If that translate 16:59:38 +s 16:59:39 jamesdenton: i'll be spinning up some aio stable/newton to test those upgrades 16:59:47 hopefuly get some done tomorrow/thursday 16:59:47 andymccr: thanks! 16:59:56 thanks andymccr. I don't know how soon i can get it done 17:00:08 Collaboration \o/ 17:00:19 Let's close the meeting for today 17:00:23 Thanks everyone 17:00:28 \o 17:00:32 #endmeeting