16:00:35 #startmeeting openstack_ansible_meeting 16:00:36 my plate is always making me company :P 16:00:37 we can do it earlier. 16:00:38 Meeting started Tue Jul 31 16:00:35 2018 UTC and is due to finish in 60 minutes. The chair is mnaser. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:40 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:41 #topic roll call 16:00:42 The meeting name has been set to 'openstack_ansible_meeting' 16:00:44 o/ 16:00:48 o/ 16:00:51 o/ 16:00:53 \m/ 16:01:15 o/ 16:01:19 o/ 16:01:39 how's everyone doing on this lovely day 16:01:48 o/ 16:02:20 (everyone is doing great apparently) 16:02:22 cool so lets dive in 16:02:32 #topic last week highlights 16:02:46 odyssey4me: Reminded we need more work done on Bionic. 16:02:58 and jrosser has picked up the challenge! 16:03:03 jrosser: has been putting in work, hopefully he's been getting the reviews he needs (i am getting back to OSA 'officially' today) 16:03:18 yes i have been - thanks everyone 16:03:31 great 16:03:33 i can cover some detail there? 16:03:43 o/ 16:03:54 sure if there's something we need to be up to date on :) 16:04:06 top line is most stuff looks ok 16:04:14 rabbitmq verion needed a bump 16:04:20 Merged openstack/openstack-ansible-os_heat master: Use generic vars file for ubuntu https://review.openstack.org/586696 16:04:31 bionic uses lxc 3.x which has some config keys changed 16:04:43 that'll be a tricky one 16:04:44 o/ 16:04:56 vars/ubuntu-16.04.yml -> ubuntu.yml fixes most things 16:05:04 and glance v1 api removed 16:05:12 thats been it really 16:05:29 yeah i think it's good, it would maybe be good to have per-distro and per-version files 16:05:36 so we can override if any specific distros have weird things per version 16:05:57 but yeah, thanks for your awesome work 16:06:03 * mnaser will be putting in some centos 7 works again soon hopefully 16:06:05 hwoarang: Leap 15 is blocked due to lxc, sudo and mariadb bugs. Will work on that after holidays (and once aio_distro_basekit is in). 16:06:10 for the key changes we can use their converter tool 16:06:11 Merged openstack/openstack-ansible-os_ceilometer master: Use generic vars file for ubuntu https://review.openstack.org/586689 16:06:13 mnaser: well, I've asked to converge to one file where possible - so that new versions are more of a no-op if possible 16:06:17 less code churn 16:06:23 no worries :) - again thanks for the reviews everyone 16:06:26 odyssey4me: ++ absolutely agree 16:06:42 jrosser: maybe we should make sure the converter tool is used for upgrades? 16:06:52 Merged openstack/openstack-ansible-os_tacker master: Use generic vars file for ubuntu https://review.openstack.org/586708 16:06:52 yeah thats a good idea 16:07:02 the converter tool looks pretty useful 16:07:08 converter tool? 16:07:34 lxc 3.0 comes with a tool to convert old conf --> new 16:07:54 we're going to have to be careful about still supporting older lxc 16:08:04 so far there has only been one key thats been a problem 16:08:09 for things like centos 16:08:14 oh, interesting - although I suspect our upgrades for in-place would be more complex - not just converting existing config 16:08:24 in the past it's been to refresh the whole host 16:08:45 i ran into 2 i think, there is one for unconfined. but it may be a good idea - e.g. deploy as is now and then run the tool against the conf (or something similar) 16:08:53 odyssey4me: well that's why I bring the idea here: before starting full heads on renewing things,maybe we should just incorporate the tool 16:08:58 lxc.aa_profile -> lxc.apparmor.profile 16:09:12 andymccr: ++ 16:09:15 hoever there is a massive list of aparrently deprecated keys and a bunch of them are used 16:09:21 to there is follow up work to check those 16:09:25 Merged openstack/openstack-ansible-os_trove master: Use generic vars file for ubuntu https://review.openstack.org/586711 16:09:29 to/so 16:09:57 until a point where we only support lxc 3.0 + and then drop the tool? i dunno thats just an idea really 16:10:49 If the tool doesn't break the file with new keys, we can just use them both 16:11:01 I mean always use the tool 16:11:03 I don't know 16:11:10 im assuming lxc3 vars break lxc2 16:11:18 that's not what I meant 16:11:28 can we actually carry *both* lxc2 and lxc3 vars? 16:11:29 Merged openstack/openstack-ansible-os_sahara master: Use generic vars file for ubuntu https://review.openstack.org/586707 16:11:46 I meant, for greenfields: run the new templating and always use the tool, for upgrades: do not template, just use the tool 16:11:54 well 16:12:06 long story short, making conditionals on the version and running the tool? 16:12:20 I misexplained myself again 16:13:00 Merged openstack/openstack-ansible-os_gnocchi master: Use generic vars file for ubuntu https://review.openstack.org/586693 16:13:16 i dont want us to dive toooo much into this discussion, perhaps we can discus this outside meeting? 16:13:19 generate things for lxc2, run the tool to migrate to lxc3 always -- so upgrades will have no problems as things haven't changed -- but greenfield will be auto migrated 16:13:23 Merged openstack/openstack-ansible-os_glance master: Use generic vars file for ubuntu https://review.openstack.org/586692 16:13:27 ok 16:13:32 it is a very important subject but we are a bit limited on time, got a few bugs 16:13:36 hwoarang: Leap 15 is blocked due to lxc, sudo and mariadb bugs. Will work on that after holidays (and once aio_distro_basekit is in). 16:13:44 i dont think hwoarang is around so that probably hasnt progressed much 16:13:55 or maybe it has 16:14:00 Merged openstack/openstack-ansible-os_ironic master: Use generic vars file for ubuntu https://review.openstack.org/586698 16:14:02 anyone know a bit more details? :> 16:14:48 i gueeeess not :<, we'll follow up next week for this 16:14:54 evrardjp: M3 was last Thursday but we were waiting for a patch that never merged: https://review.openstack.org/#/c/585720/ . AP: Refactor testing to be lighter, refactor releasing, and release without the pin. 16:15:11 so looks like we're struggling to get m3 patch in 16:15:59 looks like cinder timeouts :\ at least the most recent runn 16:16:06 can we have more people do rechecks and pay attention to that change if possible 16:16:07 mnaser: I have taken action points, I will release without the role pins. And I am working on the refactor testing already 16:16:15 ok cool 16:16:21 mnaser: I don't want any more rechecks 16:16:22 that seems reasonable. 16:16:24 that's useless now 16:16:31 do we want to abandon that change then? 16:16:32 more than 20 rechecks on infra timeouts 16:16:44 I am abandonning the chain of changes and will do things differently. 16:16:50 ok, sounds good 16:16:50 too bad for the reproducibility of m3 16:17:02 doubt anyone is deploying m3 release :X 16:17:12 evrardjp: Congratulations on our next PTL: mnaser! 16:17:20 ^^ well thats technically not official yet :p 16:17:28 but just an fyi as evrardjp put it up 16:17:33 evrardjp: Congratulations on our new core: jrosser! 16:17:46 congrats jrosser ! welcome and thanks for all the work you've done, look forward to bothering you for reviews :D 16:17:52 :) 16:17:59 #topic bug triage 16:18:04 #link https://bugs.launchpad.net/openstack-ansible/+bug/1784660 16:18:04 Launchpad bug 1784660 in openstack-ansible "[WIP] Implement Neutron OVS-DPDK Support" [Undecided,New] 16:18:04 Congrats guys! 16:18:31 grats! 16:18:40 well thats a nice to have, do we wanna put this as like confirmed/wishlist? 16:19:00 or maybe in progress and assign to james as it looks like they might be wanting to pick up the work 16:19:53 oh jamesdenton already has patches up for it 16:20:05 i assume that this is just a 'tracking' bug 16:20:16 I expect so - wishlist/in progress 16:20:16 Is the open discussion after the bug triage? I was thinking it's the other way around.. 16:20:23 and assign to jamesdenton 16:20:26 Tahvok: it is after :> 16:20:34 mnaser: ok, thanks! 16:20:43 so its in progress/wishlist/james 16:20:50 #link https://bugs.launchpad.net/openstack-ansible/+bug/1784537 16:20:50 Launchpad bug 1784537 in openstack-ansible "Resource Usage tab doesn't appear in Horizon after installing Ceilometer" [Critical,New] 16:21:36 did ceilometer get its own dashboard 16:21:56 well, it used to a long time ago - it was terrible 16:22:04 yeah but it was part of horizon afaik 16:22:13 no idea 16:22:22 i cant find a ceilomter dashboard project 16:22:37 but 16:22:42 i think it would make sense "Resource Usage" is not visible because 16:22:46 ceilometer has no api anymore 16:22:58 so horizon would poll the keystone service catalog and find nothing 16:23:34 I could only find some screenshots of it on google.. 16:23:58 one sec i tihnk it was removed 16:24:13 wayback, like icehouse-ish 16:24:33 unless 'resource usage' is that pie chart thing which shows how many computes used, etc 16:24:42 https://github.com/openstack/horizon/commit/20ea82b9efe78516286c4b35c5dc644b296b4313#diff-23a06074ccaed6243de4f9b416f50c5f 16:24:47 https://github.com/openstack/horizon/commit/20ea82b9efe78516286c4b35c5dc644b296b4313 16:24:50 it was removed 16:25:14 so invalid/undecided 16:25:30 adding a commit with that 16:25:35 err 16:25:36 comment with that commit 16:25:38 english is hard 16:25:41 #link https://bugs.launchpad.net/openstack-ansible/+bug/1784369 16:25:42 Launchpad bug 1784369 in openstack-ansible "Pike to queen failed to cleanup nova container" [Undecided,New] 16:26:48 looks like it hit a transient error 16:27:13 maybe could do with retry/until on that task 16:27:34 yeah it looks like rerunning it localyl worked fine 16:28:12 confirmed/low because it's upgrade related and only rarely might happen? 16:28:27 yeah 16:28:42 cool, confirmed/low 16:28:45 #link https://bugs.launchpad.net/openstack-ansible/+bug/1784264 16:28:45 Launchpad bug 1784264 in openstack-ansible "Pike to queen upgrade failed at repo-use.yml" [Undecided,New] 16:28:48 with a suggestion to add retries to the task, and perhaps add the low-hanging-fruit tag 16:28:57 Merged openstack/openstack-ansible-tests master: Add missing services and collect journal logs https://review.openstack.org/587259 16:28:58 Merged openstack/openstack-ansible-os_neutron master: Use generic vars file for ubuntu https://review.openstack.org/586702 16:29:20 odyssey4me: done! 16:29:31 ok so for that 16:29:33 i think i know what happened 16:29:35 this one should be resolved at the head of stable/queens now - so it'd be good to know the SHA/tag they were using 16:29:36 we ran into this 16:29:42 the 404 one? 16:29:50 aka 1784264? 16:30:01 yep 16:30:08 we ran into it if operating systems dont match 16:30:20 so like if your repo is running centos 7.4 16:30:25 but your compute is 7.5 16:30:25 oh, that's interesting - no idea about that 16:30:30 compute tries to download centos-7.5 16:30:33 but that doesnt exist in the repo 16:30:44 * mnaser thinks we should use centos-7 only rather than centos-7 16:30:44 that'd definitely do it 16:30:46 like major release only 16:31:04 the workaround is to upgrade your repo and all containers 16:31:20 there has to be a single repo container for every distro/version/arch combo you want to serve 16:31:32 right, but centos 7.4 and 7.5 is similar (i think?) 16:31:37 so if you have a mix of target hosts, you need a mix of repo containers 16:31:41 i mean the difference isnt like 18.04 and 16.04 16:31:59 yeah, I dunno - I'm just saying what it was designed to do 16:32:06 odyssey4me: i think your build code might fix all of this too 16:32:22 i think? 16:32:24 all of that will hopefully go away soon, because that was implemented because of special packages like python-libvirt, which we no longer build 16:32:31 centos is supporting only the latest version.. I think even the deployers just want the latest... 16:32:40 so given we no longer build them, we could revert to a simpler deployment 16:33:13 anyway, that's a ton of work still to come in the next cycle.. so for now the reporter needs to know what you just said 16:33:48 added a commenty 16:33:55 confirmed/medium 16:34:15 #link https://bugs.launchpad.net/openstack-ansible/+bug/1784253 16:34:15 Launchpad bug 1784253 in openstack-ansible "Upgrade pike to queens failed " [Undecided,New] 16:35:17 id say 16:35:20 repo server is broken 16:35:26 so haproxy is returning 503 16:36:24 setting to invalid and adding comment that the repo server is down so thats the root cause 16:36:40 I think there is a message coming: more ppl should work on upgrades since I don't work on those anymore :( 16:36:51 upgrades worked fine for me :\ 16:37:00 (all these bugs are teh same user) 16:37:13 #link https://bugs.launchpad.net/openstack-ansible/+bug/1784230 16:37:14 Launchpad bug 1784230 in openstack-ansible "Error while running setup-openstack.yml" [Undecided,New] 16:38:16 i am going to guess setup-infrasttructure.yml isnt running or there is a firewall 16:38:24 * mnaser notices hostname of system is ceph-mon 16:38:28 Merged openstack/openstack-ansible-os_cinder master: Use generic vars file for ubuntu https://review.openstack.org/586690 16:38:32 Merged openstack/openstack-ansible-os_nova master: Remove dosfstools-dbg package from ubuntu installs https://review.openstack.org/587374 16:38:32 incomplete will ask for info if they ran it? 16:38:33 Merged openstack/openstack-ansible-os_nova master: Use generic vars file for ubuntu https://review.openstack.org/586704 16:38:50 just a sec 16:39:24 well, that error tends to come back if the galera_server haproxy connecitivity is flaky/broken 16:39:39 yeah it's weirdly skipping hosts too 16:39:57 we don't know the branch, the cli call, nor if setup-infra ran fine 16:40:22 incomplete is the proper triage, but there is no message there 16:40:26 for the bug reporter 16:40:29 i just added a message. 16:40:33 i was waiting because of the 'just a sec' 16:40:34 :) 16:40:37 ok I saw your message 16:40:39 ok 16:40:42 great 16:40:44 nex 16:40:48 #link https://bugs.launchpad.net/openstack-ansible/+bug/1784066 16:40:48 Launchpad bug 1784066 in openstack-ansible "ceph-install.yml using older version of ceph-ansible 3.0.9" [Undecided,New] 16:40:48 next 16:41:28 are we running newer ceph-ansible in queens or 16:43:04 https://github.com/openstack/openstack-ansible/blob/stable/queens/ansible-role-requirements.yml#L185-L188 16:43:24 well 16:43:30 if he removes the version of ceph ansible 16:43:36 sorry 16:43:39 if he removes the roles 16:43:45 the upgrade should have something to remove the old roles 16:43:48 then "rebootstrap" ceph ansible 16:43:51 if it doesn't, that's a bug 16:44:11 he is still with a version of ansible coming from OSA which may not be compatible with ceph-ansible 16:44:25 'ceph-ansible' and 'ceph_client' are correct - the others are old 16:44:56 well we never said we will support ppl randomly bumping ceph-ansible with the current code 16:45:00 oh look at comments they rebootstrapped 16:45:15 exactly 16:45:44 first line is osa was 3.0.9 I needed 3.1.x so I used the roles of 3.1 and now it doesn't work ... 16:45:59 so non-osa roles needs to be replaced in case of upgrades, right? 16:46:00 well... we can point to why it's fialing 16:46:17 invalid/unsupported use case? 16:46:23 bugs are being used as support :\ 16:46:39 Is there other official place for support? 16:46:44 except irc... 16:46:50 mailing lists 16:46:52 they are all from tux__ 16:46:57 i think it is a valid bug looking at the upgrade playbooks 16:47:27 https://github.com/openstack/openstack-ansible/blob/master/scripts/upgrade-utilities/playbooks/ceph-galaxy-removal.yml something like that should have been created when the role-requirements moved to use the ceph-ansible repo instead of the galaxy role breakoust for -mon, -osd, etc 16:47:54 logan-: oh 16:47:59 so https://github.com/openstack/openstack-ansible/blob/stable/queens/ansible-role-requirements.yml#L188 is not enough 16:48:36 right that just checks out the ceph-ansible repo, butthe old galaxy roles are still sitting higher in the roles path precedence 16:48:43 ok 16:48:47 that's the trick 16:48:52 thanks logan- ! 16:49:03 yep, this is a typical feature implementation without considering the upgrade tooling 16:49:14 something we should be more wary of 16:49:28 it's super hard and we should all think about it in the reviews 16:49:34 but we are all hoomans 16:49:46 so status? 16:49:54 I think that could be important for monasca role, that has at least 5 role requirements 16:50:41 confirmed medium? 16:51:06 confirmed high 16:51:17 yeah, sounds fine to me - it'd be good to reference the discussion via an eavesdrop link 16:51:17 k done 16:51:19 logan-: can you push up a simple patch ? 16:51:27 ill put a comment to meeting after 16:51:34 #link https://bugs.launchpad.net/openstack-ansible/+bug/1783886 16:51:34 Launchpad bug 1783886 in openstack-ansible "haproxy_server: rsyslog unable to log haproxy locally" [Undecided,New] 16:51:40 need to up the pace to have some open discussion time. 16:51:46 evrardjp ya 16:51:51 Kaio Kassiano Moura Oliveira proposed openstack/openstack-ansible-os_monasca-agent master: Add Ubuntu Bionic 18.04 support https://review.openstack.org/586933 16:51:57 perhaps better to just cut the bug triage off now 16:52:06 there are always bugs, those we miss this week we can cover next 16:52:14 odyssey4me: i agree 16:52:16 we have to make time for discussion 16:52:19 ++ 16:52:23 lets cut it off here then 16:52:30 #topic open discussion 16:52:30 ok 16:52:35 Openvswitch configuration does not handle configuration properly on compute nodes. It should be configured with different interfaces on neutron agent container and compute hosts. 16:52:40 Tahvok: ^ all yours 16:52:43 Yep 16:53:00 well, queens/rocky have no neutron agent containers any more ;) 16:53:01 I've created some drawing for better understanding of the issue 16:53:25 odyssey4me: so... What do they have? 16:53:34 they sit on baremetal afaik 16:53:40 It goes onto the host instead. 16:53:43 get deployed directly on the host 16:53:59 so this only affects pike and older i assume 16:54:34 What was the reasoning to go bare metal on this? 16:54:41 Tahvok: sorry - shouldn't have interrupted you, please continue 16:55:06 You didn't... If it's bare metal, then the interface name on controller and compute is the same - issue sovled :/ 16:55:31 i think there is a way of setting variables *per host* i think? 16:55:34 Tahvok: It moved to bare metal so that L3 doesn't go down during major upgrades. 16:55:45 so you can configure vars for specific containers to work around this issue 16:55:49 And just generally that container changes don't affect the networking layer. 16:55:51 (because we had many issues of container rebootings) 16:55:52 Unless it would be cherry picked to pike? 16:56:05 i dont think we'd have such a major change get cherry picked 16:56:09 no, that's not going to happen 16:56:09 however you can run it on baremetal 16:56:16 by using env.d and is_metal 16:56:20 agreed 16:56:22 Not the bare metal thing 16:56:26 The ovs configuration thing 16:56:29 yes, if you want to you can grab the env.d file and put it into your user space 16:57:01 context for ovs configuration? 16:57:11 oh yeah - so ovs config has always been partial and incomplete, but I've never seen anyone push up a patch to fix that 16:57:12 Sec 16:57:20 ovs works for us :X 16:57:34 thats on queens tho 16:57:41 cloudnull: we're picking up your topic shortly after if you are around / have the info :p 16:57:41 mnaser: just became the ovs expert in OSA :p 16:57:46 damnit 16:57:46 :p 16:57:47 Down here: https://docs.openstack.org/openstack-ansible-os_neutron/latest/app-openvswitch.html 16:58:04 It says to configure network_interface var to set the interface 16:58:18 oh but that uses openvswitch for the br-mgmt bridge, we don't use ovs for br-mgmt, only for neutron 16:58:30 so i wouldn't be able to help out much as it's a different usecase :X 16:58:57 guilhermesp: Yes i have created those bug issue 16:59:08 I'd venture to say that using OVS for the container back-end is probably over complicating your life. 16:59:11 All I'm saying if the neutron agents are containerized, the interface name would differ between it and the compute hosts 16:59:31 Use linuxbridge for OSA's things, and use OVS for the neutron things - and keep them seperate. 16:59:40 and i think there is a way in osa to override variables per host/container 16:59:53 that is the same issue that you might run into if you run different hardware 17:00:07 your port might be eth0 somewhere and eth1 elsewhere 17:00:13 Anyway, as you have said, starting from queens, it's bare metal, so there is no interface name difference 17:00:14 cloudnull: are you around? :> 17:00:19 Issue solved I guess :/ 17:00:41 Tahvok: ++ sorry about that, you can run baremetal with OSA in queens 17:00:57 okay, cool, well, we're over time and i think cloudnull might be afks 17:00:58 mnaser: no, there won't be such difference, as you need to create br-vlan bridge - doesn't matter what interface is under it 17:01:07 Tahvok: true 17:01:08 #endmeeting