*** pbandark has quit IRC | 00:13 | |
*** lbragstad has quit IRC | 00:18 | |
*** chyka has quit IRC | 00:37 | |
*** Miouge has quit IRC | 00:43 | |
*** acormier has joined #openstack-ansible | 01:29 | |
*** acormier has quit IRC | 01:30 | |
*** acormier has joined #openstack-ansible | 01:30 | |
*** esberglu has quit IRC | 01:35 | |
*** lbragstad has joined #openstack-ansible | 01:56 | |
*** woodard_ has joined #openstack-ansible | 01:57 | |
*** woodard has quit IRC | 01:57 | |
*** acormier has quit IRC | 01:58 | |
*** chyka has joined #openstack-ansible | 02:10 | |
*** chyka has quit IRC | 02:14 | |
*** dave-mccowan has joined #openstack-ansible | 02:28 | |
*** dave-mccowan has quit IRC | 02:58 | |
*** andymccr has quit IRC | 03:00 | |
*** andymccr has joined #openstack-ansible | 03:03 | |
*** lbragstad has quit IRC | 03:04 | |
*** acormier has joined #openstack-ansible | 03:12 | |
*** acormier has quit IRC | 03:27 | |
*** dave-mccowan has joined #openstack-ansible | 03:38 | |
*** gkadam has quit IRC | 04:01 | |
*** gkadam has joined #openstack-ansible | 04:02 | |
*** udesale has joined #openstack-ansible | 04:04 | |
*** udesale_ has joined #openstack-ansible | 04:07 | |
*** dave-mccowan has quit IRC | 04:09 | |
JohnnyOSA | ls | 04:11 |
---|---|---|
*** nikm has quit IRC | 04:14 | |
cloudnull | mornings | 04:25 |
cloudnull | evenings | 04:25 |
cloudnull | :) | 04:25 |
*** udesale_ has quit IRC | 04:31 | |
*** udesale_ has joined #openstack-ansible | 04:31 | |
*** udesale has quit IRC | 04:32 | |
openstackgerrit | Merged openstack/openstack-ansible master: [Docs] Include test scenario as a new user story https://review.openstack.org/546523 | 04:38 |
*** gkadam has quit IRC | 04:41 | |
*** gkadam has joined #openstack-ansible | 04:51 | |
*** poopcat has quit IRC | 05:08 | |
gokhan | hi folks, I destroyed and recreated rabbitmq containers bacause of some reasons and then now rabbitmq users are missing. is there a better way to create rabbitmq users instead of running setup-openstack.yml ? | 05:34 |
openstackgerrit | Merged openstack/openstack-ansible-os_ceilometer stable/ocata: Deprecate auth_plugin option https://review.openstack.org/521860 | 05:39 |
openstackgerrit | Merged openstack/openstack-ansible-os_gnocchi stable/queens: Zuul: Remove project name https://review.openstack.org/545321 | 05:40 |
openstackgerrit | Merged openstack/openstack-ansible-os_gnocchi master: Change default gnocchi ceph pool name to metrics https://review.openstack.org/512313 | 05:40 |
openstackgerrit | Merged openstack/openstack-ansible-os_gnocchi stable/pike: Zuul: Remove project name https://review.openstack.org/542103 | 05:42 |
cloudnull | gokhan: yes. you still need to run setup-openstack.yml but you can do so with a tag | 05:43 |
cloudnull | to recreate the rabbitmq users | 05:43 |
cloudnull | the tag is common-rabbitmq | 05:43 |
cloudnull | to see all of the tags you can use the --list-tags option | 05:43 |
gokhan | cloudnull, I see that we can run with --step tag is it right tag ? | 05:44 |
*** bhujay has joined #openstack-ansible | 05:45 | |
cloudnull | the command would be `openstack-ansible setup-openstack.yml --tags common-rabbitmq` | 05:46 |
*** hybridpollo has quit IRC | 05:47 | |
gokhan | cloudnull, ok thanks. also I see that on rabbitmq containers dbus package is missing | 05:55 |
gokhan | cloudnull, because of it when run systemctl status rabbitmq-server we see warning like beware of timeouts | 05:56 |
*** bhujay has quit IRC | 06:00 | |
*** bhujay has joined #openstack-ansible | 06:04 | |
*** aruns has joined #openstack-ansible | 06:04 | |
*** aruns__ has joined #openstack-ansible | 06:05 | |
*** nikm has joined #openstack-ansible | 06:08 | |
nikm | evrardjp: hi | 06:08 |
*** aruns has quit IRC | 06:09 | |
nikm | evrardjp: there is a diff between cert files of repo container and cert file of controller host in HA env deployed using openstack-ansible ocata | 06:10 |
*** dariko has quit IRC | 06:10 | |
nikm | evrardjp: thats why we were getting the curl issue http://paste.openstack.org/show/680299/ | 06:11 |
nikm | http://paste.openstack.org/show/680349/ | 06:11 |
nikm | when we copied manually the cert file of host to the container, the issue is getting solved | 06:12 |
nikm | how do we automate the copying of correct cert from host to containers | 06:12 |
cloudnull | gokhan: what release are you running. Ive seen that error in early pike | 06:14 |
cloudnull | you can install the dbus package | 06:15 |
gokhan | cloudnull, yes on pike 16.0.4 release. I installed it. of if it is already solved , there is no problem thanks :) | 06:16 |
cloudnull | I was looking for the patch | 06:17 |
cloudnull | but i cant quickly find it | 06:17 |
cloudnull | i do however know we fixed it | 06:17 |
cloudnull | but im glad you got it orted | 06:18 |
cloudnull | **sorted | 06:18 |
nikm | do any one else know ? | 06:19 |
nikm | how do we copy the correct certs inside /etc/ssl/certs of container from controller host | 06:19 |
nikm | we are using openstack-ansible ocata on centos 7 | 06:20 |
nikm | http://paste.openstack.org/show/680299/ http://paste.openstack.org/show/680349/ are the issues which came inside repo container and was not coming on the host hosting containers | 06:21 |
nikm | when we copied manually the certs from /etc/ssl/certs of the host to the container, the issue gets resolved | 06:21 |
cloudnull | nikm: there are overrides for the various cert files however if you're using haproxy you should only need to override the one. | 06:21 |
cloudnull | all of the ssl is normally terminated at the lb | 06:21 |
cloudnull | are you seeing an ssl issue when hitting the repo containers directly ? | 06:22 |
gokhan | cloudnull, when run with common-rabbitmq tags, it gives error on ceph client. ceph_mon_host is undefined. I ignored this error. | 06:24 |
nikm | cloudnull: curl repos.fedorapeople.org is failing inside repo container and there is no issue of host | 06:25 |
nikm | so we took a diff on the cert files of host and container | 06:26 |
nikm | so we found one subject line is missing in container cert | 06:26 |
nikm | so when we manually copied cert from host to container | 06:26 |
nikm | curl command is started working | 06:27 |
*** acormier has joined #openstack-ansible | 06:27 | |
nikm | and also openstack-ansible setup-infrastructure.yml not failed on http://paste.openstack.org/show/680299/ http://paste.openstack.org/show/680349/ | 06:27 |
nikm | cloudnull : can u give the link for this "if you're using haproxy you should only need to override the one." | 06:29 |
*** acormier has quit IRC | 06:31 | |
nikm | cloudnull : can u tell the overrirides | 06:38 |
nikm | haproxy is not configured till now | 06:39 |
ThomasS | moring guys | 06:44 |
*** bhujay has quit IRC | 06:57 | |
*** ianychoi has quit IRC | 07:01 | |
*** arbrandes1 has joined #openstack-ansible | 07:01 | |
*** ianychoi has joined #openstack-ansible | 07:01 | |
*** mardim has joined #openstack-ansible | 07:03 | |
cloudnull | nikm: the overrides for haproxy can all be found here https://docs.openstack.org/openstack-ansible-haproxy_server/latest/ | 07:03 |
*** arbrandes has quit IRC | 07:03 | |
nikm | cloudnull: will it copy the host certs inside repo container | 07:05 |
*** SmearedBeard has joined #openstack-ansible | 07:06 | |
nikm | if we overrides haproxy | 07:06 |
*** bhujay has joined #openstack-ansible | 07:07 | |
*** vedin has joined #openstack-ansible | 07:11 | |
*** bhujay has quit IRC | 07:13 | |
vedin | Hi Everyone, I am facing while running setup-infrastructure playbook in galara container, it is giving the following issue >>> http://paste.openstack.org/show/681310/ | 07:14 |
vedin | this issue looks like HAproxy, I tried to re run the HAproxy playbook, but I am not able to ping internal LB ip | 07:15 |
nikm | hi guys | 07:23 |
nikm | how do we get log file in https://github.com/openstack/openstack-ansible/blob/stable/ocata/playbooks/haproxy-install.yml#L26 | 07:23 |
nikm | hi | 07:24 |
nikm | haproxy log file /var/log/haproxy | 07:24 |
*** aruns__ has quit IRC | 07:34 | |
*** aruns__ has joined #openstack-ansible | 07:36 | |
*** pcaruana has joined #openstack-ansible | 07:37 | |
*** john51 has quit IRC | 07:39 | |
sar | vedin: I've also had that error. Tt was a haproxy/keepalived issue. What is the status of the keepalived and haproxy services on the haproxy hosts? | 07:43 |
*** armaan has quit IRC | 07:48 | |
*** armaan has joined #openstack-ansible | 07:48 | |
*** SmearedBeard has quit IRC | 07:53 | |
*** Miouge has joined #openstack-ansible | 07:54 | |
*** chyka has joined #openstack-ansible | 07:56 | |
*** lvdombrkr has joined #openstack-ansible | 07:59 | |
*** lvdombrkr has quit IRC | 07:59 | |
*** lvdombrkr has joined #openstack-ansible | 07:59 | |
*** chyka has quit IRC | 08:01 | |
*** epalper has joined #openstack-ansible | 08:10 | |
*** threestrands_ has joined #openstack-ansible | 08:15 | |
*** admin0 has joined #openstack-ansible | 08:16 | |
admin0 | morning \o | 08:16 |
*** threestrands has quit IRC | 08:18 | |
*** mbuil has joined #openstack-ansible | 08:20 | |
*** jwitko_ has quit IRC | 08:25 | |
evrardjp | morning | 08:27 |
sar | morning evrardjp | 08:28 |
lvdombrkr | morning guys..it is possible connect linux bridge to ovs bridge? | 08:28 |
evrardjp | nikm: hey -- to modify containers you have two choices | 08:28 |
evrardjp | either modify the base container that gets used when creating new containers | 08:29 |
evrardjp | Or you modify the containers at their creation | 08:29 |
evrardjp | (you have a third option that could be modifying the container later, but that sounds bad) | 08:29 |
admin0 | i have strange issue .. greenfield new deployment: https://gist.github.com/a1git/743068ba24e1d2c357109e0adaac8328 | 08:29 |
admin0 | lxc_hosts : Retrieve base image | 08:30 |
evrardjp | lvdombrkr: while it would be technically possible, I doubt it's a great idea to mix and match on the same host | 08:30 |
evrardjp | admin0: there is a thing going on about that | 08:30 |
admin0 | aha .. | 08:30 |
admin0 | ok | 08:30 |
evrardjp | we did something -- then we changed it -- then it broke the upgrades -- so we add a compatibility layer | 08:31 |
*** sxc731 has joined #openstack-ansible | 08:31 | |
evrardjp | admin0: you might want to check in this commit: https://review.openstack.org/#/c/545849/1 | 08:31 |
evrardjp | it has links to changes | 08:32 |
evrardjp | depending on your version you might be hitting something. | 08:32 |
evrardjp | vedin: there are many causes that can lead to "no route to host" It depends on your config. | 08:33 |
evrardjp | sar: good morning :) | 08:33 |
vedin | sar : haproxy service is running but keepalived is not | 08:33 |
evrardjp | lvdombrkr: what are you trying to achieve? | 08:33 |
evrardjp | vedin: how many hosts do you have for haproxy hosts? | 08:33 |
admin0 | hmm. 1TB space in the variables | 08:33 |
evrardjp | admin0: check also the history of the bug, it might be something that's wrongly reported/need reboot or fancy stuff. | 08:34 |
evrardjp | vedin: if you have only one node, that's normal to not have keepalived. | 08:34 |
evrardjp | vedin: if you have more than one, your keepalived configuration is busted. Please give us your internal_lb_vip_address and external one + your user_*.yml variables. | 08:35 |
vedin | evrardjp: I am using internal_lb_vip_address: 192.168.4.100, and external_lb_vip_address: 192.168.64.128 | 08:37 |
cmorelli | morning guys. thanks to cloudnull and evrardjp for the help yesterday | 08:38 |
cmorelli | I finally managed to finish the deployment | 08:38 |
vedin | evrardjp: internal_lb_vip_address is from managment network and external_lb_vip_address form external network | 08:39 |
admin0 | lvdombrkr, don't :D | 08:39 |
cmorelli | but now if I try to connect to Horizon dashboard I get this error "CSRF verification failed. Request aborted." right after the Horizon admin login | 08:39 |
lvdombrkr | evrardjp: nothing yet, i just want know it is possible.. cant find any information around | 08:39 |
admin0 | err .. | 08:40 |
admin0 | i setup that variable and retrying again | 08:40 |
lvdombrkr | admin0 evrardjp: so yes or no? ))) | 08:40 |
*** electrofelix has joined #openstack-ansible | 08:40 | |
admin0 | lvdombrkr, i am setting up a new env today .. it uses ovs .. but needs some creative thinking | 08:41 |
admin0 | like only vlan and vxlan on ovs .. and only on computes . not on controllers as the script fails . . and network is on metal on one of the compute nodes and not on controllers | 08:41 |
admin0 | doing that, can work with ovs and get things done | 08:41 |
sar | vedin: do you have multiple haproxy hosts, or just one? | 08:42 |
cmorelli | more or less the same as shown here https://ask.openstack.org/en/question/50839/forbidden-403-csrf-verification-failed-request-aborted-more-information-is-available-with-debugtrue/ | 08:42 |
vedin | sar: I am using 3 nodes for haproxy | 08:43 |
sar | and you haven't specifically told it not to use keepalived? i think it defaults to installing keepalived if you have more than one haproxy host | 08:43 |
sar | if so, i guess keepalived should be running, and one of the hosts should have the vip's assigned | 08:44 |
vedin | sar : I have this configuration for keepalived >> http://paste.openstack.org/show/681446/ | 08:46 |
vedin | In all nodes keepalived service is down | 08:47 |
*** gillesMo has joined #openstack-ansible | 08:48 | |
sar | ok, then that is the problem | 08:48 |
sar | what is the output of systemctl status keepalived.service ? | 08:48 |
admin0 | evrardjp, this space requirement, that is on which node | 08:52 |
evrardjp | lvdombrkr: ovs is getting more traction lately and is tested in opnfv. We plan to introduce a scenario soon | 08:52 |
gillesMo | Hello, I have duplicate hypervisors and also a non-fonctionning overview page in horizon. I read that bug : https://bugs.launchpad.net/openstack-ansible/+bug/1736731 and it is my problem, I have 3 cells (cell0 and 2 cell1). I dont't know if it's may upgrade from Ocata to Pike or a change my internal_lb_vip_address (impacts endpoints URLs). | 08:53 |
openstack | Launchpad bug 1736731 in openstack-ansible "os_nova might create a duplicate cell1" [High,Confirmed] - Assigned to Jean-Philippe Evrard (jean-philippe-evrard) | 08:53 |
evrardjp | but right now it's not as tested as lxb | 08:53 |
evrardjp | unless you really need it, I'd say use lxb | 08:53 |
admin0 | lvdombrkr, i can share my configs on my hybrid ovs setup | 08:53 |
gillesMo | How many cells do we really need ? cell0, cell1, both ? | 08:53 |
evrardjp | admin0: container hosts | 08:53 |
evrardjp | gillesMo: at least cell0 cell1 | 08:53 |
armaan | evrardjp: Hello, how are you today? | 08:53 |
evrardjp | armaan: hello! | 08:53 |
admin0 | i have 734G gb free | 08:53 |
armaan | logan-: Hello, are you around? | 08:53 |
evrardjp | good and you ? | 08:54 |
evrardjp | admin0: then you're hitting the bug | 08:54 |
armaan | evrardjp: same as usual, struggling with something in the Ceph land :) | 08:54 |
evrardjp | hahaha who doesn't :) | 08:54 |
evrardjp | andymccr: and logan- are the best for your case | 08:54 |
admin0 | setting up that var in user_variables did not helped | 08:54 |
evrardjp | vedin: hmmmm | 08:55 |
armaan | I am just wondering if anyone know about the stable-3.0-good branch in ceph-ansible | 08:55 |
evrardjp | it should be up in all nodes | 08:55 |
lvdombrkr | admin0: yes it will be useful | 08:55 |
vedin | sar: keepalived service output >> http://paste.openstack.org/show/681453/ | 08:55 |
evrardjp | vedin: the interfaces exist on the host? | 08:55 |
gillesMo | evrardjp: OK, thanks. After deleting one of my 2 cell1, it works, but after updating cell0 (to correct the endpoint URL), I'm now getting duplilcate hypervisors/compute services... | 08:55 |
vedin | evrardjp: yes it is ans associated with br-flat bridge | 08:56 |
evrardjp | vedin: ahah | 08:57 |
evrardjp | found the issue -- read your log | 08:57 |
evrardjp | L11. | 08:57 |
evrardjp | vs L5-6 on http://paste.openstack.org/show/681446/ | 08:57 |
evrardjp | misconfiguration -- did you re-run the haproxy playbook after changing vars? | 08:58 |
admin0 | evrardjp, totally lost on how to move ahead :D | 08:58 |
evrardjp | gillesMo: wow. | 08:58 |
vedin | evrardjp: yes, but that interface is for external network for openstack | 08:58 |
admin0 | lvdombrkr, i am going to create a quick writeup on the current state of ansible + ovs | 08:58 |
evrardjp | gillesMo: why don't you use our playbooks? | 08:59 |
vedin | it is saying to provide IP | 08:59 |
*** pbandark has joined #openstack-ansible | 08:59 | |
evrardjp | it should modify the endpoints | 08:59 |
vedin | br-flat is associated with that interface | 08:59 |
evrardjp | vedin: drop your keepalived config | 08:59 |
evrardjp | this way I can understand | 08:59 |
evrardjp | network config and keepalived config and I have a full view :) | 08:59 |
evrardjp | admin0: you might be hitting a bug that's severe. | 09:00 |
evrardjp | which branch/version? | 09:00 |
admin0 | stable/pike | 09:00 |
evrardjp | changing the variable wouldn't do things without the update of the roles | 09:00 |
vedin | evrardjp: How to do that ? | 09:00 |
gillesMo | evrardjp: which playbook ? Of course I ran multiple times the os-* playbooks or setup-openstack.yml, haproxy-install.yml to chnage the endpoints URL, but I think that's what leads to my duplicate cell | 09:00 |
evrardjp | vedin: cat /etc/keepalived/keepalived.conf | pastebinit | 09:01 |
evrardjp | vedin: cat /etc/network/interfaces | pastebinit | 09:01 |
evrardjp | I love myuseless cat | 09:01 |
evrardjp | you can ofc use pastebinit /etc/network/interfaces | 09:01 |
vedin | I am using centOS | 09:01 |
evrardjp | vedin: ... you get the idea :) | 09:02 |
evrardjp | gillesMo: that's the issue we have to solve then | 09:02 |
evrardjp | if you can redo-it, and file a bug that would be great. | 09:03 |
evrardjp | The idea is that we should adapt automatically. | 09:03 |
evrardjp | if it's dangerous, maybe throw a warning or something. | 09:03 |
*** pbandark has quit IRC | 09:03 | |
gillesMo | evrardjp: OK but I think this bug describe it well : https://bugs.launchpad.net/openstack-ansible/+bug/1736731 | 09:03 |
openstack | Launchpad bug 1736731 in openstack-ansible "os_nova might create a duplicate cell1" [High,Confirmed] - Assigned to Jean-Philippe Evrard (jean-philippe-evrard) | 09:03 |
*** Sha000000 has joined #openstack-ansible | 09:03 | |
evrardjp | darn. | 09:04 |
gillesMo | Sorry ! | 09:04 |
evrardjp | haha | 09:04 |
evrardjp | it's alright, I just didn't get any cycles to do it. Imagine! | 09:04 |
evrardjp | that's a high bug. | 09:04 |
evrardjp | any help is welcomed. | 09:04 |
vedin | evrardjp : keepalive configuration >> http://paste.openstack.org/show/681469/ >>> intrfaces >> http://paste.openstack.org/show/681472/ | 09:05 |
evrardjp | vedin: is haproxy running? | 09:05 |
vedin | yes | 09:05 |
evrardjp | vedin: that's good. So next question -- why eno2 | 09:06 |
vedin | because that interface is related to that IP 192.168.64.128 | 09:06 |
evrardjp | it seems like the interface that will carry traffic is br-flat right? | 09:06 |
*** pbandark has joined #openstack-ansible | 09:07 | |
vedin | yes | 09:07 |
evrardjp | so why not having br-flat in the config? | 09:07 |
vedin | before it was br-flat only, then also same issue was throwing | 09:07 |
admin0 | evrardjp, is there a known tag that works ? | 09:07 |
evrardjp | mmm let's be consistent there | 09:07 |
vedin | ok, I will replace with br-flat now | 09:08 |
evrardjp | vedin: could you show me your br- information ? | 09:08 |
vedin | sure | 09:08 |
evrardjp | because I don't see bridges IPs there | 09:08 |
evrardjp | vedin: also: why is the ping IP 192.168.4.1 instead of 192.168.64.<external router ip> ? | 09:09 |
evrardjp | it would make more sense to shut down if you can't reach the network | 09:10 |
evrardjp | if a node is disconnected from the network at least | 09:10 |
vedin | evrardjp: bridge interfaces >> http://paste.openstack.org/show/681485/ | 09:10 |
evrardjp | it can work on the internal side, it just loses business value imo | 09:10 |
evrardjp | (test clustering is already what vrrp is doing) | 09:11 |
cmorelli | I found out probably what my problem is. Browsing the OS documentation it seems that I need to change a parameter in /etc/openstack_dashboard/local_settings.py, because i'm using HaProxy without SSL and I have to change a boolean flag over there. Howto do this with openstack-ansible though? | 09:11 |
evrardjp | vedin: br-flat doesn't have any networking? | 09:11 |
evrardjp | sorry wrongly said | 09:11 |
cmorelli | I mean, what is the correct procedure? | 09:11 |
evrardjp | no IP? | 09:11 |
vedin | this bridge I want to use for openstack provider network | 09:12 |
evrardjp | oh ok | 09:12 |
evrardjp | so it's dedicated to that | 09:12 |
evrardjp | ok let's do something different then | 09:12 |
vedin | yes | 09:12 |
evrardjp | vedin: please configure your external lb vip address to an ip (or a dns name that points to an ip) in the range of your internal lb vip address. but a different one than the internal vip. Configure your user_* to this address, and use the nic as br-mgmt. | 09:13 |
admin0 | vedin, if your router/NAT can see mgmt, you can use the same IP range in both | 09:14 |
evrardjp | because that's effectively what you're doing: all your api traffic will flow through that network. | 09:14 |
admin0 | one will be for internal haproxy .. one will be external | 09:14 |
evrardjp | your tenant traffic would still be isolated on your flat network, and no overlap. Problem solved! | 09:15 |
evrardjp | gillesMo: I am very sorry you have that issue right now. | 09:15 |
evrardjp | In the meantime, I guess CLI calls are your best choice. | 09:15 |
evrardjp | admin0: to my knowledge they all work but start to fail after the first reboot, and that's what I think you're hitting. And that's what the bug is fixing. | 09:16 |
admin0 | evrardjp, is there a known case of a windows95 style reboot of the controller nodes (which is also the container hosts) and retry script to fix ? | 09:18 |
*** hamzy_ has quit IRC | 09:20 | |
*** hamzy_ has joined #openstack-ansible | 09:20 | |
evrardjp | admin0: I am not sure to understand what you mean -- retry script to fix? | 09:20 |
admin0 | i mean i cannot figure out how to fix and move ahead with the setup-hosts | 09:21 |
admin0 | so was asking if a reboot of the controllers fix it | 09:22 |
evrardjp | admin0: I didn't get the chance to work on that -- you should probably contact other ppl, like cloudnull | 09:24 |
evrardjp | maybe jrosser has seen that | 09:24 |
nikm | evrardjp : how do I change base container or containers at creation | 09:30 |
vedin | evrardjp: if I want to use external network ip for external lb vip, then what are the change I need to do in my environment ?? | 09:30 |
evrardjp | nikm: it's in lxc_container_create role (for at creation) or lxc_host (for base) | 09:31 |
evrardjp | nikm: these are cache prep commands for example: https://github.com/openstack/openstack-ansible-lxc_hosts/blob/stable/ocata/vars/ubuntu-16.04.yml#L56-L87 | 09:32 |
evrardjp | you can override some for your use case | 09:32 |
nikm | evrardjp: ok thanks | 09:33 |
evrardjp | vedin: not sure what you mean -- you mean using an ip from the same public range for provider network/tenant network AND api ? | 09:33 |
evrardjp | vedin: it's technically doable, but I'd put that into a separate nic. separation of concerns. | 09:34 |
vedin | yes, form same external network | 09:34 |
evrardjp | You can do veth pairs and stuff like that to overcome the problem. | 09:34 |
evrardjp | that's up to you | 09:34 |
evrardjp | but you can't use the carrying interface for assigning an IP -- you must assign to the bridge. And then the bridge cannot be passed to neutron | 09:35 |
evrardjp | so you could have a br-ext that's wired to br-flat | 09:35 |
evrardjp | and br-ext having an IP | 09:35 |
evrardjp | vedin: we are using this kind of trickery in https://docs.openstack.org/openstack-ansible/latest/user/test/example.html#host-network-configuration | 09:36 |
evrardjp | not exactly the same for your use case though | 09:37 |
admin0 | lvdombrkr, http://www.openstackfaq.com/openstack-ansible-with-ovs-pike/ | 09:37 |
*** threestrands_ has quit IRC | 09:37 | |
admin0 | that is my config | 09:38 |
admin0 | on osa+ovs | 09:38 |
admin0 | this is working on PoC .. now trying to do a real deplyment and stuck on the setup-hosts lxc stuff | 09:38 |
evrardjp | vedin: oh you're on centos -- maybe you don't need both | 09:39 |
evrardjp | you can maybe just have a veth that's plugged into the host and into the br-flat. | 09:40 |
evrardjp | (yeah that's on the same host) | 09:40 |
admin0 | i am going to put my hosts mapping as well | 09:40 |
admin0 | so that its more clear | 09:40 |
evrardjp | but on the host you could use that veth end for assigning an ip | 09:40 |
evrardjp | that should do the trick without br-ext. | 09:41 |
evrardjp | please note your traffic would still overlap -- You'd be on the same bridge and all. | 09:41 |
evrardjp | so tenant traffic can crush your apis | 09:41 |
evrardjp | but you can implement tc on outbound. which is not fully helpful but a good start. | 09:42 |
admin0 | lvdombrkr, updated: http://www.openstackfaq.com/openstack-ansible-with-ovs-pike/ | 09:42 |
admin0 | hopefully it will help | 09:42 |
evrardjp | vedin: does that help? | 09:43 |
*** gkadam_ has joined #openstack-ansible | 09:43 | |
*** gkadam has quit IRC | 09:44 | |
*** gkadam_ has quit IRC | 09:44 | |
*** gkadam_ has joined #openstack-ansible | 09:44 | |
lvdombrkr | admin0: thanks i will look into | 09:45 |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible master: [Docs] Fix references https://review.openstack.org/546524 | 09:48 |
nikm | evrardjp: what about giving here /etc/ssl/certs https://github.com/openstack/openstack-ansible-lxc_hosts/blob/stable/ocata/vars/redhat-7.yml#L41 | 09:48 |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible master: [Docs] Move more examples to user guide https://review.openstack.org/546525 | 09:49 |
nikm | evrardjp: will it copy the certs from host | 09:49 |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible master: [Docs] Move Ceph example to user guides https://review.openstack.org/546526 | 09:49 |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible master: [Docs] Move network architecture into reference https://review.openstack.org/546538 | 09:49 |
*** aruns has joined #openstack-ansible | 09:50 | |
*** aruns__ has quit IRC | 09:50 | |
evrardjp | nikm: https://github.com/openstack/openstack-ansible-lxc_hosts/blob/02f2a6bf7d96a38c286d0c07a2408d1ee6ad9933/tasks/lxc_cache_preparation.yml#L83 | 09:50 |
evrardjp | nikm: you therefore have better: https://github.com/openstack/openstack-ansible-lxc_hosts/blob/stable/ocata/defaults/main.yml#L106-L111 | 09:51 |
*** aruns__ has joined #openstack-ansible | 09:51 | |
evrardjp | this work is great right? :p | 09:51 |
nikm | :) | 09:52 |
*** SmearedBeard has joined #openstack-ansible | 09:52 | |
*** aruns has quit IRC | 09:54 | |
admin0 | i set my variable lxc_host_machine_volume_size: 500G and will retry | 09:55 |
admin0 | or i need to cherrypick that patch evrardjp ? | 09:56 |
evrardjp | admin0: check on your existing code , and maybe look for the previous days conversations. | 09:56 |
evrardjp | I can't help you there. | 09:56 |
admin0 | when will it go into stable/pike :D ? | 09:57 |
admin0 | so that nothing "fancy/extra" needs to be done and documented | 09:57 |
evrardjp | needs backporting first, then need bump | 10:03 |
evrardjp | I can't say but at least 2 weeks. | 10:03 |
*** aruns__ has quit IRC | 10:03 | |
openstackgerrit | Periyasamy Palanisamy proposed openstack/openstack-ansible master: Make Opendaylight as the BGP speaker using Quagga https://review.openstack.org/523907 | 10:04 |
*** aruns__ has joined #openstack-ansible | 10:04 | |
admin0 | evrardjp, is there also a switch feature to move from new lxc to old lxc behaviour .. ? | 10:10 |
admin0 | that might fix right | 10:10 |
admin0 | nah .. manually bump up the quota :D | 10:12 |
openstackgerrit | Merged openstack/openstack-ansible master: [Docs] Fix references https://review.openstack.org/546524 | 10:12 |
admin0 | machinectl set-limit infinity | 10:13 |
admin0 | :D | 10:13 |
openstackgerrit | Merged openstack/openstack-ansible master: [Docs] Move more examples to user guide https://review.openstack.org/546525 | 10:14 |
openstackgerrit | Merged openstack/openstack-ansible master: [Docs] Move Ceph example to user guides https://review.openstack.org/546526 | 10:14 |
openstackgerrit | Merged openstack/openstack-ansible master: [Docs] Move network architecture into reference https://review.openstack.org/546538 | 10:17 |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible stable/queens: Remove periodic translations job https://review.openstack.org/546936 | 10:22 |
vedin | evrardjp : You mean, we should have IP assigned interface in all controller nodes which we want to use for external lb | 10:35 |
Taseer | evrardjp: is there anything I can do to mitigate => http://logs.openstack.org/71/503971/28/check/openstack-ansible-deploy-congress-ubuntu-xenial/aec2047/job-output.txt.gz#_2018-02-20_10_31_41_607378 | 10:35 |
admin0 | evrardjp, so I had to set-limit 500G and then systemctl restart /var/lib/machines to fix it and move forward | 10:37 |
evrardjp | vedin: I think it's advised to separate tenant traffic than API traffic, so it would be wise indeed to have a external_lb_vip_address on an ip/dns that's different. But like I said it's possible to do both. | 10:39 |
evrardjp | admin0: does it work? | 10:39 |
admin0 | ansible worked fine in one machine i used limt and test | 10:39 |
admin0 | now running the playbook like normal | 10:40 |
admin0 | if it worked on 1, should work on the other 2 controllers as well | 10:40 |
admin0 | will update | 10:40 |
evrardjp | Taseer: you'll have to debug it yourself I am afraid. | 10:40 |
*** nattanon has joined #openstack-ansible | 10:40 | |
nattanon | Hello !!! guys. Really need a help !!! | 10:41 |
evrardjp | nattanon: hello | 10:41 |
evrardjp | we'll try to do our best with the available resources. | 10:41 |
evrardjp | :D | 10:41 |
admin0 | hello nattanon | 10:41 |
gokhan | hi evrardjp odyssey4me, I have problem on rabbitmq clusters. I have 2 different environments with same configs. but on one of them rabbitmq gives error report about timeout. this is some logs:http://paste.openstack.org/show/681599/ also get errors on nova compute logs like that: http://paste.openstack.org/show/681586/ . What can be reason of this ? it is weird on my second environment there is no timeout error. only difference between my environme | 10:42 |
gokhan | nts, servers are different brand. what is your thoughts ? | 10:42 |
nattanon | I'm running OSA pike version with tag 16.0.8 | 10:42 |
admin0 | gokhan, different brand = ? | 10:42 |
admin0 | whats a brand ? | 10:42 |
admin0 | different os, different tags, different hardware ? | 10:42 |
gokhan | admin0, I mean hp and dell servers | 10:43 |
admin0 | what does tcpcump show ? | 10:43 |
nattanon | Then i face through the problems with evacuate VM provision using volume | 10:43 |
admin0 | some hp's have crappy network cards :D | 10:43 |
gokhan | admin0, you are right. problem is on hp servers | 10:43 |
*** gunix has left #openstack-ansible | 10:43 | |
admin0 | :D | 10:44 |
admin0 | dump hp .. move to dell | 10:44 |
admin0 | speaking from experience ( might not be true in your case ) .. change to a different network card then what comes in HP and issue solved | 10:45 |
nattanon | Do you guys have any idea for that ? | 10:45 |
admin0 | nattanon, logs ? | 10:45 |
admin0 | problems is a very broad term as well .. like you had pain in the finger when typing evacuate :D | 10:46 |
admin0 | nattanon, the vm's don't want to move to new hypervisors ? | 10:47 |
gokhan | admin0, yep move to dell :) but now it is very diffucult to change network card :( | 10:49 |
admin0 | you unscrew the old one out and screw the new one in | 10:49 |
nattanon | @admin0 , VM moved to new hypervisor but cant boot. Note that i'm using volume be a boot disk. | 10:50 |
nattanon | cinder-volume. | 10:51 |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible stable/pike: Update existing container_networks https://review.openstack.org/528357 | 10:52 |
openstackgerrit | Maxime Guyot proposed openstack/openstack-ansible master: Ceph RadosGW integration https://review.openstack.org/517856 | 10:53 |
*** indistylo has joined #openstack-ansible | 10:54 | |
admin0 | nattanon, what is in the console ? | 10:54 |
*** gkadam__ has joined #openstack-ansible | 10:54 | |
*** aruns__ has quit IRC | 10:55 | |
nattanon | @admin0 Wait a sec let me give you all data about us, I'm preparing. | 10:55 |
*** gkadam_ has quit IRC | 10:57 | |
gokhan | admin0, yep I asked boss and we sold melanox 25gb cards but we can use them 2 months later :( and is there otherway to solve which you advice ? maybe increase timeout time ? and also there is abug rabbitmq itself. one process beam.smp consumes more cpu | 10:58 |
*** manuelbuil has joined #openstack-ansible | 10:58 | |
gokhan | *sold took | 10:58 |
cmorelli | hi guys, any help on this? https://ibin.co/3sZjJfVWJf6M.png | 10:58 |
*** EmilienM has quit IRC | 10:59 | |
*** mbuil has quit IRC | 10:59 | |
*** nwonknu has quit IRC | 10:59 | |
*** mcarden has quit IRC | 10:59 | |
*** mattoliverau has quit IRC | 10:59 | |
cmorelli | I get this error using 15.0.15 playbooks | 10:59 |
nattanon | @admin0 Ordering event after compute1 is going down we evacuate all vm to compute2 all vm been moved fine but can't boot OS. | 10:59 |
cmorelli | trying to login to the dashboard for the first time | 10:59 |
nattanon | https://drive.google.com/open?id=1VAOFUrXlaK_DTTRWRgI_Wc_Dj3U4xxQY ---------- this is error log from console. | 10:59 |
admin0 | cmorelli, not using SSL is a crime :D | 10:59 |
*** mattoliverau has joined #openstack-ansible | 11:00 | |
*** mcarden has joined #openstack-ansible | 11:00 | |
*** EmilienM has joined #openstack-ansible | 11:00 | |
admin0 | nattanon, i think it happens if the disk was busy during the crash . use the nova rescue, boot to the new image, mount this disk and fsck | 11:00 |
cmorelli | admin0: am I doing something wrong? I'm setting `haproxy_ssl: false` in user-variables before the deployment, otherwise the setup-openstack.yml playbook fails | 11:00 |
cmorelli | I appreciate the help ;) | 11:01 |
cmorelli | I know not using ssl is a crime, but I would like to see it working first | 11:01 |
admin0 | cmorelli, not even using proper hostnames ? | 11:02 |
*** nwonknu has joined #openstack-ansible | 11:03 | |
nikm | evrardjp : while creating container which file of https://github.com/openstack/openstack-ansible-lxc_container_create can we use for copying files from host | 11:03 |
nikm | since we do not want to recreate container base image | 11:03 |
cmorelli | @admin0, what do you mean? | 11:04 |
nattanon | Then we trying to delete instance cant be completely deleted volume can't be delete and my compute is going down then can't going forever with this error. | 11:04 |
nattanon | https://pastebin.com/QvZVLQBF | 11:04 |
nikm | evrardjp: will container base image will be recreated with # openstack-ansible setup-infrastructure.yml --syntax-check | 11:05 |
nikm | openstack-ansible setup-infrastructure.yml | 11:05 |
nattanon | @admin0 i will try your advice first. | 11:06 |
*** pcaruana has quit IRC | 11:07 | |
admin0 | nattanon, first in the compute node the vms are migrated, use the nova , do the fsck and fix the volumes so that they boot .. client happy | 11:07 |
gokhan | evrardjp, how can we reach rabbitmq on browser ? I mean which user I must use ? | 11:07 |
admin0 | then work on the other one which gives this or that error to fix | 11:07 |
admin0 | evrardjp, how do I increase: Ensure that the LXC cache has been prepared (X retries left). | 11:08 |
admin0 | is there a var for it | 11:08 |
admin0 | so it works in 1 contrller | 11:08 |
admin0 | but in the other 2 it failed .. but not due to that limit error | 11:08 |
admin0 | i think the disks on raid ae a bit slow | 11:09 |
admin0 | so they are not getting enough time to expand as the retry is already reached | 11:09 |
*** alex____ has joined #openstack-ansible | 11:09 | |
*** alex____ has quit IRC | 11:09 | |
*** fusmu has joined #openstack-ansible | 11:10 | |
evrardjp | admin0: I don't think we have a var for that let me double check | 11:11 |
evrardjp | lxc_cache_prep_timeout | 11:11 |
evrardjp | 1200seconds | 11:12 |
evrardjp | lxc_cache_prep_timeout: 6000 | 11:12 |
nattanon | @admin0 I'm curious that instance with volume can be evacuate as a standard feature ?? | 11:12 |
nikm | evrardjp: hi | 11:13 |
admin0 | nattanon, it is a standard feature | 11:13 |
admin0 | did you evacuate after or before a crash | 11:14 |
admin0 | like pre-maintenance migration or post-crash evacuation | 11:14 |
cmorelli | @admin: this is exactly what I did: in /etc/openstack-deploy.yml I set `openstack_service_publicuri_proto: http` and `haproxy_ssl: false`; then I had to modify CSRF_COOKIE_SECURE=False and SESSION_COOKIE_SECURE=Falise in ansible conf `/etc/ansible/roles/os_horizon/templates/horizon_local_settings.py.j2` | 11:14 |
nikm | evrardjp: where can we copy files in https://github.com/openstack/openstack-ansible-lxc_container_create | 11:14 |
nattanon | before it crash . | 11:14 |
cmorelli | still, I get the CSRF error after login | 11:14 |
nattanon | @admin0 before it crash. | 11:14 |
admin0 | how many moved, how many worked fine , how may not worked @ all ? | 11:15 |
nattanon | admin0: All failed with boot. but locate new hypervisor is fine. | 11:15 |
admin0 | does nova rescue helped ? | 11:16 |
admin0 | used it before ? | 11:16 |
admin0 | also what is the backend ? nfs, ceph, iscsi ? | 11:16 |
admin0 | @cmorelli, never tried that use case .. so no idea :) | 11:17 |
admin0 | mine is ssl, proper certificate and hostname mapping for both ext and int = must haves even before touching ansible | 11:17 |
evrardjp | nikm: don't copy the files, just pass the variables of the path on the host of the files you want to copy to container cache | 11:18 |
nattanon | admin0: Need 5 min my compute is down so need move it out and then plug it in a gain like i said delete evacuated vm make my compute gone away. | 11:18 |
*** Jack_Iv has joined #openstack-ansible | 11:19 | |
nattanon | T_T | 11:19 |
nattanon | admin0: More information cinder volume backend using ceph. | 11:21 |
admin0 | nattanon, what command u used ? | 11:22 |
admin0 | to evacuate ? | 11:22 |
admin0 | evrardjp, retrying with value set ot 12000 | 11:22 |
*** epalper has quit IRC | 11:23 | |
nikm | evrardjp: what is the variable name in https://github.com/openstack/openstack-ansible-lxc_container_create | 11:23 |
nattanon | admin0: button on horizon | 11:24 |
nikm | to be used | 11:24 |
*** portante has quit IRC | 11:24 | |
admin0 | nikm, what exactly are you trying to do ? | 11:24 |
nattanon | admin0: Oooopss !!!!! nova rescue is work. | 11:24 |
nikm | evrardjp : like you told for base container image https://github.com/openstack/openstack-ansible-lxc_hosts/blob/stable/ocata/defaults/main.yml#L106-L111 | 11:24 |
admin0 | is working ? or more work for you ? | 11:25 |
nikm | evrardjp: i want to copy host cert files to an existing container | 11:25 |
nikm | while running openstack-ansible setup-infrastructure.yml | 11:25 |
admin0 | nikm, isn't that controlled using a variable and setup during haproxy run ? | 11:25 |
nattanon | admin0: sad it more work for me. T_T | 11:25 |
admin0 | oh .. well, there is no magic button :) | 11:26 |
*** mattoliverau_ has joined #openstack-ansible | 11:27 | |
nattanon | admin0: Actually, evacuate instance with volume should be work fine right with no more operation. | 11:27 |
nikm | admin0 : do u mean https://docs.openstack.org/openstack-ansible-haproxy_server/latest/ variables | 11:28 |
nattanon | admin0: Should i back to check nova,cinder,ceph configuration ? | 11:28 |
admin0 | nikm, haproxy_user_ssl_cert: and haproxy_user_ssl_key: | 11:32 |
admin0 | point to the location where you key and cert is in the deoploy | 11:32 |
admin0 | and re-run haproxy-setup.yml | 11:32 |
admin0 | nattanon, should be = yes, is it always = no | 11:32 |
*** Faster-Fanboi_ has joined #openstack-ansible | 11:32 | |
*** epalper has joined #openstack-ansible | 11:33 | |
admin0 | you said more work .. so here is how it goes .. you have to do a nova resuce to every uuid .. and then boot into rescue .. but the rescue pass shown always does not work .. so you have to catch grub of the resuce system, and rescue itself ( single user mode ) and then fsck the vdb ( vda is the rescue itself ) and reboot | 11:33 |
admin0 | for each machines | 11:33 |
admin0 | do 1 verify that you see vdb and fsck works and that you can boot up the machine | 11:34 |
admin0 | checking conf has nothing to do with this | 11:34 |
cmorelli | @admin, to use SSL do I need to generate certificates beforehand? I read somewhere that the playbooks would generate self-signed certs automatically? | 11:34 |
*** mattoliverau has quit IRC | 11:34 | |
*** bradm has quit IRC | 11:34 | |
*** Faster-Fanboi has quit IRC | 11:34 | |
admin0 | cmorelli, it will generate itself | 11:34 |
admin0 | use haproxy_ssl_self_signed_regen: true | 11:34 |
*** portante has joined #openstack-ansible | 11:35 | |
openstackgerrit | Merged openstack/openstack-ansible-galera_client master: Fix cache update after initial apt_repository fail https://review.openstack.org/546559 | 11:35 |
admin0 | cmorelli, you can easily map the ip and domain and get one from letsencrypt :) good enough for 3 months before you have to repeat that again | 11:35 |
cmorelli | I'm following the ufficial gude here... https://docs.openstack.org/project-deploy-guide/openstack-ansible/ocata/app-config-test.html#test-environment-config | 11:35 |
cmorelli | in particular, the botom of the page says to just put `openstack_service_publicuri_proto: http` in the user config | 11:36 |
admin0 | csrf will come if horizon is accessed using a different ip then what you set it in the config | 11:36 |
admin0 | you can go to the haproxy container and check what ip works fine with | 11:37 |
cmorelli | uh | 11:38 |
cmorelli | I will try to redeploy the playbooks with these two vars | 11:38 |
cmorelli | openstack_service_publicuri_proto: http | 11:38 |
cmorelli | haproxy_ssl_self_signed_regen: true | 11:38 |
cmorelli | hopefully it will be enough to make it work | 11:38 |
*** stuartgr has joined #openstack-ansible | 11:38 | |
*** armaan has quit IRC | 11:39 | |
cmorelli | @admin, this is what I have in the config global_overrides: | 11:39 |
cmorelli | internal_lb_vip_address: 10.13.0.11 | 11:39 |
cmorelli | external_lb_vip_address: 10.13.0.11 | 11:39 |
cmorelli | tunnel_bridge: "br-vxlan" | 11:39 |
cmorelli | management_bridge: "br-mgmt" | 11:39 |
cmorelli | etc ... | 11:40 |
*** armaan has joined #openstack-ansible | 11:40 | |
cmorelli | that ip address is the same that I try to access from the browser to login | 11:40 |
*** shardy has quit IRC | 11:42 | |
*** bradm has joined #openstack-ansible | 11:44 | |
*** armaan has quit IRC | 11:46 | |
*** Jack_Iv has quit IRC | 11:47 | |
*** udesale_ has quit IRC | 11:50 | |
Miouge | Support for letsencrypt would be kind of cool, if not already possible. | 11:51 |
Miouge | It might be a little tricky because one has to share the certs & key as well as a state between for the HTTP-01 validation | 11:54 |
admin0 | evrardjp, that fixed it | 11:56 |
admin0 | 12000 | 11:56 |
*** armaan has joined #openstack-ansible | 11:57 | |
admin0 | my disks are slow, but the patch and solution was tldr for me .. so setting up the limit manually and sytsemctl restart /var/lib/machines solved for me | 11:57 |
admin0 | \o/ :D | 11:58 |
*** Sha000000 has quit IRC | 12:04 | |
*** pcaruana has joined #openstack-ansible | 12:09 | |
*** nattanon has quit IRC | 12:10 | |
vedin | evrardjp : now this new problem we are facing http://paste.openstack.org/show/681802/ | 12:17 |
admin0 | vedin, cannot ubuntu ? | 12:18 |
vedin | no we have to stick with centos 7 :( | 12:19 |
*** gunix has joined #openstack-ansible | 12:23 | |
gunix | https://docs.openstack.org/openstack-ansible/latest/contributor/quickstart-aio.html | 12:24 |
gunix | this returns 404 | 12:24 |
gunix | is this the official link ? | 12:24 |
odyssey4me | vedin did the repo build complete properly - can you confirm two things: 1. that haproxy is up and running with its configuration; 2. that the repo container has content in /var/www/repo/os-releases/<tag number>/ | 12:24 |
*** aruns has joined #openstack-ansible | 12:25 | |
odyssey4me | gunix it looks like evrardjp has been moving some stuff around, so that link's not working any more | 12:25 |
odyssey4me | the pike one is still there: https://docs.openstack.org/openstack-ansible/pike/contributor/quickstart-aio.html | 12:26 |
gunix | be careful with that. google has a delay when upgrading | 12:26 |
odyssey4me | and the process for an AIO has not changed since kilo, so you're safe to use it ;) | 12:26 |
gunix | :D | 12:26 |
*** dave-mccowan has joined #openstack-ansible | 12:27 | |
openstackgerrit | Merged openstack/openstack-ansible-lxc_hosts master: Install common packages into container cache https://review.openstack.org/546239 | 12:27 |
*** indistylo has quit IRC | 12:28 | |
evrardjp | gunix: where did you find that link ? | 12:29 |
gunix | evrardjp: on google | 12:29 |
evrardjp | yup, maybe we should do redirections | 12:30 |
vedin | odyssey4me: haproxy is working fine and /var/www/repo/os-releases/<tag number>/ have the files, i found ./mysql_python-1.2.5-cp27-cp27mu-linux_x86_64.whl file available in directory but it is throwing error for this python package only | 12:30 |
cmorelli | so, if I don't put `haproxy_ssl: false` in user config, I stumble in this error on the setup-openstack.yml playbook: | 12:31 |
cmorelli | fatal: [infra1_glance_container-57f5f85d]: FAILED! => {"attempts": 5, "changed": false, "failed": true, "module_stderr": "mesg: ttyname failed: Inappropriate ioctl for device\nTraceback (most recent call last):\n File \"/tmp/ansible_TLcg0A/ansible_module_keystone.py\", line 1459, in <module>\n main()\n File \"/tmp/ansible_TLcg0A/ansible_module_keystone.py\", line 1453, in main\n | 12:31 |
cmorelli | km.command_router()\n File \"/tmp/ansible_TLcg0A/ansible_module_keystone.py\", line 484, in command_router\n facts = action(variables=action_command['variables'])\n File \"/tmp/ansible_TLcg0A/ansible_module_keystone.py\", line 1030, in ensure_service\n self._authenticate()\n File \"/tmp/ansible_TLcg0A/ansible_module_keystone.py\", line 606, in _authenticate\n self.keystone = | 12:31 |
cmorelli | client.Client(**client_args)\n File \"/usr/local/lib/python2.7/dist-packages/keystoneclient/v3/client.py\", line 238, in __init__\n self.authenticate()\n File \"/usr/local/lib/python2.7/dist-packages/positional/__init__.py\", line 101, in inner\n return wrapped(*args, **kwargs)\n File \"/usr/local/lib/python2.7/dist-packages/keystoneclient/httpclient.py\", line 581, in authenticate\n | 12:31 |
cmorelli | resp = self.get_raw_token_from_identity_service(**kwargs)\n File \"/usr/local/lib/python2.7/dist-packages/keystoneclient/v3/client.py\", line 324, in get_raw_token_from_identity_service\n _('Authorization failed: %s') % e)\nkeystoneauth1.exceptions.auth.AuthorizationFailure: Authorization failed: Unable to establish connection to http://10.13.0.11:35357/v3/auth/tokens\n", "module_stdout": "", | 12:31 |
cmorelli | "msg": "MODULE FAILURE"} | 12:31 |
cmorelli | this time I also used `haproxy_ssl_self_signed_regen: true` as admin0 suggested | 12:31 |
cmorelli | I'm lost guys... | 12:32 |
evrardjp | cmorelli: what are you trying to achieve? | 12:32 |
*** Guy has joined #openstack-ansible | 12:32 | |
evrardjp | you want full http no https? | 12:32 |
evrardjp | cmorelli: what is your openstack_user_config ? | 12:32 |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible master: Update documentation index to include Queens https://review.openstack.org/546968 | 12:33 |
evrardjp | cmorelli: because you can have two different IPs | 12:33 |
cmorelli | @evrardjp I'm just following the user guide , I wanted to do the test-environment setup (here: https://docs.openstack.org/project-deploy-guide/openstack-ansible/ocata/app-config-test.html#test-environment-config) , but I stumbled in this error first. Then I googled a little bit and found that people suggested to set `haproxy_ssl: false` in the user config | 12:34 |
evrardjp | maybe the text here is more clear: https://docs.openstack.org/openstack-ansible/latest/user/test/example.html#user-variables | 12:34 |
evrardjp | it's latest branch text | 12:34 |
*** Jack_Iv has joined #openstack-ansible | 12:34 | |
cmorelli | so, if I do that I get to the WebUI working, but then I get that CSRF error a | 12:34 |
*** Jack_Iv has quit IRC | 12:34 | |
evrardjp | cmorelli: which branch? | 12:35 |
evrardjp | you want to do ocata only? | 12:35 |
cmorelli | i'm using tag 15.1.15 | 12:35 |
evrardjp | Pike is the current latest released branch | 12:35 |
cmorelli | ocata | 12:35 |
evrardjp | why no Pike? | 12:35 |
evrardjp | he deserves love too. | 12:35 |
admin0 | :) | 12:36 |
admin0 | #lovePike | 12:36 |
evrardjp | admin0: do that for next version? | 12:36 |
evrardjp | :D | 12:36 |
admin0 | if it works flawless :) | 12:36 |
evrardjp | cmorelli: anyway it should work in both cases. | 12:36 |
cmorelli | I don't know, I would prefer to use not the bleeding edge version in general, it's nothing agains Pike release in particular :) | 12:36 |
evrardjp | cmorelli: the bleeding edge is master | 12:37 |
admin0 | cmorelli, if for prod, i think pike sets a good foundtaion for cells | 12:37 |
cmorelli | yes true | 12:37 |
evrardjp | in two weeks or so we'll release queens | 12:37 |
evrardjp | just FYI :p | 12:37 |
cmorelli | it's not prod... i'm just trying to setup openstack to demo a in-permises cloud in my workpace | 12:37 |
evrardjp | cmorelli: oh. | 12:38 |
cmorelli | that's why I don't really need SSL working immediately :) | 12:38 |
evrardjp | cmorelli: start with an empty machine, and do scripts/gate-check-commit.sh | 12:38 |
evrardjp | boom. | 12:38 |
evrardjp | well that's bad advice. | 12:38 |
evrardjp | follow https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html | 12:38 |
vedin | odyssey4me : what should i do now, i pasted what u asked | 12:38 |
cmorelli | but I have three machines, i don't want to do AIO | 12:38 |
evrardjp | if it's for very quick PoC that's what you want to show. Bleeding edge fast delivered! | 12:39 |
*** chyka has joined #openstack-ansible | 12:39 | |
evrardjp | or do it with more stable: | 12:39 |
cmorelli | I already setup Vlans, networking, everything that is needed to run the playbooks | 12:39 |
evrardjp | https://docs.openstack.org/openstack-ansible/pike/contributor/quickstart-aio.html | 12:39 |
evrardjp | up to you | 12:39 |
evrardjp | I'd say the easiest though is not to mess up with ssl and stuff at the beginning. Just do the standard procedure. | 12:40 |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible stable/queens: Update Queens doc index https://review.openstack.org/546971 | 12:40 |
evrardjp | that's for test in Pike cmorelli : https://docs.openstack.org/project-deploy-guide/openstack-ansible/pike/app-config-test.html | 12:40 |
evrardjp | haha you beat me to it | 12:40 |
cmorelli | I understand that, but I also have to demo terraform/packer/ansible IaaS on top of Openstack, so the AIO solution could be too 'clogged', i think,.... am I wrong? | 12:41 |
cmorelli | maybe if I paste my openstack_user_config.yml in a pastebin, could you guys rewiev it? | 12:41 |
odyssey4me | cmorelli pike is a currently stable release, ocata is getting a bit old and crusty by now... queens is a release candidate right now... so if you want n-1 then pike is what you want, not ocata | 12:42 |
odyssey4me | but meh, ocata works too if you want | 12:42 |
*** chyka has quit IRC | 12:43 | |
evrardjp | odyssey4me: I will update the deploy_guide system to make sure it's pointing to queens deploy guide | 12:43 |
odyssey4me | cmorelli and doing haproxy_ssl: no is not the answer for disabling the public endpoints - if you want to do that, you should only need to do https://docs.openstack.org/project-deploy-guide/openstack-ansible/ocata/app-config-test.html#user-variables | 12:43 |
odyssey4me | evrardjp My patches are a stop-gap to make sure our indexes expose the docs properly. Nothing more. You're welcome to make changes beyond that if you wish, but I think it's important that we expose the docs correctly ASAP. | 12:44 |
cmorelli | here is my openstack_user_config : https://pastebin.com/2HyDMFzd | 12:45 |
evrardjp | odyssey4me: ok | 12:45 |
*** sxc731 has quit IRC | 12:45 | |
evrardjp | agreed | 12:45 |
cmorelli | ok I wikk move to Pike then.... maybe my problem will be fixed | 12:45 |
cmorelli | *will | 12:45 |
cmorelli | in any case I appreciate if you can review it :) | 12:45 |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible master: Update documentation index to include Queens https://review.openstack.org/546968 | 12:46 |
odyssey4me | vedin looking through backscroll | 12:47 |
odyssey4me | vedin ok, then if the container is unable to download the package, and haproxy and the repo look complete, then you'll have to troubleshoot connectivity issues from the source container to the LB, and from the LB to the repo | 12:48 |
odyssey4me | it could be bad network config, bad MTU, bad routing... all sorts | 12:48 |
evrardjp | odyssey4me: done | 12:51 |
*** zul has quit IRC | 12:51 | |
cmorelli | @odyssey4me: if I only put `openstack_service_publicuri_proto: http` in user variables without `haproxy_ssl: false`, then the playbook setup-openstack.yml does not finish correctly | 12:52 |
*** Chealion has quit IRC | 12:53 | |
cmorelli | few minutes ago I posted the problem I get : ...Authorization failed: Unable to │ chason | 12:53 |
cmorelli | | establish connection to http://10.13.0.11:35357/v3/auth/tokensy | 12:53 |
odyssey4me | cmorelli hmm, that's odd - if that is the case then there must be some sort of regression somewhere, because the endpoint using http does not require haproxy not to be using certs, they're independent of each other | 12:54 |
evrardjp | I think it's the way we structured docs in ocata that's the problem | 12:54 |
*** Chealion has joined #openstack-ansible | 12:54 | |
*** zul has joined #openstack-ansible | 12:55 | |
evrardjp | cmorelli: please use two different IPs for internal and external_lb_vip_address, and re-run your playbooks. You can then safely remove these overrides. | 12:55 |
evrardjp | sorry | 12:55 |
evrardjp | please use two different IPs for internal and external_lb_vip_address, remove these overrides, and re-run your playbooks :) | 12:55 |
evrardjp | and give the CA to your users. Problem solved. | 12:56 |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible stable/queens: Update Queens doc index https://review.openstack.org/546971 | 12:56 |
odyssey4me | yeah, realistically you should implement two different IP's regardless | 12:56 |
cmorelli | @evrardjp, sorry for my noobness: any suggestions on what ip to use as internal lb and external? Currently these three machines are running in a private lan | 12:56 |
cmorelli | 10.13.0.11-12-13 are my ips | 12:56 |
cmorelli | and .11 is the 'infra1' machine | 12:56 |
evrardjp | that's their mgmt network? | 12:57 |
evrardjp | br-mgmt I mean | 12:57 |
cmorelli | yes | 12:57 |
cmorelli | and i also have a vlan20 and vlan30 setup | 12:57 |
cmorelli | for br-storage and br-vxlan | 12:57 |
evrardjp | ok you can have 10.13.0.10 I guess for internal lb vip address (and put that into the reserved IPs) | 12:57 |
evrardjp | if 10 is not taken | 12:57 |
cmorelli | ah so for virtualip I can use any IP which is not already physically taken? | 12:58 |
evrardjp | and you can have 10.13.0.9 for external lb vip address for example (and put that into the reserved ips) | 12:58 |
evrardjp | cmorelli: you have to configure keepalived | 12:58 |
evrardjp | yes | 12:58 |
evrardjp | you must | 12:58 |
evrardjp | else you'd have conflicts | 12:58 |
cmorelli | gonna try immediately, thanks for the tip | 12:58 |
evrardjp | please check the prod guide on how to configure your user_variables.yml for keepalived | 12:59 |
evrardjp | please use a dns name for external lb vip address. | 12:59 |
evrardjp | you'll thank me later. | 12:59 |
evrardjp | :) | 12:59 |
cmorelli | but.... | 12:59 |
evrardjp | her emails? | 12:59 |
evrardjp | cmorelli: ? | 13:05 |
*** aruns has quit IRC | 13:07 | |
evrardjp | sorry odyssey4me | 13:09 |
*** wagner has joined #openstack-ansible | 13:09 | |
cmorelli | i am just confused about the DNS name that you suggested. idon't have any dns resolution setup , it is a private lan | 13:10 |
*** aruns has joined #openstack-ansible | 13:11 | |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible stable/queens: Update Queens doc index https://review.openstack.org/546971 | 13:13 |
*** niraj_singh has joined #openstack-ansible | 13:13 | |
*** electrofelix has quit IRC | 13:14 | |
odyssey4me | cmorelli using a DNS name makes it easier to use certs if you want to later, and also makes it easier to implement other things to make access to the system easier - for example if you use an IP, only that IP will work... but if you use a name, then even if you NAT/PAT access to the system it will work as long as you have a way to resolve the name back to the IP used for access internally/externally | 13:14 |
odyssey4me | so, given that the systems themselves use the internal endpoints to talk to each other, keep that as an IP, but generally it's advised to use a DNS name for the public endpoints - changing them after the deployment is a bit of a pain | 13:15 |
*** hamzy_ is now known as hamzy | 13:16 | |
*** JohnnyOSA has quit IRC | 13:17 | |
cmorelli | fatal: [infra1_galera_container-b3750f45]: FAILED! => {"attempts": 5, "changed": false, "cmd": "/usr/local/bin/pip2 install -U --constraint http://10.13.0.100:8181/os-releases/15.1.15/ubuntu-16.04-x86_64/requirements_absolute_requirements.txt pyasn1 pyOpenSSL requests urllib3", "failed": true, "msg": "\n:stderr: Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection | 13:18 |
cmorelli | broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.HTTPConnection object at 0x7f1db9b42590>: | 13:18 |
*** JohnnyOSA has joined #openstack-ansible | 13:18 | |
cmorelli | 10.13.0.100 is the internal lb vip | 13:18 |
cmorelli | that is set | 13:18 |
*** berendt has quit IRC | 13:18 | |
*** berendt has joined #openstack-ansible | 13:20 | |
odyssey4me | cmorelli trace the path of the data to find the failure: galera_container -> haproxy -> repo container | 13:21 |
odyssey4me | if the packets are making their way from one to the next just fine, then validate whether the repo container has the content | 13:21 |
*** woodard_ has quit IRC | 13:29 | |
openstackgerrit | Merged openstack/openstack-ansible stable/newton: Update all SHAs for 14.2.16 https://review.openstack.org/545732 | 13:30 |
*** udesale has joined #openstack-ansible | 13:32 | |
*** MikeW has joined #openstack-ansible | 13:36 | |
*** acormier has joined #openstack-ansible | 13:39 | |
openstackgerrit | Merged openstack/openstack-ansible-galera_server master: Fix cache update after initial apt_repository fail https://review.openstack.org/546312 | 13:40 |
*** acormier has quit IRC | 13:42 | |
*** woodard has joined #openstack-ansible | 13:44 | |
*** shardy has joined #openstack-ansible | 13:48 | |
admin0 | phew, .. setup hosts finished finally without a hiccup | 13:58 |
*** sxc731 has joined #openstack-ansible | 14:01 | |
admin0 | lvdombrkr, so the ovs thing i pasted, ironic still wants linuxbridge, so under those guidelines, for ironic, ironic is also linuxbridge | 14:03 |
*** lbragstad has joined #openstack-ansible | 14:10 | |
admin0 | hmm.. evrardjp ..in my new also ,keeplive is running in all, but no ip is being added/seen in any interface .. to move on, i manually did a ifconfig in one of the controller and proceed agahed | 14:13 |
admin0 | will check later why its failing/not working | 14:13 |
admin0 | quick question .. does stable/pike allow live migrations by default ( non shared storage ) ? | 14:20 |
admin0 | or do i need certain overrides to do | 14:20 |
*** manuelbuil has quit IRC | 14:20 | |
*** Sha000000 has joined #openstack-ansible | 14:21 | |
*** Sha000000 has quit IRC | 14:23 | |
*** Sha000000 has joined #openstack-ansible | 14:23 | |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-ceph_client master: Fix cache update after initial apt_repository fail https://review.openstack.org/547000 | 14:24 |
*** sxc731 has quit IRC | 14:25 | |
Tahvok | Anyone got promo code for ptg? | 14:25 |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-openstack_hosts master: Fix cache update after initial apt_repository fail https://review.openstack.org/547003 | 14:29 |
odyssey4me | Tahvok do you mean the hotel code? it's on https://www.openstack.org/ptg/#tab_travel | 14:30 |
*** jwitko_ has joined #openstack-ansible | 14:30 | |
*** esberglu has joined #openstack-ansible | 14:31 | |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-os_nova master: Fix cache update after initial apt_repository fail https://review.openstack.org/547004 | 14:35 |
Tahvok | odyssey4me: the code if for the 155 euro price? or is there another discount? | 14:35 |
odyssey4me | Tahvok that's the only discount I'm aware of | 14:38 |
*** ansmith has joined #openstack-ansible | 14:38 | |
admin0 | strange .. keepalive is running ..but tcpdump registeres no activity on vrrp | 14:40 |
*** Sha000000 has quit IRC | 14:48 | |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-rabbitmq_server master: Fix cache update after initial apt_repository fail https://review.openstack.org/547015 | 14:48 |
evrardjp | admin0: check the interfaces config and your network connectivity. But you already know that :D | 14:48 |
evrardjp | also check you are listening to everything | 14:48 |
odyssey4me | https://media.giphy.com/media/TNwRJDrAry7qU/giphy.gif | 14:49 |
evrardjp | (multicast address) | 14:49 |
evrardjp | HAHAHA | 14:49 |
evrardjp | :) | 14:49 |
admin0 | :D | 14:50 |
*** sxc731 has joined #openstack-ansible | 14:50 | |
admin0 | i had to do tcpdump -i any -n vrrp and then re-run the haprxy setup playbook 3 times ( coz i find nothing wrong ) so .. try try until die .. tryhard mode | 14:51 |
admin0 | and it worked :D | 14:51 |
*** kstev has joined #openstack-ansible | 14:52 | |
*** epalper has quit IRC | 14:52 | |
*** gkadam__ has quit IRC | 14:53 | |
ansmith | jmccrory: hello | 14:54 |
*** epalper has joined #openstack-ansible | 14:55 | |
*** kstev has quit IRC | 14:56 | |
*** kstev has joined #openstack-ansible | 14:56 | |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-pip_install stable/pike: Fix cache update after initial apt_repository fail https://review.openstack.org/547021 | 15:00 |
admin0 | \o/ setup infra also passed ... hiccups on setup-hosts = that bug which was done using manual quota limit ( thank god i did not have to cherry pick that fix ) and went relatively smooth .. .. and then to increase the container build wait .. i had to double from 6000 to 12000 | 15:15 |
admin0 | vip worked when i override the internal and external ids to 120 172 | 15:15 |
odyssey4me | admin0 so the default value is too small? | 15:15 |
admin0 | in this specif case yes . i am on a 4 disk raid 10 setup .. but what i noticed was the extraction took a long time and by the time it countdown to 6000 retries, it was timing out | 15:17 |
admin0 | maybe my disks are bad .. will be reviewing them | 15:18 |
admin0 | quota bug, i was able to move forward by doing machinectl limit-set 500G ; sytsemctl restart /var/lib/machines | 15:19 |
*** udesale has quit IRC | 15:19 | |
admin0 | cherry picking that code was scary for me | 15:19 |
admin0 | i meant that patch | 15:19 |
admin0 | odyssey4me, i think it was on a todo somehwere .. osa+ovs .. i have it up quick here: http://www.openstackfaq.com/openstack-ansible-with-ovs-pike/ --- that is how it works now that i have been able to make it work | 15:20 |
*** Guy has quit IRC | 15:22 | |
admin0 | now to setup-openstack :) | 15:22 |
admin0 | setup infra also passed on good | 15:22 |
Tahvok | Do we know what time ptg will end on Friday? | 15:27 |
odyssey4me | Tahvok Friday is typically more of a social day. We chat for the morning, but most people start leaving after lunch. | 15:28 |
Tahvok | Then it's fine. Trying to arrange my flight, saw one at 20:50, and was not sure if I would make it | 15:29 |
*** acormier has joined #openstack-ansible | 15:32 | |
*** mardim has quit IRC | 15:34 | |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-pip_install stable/pike: Fix cache update after initial apt_repository fail https://review.openstack.org/547021 | 15:35 |
openstackgerrit | Markos Chandras (hwoarang) proposed openstack/openstack-ansible-galera_server master: tasks: Fix use_percona_upstream variable usage https://review.openstack.org/535252 | 15:36 |
hwoarang | odyssey4me: evrardjp^ would you be able to get this in :) conflicts conficts conflicts | 15:37 |
hwoarang | hello btw | 15:37 |
* hwoarang is super busy today | 15:37 | |
*** aruns has quit IRC | 15:38 | |
evrardjp | hwoarang: yup. | 15:38 |
evrardjp | odyssey4me: could you vote on hwoarang 's patch too? | 15:40 |
*** Sha000000 has joined #openstack-ansible | 15:41 | |
evrardjp | Tahvok: my flight is at 6pm on Friday , FYI. | 15:41 |
odyssey4me | evrardjp busy working through it | 15:43 |
*** SerenaFeng has joined #openstack-ansible | 15:45 | |
evrardjp | odyssey4me: thanks. | 15:46 |
hwoarang | thanks! | 15:47 |
*** mardim has joined #openstack-ansible | 15:53 | |
*** sxc731 has quit IRC | 15:57 | |
openstackgerrit | git-harry proposed openstack/openstack-ansible-galera_client stable/queens: Fix cache update after initial apt_repository fail https://review.openstack.org/547043 | 15:58 |
openstackgerrit | git-harry proposed openstack/openstack-ansible-galera_client stable/pike: Fix cache update after initial apt_repository fail https://review.openstack.org/547044 | 15:58 |
*** mrch has joined #openstack-ansible | 15:59 | |
*** lvdombrkr has quit IRC | 16:03 | |
*** dariko has joined #openstack-ansible | 16:04 | |
openstackgerrit | git-harry proposed openstack/openstack-ansible-galera_client stable/ocata: Fix cache update after initial apt_repository fail https://review.openstack.org/547048 | 16:06 |
*** wagner has quit IRC | 16:11 | |
*** mbuil has joined #openstack-ansible | 16:12 | |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-pip_install stable/pike: Fix cache update after initial apt_repository fail https://review.openstack.org/547021 | 16:13 |
openstackgerrit | git-harry proposed openstack/openstack-ansible-galera_client stable/newton: Fix cache update after initial apt_repository fail https://review.openstack.org/547052 | 16:16 |
*** kstev has quit IRC | 16:16 | |
MikeW | evrardjp I linted my json file and promise to never touch it again lol :) | 16:18 |
openstackgerrit | git-harry proposed openstack/openstack-ansible-galera_server stable/queens: Fix cache update after initial apt_repository fail https://review.openstack.org/547053 | 16:18 |
evrardjp | MikeW: good :) | 16:20 |
evrardjp | MikeW: what was the thing you wanted to do with it? Change ips? | 16:20 |
MikeW | evrardjp Still doesn't work I think you were right in renaming being an issue. Yeah changing IPs | 16:20 |
mhayden | odyssey4me: on https://review.openstack.org/546308 -- i replied about the http mirror | 16:26 |
openstackgerrit | git-harry proposed openstack/openstack-ansible-galera_server stable/pike: Fix cache update after initial apt_repository fail https://review.openstack.org/547061 | 16:30 |
*** SerenaFeng has quit IRC | 16:30 | |
*** yolanda has quit IRC | 16:30 | |
*** weezS has joined #openstack-ansible | 16:30 | |
*** yolanda has joined #openstack-ansible | 16:30 | |
*** armaan has quit IRC | 16:33 | |
*** armaan has joined #openstack-ansible | 16:33 | |
*** sxc731 has joined #openstack-ansible | 16:35 | |
openstackgerrit | git-harry proposed openstack/openstack-ansible-galera_server stable/ocata: Fix cache update after initial apt_repository fail https://review.openstack.org/547063 | 16:36 |
*** fusmu has quit IRC | 16:37 | |
*** chyka has joined #openstack-ansible | 16:48 | |
*** Smeared_Beard has joined #openstack-ansible | 16:49 | |
*** SmearedBeard has quit IRC | 16:50 | |
mhayden | odyssey4me: many thanks good sir | 16:54 |
evrardjp | mhayden: do you know why it's not caching with https? | 16:56 |
mhayden | it bypasses the cache | 16:56 |
evrardjp | ok | 16:56 |
mhayden | the cache only works with http | 16:56 |
evrardjp | we might want to use gpg checking | 16:56 |
mhayden | yum has that enabled by default | 16:57 |
evrardjp | we disable it in the package install | 16:57 |
mhayden | and it's enabled by ansible-hardening too | 16:57 |
odyssey4me | I think that's pretty standard. I'm not aware of a proxy cache that is able to cache HTTPS content. It can proxy it, but not cache it. | 16:57 |
mhayden | well we could fix that | 16:57 |
evrardjp | (apt key) | 16:57 |
evrardjp | not apt yum key | 16:57 |
mhayden | odyssey4me: yeah, it would require a trusted cert with a mitm :) | 16:57 |
odyssey4me | I could be wrong. It has been a while. | 16:57 |
mhayden | evrardjp: if we aren't checking GPG keys on pkgs, i'd like to fix that too | 16:57 |
evrardjp | yeah I am fine with that too | 16:57 |
mhayden | the mariadb fix should help a lot since those packages are big | 16:58 |
evrardjp | well yeah but that takes a lot of time :D | 16:58 |
evrardjp | let me double check | 16:58 |
evrardjp | it seems I was wrong | 17:00 |
mhayden | GPG checking on packages should be really fast | 17:01 |
mhayden | since you're just verifying the sig | 17:01 |
mhayden | but i saw a 70-80% reduction in time when i switched the mariadb repo to http | 17:01 |
mhayden | at least for yum | 17:01 |
mhayden | which is good | 17:01 |
mhayden | i think the server pkg is ~ 90MB by itdelf | 17:01 |
mhayden | itself | 17:01 |
evrardjp | wow | 17:02 |
evrardjp | that's a big pkg | 17:02 |
evrardjp | all of that for DATABASES! | 17:02 |
evrardjp | pff | 17:02 |
evrardjp | let put our data into images. | 17:02 |
evrardjp | haha | 17:03 |
evrardjp | where should I put "Securing services with SSL certificates": A new section under user guide, or reference? | 17:04 |
mhayden | OMG WE HAD AN INTEGRATED GATE FOR CENTOS 7 SUCCEED | 17:05 |
mhayden | https://review.openstack.org/545455 | 17:05 |
evrardjp | I don't think putting it into reference/architecture/security. | 17:05 |
mhayden | 1 HR 53 M | 17:05 |
* mhayden REJOICES | 17:05 | |
mhayden | it is a basekit (thanks andymccr) but it is exciting nonetheless | 17:05 |
evrardjp | #success mhayden got centos OSA gate under 2h today | 17:05 |
openstackstatus | evrardjp: Added success to Success page (https://wiki.openstack.org/wiki/Successes) | 17:05 |
mhayden | WHEEEEE | 17:05 |
evrardjp | you're the deal! | 17:06 |
openstackgerrit | git-harry proposed openstack/openstack-ansible-galera_server stable/ocata: Fix cache update after initial apt_repository fail https://review.openstack.org/547063 | 17:06 |
evrardjp | where is the patch? | 17:06 |
mhayden | evrardjp: https://review.openstack.org/545455 | 17:06 |
mhayden | that depends on andymccr's https://review.openstack.org/543534 | 17:06 |
evrardjp | if that's a replacement of playbooks with debug: msg="i did it" it doesnt count | 17:06 |
odyssey4me | what is a good word to describe achitectures other than x86_64/i386 | 17:06 |
evrardjp | non intel | 17:07 |
mhayden | evrardjp: also the other galera patch is here -> https://review.openstack.org/546309 | 17:07 |
mhayden | odyssey4me: alternative architectures? | 17:07 |
mhayden | secondary architectures? | 17:07 |
evrardjp | lol | 17:07 |
odyssey4me | alt_arch will do I guess | 17:07 |
evrardjp | non intel seem more likely | 17:07 |
odyssey4me | well, x86_64 covers AMD too ;) | 17:07 |
mhayden | we could use something like adreznec_stuff | 17:07 |
evrardjp | not IA64, not IA32 | 17:07 |
* mhayden winks at adreznec | 17:07 | |
mhayden | itanium? | 17:08 |
evrardjp | that's definitely not supported | 17:08 |
evrardjp | IA64... | 17:08 |
adreznec | I'm sure Itanium and SPARC support is coming any day now, right mhayden? | 17:10 |
mhayden | also, those new galera patches take 60-90 seconds on initial install without caching, and 20-30 sec after | 17:10 |
mhayden | adreznec: but of course | 17:10 |
* mhayden will be back shortly | 17:11 | |
*** kstev has joined #openstack-ansible | 17:11 | |
*** armaan has quit IRC | 17:12 | |
*** armaan has joined #openstack-ansible | 17:13 | |
evrardjp | I guess I will move that to user guide | 17:13 |
openstackgerrit | Jimmy McCrory proposed openstack/openstack-ansible-os_nova master: Rearrange cell mapping tasks https://review.openstack.org/547072 | 17:15 |
*** admin0 has quit IRC | 17:16 | |
andymccr | cells :/ | 17:16 |
evrardjp | a nightmare in dragonball, a nightmare in openstack! | 17:17 |
jmccrory | heh yeah...finally moving to ocata. that was my face, andymccr | 17:18 |
andymccr | jmccrory: i thought we had fixed up the ordering stuff | 17:18 |
logan- | watch out for https://bugs.launchpad.net/openstack-ansible/+bug/1729661 also | 17:19 |
openstack | Launchpad bug 1729661 in openstack-ansible "Map instances to new Cell1 takes excessive amounts of time to run on upgraded cloud" [Wishlist,Confirmed] | 17:19 |
andymccr | i literally sat with a nova core and fleshed out the deploy path, and then the upgrade path (which are different and really hard to automate) | 17:19 |
andymccr | hahhahaha | 17:19 |
jmccrory | pushing stateless hypervisors at the same time, so the discover were never actually running in our environments | 17:19 |
openstackgerrit | git-harry proposed openstack/openstack-ansible-galera_server stable/newton: Fix Apt cache update due to adding Galera repo https://review.openstack.org/547074 | 17:20 |
jmccrory | or instance mapping rather | 17:20 |
odyssey4me | logan- I think that jmccrory's patch may actually help with that bug | 17:20 |
odyssey4me | but to be honest, I'm in the weeds of something else right now, so I'm probably not thinking straight | 17:20 |
andymccr | ocata was a nasty release for nova | 17:21 |
andymccr | that + the placement bits | 17:21 |
andymccr | from a deployment perspective made for not-fun | 17:21 |
jmccrory | yeah, should help there. this will make instance mapping and discover_hosts commands both only run one time, instead of once per compute | 17:21 |
*** gillesMo has quit IRC | 17:24 | |
odyssey4me | jmccrory perhaps it'd be good to add 'Related-Bug' or 'Closes-Bug' to the review then ;) | 17:25 |
openstackgerrit | Jimmy McCrory proposed openstack/openstack-ansible-os_nova master: Rearrange cell mapping tasks https://review.openstack.org/547072 | 17:27 |
evrardjp | odyssey4me: +1 | 17:27 |
evrardjp | thanks jmccrory | 17:28 |
logan- | jmccrory: great | 17:29 |
openstackgerrit | Merged openstack/openstack-ansible-repo_build stable/pike: SUSE: Fix MariaDB development package https://review.openstack.org/546583 | 17:29 |
openstackgerrit | Merged openstack/openstack-ansible-galera_client master: Allow Galera package downloads over HTTP https://review.openstack.org/546308 | 17:29 |
openstackgerrit | Merged openstack/openstack-ansible-galera_server master: tasks: Fix use_percona_upstream variable usage https://review.openstack.org/535252 | 17:29 |
ansmith | jmccrory: hi, had a question regarding https://review.openstack.org/#/c/499882/ | 17:36 |
jmccrory | ansmith sure, but haven't been able to get around to working on that unfortunately | 17:37 |
ansmith | jmccrory: would like to help if possible, might it be discussed at ptg? | 17:37 |
jmccrory | yeah definitely | 17:38 |
ansmith | should i propose in the etherpad or just plan to swing by mtg room | 17:39 |
jmccrory | yeah i think it'd be good to have it on etherpad. not sure what the time slots look like already, but if not it's own dedicated topic, there should be some free time to go over that in the room | 17:42 |
*** mrch has quit IRC | 17:43 | |
ansmith | sounds like a plan, will add it to epad | 17:44 |
*** idlemind has quit IRC | 17:50 | |
*** armaan has quit IRC | 17:51 | |
*** armaan has joined #openstack-ansible | 17:52 | |
openstackgerrit | Merged openstack/openstack-ansible-galera_server master: Allow Galera package downloads over HTTP https://review.openstack.org/546309 | 18:02 |
*** pbandark has quit IRC | 18:07 | |
*** admin0 has joined #openstack-ansible | 18:10 | |
*** mbuil has quit IRC | 18:16 | |
*** d3n14l has joined #openstack-ansible | 18:20 | |
*** sxc731 has quit IRC | 18:23 | |
openstackgerrit | Major Hayden proposed openstack/openstack-ansible-galera_server stable/queens: Allow Galera package downloads over HTTP https://review.openstack.org/547097 | 18:30 |
openstackgerrit | Major Hayden proposed openstack/openstack-ansible-galera_client stable/queens: Allow Galera package downloads over HTTP https://review.openstack.org/547098 | 18:30 |
admin0 | how does this error come: oslo_config.cfg.DefaultValueError: Error processing default value c3_cinder_api_container-57ae96e9 for Opt type of HostAddress. | 18:30 |
*** openstackgerrit has quit IRC | 18:33 | |
d3n14l | Hey there, I am looking for a best practices guide for writing ansible roles. Is there some guide by the osa project? | 18:33 |
mhayden | cloudnull: if you have a moment -> https://review.openstack.org/546153 | 18:36 |
*** openstackgerrit has joined #openstack-ansible | 18:36 | |
openstackgerrit | Major Hayden proposed openstack/openstack-ansible master: CentOS 7 integrated gate optimization https://review.openstack.org/545455 | 18:36 |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-galera_server master: Restore support for percona packages when using ppc64le https://review.openstack.org/547101 | 18:38 |
odyssey4me | adreznec not sure when/if this will touch you guys, but I've proposed https://review.openstack.org/547101 - will add a reno to it shortly | 18:39 |
openstackgerrit | Merged openstack/openstack-ansible stable/ocata: Update all SHAs for 15.1.17 https://review.openstack.org/545666 | 18:41 |
sar | Yesterday i got this error message in cinder-volume.log all the time when trying to migrate attached volumes: "Could not determine a suitable URL". | 18:41 |
sar | Today i added os_privileged_user_auth_url, os_privileged_user_name and os_privileged_user_password in cinder.conf, and now i get a new message: "BadRequest: Expecting to find domain in project. The server could not comply with the request since it is either malformed or otherwise incorrect. The client is assumed to be in error." | 18:41 |
sar | Does anyone know where i can specify domain in cinder.conf, or know otherwise how i can fix this? | 18:42 |
openstackgerrit | Major Hayden proposed openstack/openstack-ansible-repo_build master: [WIP] Remove unneeded clients from heat https://review.openstack.org/546319 | 18:44 |
*** epalper has quit IRC | 18:45 | |
openstackgerrit | Major Hayden proposed openstack/openstack-ansible master: [WIP] Test repo build w/trimmed heat requirements https://review.openstack.org/546331 | 18:47 |
openstackgerrit | Major Hayden proposed openstack/openstack-ansible master: [WIP] Test repo build w/trimmed heat requirements https://review.openstack.org/546331 | 18:47 |
armaan | folks, have you released 15.1.17? | 18:50 |
shananigans | sar: Not sure it will help, but it looks like most of this might be set under the keystone_authtoken section. https://docs.openstack.org/cinder/latest/install/cinder-storage-install-ubuntu.html | 18:52 |
sar | I already have that in the keystone_authtoken section. I was looking at this: https://docs.openstack.org/ocata/config-reference/block-storage/config-options.html | 18:54 |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-galera_server master: Restore support for percona packages when using ppc64le https://review.openstack.org/547101 | 18:54 |
sar | Where it describes os_privileged_user_name like this: OpenStack privileged account username. Used for requests to other services (such as Nova) that require an account with special rights. | 18:55 |
sar | (under the DEFAULT section) | 18:55 |
*** poopcat has joined #openstack-ansible | 18:55 | |
sar | The problem is, when cinder is trying to migrate an attached volume, it calls the nova api | 18:55 |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-galera_server master: Restore support for percona packages when using ppc64le https://review.openstack.org/547101 | 18:55 |
*** d3n14l has left #openstack-ansible | 19:00 | |
*** stuartgr has quit IRC | 19:01 | |
*** Sha000000 has quit IRC | 19:01 | |
odyssey4me | armaan nope, that's the next proposed release - the sha bumps only just merged: https://review.openstack.org/545666 | 19:02 |
mnaser | debhelper : Depends: dh-strip-nondeterminism (>= 0.028~) but it is not going to be installed | 19:04 |
mnaser | anyone else run into this? | 19:04 |
armaan | odyssey4me: ahh, that makes sense. we were just going to upgrade Ocata to Pike. Using 15.1.6 for now | 19:05 |
odyssey4me | mnaser yeah, I just saw it in the panko role test | 19:05 |
mnaser | odyssey4me: looks like master went through though? | 19:05 |
mnaser | but this is for the stable branches | 19:05 |
mnaser | and infra said the ubuntu mirrors were fixed last night | 19:05 |
odyssey4me | I'm not sure why that package is actually even needed, but that sort of failure happens when the packages on the host are newer than those available in the apt sources configured... or there's a conflict. | 19:05 |
mnaser | looks like stable/pike passsed | 19:06 |
odyssey4me | mnaser master failed too - same issue: https://review.openstack.org/546773 | 19:06 |
mnaser | and andreas did the rechecks at the same time | 19:06 |
mnaser | i wonder if maybe one provider got older nodepool images or something | 19:07 |
odyssey4me | mnaser FYI the original content was pike, so anything after that has the same code base and has not been adjusted other than to change the branches... | 19:07 |
mnaser | odyssey4me: yeah i was jsut going to ask if there was any differences but maybe this failure is an os_keystone one | 19:07 |
odyssey4me | but I did test master on a cloud image of my own yesterday, and it passed functional tests... so this is new | 19:07 |
mnaser | cause that's where it is failing | 19:07 |
odyssey4me | ah? that's new then | 19:08 |
mnaser | its failing during keystone distro pkg installs | 19:08 |
*** sxc731 has joined #openstack-ansible | 19:08 | |
mnaser | yep, its not pankos fault | 19:08 |
odyssey4me | if it's in a container where the fail happens, then bear in mind that our container image is downloaded from images.lxc-container.org, so it might have newer packages in it than those in the infra mirror | 19:08 |
mnaser | odyssey4me: https://review.openstack.org/#/c/547072/ recent os_nova change, failed with the same issue | 19:08 |
odyssey4me | sounds to me like this might be the case | 19:08 |
odyssey4me | we can wait it out, or ask infra to update the mirror again | 19:09 |
odyssey4me | this is very unusual, because the base image package list is very small | 19:09 |
odyssey4me | but with gcc releasing yesterday, and whatever this is today, I guess it happens sometimes | 19:09 |
mnaser | 20180222_03:49 | 19:09 |
mnaser | latest ubuntu xenial image | 19:09 |
mnaser | so that can explain it | 19:09 |
odyssey4me | I've been meaning to convert our base image prep into d-i-b to replace what we have in lxc_hosts today. | 19:10 |
odyssey4me | It just hasn't been that much of a priority. | 19:10 |
mnaser | odyssey4me: i'll pretty much be at all the osa ptg stuff and i can bring up some stuff that we learned to stabilize ci in puppet | 19:10 |
mnaser | ex: it looks like there is no usage of mirrors right now? and https://us.images.linuxcontainers.org/ is not mirrored either so that can't be good | 19:11 |
odyssey4me | cool bananas - I'm out for the night, cheerio! | 19:11 |
mnaser | o/ later | 19:11 |
mnaser | i'll try to ask infra to force an update | 19:11 |
admin0 | anyone knows how to fix this error. or what might I be missing: "oslo_config.cfg.DefaultValueError: Error processing default value c3_cinder_api_container-57ae96e9 for Opt type of HostAddress" | 19:14 |
admin0 | hmm.. nova works, neutron got the same error: oslo_config.cfg.DefaultValueError: Error processing default value c3_neutron_server_container-4e8ae1a8 for Opt type of HostAddress | 19:32 |
admin0 | what am I missing/doing wrong ? | 19:32 |
openstackgerrit | Merged openstack/openstack-ansible-os_panko stable/pike: Zuul: Remove project name https://review.openstack.org/546775 | 19:32 |
admin0 | hostname, hostname -f returns fine | 19:33 |
logan- | mnaser re: the linuxcontainers stuff above and in infra... it is downloaded thru a reverse proxy (see https://github.com/openstack/openstack-ansible/blob/99ca16e85e5b81fa111c152f0fae56bd05a5d814/tests/roles/bootstrap-host/tasks/prepare_aio_config.yml#L80-L85) | 19:33 |
logan- | i think that setting was implemented in both osa and role test gates | 19:34 |
logan- | s/osa/integrated | 19:34 |
admin0 | logan-, not my error right ? | 19:35 |
admin0 | aah . was for mnaser | 19:36 |
logan- | admin0: maybe the underscores are breaking it. does 'hostname' show '_' or '-' | 19:36 |
admin0 | none | 19:36 |
admin0 | infra nodes = c1 c2 and c3, compute = b4 .. b10 | 19:37 |
logan- | inside the container though | 19:37 |
logan- | the container hostnames should have '-' | 19:37 |
admin0 | - | 19:40 |
admin0 | oh | 19:40 |
admin0 | it generated all on _ | 19:40 |
admin0 | my inventory list output: https://gist.github.com/a1git/a9fd50ba71b62372887793552a019ff1 | 19:40 |
admin0 | i am using stable/pike | 19:41 |
logan- | the container name is not the same as the hostname of the container | 19:41 |
logan- | the container name is expected to contain '_' | 19:41 |
logan- | the hostname is not | 19:41 |
admin0 | err.. its auto generated by the script | 19:41 |
admin0 | sorry .. me confused | 19:42 |
logan- | lxc-attach -n `lxc-ls -1 | grep cinder_api` -- hostname | 19:42 |
logan- | what is the output | 19:42 |
admin0 | root@c3:~# lxc-attach -n `lxc-ls -1 | grep cinder_api` -- hostname | 19:42 |
admin0 | c3_cinder_api_container-57ae96e9 | 19:42 |
admin0 | root@c3:~# | 19:42 |
admin0 | oops .. sorry for multi-line | 19:42 |
admin0 | returned: c3_cinder_api_container-57ae96e9 | 19:43 |
logan- | as a quick test try changing the /etc/hostname and /etc/hosts in the container to set - | 19:43 |
logan- | and see if cinder still complains | 19:43 |
logan- | (reboot the container after changing the hostname files) | 19:43 |
admin0 | before i do this test.. assuming this change fixes it, what is the solution ? i have to redo the whole patform again ? | 19:44 |
admin0 | and this is a greenfield deployment .. why would the setup-hosts generate _ when we know it breaks in setup-openstack ? | 19:45 |
MikeW | My repo build isn't pulling down the magnum checksum and virtualenv... can I do this manually? | 19:45 |
logan- | admin0: i'm not sure if there is a patch proposed yet. I just recall seeing someone else with this issue from bug triage. looking for the bug report | 19:47 |
logan- | admin0: https://bugs.launchpad.net/openstack-ansible/+bug/1743805 | 19:47 |
openstack | Launchpad bug 1743805 in openstack-ansible "neutron-db-manage fails on hostname with underscore" [Undecided,Incomplete] | 19:47 |
mnaser | logan-: i see the bootstrap_host_ubuntu_repo but i dont see anything that sets it | 19:49 |
logan- | user_variables_aio.yml template i think | 19:49 |
mnaser | logan-: ok ill double check but indeed its here https://github.com/openstack-infra/system-config/blob/master/modules/openstack_project/templates/mirror.vhost.erb#L166-L169 | 19:49 |
logan- | yup | 19:50 |
logan- | https://github.com/openstack/openstack-ansible/blob/99ca16e85e5b81fa111c152f0fae56bd05a5d814/tests/roles/bootstrap-host/templates/user_variables.aio.yml.j2#L164-L168 | 19:50 |
mnaser | logan-: but still can't find bootstrap_host_ubuntu_repo (at least with github search which isn't always reliable) | 19:52 |
logan- | ohh gotcha | 19:52 |
logan- | tests/roles/bootstrap-host/tasks/install_packages.yml | 19:53 |
logan- | github search is the worst | 19:53 |
logan- | looks like it tries to figure it out here https://github.com/openstack/openstack-ansible/blob/99ca16e85e5b81fa111c152f0fae56bd05a5d814/tests/roles/bootstrap-host/tasks/install_packages.yml#L25-L49 | 19:55 |
*** weezS_ has joined #openstack-ansible | 20:03 | |
odyssey4me | o/ logan- mnaser trying to figure out how repositories are configured? I can help | 20:10 |
mnaser | odyssey4me: trying to figure out why this error is happening | 20:10 |
mnaser | infra recently started mirorring bionic | 20:10 |
mnaser | could it be related? | 20:10 |
* mnaser shrugs | 20:10 | |
odyssey4me | looks like https://review.openstack.org/546775 went through, which is odd | 20:11 |
odyssey4me | I wonder if some regions are out of date, and some are up to date | 20:11 |
mnaser | odyssey4me: thats not possible because of how afs works (afaik) | 20:11 |
odyssey4me | role tests and integrated build tests get set differently, so let's focus on one of them | 20:11 |
odyssey4me | mnaser yep, that's why I think it's odd | 20:11 |
mnaser | http://logs.openstack.org/73/546773/1/check/openstack-ansible-functional-ubuntu-xenial/e232748/logs/ara/result/b068b5cb-4466-40f1-98f9-43037bcbc06c/ | 20:12 |
odyssey4me | ok, so the basic overview of what happens is this | 20:12 |
mnaser | nothing bionic related here for the container | 20:12 |
odyssey4me | lxc_hosts preps an image on the host which forms the basis of all containers | 20:12 |
odyssey4me | the base cache is downloaded from images.linuxcontainers.org - inside infra it's downloaded through a reverse proxy | 20:13 |
odyssey4me | the download is initiated here: https://github.com/openstack/openstack-ansible-lxc_hosts/blob/45bee5806a4249eaf511f9203a50cbacca88b72f/tasks/lxc_cache_prestage.yml#L66 | 20:14 |
odyssey4me | and finalised here: https://github.com/openstack/openstack-ansible-lxc_hosts/blob/master/tasks/lxc_cache_preparation_systemd_new.yml#L41 | 20:14 |
odyssey4me | it's done via an async task because it takes some time, so we let the host do some other work while that's happening | 20:15 |
odyssey4me | inside nodepool/openstack-ci, the right mirror var is set in this task: https://github.com/openstack/openstack-ansible-tests/blob/master/common-tasks/test-set-nodepool-vars.yml#L25 | 20:15 |
odyssey4me | so if we look in https://review.openstack.org/#/c/546773/ at the failure - which is a role test | 20:16 |
odyssey4me | we can see when that var is set for the test by looking for 'Discover the lxc_image_cache_server value when in nodepool' in the ARA report in http://logs.openstack.org/73/546773/1/check/openstack-ansible-functional-ubuntu-xenial/e232748/logs/ara/ | 20:17 |
odyssey4me | we can see it did the right thing in the async result: http://logs.openstack.org/73/546773/1/check/openstack-ansible-functional-ubuntu-xenial/e232748/logs/ara/result/639ff31d-54ed-49c9-a0a4-1b631bcb2018/ | 20:18 |
odyssey4me | mnaser with me so far? | 20:18 |
mnaser | odyssey4me: yeah, also some interesting discussion from #openstack-infra too | 20:18 |
odyssey4me | it looks like the host prep worked just fine: http://logs.openstack.org/73/546773/1/check/openstack-ansible-functional-ubuntu-xenial/e232748/logs/host/lxc-cache-prep-commands.log.txt.gz | 20:19 |
mnaser | combining what you're saying with what they're saying, we're ending up with either bionic images with xenial repos OR xenial images with bionic repos | 20:19 |
odyssey4me | what that means is that the base image was downloaded, and the prep was done using infra mirrors without a hitch - so that means the problem happens later | 20:19 |
admin0 | odyssey4me, this bug https://bugs.launchpad.net/openstack-ansible/+bug/1743805 affects all new installs . | 20:20 |
openstack | Launchpad bug 1743805 in openstack-ansible "neutron-db-manage fails on hostname with underscore" [Undecided,Incomplete] | 20:20 |
odyssey4me | a key thing to understand here is that all containers do not use the infra images - only the host does | 20:20 |
admin0 | i am doing a new greenfield install today .. cinder and neutron fails .. cinder i can live without for a few days, neutron cannot | 20:20 |
odyssey4me | also, the container apt sources are pristine - we only copy /etc/apt/sources.list from the host - everything else is laid down by our ansible tasks | 20:21 |
mnaser | odyssey4me: so download image, 'prep' it locally, build containers from it, right? | 20:21 |
odyssey4me | mnaser yep | 20:21 |
mnaser | is it possible that we're downloading a bionic image and prepping it with xenial repos | 20:21 |
odyssey4me | we would like to change the download part to something prepped by diskimage-builder so that we have *full* control of it, but it's not work anyone's picked up yet | 20:21 |
mnaser | odyssey4me: https://github.com/openstack/openstack-ansible-lxc_hosts/blob/45bee5806a4249eaf511f9203a50cbacca88b72f/tasks/lxc_cache_prestage.yml#L47-L61 | 20:21 |
mnaser | that gets a list of all images | 20:22 |
mnaser | and grabs the latest one | 20:22 |
mnaser | which might happen to be bionic | 20:22 |
odyssey4me | ok, so what is a 'bionic' image? | 20:22 |
mnaser | odyssey4me: latest release of ubuntu | 20:22 |
mnaser | 18.04 | 20:23 |
odyssey4me | oh, that'd be weird | 20:23 |
mnaser | that codebase doesnt seem to filter | 20:23 |
mnaser | it seeems to grab this file https://us.images.linuxcontainers.org/meta/1.0/index-system | 20:23 |
mnaser | then matches against cache_index_item | 20:23 |
odyssey4me | based on http://logs.openstack.org/73/546773/1/check/openstack-ansible-functional-ubuntu-xenial/e232748/logs/host/lxc-cache-prep-commands.log.txt.gz it looks like it's using xenial sources - and that file came from the host | 20:24 |
odyssey4me | mnaser would it be helpful to continue the discussion in #openstack-infra? | 20:24 |
mnaser | odyssey4me: nono, i'm saying the lxc image we download is bionic (18.04) | 20:24 |
openstackgerrit | Major Hayden proposed openstack/openstack-ansible master: [WIP] Install Python 3.5 on CentOS 7 https://review.openstack.org/547126 | 20:24 |
mnaser | odyssey4me: for now, i think this is an OSA issue | 20:24 |
mnaser | one tiny thing i need to check | 20:25 |
mnaser | what "Set image index fact" is doing | 20:25 |
odyssey4me | looks to me like it's right? http://logs.openstack.org/73/546773/1/check/openstack-ansible-functional-ubuntu-xenial/e232748/logs/ara/result/5593ebbc-f171-49a7-81a2-4fe0883ef9d5/ | 20:25 |
odyssey4me | also see the actual image downloaded's path: http://logs.openstack.org/73/546773/1/check/openstack-ansible-functional-ubuntu-xenial/e232748/logs/ara/result/639ff31d-54ed-49c9-a0a4-1b631bcb2018/ | 20:26 |
odyssey4me | so, if it is bionic, then something has gone wrong upstream | 20:26 |
mnaser | {{ lxc_images[0].split(';')[-1] }} | 20:26 |
mnaser | yeah | 20:26 |
mnaser | ok let me download this real quick to check | 20:26 |
odyssey4me | ok - let's take a step back for a bit - do that, that might be useful | 20:27 |
odyssey4me | lemme take a peek at the actual fail and look at it from that end | 20:27 |
mnaser | odyssey4me: interestingly enough xenial has the same exact update timestamp as bionic | 20:28 |
mnaser | 20180222_03:49 | 20:28 |
odyssey4me | debhelper happens to be in xenial-backports: https://packages.ubuntu.com/search?suite=xenial-backports&searchon=names&keywords=debhelper | 20:28 |
odyssey4me | mnaser ah yes, but the CI that builds those images runs regularly, and all the time | 20:29 |
mnaser | odyssey4me: but that release has no pinned dh-strip-nondeterminism dependency | 20:29 |
mnaser | odyssey4me: where as look at https://packages.ubuntu.com/bionic/debhelper .. exact dependency it's trying to pull | 20:29 |
odyssey4me | it does dep on it: https://packages.ubuntu.com/xenial-backports/debhelper | 20:31 |
openstackgerrit | Merged openstack/openstack-ansible-tests master: Use ARA instead of profile_tasks callback https://review.openstack.org/546270 | 20:32 |
mnaser | odyssey4me: but it's not a pinned dependency. if you notice, the issue is that it's trying to install >= 0.028~ specficially in the error | 20:32 |
odyssey4me | yeah, that is odd | 20:32 |
admin0 | is there a way to force creation of containers with - instead of _ ? | 20:32 |
mnaser | so with the xenial-backports or xenial package, it'll install any release happily. but if it's trying to install debhelper from bionic, then it absolutely needs >= 0.028~ .. which doesn't exist in xenial | 20:33 |
mnaser | admin0: i think you're using an old release of osa | 20:33 |
mnaser | all new ones use dashes | 20:33 |
admin0 | stable/pike | 20:33 |
odyssey4me | admin0 the inventory uses _, but lxc-container-create should translate that to - | 20:34 |
odyssey4me | for the hostnames and dns entries, I mean | 20:34 |
admin0 | mnaser, checked out like 12 hours ago | 20:34 |
mnaser | admin0: not sure, my deployment def has dashes | 20:34 |
mnaser | id double check configs/vars/etc | 20:34 |
admin0 | mnaser, this is what i got https://gist.github.com/a1git/a9fd50ba71b62372887793552a019ff1 | 20:34 |
odyssey4me | that's been the case since mitaka... but you're not the first to report a problem, so maybe something new has crept into LXC which is breaking the mechanism | 20:34 |
mnaser | odyssey4me: thats not the same thing | 20:34 |
odyssey4me | not sure if cloudnull is around, 'cos he was looking into it | 20:35 |
mnaser | read what odyssey4me just mentioned re: inventory and hostname | 20:35 |
mnaser | lxc-attach -n c1_neutron_server_container-efcf9281 on c1 | 20:35 |
mnaser | and type hostname | 20:35 |
mnaser | odyssey4me: maybe we can talk a bit with infra some more, i'm a bit at a loss | 20:35 |
odyssey4me | yep, if you do ^ then you should see '-' instead | 20:35 |
admin0 | returns: c1_neutron_server_container-efcf9281 | 20:35 |
odyssey4me | mnaser okie dokey | 20:36 |
admin0 | i do not see - | 20:36 |
mnaser | odyssey4me: is queens supported on xenial by uca? | 20:40 |
admin0 | mnaser, whichi file is responsible for this ... maybe i can compare/check this | 20:42 |
mnaser | admin0: no idea, it just works for me :X | 20:42 |
mnaser | id retry the deployment i think something went wrong | 20:42 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/openstack-ansible-nspawn_container_create master: Updated from OpenStack Ansible Tests https://review.openstack.org/547133 | 20:43 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/openstack-ansible-nspawn_hosts master: Updated from OpenStack Ansible Tests https://review.openstack.org/547134 | 20:43 |
odyssey4me | mnaser yup: http://mirror.bhs1.ovh.openstack.org/ubuntu-cloud-archive/dists/xenial-updates/queens/ | 20:43 |
openstackgerrit | Merged openstack/openstack-ansible stable/queens: Update Queens doc index https://review.openstack.org/546971 | 20:45 |
openstackgerrit | Merged openstack/openstack-ansible master: Update documentation index to include Queens https://review.openstack.org/546968 | 20:45 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/openstack-ansible-os_panko master: Updated from OpenStack Ansible Tests https://review.openstack.org/547135 | 20:46 |
mnaser | odyssey4me: now to raise the question | 20:46 |
mnaser | why do we even install debhelper | 20:46 |
mhayden | it's a helper | 20:47 |
openstackgerrit | Merged openstack/openstack-ansible-galera_client stable/ocata: Fix cache update after initial apt_repository fail https://review.openstack.org/547048 | 20:47 |
mhayden | for debs | 20:47 |
openstackgerrit | Merged openstack/openstack-ansible-galera_client stable/newton: Fix cache update after initial apt_repository fail https://review.openstack.org/547052 | 20:47 |
odyssey4me | that is a very good question | 20:47 |
openstackgerrit | Merged openstack/openstack-ansible-galera_client stable/queens: Fix cache update after initial apt_repository fail https://review.openstack.org/547043 | 20:47 |
mnaser | mhayden with the 🔥 answers | 20:47 |
mhayden | i doubt we need it unless we're building debs or converting a python pkg to deb | 20:47 |
mnaser | git blame time | 20:47 |
mhayden | OH EMOJI | 20:47 |
mhayden | 🍺 | 20:48 |
mnaser | hey cloudnull -- wanna try to remember a decision you took 2 years ago? https://github.com/openstack/openstack-ansible-os_keystone/commit/ebdcb34c3a95fc399fe077455bffe40617bccdaf :P | 20:48 |
admin0 | magic of git blame :) | 20:48 |
mnaser | im back to 3 years and still see debhelper | 20:49 |
mnaser | actually, it looks like this has existed since the start of os_keystone so before things got split | 20:49 |
odyssey4me | http://git.openstack.org/cgit/openstack/openstack-ansible-os_keystone/tree/vars/ubuntu-16.04.yml#n17 | 20:49 |
odyssey4me | it might just be leftover cruft from days of yore | 20:50 |
openstackgerrit | Merged openstack/openstack-ansible-galera_client stable/pike: Fix cache update after initial apt_repository fail https://review.openstack.org/547044 | 20:50 |
odyssey4me | mnaser mhayden yep, I found it in our icehouse keystone dep list | 20:51 |
odyssey4me | https://github.com/openstack/openstack-ansible/commit/c6e5c9a74e915915f1aaf1948c46a3f8afe9b823 | 20:52 |
mnaser | thats waaay back | 20:52 |
odyssey4me | yep, no mention of it before that | 20:53 |
odyssey4me | lemme check something else, to see if we can go back even further ;) | 20:53 |
mnaser | dont think i can go back beyond that :p | 20:53 |
mnaser | but while you do that i'll propose a patch to drop it for now | 20:53 |
odyssey4me | yep, looks to me like it's old cruft anyway | 20:54 |
odyssey4me | it's curious to me that there's an issue though | 20:54 |
odyssey4me | it's plausible that UCA was added, but the indexes weren't updated after that | 20:54 |
mnaser | does OSA have someone to reach out for UCA issues like this | 20:55 |
odyssey4me | I could see if jamespage is around. | 20:55 |
mnaser | do we need dh-apparmor too? | 20:55 |
odyssey4me | Heh, he's in channel - so if he's available, he'll pop up. | 20:55 |
mnaser | does mhayden respond to selinux only or apparmor questions work too :P | 20:55 |
odyssey4me | I doubt it, if it's a related package. | 20:55 |
* mhayden has little apparmor experience | 20:56 | |
mnaser | "dh-apparmor provides the debhelper tools used to install and migrate AppArmor profiles. This is normally used from package maintainer scripts during install and removal." | 20:56 |
odyssey4me | mhayden put selinux in permissive mode... I'm not sure if we can trust him any more | 20:56 |
odyssey4me | mnaser sounds lie we can remove it | 20:56 |
admin0 | :D | 20:56 |
admin0 | "<odyssey4me> mhayden put selinux in permissive mode... I'm not sure if we can trust him any more" :D | 20:56 |
odyssey4me | mnaser what I'm suggesting is that a possible cause could be the issue which https://review.openstack.org/547003 is actually trying to solve | 20:58 |
mnaser | odyssey4me: i mean that change did pass | 20:58 |
mnaser | :P | 20:58 |
odyssey4me | the UCA repo is configured, but the cache does not get updated - then the install tries to install and finds a dep it can't resolve | 20:58 |
openstackgerrit | Merged openstack/openstack-ansible-plugins master: Make connection plugin compatible with Ansible 2.5 https://review.openstack.org/543576 | 20:58 |
openstackgerrit | Merged openstack/openstack-ansible-galera_server stable/newton: Fix Apt cache update due to adding Galera repo https://review.openstack.org/547074 | 20:58 |
mnaser | but it didn't try to install the os_keystone stuff | 20:58 |
odyssey4me | that said, how does it know about the new dep | 20:59 |
openstackgerrit | Mohammed Naser proposed openstack/openstack-ansible-os_keystone master: Drop unnecessary dependencies from role https://review.openstack.org/547139 | 20:59 |
odyssey4me | we can try a test patch to see if we get a pass | 20:59 |
mhayden | odyssey4me: TEMPORARILY | 20:59 |
odyssey4me | lemme push a test patch with a depends-on | 20:59 |
mhayden | odyssey4me: ಠ_ಠ | 20:59 |
odyssey4me | mhayden sure, sure... I bet you say that to all the auditors | 20:59 |
admin0 | i am going to destroy the containers, pull the 16.0.9 i see now and hope the new containers will be using - and not _ | 21:00 |
openstackgerrit | Mohammed Naser proposed openstack/openstack-ansible-repo_build master: Drop unnecessary dependencies from role https://review.openstack.org/547140 | 21:00 |
mnaser | voila | 21:00 |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-os_keystone master: [TEST] Try using updated openstack_hosts https://review.openstack.org/547141 | 21:01 |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/openstack-ansible-repo_build master: Drop unnecessary dependencies from role https://review.openstack.org/547140 | 21:03 |
odyssey4me | mnaser just added the depends-on - that repo build change should not merge unless the other does... although that said, the repo build patch data is not used, so meh | 21:04 |
odyssey4me | it's just better to link them :) | 21:04 |
mnaser | odyssey4me: no worries, i just used codesearch to find all debhelper references | 21:04 |
odyssey4me | yep, codesearch is *way* better than github search :) cc logan- | 21:05 |
logan- | oh neat. never even heard of codesearch | 21:05 |
odyssey4me | codesearch.openstack.org | 21:06 |
odyssey4me | I'm no expert, but it's far more useful than github search as it searches substrings. | 21:06 |
logan- | awesome | 21:06 |
odyssey4me | I only wish you could specify the branch to search too. | 21:06 |
odyssey4me | as far as I've seen - it does master only | 21:07 |
odyssey4me | (same as github) | 21:07 |
logan- | i gave up on github search long ago, always just clone+grep. will have to give codesearch a try :) | 21:07 |
mhayden | ripgrep is quite fast too | 21:14 |
odyssey4me | ripgrep? | 21:15 |
mhayden | rust implementation of grep | 21:15 |
mhayden | https://github.com/BurntSushi/ripgrep | 21:15 |
odyssey4me | neat, but still requires local clones | 21:15 |
mhayden | si | 21:15 |
mhayden | i always have updated local clones thanks to gertty | 21:16 |
odyssey4me | of course I know mhayden has all the OSA repositories cloned locally - every. single. one. | 21:16 |
logan- | lol | 21:16 |
odyssey4me | especially those news ones we imported yesterday when we broke zuul again | 21:16 |
mhayden | you betcha | 21:16 |
mhayden | hot damn | 21:16 |
mhayden | wait, you guys were breaking stuff and you didn't invite me? | 21:16 |
mhayden | that's, like, my specialty | 21:17 |
mhayden | ol logan- should be thankful nothing's broken in his datacenter yet | 21:17 |
odyssey4me | yes, the breakdown was spectacular - but lacked the pazezz it has when you're around | 21:17 |
* mhayden will try harder | 21:17 | |
odyssey4me | so, welcome to our new repositories: | 21:17 |
odyssey4me | https://github.com/openstack/openstack-ansible-os_panko | 21:17 |
odyssey4me | https://github.com/openstack/openstack-ansible-nspawn_hosts | 21:17 |
* mhayden is hungry for breadcrumbs | 21:18 | |
odyssey4me | https://github.com/openstack/openstack-ansible-nspawn_container_create | 21:18 |
* odyssey4me gives up trying to spell pazazz (whatever) and switches to the word 'flare' instead | 21:18 | |
odyssey4me | heh, although that's probably the wrong word - 'flair' ? | 21:19 |
* odyssey4me gives up on words | 21:19 | |
evrardjp[m] | haha | 21:19 |
odyssey4me | speaking of which, we should probably add some functional tests to those repositories :p | 21:19 |
mhayden | apparently pizzazz doesn't directly translate into afrikaans | 21:20 |
mhayden | ... the more you know ... | 21:20 |
odyssey4me | mhayden flair doesn't either: http://www.majstro.com/Web/Majstro/bdict.php?gebrTaal=eng&bronTaal=eng&doelTaal=afr&teVertalen=flair | 21:21 |
odyssey4me | logan- got a minute to look through https://review.openstack.org/#/q/topic:bug/1750656+(status:open+OR+status:merged) ? | 21:23 |
logan- | yep | 21:23 |
logan- | on it | 21:23 |
mhayden | odyssey4me: dang, how do people in south africa enjoy office space?! | 21:25 |
jamespage | odyssey4me: ok here for a bit | 21:27 |
odyssey4me | jamespage thanks :) | 21:27 |
odyssey4me | so some time this afternoon we started getting fails from our keystone role, which tries to install debhlper | 21:27 |
jamespage | lemme check what time I promoted everything | 21:27 |
odyssey4me | the error was that it couldn't find its dependency, which was pinned | 21:28 |
openstackgerrit | Merged openstack/openstack-ansible-galera_server stable/ocata: Fix cache update after initial apt_repository fail https://review.openstack.org/547063 | 21:28 |
openstackgerrit | Merged openstack/openstack-ansible-galera_server stable/pike: Fix cache update after initial apt_repository fail https://review.openstack.org/547061 | 21:28 |
openstackgerrit | Merged openstack/openstack-ansible-galera_server stable/queens: Fix cache update after initial apt_repository fail https://review.openstack.org/547053 | 21:28 |
jamespage | odyssey4me: 15:03 | 21:28 |
odyssey4me | it took a bit of spelunking, but we found that pinned requirement in UCA | 21:28 |
odyssey4me | ok, let's see if we can find the first failures in logstash | 21:29 |
odyssey4me | here's an example: http://logs.openstack.org/73/546773/1/check/openstack-ansible-functional-ubuntu-xenial/e232748/logs/ara/result/73cd2edf-2288-4f40-af0e-ca26f0d446db/ | 21:29 |
jamespage | that included fresh backports of debhelper + associated misc tooling bumps needed to support that including pkgbinarymangler, strip-nondeterminism and cmake (yah I know) | 21:29 |
jamespage | odyssey4me: have you managed to unstick that held package issue? | 21:30 |
*** deadnull has quit IRC | 21:30 | |
jamespage | I just did quick check on fresh xenial with queens-updates and its installable afaict | 21:30 |
jamespage | well it must be otherwise nothing would have built for the last three weeks :-) | 21:31 |
odyssey4me | this appears to be the earliest failure in logstash: http://logs.openstack.org/17/544117/11/check/openstack-ansible-octavia-ssl-nv/28fa63d/job-output.txt | 21:31 |
odyssey4me | it was using the mirror: http://logs.openstack.org/17/544117/11/check/openstack-ansible-octavia-ssl-nv/28fa63d/logs/etc/openstack/openstack1/apt/sources.list.d/uca.list.txt.gz | 21:33 |
odyssey4me | all the way: http://logs.openstack.org/17/544117/11/check/openstack-ansible-octavia-ssl-nv/28fa63d/logs/etc/openstack/openstack1/apt/sources.list.txt.gz | 21:33 |
odyssey4me | so it's a bit odd | 21:33 |
odyssey4me | I guess it's plausible that there was an infra mirror update process running at around the same time | 21:34 |
jamespage | odyssey4me: lots of moving parts | 21:34 |
odyssey4me | and perhaps the update you did wasn't done yet | 21:34 |
odyssey4me | so when the mirror happened, the source mirror was incomplete | 21:34 |
jamespage | binary package copies between -proposed and -updates PPA's in launchpad | 21:34 |
jamespage | sync of PPA -> UCA (via reprepro) | 21:34 |
*** acormier has quit IRC | 21:36 | |
jamespage | and then sync from UCA to infra mirrors as well | 21:36 |
*** acormier has joined #openstack-ansible | 21:37 | |
*** sxc731 has quit IRC | 21:37 | |
*** pcaruana has quit IRC | 21:37 | |
odyssey4me | yeah, none of which are event based - especially the last part | 21:38 |
jamespage | odyssey4me: urgh so | 21:38 |
jamespage | odyssey4me: debehlper when in at 15:06 | 21:38 |
jamespage | strip-nondt at 15:39 | 21:38 |
jamespage | that's either side of a sync from the source PPA's on LP to the actual UCA | 21:38 |
odyssey4me | heh, aha | 21:39 |
jamespage | so from 15:30 until 16:30 there would have been an installability issue | 21:39 |
odyssey4me | ok, so these are all scheduled, not event driven | 21:39 |
odyssey4me | and we happened to find a perfect storm - I guess partly due to the volume of updates | 21:39 |
jamespage | unfortunately that does appear to be the case here | 21:40 |
evrardjp | good morning everyone | 21:40 |
*** acormier has quit IRC | 21:41 | |
odyssey4me | jamespage so this is when the last infra update was: http://mirror.bhs1.ovh.openstack.org/ubuntu-cloud-archive/timestamp.txt | 21:41 |
jamespage | should be OK now then | 21:41 |
evrardjp | what's the issue? | 21:42 |
odyssey4me | jamespage aha, that would then explain why https://review.openstack.org/546775 managed to get past the issue, but https://review.openstack.org/546773 which ran before that time did not | 21:43 |
odyssey4me | rechecked some more patches now to see what happens | 21:43 |
odyssey4me | jamespage ok, curiosity satisfied - mnaser does that all make more sense now? | 21:44 |
*** ansmith has quit IRC | 21:44 | |
odyssey4me | thanks jamespage :) | 21:44 |
jamespage | you're welcome - I'll have a think about how we can make these type of promotion a bit more transactional to keep things consistent | 21:45 |
mnaser | jamespage, odyssey4me: thanks for the info, what an odd set of coincidences | 21:45 |
odyssey4me | mnaser rather :) it happens | 21:46 |
odyssey4me | it would seem that it was an issue for not too long - 90 mins at most, although that's my speculative guess and I'm too lazy to dig up a more accurate guess :p | 21:47 |
*** dave-mcc_ has joined #openstack-ansible | 21:47 | |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible master: [Docs] Centralize Inventory documentation https://review.openstack.org/547149 | 21:47 |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible master: [Docs] Move limited connectivity to user guide https://review.openstack.org/547150 | 21:47 |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible master: [Docs] Migrate security into user guide https://review.openstack.org/547151 | 21:47 |
cloudnull | mnaser: whats up ? | 21:47 |
odyssey4me | evrardjp the symptom was http://logs.openstack.org/17/544117/11/check/openstack-ansible-octavia-ssl-nv/28fa63d/logs/ara/result/6bd11718-fa40-4de0-a8ae-89f0e6c02825/ | 21:48 |
mnaser | cloudnull: we're all good now but it looks like in a history far far away debhelper was added as a os_keystone dependency | 21:48 |
mnaser | and with some combination of issues, we couldn't install it | 21:48 |
odyssey4me | the root cause turned out to be a few things happening at the wrong place at the wrong time | 21:49 |
cloudnull | oh. | 21:49 |
*** dave-mccowan has quit IRC | 21:49 | |
cloudnull | ok, so nothing to see here https://i1.wp.com/angry.net/blog2/wp-content/uploads/2014/09/nothing-to-see-here.jpg ? | 21:50 |
odyssey4me | jamespage pushed a bunch of updates to the PPA for UCA, which then somewhere while it wasn't done got pulled into UCA, which then also got pulled in by infra's mirror process | 21:50 |
odyssey4me | so packages that UCA wanted weren't there | 21:50 |
cloudnull | admin0: odyssey4me: are we seeing hostnames with _ instead of - ? | 21:50 |
odyssey4me | the syncs are all done now, and everythihg is back to working | 21:50 |
evrardjp | cloudnull: :) | 21:50 |
admin0 | yes | 21:50 |
admin0 | i spent 10 hours to fail at cinder and neutron in the end :( | 21:50 |
admin0 | just now deleted all the containers and pulled tag 16.0.9 hoping it might fix | 21:51 |
evrardjp | odyssey4me: thanks for taking care of it | 21:51 |
evrardjp | and thanks jamespage and infra :) | 21:51 |
admin0 | so confident with OSA that i directly used it on production :D so took this long time to find out | 21:51 |
* admin0 feels cheated | 21:52 | |
odyssey4me | admin0 was this a fresh install, or an upgrade from a previous version? | 21:52 |
admin0 | greenfield | 21:52 |
admin0 | super fresh | 21:52 |
cloudnull | admin0: if it happens again I'd love to work though the issue and see what falls out | 21:53 |
cloudnull | we've had two different reports of this in the last couple of weeks but on different releases but no means to reproduce it | 21:53 |
admin0 | i nuked everything , pulled 16.0.9 , even deleted the /etc/ansible from deploy and redoing the setup ansible part | 21:53 |
cloudnull | so I'm a little bit at a loss on what went wrong | 21:53 |
admin0 | since this is new install, i can add your keys for you to poke around if required | 21:54 |
odyssey4me | cloudnull what's useful in this case is that admin0 is quite meticulous in documenting all his config ;) | 21:54 |
cloudnull | admin0: https://github.com/cloudnull.keys | 21:54 |
cloudnull | ++ | 21:54 |
admin0 | i also faced the machinectl bug and your patche scared me to cherry pick .. so i had to manually do machinectl limt to 500g, systemctl restart /var/lib/machines and then i do not have to do the patches or do cherry pick | 21:55 |
admin0 | that 16GB limit reached bug | 21:55 |
cloudnull | let me know what you find out, I'm happy to go dig-in | 21:55 |
admin0 | ok | 21:55 |
admin0 | cloudnull, unrelated to the errors, this is my hybrid setup .. osa with ovs - http://www.openstackfaq.com/openstack-ansible-with-ovs-pike/ | 21:56 |
cloudnull | that's awesome! | 21:56 |
admin0 | i have one running fine on okata .. so this new greenfield is based on the same hybrid setup to use osa but on pike | 21:56 |
admin0 | here in the new one, we plan to run a platform that uses nfv .. so i will be able to document those as well | 21:57 |
cloudnull | admin0: might want to update the user variables for pike so that you can use firewall_v2 | 21:58 |
admin0 | ok | 21:58 |
cloudnull | other than that the configs look great! | 21:59 |
admin0 | this setup is going to be big .. i have 6 machines now .. but i want to migrate workload from an old cluster to here which has like 60 machines .. and we have one tenant that needs dedicated hardware where heat will create like 1000 machines every 3hours, and then destroy .. again 1000.. agin destroy | 21:59 |
cloudnull | also nice touch w/ jumbo frames | 22:00 |
admin0 | so if you have a good example on using cells, i will incorporate from start | 22:00 |
admin0 | cloudnull, i experiment with OSA a lot :D | 22:00 |
cloudnull | I've not run cells in any meaningful way and have done VERY little with cells v2 | 22:00 |
odyssey4me | admin0 doing a cells implementation would be a small stretch from an ansible standpoint, but a bit impractical from an inventory standpoint right now | 22:01 |
admin0 | ok | 22:01 |
odyssey4me | if we start working on multi-cell implementations, I personally would rather see that we work out how to scale the inventory better | 22:01 |
odyssey4me | for now we recommend having multiple regions instead, each with their own inventory | 22:02 |
cloudnull | that said, 1000 vm workload in a single cloud of <500 compute hosts should be fine with the general configs | 22:02 |
mnaser | fyi at 60 machines you're going to have more issues dealing with cells v2 | 22:02 |
mnaser | how do i rephrase this | 22:02 |
mnaser | multiple cells at cells v2 scale is too much hassle | 22:02 |
mnaser | i think host aggregates with scheduler configuration would be much easier for you | 22:03 |
cloudnull | ^ that's what I've done more with | 22:03 |
mnaser | you'll obviously need a single cell because cells v2 has an api and 'default' cell but yeah | 22:03 |
admin0 | 1000 vm is just for training guys who destroy and bring it up every 4 hours . | 22:03 |
odyssey4me | yup, using host aggregates/availability zones for localised scale is far more useful | 22:03 |
admin0 | for their case, i disable ceilometer agents . | 22:04 |
admin0 | so firewall needs to change to firewall_v2 .. done | 22:05 |
admin0 | running | 22:05 |
admin0 | day gone .. here goes my night :D | 22:05 |
cloudnull | admin0: https://gist.github.com/cloudnull/bd3f03191683088b3f8ac46b8ef5799b | 22:06 |
cloudnull | I recently had to dig that info up | 22:06 |
cloudnull | which came from the osic cloud at 252 compute hosts, 3 infra nodes. | 22:07 |
cloudnull | which should give you an idea of what a single cell should be capable of | 22:08 |
odyssey4me | (with awesome hardware) ;) | 22:09 |
cloudnull | ^ this is true | 22:09 |
admin0 | so this hostname thing .. how do I validate if 16.0.9 does it correctly .. it should be visible after lxc-hosts-setup.yml right ? | 22:09 |
cloudnull | after lxc container create | 22:09 |
admin0 | ok | 22:09 |
admin0 | right | 22:10 |
admin0 | i have to login and do hostname -f | 22:10 |
admin0 | ok | 22:10 |
admin0 | running ! | 22:10 |
cloudnull | great. | 22:10 |
cloudnull | ping me if you figure anything out. | 22:10 |
cloudnull | or generally see the issue | 22:10 |
admin0 | ok | 22:12 |
*** armaan has quit IRC | 22:13 | |
odyssey4me | admin0 if you just do setup-hosts.yml, then check, that will do | 22:14 |
admin0 | its running | 22:14 |
odyssey4me | cloudnull could you push up some patches to get the functional tests back into the nspawn roles? | 22:15 |
odyssey4me | we broke infra with what was in the seeded repositories last night, so they got force-removed from the repo | 22:15 |
odyssey4me | if you could push up the configs to get tests back in, that'd be awesome | 22:15 |
spotz | odyssey4me: OSA broke infra? | 22:15 |
odyssey4me | if you don't manage, then I'll likely pick that up in the morning | 22:16 |
odyssey4me | spotz it was a team effort ;) | 22:16 |
spotz | heheh | 22:16 |
odyssey4me | cloudnull wrote the code, I inspected it and got it imported - boom... we discovered that infra had no tests on importing repositories which checked whether the zuul stuff worked, and when the repo imported, zuul broke | 22:17 |
admin0 | :D | 22:17 |
cloudnull | oh ? | 22:17 |
admin0 | does the /etc/hosts also get cleaned of old stuff in cases like mine when i have to delete the old hostname_ip mapping to start fresh | 22:17 |
cloudnull | jajaja. | 22:17 |
admin0 | right now, been clearing out that by hand | 22:17 |
cloudnull | odyssey4me: that's funny :) | 22:18 |
cloudnull | so what do i need to do, maybe just push up the roles without the zuul things? | 22:18 |
odyssey4me | cloudnull the roles are imported | 22:19 |
cloudnull | oh ok | 22:19 |
admin0 | ok: [c3_nova_api_os_compute_container-92064114 -> 172.29.236.3] | 22:19 |
admin0 | - not a good sign right ? | 22:19 |
odyssey4me | http://eavesdrop.openstack.org/irclogs/%23openstack-ansible/%23openstack-ansible.2018-02-22.log.html#t2018-02-22T21:17:47 | 22:19 |
odyssey4me | cloudnull so we just need the functional test config (ie the zuul.d directory and content) pushed up in a review | 22:20 |
admin0 | new format should be c3-nova-api-os-compute-container-xxx ? | 22:20 |
cloudnull | ok | 22:20 |
cloudnull | ill get on that | 22:20 |
cloudnull | admin0: yes, | 22:20 |
cloudnull | can you cat /etc/hosts | 22:21 |
odyssey4me | cloudnull thanks | 22:21 |
admin0 | i will checkout only 16.0.9 ( not master and then checkout ) and give it 1 more try | 22:21 |
odyssey4me | what you had in the source broken zuul badly :) that said, if you're now pushing it up for review, zuul will tell you that it's no good and you can figure it out until it's right | 22:21 |
odyssey4me | I do suspect that part of the issue was that the roles were pulled in together, not one after the other - and the jobs depended on each other | 22:22 |
odyssey4me | anyway, I'll leave it to you | 22:22 |
cloudnull | odyssey4me: interesting that it accepted it given that it was busted. | 22:22 |
odyssey4me | time for me to go back to !computering | 22:22 |
cloudnull | odyssey4me: take care. | 22:22 |
cloudnull | admin0: both the name with _ and name with - should work | 22:23 |
odyssey4me | cloudnull the fault was that it imported the repo, without verifying that the repo contents would not break it... so that's fixed now in a new test :) | 22:23 |
odyssey4me | new system, new lessons ;) | 22:23 |
cloudnull | odyssey4me: so we're all welcome :) | 22:23 |
admin0 | cloudnull, is therea way to just force/change all hostname generation with - | 22:23 |
cloudnull | the hostname should be with a - | 22:23 |
admin0 | so that i will know it will just work for cinder/neutron on setup-openstack playbook | 22:24 |
odyssey4me | https://media.giphy.com/media/tXTqLBYNf0N7W/giphy.gif | 22:24 |
cloudnull | however an alias will exist with an _ | 22:24 |
cloudnull | odyssey4me: exactly | 22:24 |
admin0 | is there a way i can git rid of _ and use only -? | 22:24 |
cloudnull | admin0: so /etc/hostname should be only with a - | 22:24 |
cloudnull | however a line in /etc/hosts will contain both | 22:24 |
cloudnull | i don't think we have any way to disable the alias, but we could make one. | 22:25 |
admin0 | if we all see _ gives an issue, maybe make a variable so that for new installs, only use - | 22:25 |
cloudnull | i'd be happy to ssh in and poke around. see if something is off | 22:25 |
cloudnull | anything with an _ should just be an alias, which shouldn't impact hostname resoution | 22:26 |
cloudnull | but if it is, then there's a bug we need to fix | 22:26 |
odyssey4me | admin0 we have a feature request for that already: https://bugs.launchpad.net/openstack-ansible/+bug/1643680 | 22:26 |
openstack | Launchpad bug 1643680 in openstack-ansible "Shift to using dashes instead of underscores for container names" [Wishlist,Confirmed] | 22:26 |
odyssey4me | the answer for now is, not yet | 22:27 |
odyssey4me | anywa, night night! | 22:27 |
cloudnull | see you tomorrow | 22:27 |
admin0 | see ya ! | 22:27 |
openstackgerrit | Merged openstack/openstack-ansible-openstack_hosts master: Fix cache update after initial apt_repository fail https://review.openstack.org/547003 | 22:29 |
openstackgerrit | Merged openstack/openstack-ansible-nspawn_container_create master: Updated from OpenStack Ansible Tests https://review.openstack.org/547133 | 22:29 |
openstackgerrit | Merged openstack/openstack-ansible-ceph_client master: Fix cache update after initial apt_repository fail https://review.openstack.org/547000 | 22:29 |
openstackgerrit | Merged openstack/openstack-ansible-pip_install stable/pike: Fix cache update after initial apt_repository fail https://review.openstack.org/547021 | 22:29 |
*** esberglu has quit IRC | 22:32 | |
*** ansmith has joined #openstack-ansible | 22:37 | |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/openstack-ansible-nspawn_hosts master: add minimal functional tests https://review.openstack.org/547157 | 22:41 |
admin0 | containers have names like c3_nova_scheduler_container-c46a84fd . waiting for it to complete so that i can login and check its hostname | 22:41 |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/openstack-ansible-nspawn_container_create master: Add minimal functional gate https://review.openstack.org/547158 | 22:43 |
cloudnull | yes the name should always have an _ | 22:43 |
cloudnull | however the hostname should be tranlated to - | 22:43 |
admin0 | cloudnull, c3_cinder_api_container-9182f2dd | 22:44 |
*** dave-mcc_ has quit IRC | 22:44 | |
admin0 | is the hostname | 22:44 |
admin0 | it did not translate | 22:44 |
cloudnull | that's what's been written into the /etc/hostname file ? | 22:45 |
cloudnull | mind pasting /etc/hostname and /etc/hosts? | 22:45 |
admin0 | cat /etc/hostname => c3_cinder_api_container-9182f2dd cat /etc/hosts => 127.0.1.1 c3_cinder_api_container-9182f2dd | 22:46 |
cloudnull | well that's not good. | 22:47 |
openstackgerrit | Merged openstack/openstack-ansible-os_panko master: Updated from OpenStack Ansible Tests https://review.openstack.org/547135 | 22:47 |
cloudnull | admin0: https://github.com/openstack/openstack-ansible-lxc_container_create/blob/master/tasks/lxc_container_config.yml#L254-L262 | 22:49 |
cloudnull | you should see a domain name | 22:49 |
cloudnull | in etc/hosts | 22:49 |
cloudnull | and etc/hostname should have been rewritten https://github.com/openstack/openstack-ansible-lxc_container_create/blob/master/tasks/lxc_container_config.yml#L264-L271 | 22:49 |
openstackgerrit | Merged openstack/openstack-ansible-rabbitmq_server master: Fix cache update after initial apt_repository fail https://review.openstack.org/547015 | 22:49 |
admin0 | i am adding your keys | 22:49 |
admin0 | uno momento | 22:49 |
*** jwitko__ has joined #openstack-ansible | 22:50 | |
*** jwitko_ has quit IRC | 22:54 | |
*** jwitko__ has quit IRC | 22:56 | |
*** acormier has joined #openstack-ansible | 22:57 | |
*** acormier has quit IRC | 23:01 | |
*** idlemind has joined #openstack-ansible | 23:02 | |
openstackgerrit | Merged openstack/openstack-ansible-tests master: Set SELinux to permissive mode for tests https://review.openstack.org/546153 | 23:07 |
admin0 | is there an example of magnum, designate, octavia setup and usage ? | 23:10 |
admin0 | and does live-migration (without shared storage) work out by default ? | 23:10 |
admin0 | or need to override some vars for it ? | 23:10 |
*** jwitko_ has joined #openstack-ansible | 23:14 | |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/openstack-ansible-nspawn_hosts master: add minimal functional tests https://review.openstack.org/547157 | 23:19 |
*** dariko has quit IRC | 23:30 | |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible master: [Docs] Migrate security into user guide https://review.openstack.org/547151 | 23:31 |
*** acormier has joined #openstack-ansible | 23:38 | |
*** acormier has quit IRC | 23:38 | |
*** acormier has joined #openstack-ansible | 23:39 | |
*** weezS__ has joined #openstack-ansible | 23:44 | |
*** weezS has quit IRC | 23:45 | |
*** weezS_ is now known as weezS | 23:45 | |
*** weezS is now known as 7JTADKMG9 | 23:45 | |
*** weezS__ is now known as 7GHAB5386 | 23:45 | |
*** abelur has quit IRC | 23:45 | |
*** 7JTADKMG9 has left #openstack-ansible | 23:46 | |
*** 7GHAB5386 has left #openstack-ansible | 23:46 | |
admin0 | ok .. suppose if you have 50 lxc containers to create . say 5 on each 10 machines, and if only 1 of that fails .. so you see 49 green and 1 red .. guess what ? the hostname conversion does not apply to the rest 49 shown in green | 23:51 |
admin0 | silent bug | 23:51 |
admin0 | so while i see all green, just 1 red on container X that fails .. i do not realize that the hostname thing is not done | 23:52 |
admin0 | and it hits back @ the end | 23:52 |
admin0 | after you have done all | 23:52 |
admin0 | was able to reproduce this | 23:52 |
admin0 | so say while i had a container failed on say c1 and everything on c3 looked green, hostname still remained c3_utility_container-d12ce6d9 .. i had to rerun the container create -- bypassing everything that could have failed and then it updated the hostname | 23:54 |
admin0 | so cloudnull , if people complain of this again and we do not know why it happens, this is the case | 23:55 |
admin0 | ask if out of XXX if any container failed due to whatever reasons and rest all = green .. this happens | 23:55 |
cloudnull | ah. that's kinda a bummer | 23:56 |
admin0 | to reproduce i put one host under iroic with OSA which does not work .. so the ironic container will fail while the whole infrastructure will show red and ok .. but the hostname will not change | 23:56 |
admin0 | so when the cinder was run, it fixed cinder only | 23:56 |
admin0 | not others | 23:56 |
cloudnull | yea. that makes sense. | 23:56 |
admin0 | i did a debug and when i bypass containers i know will fail, it updated all | 23:56 |
admin0 | so green != green :D | 23:57 |
cloudnull | i wonder if this is related to https://github.com/openstack/openstack-ansible/blob/master/playbooks/containers-lxc-create.yml#L37 | 23:57 |
admin0 | no idea :) | 23:57 |
cloudnull | IE there was less than 20% failure | 23:58 |
cloudnull | I think we should just dump those lines | 23:58 |
admin0 | the hostname thing should still execute on a per vm basis | 23:58 |
admin0 | maybe the trigger is somewhere else when it does when ALL passes without issues | 23:58 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!