*** jtriley has joined #openstack-kolla | 00:13 | |
Pavo | if anyone is interested in helping with stress test | 00:14 |
---|---|---|
Pavo | http://ddi.hopto.org:8080/signup_user_complete/?id=3ufq6eprabyx3eqaqkbgwoitow | 00:14 |
*** sdake_ has quit IRC | 00:31 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 00:42 |
*** saneax is now known as saneax-_-|AFK | 00:45 | |
*** zhurong has joined #openstack-kolla | 00:48 | |
*** sdake has joined #openstack-kolla | 00:48 | |
*** zhurong has quit IRC | 00:51 | |
kfox1111 | sbezverk: ping | 01:03 |
*** sayantani01 has quit IRC | 01:04 | |
*** yuanying has joined #openstack-kolla | 01:06 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 01:07 |
sbezverk | kfox1111: hey I wanted to check with you.. | 01:08 |
sbezverk | I consistently see one failure on ubuntu job when glance gets removed.. on all other jobs glance removal goes smooth, but on Ubuntu glance api pod gets stuck in Terminating phase more than 300 seconds | 01:10 |
sbezverk | I am suspecting that maybe pre-delete hook is misbehaving in ubuntu | 01:10 |
kfox1111 | weird. | 01:11 |
sbezverk | kfox1111: if you have a couple of minutes please check https://review.openstack.org/#/c/426438/15 | 01:12 |
kfox1111 | sure. will do. | 01:12 |
sbezverk | when it gets completed | 01:12 |
kfox1111 | you had a cinder one too, right? | 01:12 |
*** sdake has quit IRC | 01:13 | |
*** l4yerffeJ has quit IRC | 01:14 | |
*** l4yerffeJ has joined #openstack-kolla | 01:14 | |
sbezverk | kfox1111: yes I just added it too | 01:15 |
kfox1111 | k. yeah, would be curious if cinder has the same issue. | 01:15 |
sbezverk | I will need to step out for a couple of hours. will check later | 01:16 |
kfox1111 | k. I'll try and keep an eye on it. | 01:16 |
openstackgerrit | Eduardo Gonzalez proposed openstack/kolla: Add trove and sahara dashboard ubuntu binary https://review.openstack.org/426533 | 01:20 |
*** richwellum has joined #openstack-kolla | 01:22 | |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Check to see if Horizon is working in the gate. https://review.openstack.org/426025 | 01:22 |
*** sdake has joined #openstack-kolla | 01:31 | |
*** tonanhngo has joined #openstack-kolla | 01:34 | |
*** rhallisey has quit IRC | 01:38 | |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Periodic job fix 3 https://review.openstack.org/426534 | 01:45 |
*** goldyfruit has quit IRC | 01:55 | |
*** tonanhngo has quit IRC | 01:59 | |
*** richwellum has quit IRC | 02:02 | |
*** l4yerffeJ has quit IRC | 02:02 | |
*** sdake has quit IRC | 02:03 | |
*** sdake has joined #openstack-kolla | 02:12 | |
*** eaguilar has joined #openstack-kolla | 02:13 | |
*** tonanhngo has joined #openstack-kolla | 02:14 | |
*** tonanhngo has quit IRC | 02:19 | |
*** unicell has quit IRC | 02:24 | |
*** msimonin has joined #openstack-kolla | 02:28 | |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Check to see if Horizon is working in the gate. https://review.openstack.org/426025 | 02:39 |
kfox1111 | kolla-kubernetes reviewers for this please: https://review.openstack.org/#/c/426534/ | 02:40 |
Pavo | if anyone is interested in helping with stress test | 02:41 |
Pavo | http://ddi.hopto.org:8080/signup_user_complete/?id=3ufq6eprabyx3eqaqkbgwoitow | 02:41 |
*** sdake has quit IRC | 02:50 | |
*** sayantani01 has joined #openstack-kolla | 03:07 | |
*** goldyfruit has joined #openstack-kolla | 03:15 | |
*** dave-mccowan has joined #openstack-kolla | 03:27 | |
*** sdake has joined #openstack-kolla | 03:31 | |
*** tonanhngo has joined #openstack-kolla | 03:32 | |
*** eaguilar has quit IRC | 03:36 | |
*** hfu has joined #openstack-kolla | 03:37 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 03:52 |
*** dave-mccowan has quit IRC | 03:57 | |
*** sdake has quit IRC | 03:58 | |
*** sdake has joined #openstack-kolla | 04:02 | |
openstackgerrit | Merged openstack/kolla-kubernetes: Periodic job fix 3 https://review.openstack.org/426534 | 04:15 |
*** tonanhngo has quit IRC | 04:23 | |
*** tonanhngo has joined #openstack-kolla | 04:24 | |
*** tonanhngo has quit IRC | 04:28 | |
*** lamt has joined #openstack-kolla | 04:31 | |
*** adrian_otto has joined #openstack-kolla | 04:51 | |
*** lamt has quit IRC | 05:02 | |
*** adrian_otto has quit IRC | 05:03 | |
*** unicell has joined #openstack-kolla | 05:34 | |
*** goldyfruit has quit IRC | 05:35 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 05:36 |
*** lamt has joined #openstack-kolla | 06:04 | |
*** haplo37 has quit IRC | 06:05 | |
*** g3ek has quit IRC | 06:06 | |
*** haplo37 has joined #openstack-kolla | 06:09 | |
*** g3ek has joined #openstack-kolla | 06:09 | |
*** ipsecguy_ has joined #openstack-kolla | 06:10 | |
*** ipsecguy has quit IRC | 06:11 | |
*** lamt has quit IRC | 06:24 | |
*** lamt has joined #openstack-kolla | 06:34 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 06:35 |
*** saneax-_-|AFK is now known as saneax | 06:37 | |
*** jtriley has quit IRC | 06:54 | |
*** lamt has quit IRC | 06:59 | |
*** yuanying_ has joined #openstack-kolla | 07:06 | |
*** bmace_ has quit IRC | 07:08 | |
*** bmace_ has joined #openstack-kolla | 07:08 | |
*** yuanying has quit IRC | 07:09 | |
*** sayantani01 has quit IRC | 07:12 | |
*** zhubingbing has joined #openstack-kolla | 07:27 | |
zhubingbing | hello guys | 07:28 |
*** tonanhngo has joined #openstack-kolla | 07:34 | |
*** tonanhngo has quit IRC | 07:39 | |
*** saneax is now known as saneax-_-|AFK | 07:54 | |
*** skramaja has quit IRC | 07:55 | |
zhubingbing | hello guys | 07:57 |
*** skramaja_ has joined #openstack-kolla | 07:59 | |
*** zhubingbing has quit IRC | 08:00 | |
*** saneax-_-|AFK is now known as saneax | 08:06 | |
*** sdake_ has joined #openstack-kolla | 08:43 | |
*** sdake has quit IRC | 08:45 | |
*** saneax is now known as saneax-_-|AFK | 08:45 | |
*** sdake has joined #openstack-kolla | 08:49 | |
*** sdake_ has quit IRC | 08:49 | |
*** msimonin1 has joined #openstack-kolla | 08:50 | |
*** msimonin has quit IRC | 08:53 | |
*** mgoddard has joined #openstack-kolla | 09:17 | |
*** matrohon has joined #openstack-kolla | 09:47 | |
*** msimonin1 has quit IRC | 09:58 | |
*** zhubingbing has joined #openstack-kolla | 10:10 | |
*** sdake has quit IRC | 10:28 | |
*** pbourke has quit IRC | 10:31 | |
*** pbourke has joined #openstack-kolla | 10:32 | |
*** matrohon has quit IRC | 10:47 | |
*** saneax-_-|AFK is now known as saneax | 10:54 | |
*** klindgren_ has joined #openstack-kolla | 10:56 | |
*** mgoddard has quit IRC | 10:58 | |
*** klindgren has quit IRC | 10:59 | |
*** mgoddard has joined #openstack-kolla | 11:24 | |
*** yuanying_ has quit IRC | 11:39 | |
*** mgoddard has quit IRC | 11:45 | |
*** mgoddard has joined #openstack-kolla | 12:03 | |
*** mgoddard has quit IRC | 12:11 | |
*** matrohon has joined #openstack-kolla | 12:23 | |
*** sdake has joined #openstack-kolla | 12:28 | |
*** leseb has quit IRC | 12:31 | |
*** leseb has joined #openstack-kolla | 12:36 | |
*** saneax is now known as saneax-_-|AFK | 12:49 | |
*** kristian__ has joined #openstack-kolla | 12:59 | |
*** sdake has quit IRC | 13:24 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 13:37 |
*** matrohon has quit IRC | 13:44 | |
*** hfu has quit IRC | 13:51 | |
openstackgerrit | Merged openstack/kolla: remove /var/log/trove in trove dockerfile https://review.openstack.org/426254 | 13:52 |
*** hfu has joined #openstack-kolla | 13:52 | |
openstackgerrit | Merged openstack/kolla: Fix code format ceilometer_compute dockerfile https://review.openstack.org/426224 | 13:53 |
*** hfu has quit IRC | 13:53 | |
*** hfu has joined #openstack-kolla | 13:55 | |
*** hfu has quit IRC | 13:55 | |
*** crushil has joined #openstack-kolla | 14:00 | |
*** crushil has quit IRC | 14:08 | |
*** kristia__ has joined #openstack-kolla | 14:09 | |
*** kristian__ has quit IRC | 14:12 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 14:21 |
*** kristian__ has joined #openstack-kolla | 14:24 | |
*** kristia__ has quit IRC | 14:27 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 14:29 |
*** crushil has joined #openstack-kolla | 14:40 | |
*** kristian__ has quit IRC | 14:43 | |
*** kristia__ has joined #openstack-kolla | 14:43 | |
*** kristia__ has quit IRC | 14:48 | |
*** crushil has quit IRC | 14:55 | |
*** kristian__ has joined #openstack-kolla | 14:56 | |
*** tonanhngo has joined #openstack-kolla | 14:58 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 15:00 |
*** sayantani01 has joined #openstack-kolla | 15:00 | |
*** tonanhngo has quit IRC | 15:03 | |
*** goldyfruit has joined #openstack-kolla | 15:05 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 15:09 |
*** sdake has joined #openstack-kolla | 15:10 | |
*** sdake has quit IRC | 15:10 | |
*** sdake has joined #openstack-kolla | 15:10 | |
*** sdake has quit IRC | 15:15 | |
*** sdake has joined #openstack-kolla | 15:18 | |
*** kristia__ has joined #openstack-kolla | 15:22 | |
*** kristian__ has quit IRC | 15:25 | |
*** adrian_otto has joined #openstack-kolla | 15:36 | |
*** dims has quit IRC | 15:38 | |
*** adrian_otto has quit IRC | 15:40 | |
sbezverk | kfox1111: if you show up this morning, please ping me. I found the issue with Terminating, need to discuss is you are ok with the proposed solution. | 15:50 |
*** tonanhngo has joined #openstack-kolla | 15:56 | |
*** tonanhngo has quit IRC | 16:00 | |
*** adrian_otto has joined #openstack-kolla | 16:03 | |
*** adrian_otto has quit IRC | 16:07 | |
*** kristian__ has joined #openstack-kolla | 16:09 | |
*** kristia__ has quit IRC | 16:11 | |
*** simon-AS559 has joined #openstack-kolla | 16:17 | |
*** tonanhngo has joined #openstack-kolla | 16:37 | |
*** tonanhngo has quit IRC | 16:41 | |
kfox1111 | sbezverk: ping | 16:45 |
*** v1k0d3n has quit IRC | 16:46 | |
kfox1111 | whats up? | 16:47 |
*** v1k0d3n has joined #openstack-kolla | 16:47 | |
sbezverk | kfox1111: hey please check glance clean up ps | 16:48 |
sbezverk | I end up actually doing some other ;) clean up | 16:48 |
sbezverk | it is all green | 16:48 |
kfox1111 | k. looking | 16:48 |
kfox1111 | the periodic gate worked this time. :) http://tarballs.openstack.org/kolla-kubernetes/gate/containers/ | 16:48 |
kfox1111 | we have tarballs. :) | 16:49 |
sbezverk | kfox1111: nice | 16:49 |
*** unicell has quit IRC | 16:53 | |
*** v1k0d3n has quit IRC | 16:54 | |
kfox1111 | sbezverk: ah. glance-api never got the haproxy disable feature. | 16:55 |
*** v1k0d3n has joined #openstack-kolla | 16:55 | |
kfox1111 | sbezverk: alternately, it would be nice to see if we could figure out how to merge glance-api into the common template. | 16:56 |
kfox1111 | I think we just need an optional "extraThingy"'s kind of thing to make it work. | 16:57 |
*** kristia__ has joined #openstack-kolla | 16:59 | |
sbezverk | kfox1111: since this ps becomes kind of big | 16:59 |
sbezverk | I can do it as a follow up one | 16:59 |
kfox1111 | sure. | 17:00 |
sbezverk | kfox1111: bit the root cause of the issue was termination graceful period | 17:00 |
kfox1111 | found one potential major issue. | 17:00 |
kfox1111 | bit rot, how so? | 17:00 |
kfox1111 | my concern is somethign is actually broken now, and should be fixed. | 17:00 |
sbezverk | kfox1111: when helm deletes release, pod will be sitting there for 2 days always | 17:01 |
*** kristian__ has quit IRC | 17:01 | |
sbezverk | for the gate we need to use graceful period set to 0 | 17:01 |
kfox1111 | thats a bug for sure then. | 17:01 |
kfox1111 | shutdown shoudl happen properly with the large grace period. if not, its broken. :/ | 17:02 |
sbezverk | kfox1111: what you saw broken? | 17:02 |
kfox1111 | if you have to set the period to 0, it is broken. | 17:02 |
kfox1111 | termination should happen once all connections are dropped. erregardless of the termination delay. | 17:02 |
sbezverk | kfox1111: got it | 17:02 |
kfox1111 | the termination delay is an upper bound. | 17:02 |
kfox1111 | uusually, if it takes more then a few minutes (user uploading a big image) then somethings wrong. | 17:03 |
kfox1111 | were you seeing it get stuck for multiple minutes? | 17:03 |
sbezverk | more than 5 minutes | 17:05 |
kfox1111 | hmm.... k. yeah, then somethings broken. :/ | 17:06 |
sbezverk | I bumpped up to 300 sec wait for termination | 17:06 |
sbezverk | but as soon as I used 0 as grace period it started working | 17:06 |
kfox1111 | yeah. cause it skips all safe shutdown checks. :/ | 17:06 |
kfox1111 | so, do you have a log from when it took too long? | 17:07 |
*** tonanhngo has joined #openstack-kolla | 17:07 | |
sbezverk | I should have logs but remember that we delete by using helm delete | 17:08 |
sbezverk | I cannot use --force keyword | 17:08 |
sbezverk | kfox1111: http://logs.openstack.org/38/426438/13/check/gate-kolla-kubernetes-deploy-ubuntu-binary-2-ceph-nv/459389e/console.html#_2017-01-28_23_46_11_226334 | 17:09 |
kfox1111 | I'm ok with having the option. just not using it in the gate. | 17:09 |
kfox1111 | thanks. looking. | 17:10 |
kfox1111 | hae you ever seen it with any of the other deployments, or just glance-api? | 17:10 |
sbezverk | smae issue was with cinder | 17:11 |
sbezverk | I fixed first glance | 17:11 |
kfox1111 | k. | 17:11 |
kfox1111 | http://logs.openstack.org/38/426438/13/check/gate-kolla-kubernetes-deploy-ubuntu-binary-2-ceph-nv/459389e/logs/pods/kolla-glance-api-1088032543-g31gb.txt is interesting. | 17:12 |
sbezverk | then cinder started misbahiving | 17:12 |
kfox1111 | the readyness hook stopped working when the shutdown was requested. | 17:12 |
kfox1111 | so most of it worked.... | 17:12 |
*** absubram has joined #openstack-kolla | 17:12 | |
*** tonanhngo has quit IRC | 17:12 | |
kfox1111 | hmm... | 17:14 |
kfox1111 | so, the prestop hook on the main container waits for haproxy to exit. | 17:15 |
kfox1111 | it never kills anything though, so it uses the default termination handler on the main container... | 17:15 |
kfox1111 | haproxy looked like it shutdown ok maybe but not the main container... | 17:16 |
kfox1111 | perhaps the main containers aren't taking the terminate signal. I know some containers don't work with it out of the box. | 17:17 |
*** absubram has quit IRC | 17:17 | |
*** negronjl has joined #openstack-kolla | 17:19 | |
kfox1111 | so, yeah, the logs show the signal making it to haproxy and it shuts down. | 17:22 |
kfox1111 | If I had to guess, I'd say it was the glance-api container failing to respond to the terminate. | 17:24 |
kfox1111 | ok. can you please try adding the following to the main containers: | 17:24 |
kfox1111 | command: [/bin/bash, -c, 'trap "echo I'm dieing" TERM; kolla_start'] | 17:25 |
kfox1111 | I'm guessing that might work, or at least should tell us if the signal's making it. | 17:27 |
*** tonanhngo has joined #openstack-kolla | 17:27 | |
*** lamt has joined #openstack-kolla | 17:29 | |
kfox1111 | uh.. I guess there's a quote problem there. | 17:30 |
kfox1111 | command: [/bin/bash, -c, 'trap "echo Im dieing" TERM; kolla_start'] | 17:30 |
*** kristia__ has quit IRC | 17:31 | |
sbezverk | kfox1111: looks ok | 17:31 |
*** tonanhngo has quit IRC | 17:31 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 17:35 |
sbezverk | kfox1111: what you are trying to catch with this command? cannot say I understand it.. | 17:40 |
kfox1111 | so, normally when a pod is to be terminated, | 17:40 |
kfox1111 | each container' | 17:40 |
kfox1111 | s init process is given the TERM signal. | 17:41 |
kfox1111 | then it waits graceperiod amount of time before just kill -9'ing everything in the container to force it to die. | 17:41 |
kfox1111 | the default is 30 seconds. | 17:41 |
kfox1111 | this gives the container a chance to shutdown safely before it is hard killed. | 17:41 |
kfox1111 | some containers though don't listen to the TERM signal so they just stay alive forever. | 17:42 |
kfox1111 | usually the default 30 second timeout kicks in and shoots it anyway. | 17:42 |
kfox1111 | in the 0 case, your just forcing it to kill -9 everythign right away. | 17:42 |
*** brad[] has quit IRC | 17:45 | |
sbezverk | kfox1111: do you want me to remove from the gate 0 and re-test it? | 17:46 |
kfox1111 | yes please. | 17:47 |
sbezverk | kfox1111: cool | 17:47 |
kfox1111 | it looks like its mostly working. just a minor bug to fix. | 17:47 |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 17:49 |
sbezverk | kfox1111: originally I thought that graceful period is not just upper boundary but the time pod always waits before killing itself | 17:50 |
kfox1111 | ah. | 17:50 |
*** simon-AS559 has quit IRC | 18:02 | |
*** tonanhngo has joined #openstack-kolla | 18:08 | |
sbezverk | kfox1111: it did not like that command | 18:09 |
*** sdake has quit IRC | 18:10 | |
kfox1111 | hmm... | 18:11 |
sbezverk | extra ' because of I'm | 18:11 |
kfox1111 | yeah. sorry. | 18:11 |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 18:11 |
*** lamt has quit IRC | 18:13 | |
sbezverk | kfox1111: :) np | 18:14 |
sbezverk | kfox1111: if I understood your idea well, you want to see this message, when a container gets sig kill or terminate? | 18:15 |
kfox1111 | yeah. | 18:16 |
kfox1111 | it may just work, if bash passes on the TERM to its children. | 18:17 |
kfox1111 | if not, then we can tweak the trap to kill the kolla_start | 18:17 |
SamYaple | kfox1111: first the container gets TERM, then it gets KILL | 18:24 |
SamYaple | as long as it isn't using sudo, it should pass through the dumbinit just fine | 18:24 |
SamYaple | it gets KILL after (by default) 10 seconds of not shuting down from a TERM | 18:24 |
kfox1111 | SamYaple: yeah, but we're not seeing that behavior. | 18:25 |
kfox1111 | k8s defaults to 30 seconds. | 18:25 |
kfox1111 | but we're overriding it to something longer, | 18:25 |
SamYaple | what is crashing (dont know the context) | 18:25 |
kfox1111 | as things like glance image uploads can take longer then 30 seconds. | 18:25 |
kfox1111 | we're seeing a case where it looks like kolla_start isn't getting the term. | 18:26 |
SamYaple | which service? | 18:26 |
kfox1111 | we're doing some more complicated orchestration. | 18:26 |
kfox1111 | so that connections complete before killing the api service. | 18:26 |
kfox1111 | the orchestartion seems to be working, but the final term to the api service isnt. | 18:27 |
kfox1111 | we're double checking the logic. | 18:27 |
kfox1111 | and adding tests to the gate to ensure it doesn't break again. | 18:27 |
SamYaple | well, kolla_start doesnt get the signal, because kolla_start isn't running | 18:27 |
kfox1111 | s/kolla_start/whatever it execs. | 18:28 |
SamYaple | dumbinit should be recieving the signal, but frankly i havent ever looked into the dumbinit kolla is running | 18:28 |
SamYaple | so you should double check that | 18:28 |
kfox1111 | yeah. | 18:28 |
SamYaple | i dont know what bash isn't used to be honest | 18:28 |
SamYaple | bash reaps child processes | 18:28 |
kfox1111 | yeah. so wrapping it in bash just might fix it. | 18:29 |
SamYaple | well you dont even need to wrap it in bash | 18:29 |
SamYaple | literally remove exec and dumbinit | 18:29 |
SamYaple | thats it | 18:29 |
kfox1111 | but will python reap its children? | 18:29 |
SamYaple | no! | 18:29 |
kfox1111 | yeah. I didn't think so. | 18:29 |
SamYaple | well kind of. but mostly for what you are talking about, no | 18:30 |
SamYaple | really though, removing dumbinit and exec will mean you can trust the children are getting the proper flag | 18:30 |
kfox1111 | my understanding though, is if a process is init, it generally starts off ignoring TERM, unless it explicitly asks for it. | 18:31 |
sbezverk | kfox1111: glance-api is crashing when that command is used | 18:34 |
*** brad[] has joined #openstack-kolla | 18:34 | |
SamYaple | bash passes all signals properly, by default. maybe it can be configured to ignore signals? i dont know | 18:35 |
kfox1111 | I mean non bash. | 18:36 |
kfox1111 | bash by default always sets a term handler. | 18:36 |
kfox1111 | sbezverk: the log is weird: http://logs.openstack.org/38/426438/26/check/gate-kolla-kubernetes-deploy-ubuntu-binary-2-ceph-nv/fb3e239/logs/pods/kolla-glance-api-845267455-cpwtz-main.txt | 18:36 |
SamYaple | ah. yea. really you guys are curling a random binary from the internet https://github.com/openstack/kolla/blob/master/docker/base/Dockerfile.j2#L315 | 18:37 |
SamYaple | and that freaks me out in general | 18:37 |
kfox1111 | +1. | 18:37 |
SamYaple | i really dont know why that was let in. but i wasnt around for the discussion | 18:38 |
kfox1111 | really, docker shoudl provide a simple init I think. | 18:38 |
kfox1111 | its such a common use case. :/ | 18:38 |
sbezverk | kfox1111: I know but some jobs were successful | 18:40 |
SamYaple | actually, strangely, dockers stance is you don't need an init, you should be restarting/recreating stuff so often that it shouldnt be a problem | 18:40 |
kfox1111 | sbezverk: yeah, really strange. | 18:40 |
SamYaple | not officially, but thats the things ive seen from devs | 18:40 |
sbezverk | and it shows that we had to wait few seconds before the pod gets terminated | 18:40 |
kfox1111 | SamYaple: yeah, but child cleanup. | 18:41 |
*** kristian__ has joined #openstack-kolla | 18:41 | |
kfox1111 | SamYaple: that stance shows a lack of understanding what init does. | 18:41 |
*** tonanhngo has quit IRC | 18:45 | |
SamYaple | kfox1111: im not disagring | 18:45 |
*** richwellum has joined #openstack-kolla | 18:45 | |
SamYaple | thats what i stick with bash | 18:45 |
SamYaple | kfox1111: that said, you shouldnt have so many child processes that this becomes a problem if you are recreating the container once a week or so | 18:45 |
SamYaple | i believe is the idea | 18:45 |
*** kristian__ has quit IRC | 18:45 | |
*** tonanhngo has joined #openstack-kolla | 18:46 | |
sbezverk | kfox1111: any other ideas? other than go back to use grace period 0 until figure out more generic solution? | 18:48 |
kfox1111 | sbezverk: weird... | 18:49 |
kfox1111 | SamYaple: yeah, but they still are wrong. :) | 18:49 |
kfox1111 | SamYaple: some things just tend to fork off a lot of children. its pretty common/normal. | 18:50 |
kfox1111 | sbezverk: I think something else is wrong now. | 18:50 |
*** richwellum has quit IRC | 18:50 | |
kfox1111 | it feels like more then a coincidence that the ceph ones are failing with the missing file, | 18:51 |
kfox1111 | and the iscis ones are all passing now. | 18:51 |
SamYaple | how are you orechstrating this? | 18:51 |
sbezverk | SamYaple: two different orchestrations are used | 18:52 |
kfox1111 | part outside of k8s, part inside. | 18:52 |
sbezverk | one is deployment by microservice and one by service with dependency | 18:52 |
*** tonanhngo has quit IRC | 18:53 | |
kfox1111 | k8s has some build in orchestration around upgrading things, by | 18:53 |
kfox1111 | having it delete old version pods and create new ones. | 18:53 |
kfox1111 | to be seamless, the delete signal on the pod should block the termination until it has no more traffic. | 18:54 |
kfox1111 | the python services had no mechanism to do that. | 18:54 |
kfox1111 | so I put haproxy in front to do safe shutdowns, as it does support connection tracking/safe shutdown. | 18:55 |
SamYaple | kfox1111: best way to do that is at the LB | 18:55 |
kfox1111 | essentially, yeah. | 18:55 |
SamYaple | haproxy/nginx _could_ do it in front, but that seems really heavy | 18:55 |
kfox1111 | a coule birds with one stone. | 18:56 |
kfox1111 | k8s provides a native l3 load balancer based on tcp. | 18:56 |
kfox1111 | iptables. | 18:56 |
kfox1111 | I then pair a haproxy listening on that, the python service on 127.0.0.1. | 18:57 |
SamYaple | yea. i am building an nginx one | 18:57 |
kfox1111 | when tied in with ssl, | 18:57 |
kfox1111 | you can do ssl termination and connection tracking at the haproxy, and then pass the unencrypted traffic safely within the node. | 18:57 |
SamYaple | so the problem with your last statement is that wont fly for any banks | 18:58 |
SamYaple | even though you are passing "encrypted" unencrypted traffic | 18:58 |
kfox1111 | why? it never crosses the network. | 18:58 |
SamYaple | we just went through this with a customer | 18:58 |
*** lamt has joined #openstack-kolla | 18:58 | |
SamYaple | "then pass the unencrypted traffic safely within the node"? | 18:58 |
SamYaple | am i misundertsanding your statement? | 18:59 |
kfox1111 | within a single node, ssl termination happens on the haproxy sidecar for that pod, then is passed via loopback to the pod's python process. | 18:59 |
SamYaple | ah. yea that would probably be ok | 19:00 |
SamYaple | im still building the nginx LB. i dont see how upgrades are going to happen without it | 19:00 |
SamYaple | and i dont want to run haproxy/nginx per service | 19:00 |
kfox1111 | for k8s, the connection tracking plus k8s's native lb are enough for all api rolling upgrades, | 19:01 |
kfox1111 | except horizon. | 19:01 |
kfox1111 | and I've got a different ps for that in flight: | 19:01 |
kfox1111 | https://review.openstack.org/#/c/333996/ | 19:01 |
*** unicell has joined #openstack-kolla | 19:02 | |
kfox1111 | the sidecar haproxy + k8s lb gives a fully distributed lb. very scalable. | 19:02 |
kfox1111 | and the lb only has to know about the service its built into, so no tweaking it to add more services. | 19:03 |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 19:03 |
kfox1111 | sbezverk: I think we were pretty close before. | 19:06 |
SamYaple | kfox1111: i mean you make a good point about the distrubuted nature, but the lack of connection tracking and other abilities (like L7 balancing) make it a tough pill to swallow | 19:08 |
SamYaple | and you arent going to get ssl termination either. so youll still need to throw in an LB somewhere | 19:09 |
kfox1111 | connection tracking's done distributed, right at the api service. as well as optionally, ssl termination. | 19:09 |
kfox1111 | different tools for different jobs. | 19:09 |
kfox1111 | theres two types of ssl termination too. | 19:10 |
kfox1111 | some sites want both for extra security. | 19:10 |
kfox1111 | ie, | 19:10 |
SamYaple | sure, but you can't loadbalnce to ssl'd hosts | 19:11 |
kfox1111 | ssl termination layer 1, centeralized, with the main certs, so that the attack surface for getting the certs is smaller. | 19:11 |
kfox1111 | they ressl with an internal cert that goes to the endpoints and unterminated there. | 19:11 |
SamYaple | so you are talking ssl termination, unencrypted to the nodes, then ssl'd again? | 19:11 |
kfox1111 | no. ssl both parts. | 19:11 |
SamYaple | ok, but you can't load balance to ssl'd hosts | 19:12 |
kfox1111 | say, you have a k8s managed service, with 3 haproxy frontend tls terminators. | 19:12 |
SamYaple | so is the kubernetes LB doing ssl termination on both the external and internal side? | 19:12 |
kfox1111 | they then have https configured to talk to the backends. | 19:12 |
*** lamt has quit IRC | 19:12 | |
kfox1111 | then you have a k8s service and api pods, each doing ssl termination of their own with haproxy sidecars. | 19:12 |
kfox1111 | kubernetes's native lb only does tcp lb. | 19:13 |
kfox1111 | which is ok for this use case. | 19:13 |
*** lamt has joined #openstack-kolla | 19:13 | |
kfox1111 | I've got a graph, while wrongly labeled for this use case, might help. | 19:13 |
SamYaple | this will only be ok for stateless services. anything stateful like horizon is going to give you trouble | 19:14 |
SamYaple | unless you use a shared bakend | 19:14 |
kfox1111 | yeah. ceph. | 19:14 |
kfox1111 | horzion is seperate issue. | 19:14 |
kfox1111 | https://review.openstack.org/#/c/333996/28/doc/source/kolla-kubernetes-horizon.png,unified | 19:14 |
kfox1111 | and thats where the graphic comes from. horizon's solution. :) | 19:14 |
kfox1111 | in the non horizon case, emagine the diagram with: | 19:15 |
kfox1111 | * no backend version 2 services. | 19:15 |
kfox1111 | just the backend version 1 block. | 19:15 |
kfox1111 | "horizon frontend A/B" is really 'haproxy' | 19:16 |
SamYaple | i see what you are trying to say, and im pretty sure there is still an issue there for you | 19:16 |
SamYaple | i think kubernetes lack of LB flexibility is going to hurt you very quickly here | 19:16 |
SamYaple | but time will reveal all things | 19:16 |
kfox1111 | and horizon backend 1a/b is a "haproxy + python-api pair" | 19:17 |
SamYaple | i won't comment on horizon. i don't exactly look to them for architecture advise. they have an historically bad track record there | 19:18 |
kfox1111 | I think they provide the bsare minimum flexibility to build much more flexible systems on top. | 19:18 |
kfox1111 | SamYaple: I didn't either. I assumed they will for sure break things between versions, | 19:18 |
kfox1111 | and designed a system that coudl handle that. | 19:18 |
kfox1111 | the only way that would work is with running multiple versions of horizon in parallel, and targeting the browser to the version it logged in on. | 19:19 |
*** tonanhngo has joined #openstack-kolla | 19:19 | |
SamYaple | i thought you said that graphic was from horizon? | 19:19 |
SamYaple | locking clients to singular horizon servers is strange | 19:20 |
kfox1111 | was using that graph as an example of how the other kind of looks. its a bit different but similar. | 19:20 |
kfox1111 | locking clients to clusters of horizon servers. | 19:20 |
kfox1111 | until they are drained. | 19:21 |
kfox1111 | so eventually the sessions will timeout and will hit a new server. | 19:21 |
*** kristian__ has joined #openstack-kolla | 19:23 | |
*** tonanhngo has quit IRC | 19:24 | |
*** lamt has quit IRC | 19:25 | |
kfox1111 | sbezverk: I'm pretty confused by the result you got. the ps doesn't seem to change anything that would make glance.keyright not show up. but it was consistently doing so. | 19:26 |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Pull container tallballs into the gate https://review.openstack.org/426598 | 19:28 |
*** sdake has joined #openstack-kolla | 19:29 | |
*** lamt has joined #openstack-kolla | 19:29 | |
sbezverk | kfox1111: I had all gates green this morning, I am trying to get back to green state | 19:31 |
sbezverk | so I am reverting these changes | 19:32 |
sbezverk | and then go 1 step at the time | 19:32 |
kfox1111 | k. | 19:33 |
kfox1111 | oh... | 19:38 |
kfox1111 | looks like the selenium test fails sometimes as the browser pool is empty as its not fully up yet... | 19:39 |
kfox1111 | so, I need to wait for that... | 19:39 |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 19:40 |
SamYaple | idk kfox1111. i just cant get on board with this. a random client chosen from the available hosts. no way for least-connection type load balancing. no retiries for failed tcp. no L7 anything | 19:42 |
SamYaple | im trying to wrap my brain around it, but you lose alot of assurances that make this difficult to swallow | 19:43 |
kfox1111 | SamYaple: most big sites seperate their l3 from their l7 lb's. | 19:44 |
SamYaple | ok, but there is no option for an l7 lb here | 19:45 |
SamYaple | and this isn't an l3 lb | 19:45 |
*** sayantani01 has quit IRC | 19:45 | |
kfox1111 | k8s's services are l3 lb's. | 19:45 |
SamYaple | they are not... they are l4 | 19:45 |
kfox1111 | haproxy is an l7. | 19:45 |
SamYaple | tcp/udp is l4 | 19:45 |
kfox1111 | well, ok. under that definition, | 19:46 |
kfox1111 | k8s's is an l4 lb. | 19:46 |
SamYaple | 'As of Kubernetes v1.0, Services are a "layer 4" ' | 19:46 |
SamYaple | from k8s site | 19:46 |
kfox1111 | yeah. | 19:46 |
SamYaple | ok. my point here is you lose alot of assurances and safty with this iptables lb | 19:47 |
kfox1111 | yeah. but I don't think it matters for rest servcies or web sites much. | 19:47 |
SamYaple | its more distributed, sure. but its more fragile too. i cant imagine an nginx/haproxy contaienr that spins up with a service will make be _worse_ in most situations | 19:48 |
kfox1111 | its really mostly a problem if you really really must have one client hit one exact server always with multiple streams. | 19:49 |
SamYaple | and even with those, you can't make guarantees about lost packets. i know you should architect your application for retiries, etc, but you should architect your architecture to not _lose_ packets in the first place | 19:49 |
kfox1111 | and thats really a bad use case. :/ | 19:49 |
kfox1111 | packets get lost. fact of life. | 19:49 |
kfox1111 | thats why tcp exists in the first place. | 19:49 |
kfox1111 | we all have accepted tcp as a fact of life too. | 19:50 |
SamYaple | which brings me to another point, this doesnt do udp either | 19:50 |
SamYaple | which something liek desginate would need | 19:50 |
kfox1111 | I don't know if k8s has a udp variant of svc's. | 19:50 |
SamYaple | no reason to lose more packets than needed either | 19:50 |
kfox1111 | yeah. I can see it needig it. | 19:50 |
kfox1111 | agreed. | 19:50 |
SamYaple | im just having a hard time swallowing that the iptables with haproxy sidecar is a better solution than haproxy per service is all | 19:51 |
SamYaple | thats my main hangup | 19:51 |
kfox1111 | there isn't one big reason I think but a set of smaller reasons that add up to a good reason. | 19:52 |
kfox1111 | the ssl termination right at the pod is one. | 19:52 |
kfox1111 | the seamless rolling upgrade support it adds is another. | 19:52 |
SamYaple | which you would get from haproxy per service | 19:52 |
kfox1111 | the pretty much linear scalability is another. | 19:52 |
SamYaple | the linear scalability is the part that makes me question | 19:53 |
kfox1111 | there's no orchestration though around a single haproxy. | 19:53 |
kfox1111 | the ssl termination is not cheep. having it at hte pod, allows it to scale with the pods. | 19:53 |
SamYaple | yea, youre right. it would be another piece. but it can exist | 19:53 |
SamYaple | thats a good point | 19:53 |
kfox1111 | its free though in k8s land when done this way. | 19:53 |
SamYaple | distributing the load there | 19:53 |
kfox1111 | yeah. | 19:54 |
SamYaple | would this also allow for client certs? | 19:54 |
kfox1111 | the ssl termination load per pod is realitively constant. then scaled out with the pods. | 19:54 |
SamYaple | yea thats a really solid point, i agree | 19:54 |
kfox1111 | Not sure. I think you can pass the client cert info through the lb's. | 19:54 |
sdake | haproxy takes 1.5% of the total cpu utilization during 123 node scale testing | 19:54 |
sdake | on the controller nodes | 19:54 |
sdake | most of the network utilization is in the networking stack itself | 19:55 |
SamYaple | kfox1111: still, not having ssl configured at the service itself is questionable | 19:55 |
sdake | i think latency is what is important not cpu utilization | 19:55 |
sdake | carry on :) | 19:55 |
SamYaple | kfox1111: i know thats going to raise an eyebrow | 19:55 |
sdake | colocating haproxy with the pod may reduce latency significantly | 19:56 |
sdake | not really sure on that point | 19:56 |
kfox1111 | SamYaple: I think ultimately we will suprpot both use cases. | 19:56 |
SamYaple | kfox1111: weve been talking about alot. what is "both use cases" to you? | 19:56 |
kfox1111 | SamYaple: going back to the, microservices provide building lbocks allowing you to assemble them however you want. | 19:56 |
sdake | we measured a 20% impact on typical operations - all the way up to 300% for operations like spamming keystone token gen | 19:56 |
kfox1111 | SamYaple: back to the ssl termination case we were talking about. | 19:57 |
kfox1111 | some sites want it centeralized for extra security of their main ssl certs. | 19:57 |
sdake | ok gotta jet - have big house backlog :) | 19:57 |
kfox1111 | some want that, plus ssl termination on all network connections. | 19:57 |
kfox1111 | so need a secondary termination at the node. | 19:57 |
kfox1111 | so haproxy side car solves the second case. | 19:57 |
kfox1111 | l7 lb for the service, solves the first issue. | 19:58 |
kfox1111 | both are complementery. | 19:58 |
kfox1111 | and nessisary for that one use case. | 19:58 |
kfox1111 | so I think we provide the building blocks to do that, and let the user decide if tehy want extreme securtiy, maximum performance (latency) or somewhere in between. | 19:59 |
SamYaple | kfox1111: if connection draining could be achieved without an haproxy sidecar, with ssl termination in the app, would that be preferable? | 20:04 |
SamYaple | kfox1111: a thought, use haproxy/nginx in tcp mode. they do not do the ssl termination. they pass that tcp info through to the backend. they can still do connection draining | 20:12 |
*** matrohon has joined #openstack-kolla | 20:12 | |
sbezverk | kfox1111: I reverted the changes and glance api crash is gone.. | 20:13 |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 20:17 |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Check to see if Horizon is working in the gate. https://review.openstack.org/426025 | 20:26 |
kfox1111 | SamYaple: I'd be ok with that. ssl termination in python has thus far been kind of... bad. though maybe its gotten better. | 20:27 |
kfox1111 | SamYaple: yeah. when not configured for tls, it totally could do l4 instead of l7 and still do the job. | 20:28 |
kfox1111 | sbezverk: hmm... | 20:29 |
SamYaple | kfox1111: remember, the ssl termination is not in python. its in apache, or uwsgi | 20:29 |
SamYaple | kfox1111: but the important part is, its done at the running service itself | 20:30 |
kfox1111 | SamYaple: that just switches out haproxy for apache. | 20:30 |
kfox1111 | not sure its any less complex. | 20:30 |
kfox1111 | and its for sure less tested? | 20:30 |
SamYaple | its ok. then lets talk about your actual goal. connection draining without a massive 10 hour timeout (just to be sure) | 20:31 |
SamYaple | the ssl termination was just something you could _do_ in that setup, correct? | 20:31 |
SamYaple | or is the goal to abstract the ssl termination | 20:31 |
kfox1111 | right. | 20:32 |
kfox1111 | the initial implementation was an easy way to implement connection tracking, | 20:32 |
kfox1111 | with the ability to easily add ssl termination in the future. | 20:32 |
kfox1111 | its easy to switch it out with other things though, as the templates share one common master template where the logic all resides. | 20:32 |
SamYaple | ok. then really, i think I would lean toward a pre-stop task that changes readiness to false (so no new connections) and then watches the connections drain itself. remove nginx/haproxy entirely | 20:33 |
SamYaple | that would be the simpliest, most tested solution | 20:33 |
kfox1111 | it can't do that. | 20:34 |
kfox1111 | there's no connection tracking built into k8s that you can get at. | 20:34 |
kfox1111 | hence using haproxy for that. | 20:34 |
SamYaple | pre-stop task | 20:34 |
SamYaple | it runs in the container | 20:34 |
kfox1111 | which does what? | 20:34 |
SamYaple | `ss` or any number of ways of checking if a process still has a connection | 20:35 |
kfox1111 | theres not a clean way to seperate users connections from the processes own connections? | 20:35 |
kfox1111 | the python-api process might keep long running connections to mariadb or keystone that shoudln't be tracked. | 20:36 |
kfox1111 | the haproxy solution can track the connections that pass through it, but not the rest, solving that problem. | 20:37 |
kfox1111 | (the haproxy solution was a quick/easy solution to the problem. not apposed to other solutions) | 20:37 |
SamYaple | no. youre thinking of it the wrong way | 20:41 |
SamYaple | keystone listens on 5000, make sure that is drained | 20:41 |
SamYaple | you dont care about outgoing | 20:41 |
SamYaple | you can absolutely reliably make sure that 5000 has no connections | 20:41 |
kfox1111 | hm... that might work. | 20:42 |
SamYaple | that would be great to remove an external thing like that | 20:42 |
kfox1111 | yeah. | 20:43 |
kfox1111 | we already allow haproxy to be conditional. | 20:43 |
kfox1111 | in the else case, if we could figure out an ss hook that would work, we could add it to that side. | 20:44 |
kfox1111 | then when haproxy is disabled, connection tracking would still work. | 20:44 |
SamYaple | well at that point haproxy would just be overhead with no added benefit, no? | 20:44 |
kfox1111 | haproxy would still be needed for the ssl termination case. | 20:45 |
kfox1111 | which should be pretty easy to add to that side. | 20:45 |
kfox1111 | so I think its probably ok leaving the option there. | 20:45 |
SamYaple | wait, why is ssl termination an issue? all the things do ssl termination | 20:46 |
SamYaple | i thought the _issue_ was connection draining, and ssl termination was just an extra | 20:46 |
kfox1111 | yes. | 20:46 |
SamYaple | but im not seeing how haproxy ssl termination is better than apache or uwsgi | 20:46 |
kfox1111 | your saying you've figurd out reliable ssl termination in python? | 20:46 |
SamYaple | no one does ssl termination in python..... | 20:46 |
kfox1111 | becuase ssl termination with haproxy is much better tested then sticking python api services in apache I think. | 20:47 |
SamYaple | oh you are talkign about eventlet | 20:47 |
SamYaple | isnt that deprecated everywhere? | 20:47 |
kfox1111 | yeah, eventlet. | 20:47 |
SamYaple | isnt uwsgi the recommended for most services? | 20:47 |
kfox1111 | not that I've heard. | 20:47 |
kfox1111 | the only ones I've seen not recommend eventlet are | 20:47 |
kfox1111 | keystone recommends apache explicitly for the auth plugins, | 20:48 |
kfox1111 | and horizon is commonly only deployed with apache too. | 20:48 |
kfox1111 | I think barbican is uwsgi by default, for those deploying that. | 20:48 |
kfox1111 | pretty much everthing else I've seen has been eventlet. | 20:48 |
sdake | bmace_ are you about? | 20:48 |
stevemar | kfox1111: yep, you are correct | 20:49 |
SamYaple | kfox1111: keystoen recommends uwsgi. | 20:49 |
kfox1111 | I'd rather use the most tested cases. so haproxy + ssl termination. | 20:49 |
SamYaple | if you _need_ the auth crap they say apache | 20:49 |
SamYaple | but here https://review.openstack.org/#/c/419706/ | 20:49 |
SamYaple | its not everywhere, but it will be in Pike. wsgi | 20:49 |
sdake | SamYaple my understanding is eventlet isn't ported to python 3.5 | 20:49 |
kfox1111 | SamYaple: ok. they may have changed that stance since I talked to them last. | 20:50 |
sdake | SamYaple and there is no intent to do so | 20:50 |
SamYaple | which means apache2 | 20:50 |
kfox1111 | SamYaple: we're far away from pike. ;) | 20:50 |
sdake | SamYaple therefore, openstack is moving away from eventlet | 20:50 |
SamYaple | kfox1111: nah its been uwsgi for like 3 cycles | 20:50 |
sdake | SamYaple i think this was on the mailing list - but i could have imagined it | 20:50 |
kfox1111 | SamYaple: ok. | 20:50 |
SamYaple | sdake: check the review i just posted. youll expect wsgi/uwsgi in pike across the board. with alot of services already having it | 20:51 |
sdake | SamYaple right - ithink this is driven by the python 3.5 work | 20:51 |
stevemar | SamYaple: keystone tests with both apache (with mod_wsgi) and uwsgi itself, but still recommend apache for auth plugins | 20:51 |
SamYaple | so apache/uwsgi which are both good at ssl termination. with apache2 beign much better than haproxy (which is still relativily new to the ssl termination game) | 20:51 |
sdake | i rather like the ssl termination we have at present | 20:52 |
sdake | its fast and effeicient | 20:52 |
kfox1111 | SamYaple: operators tend to resist change. until its widely deployed, a bunch of operators won't be terribly convinced to switch until they have seen proof its stable. | 20:52 |
sdake | although it doesn't do internal ssl termination and i'm not sure it could be done at all | 20:52 |
kfox1111 | while eventlet's scary as heck really, | 20:53 |
SamYaple | stevemar: http://docs.openstack.org/developer/keystone/devref/development_best_practices.html#running-keystone this is what im referring too | 20:53 |
kfox1111 | its known to work pretty well for openstack. | 20:53 |
kfox1111 | uwsgi, not nearly so proven for that code base. :/ | 20:53 |
SamYaple | stevemar: most of the documenation points to uwsgi (including the best practice). so docuemntation is wrong | 20:53 |
sdake | SamYaple - quick q, were ou palnning to submit salt for pike? | 20:53 |
sdake | SamYaple the reason i ask is you will need to recruit a team :) | 20:54 |
SamYaple | sdake: no. i disagree with kolla building its own strange delieverables thing and wont be following that structure | 20:54 |
sdake | SamYaple you mean the file that inc0 put in the repo? | 20:54 |
sdake | I disliked that as well | 20:54 |
sdake | SamYaple or some other structural thing? | 20:55 |
SamYaple | i mean kolla-salt need not be related to kolla. nor kolla-ansible at this point (though im certainly not suggesting name changes) | 20:55 |
sdake | kolla needs atleast one deployment tool to fulfill our mission - or we could potentially be removed from the big tent | 20:56 |
stevemar | SamYaple: ah yeah, that should be updated :) last updated almost a year ago, damn docs | 20:56 |
sdake | our mission says the word "tools" though - so multiple are fine | 20:56 |
SamYaple | stevemar: fair enough. but it has been that way for a while. I personally use apache2 because i use saml (i think weve discussed this before) | 20:57 |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 20:57 |
kfox1111 | sdake: yeah, I think the assumption that there's shared containers, and seperate tools is kind of wrong. | 20:57 |
kfox1111 | is see kolla kubernetes microservice packages very much like kolla's containers. | 20:57 |
stevemar | SamYaple: yep, i think so. looks like the change was just to show how to spin up keystone quickly, so it works in that example :P | 20:57 |
SamYaple | sdake: whatever you say man. but lots of good points were brought up in that ML thread. and there is no valid reason to couple the deployment with the container config at this point in time (with regards to big-tent and delieveables) | 20:57 |
kfox1111 | I think its likely that there may be multiple deployment tools that consume them. | 20:58 |
kfox1111 | so here may be sub deliverables for kola-kubernetes. | 20:58 |
sdake | kfox1111 we have to have something working first before we shave off kolla-kubernetes into a seprate big tent project | 20:59 |
sdake | kfox1111 and a ptl to lead it | 20:59 |
kfox1111 | yeah, not saying we should split it at all. | 20:59 |
sdake | kfox1111 i agree multiple deployment tools can use the kolla images -that is the whole point | 20:59 |
kfox1111 | just saying, the assumption that kolla-kubernetes is one single deliverable may be wrong. | 20:59 |
sdake | SamYaple the main reason i see is it is our mission to do so - as per the commitment the 500+ people that have worked on kolla agreed to when committing to the repo | 21:00 |
sdake | SamYaple however, I can get you don't want to play inthe kolla project sandbox - works for me | 21:00 |
kfox1111 | it maybe should be one deliverable for the microservcies, and seperate deliverables for the orchestration tooling. | 21:00 |
sdake | kfox1111 ya we can always split stuff out later - we did it with kolla and kolla-ansible ;-) | 21:01 |
sdake | its disruptive to do so though - as a big chunk of ocata was spent making that split work | 21:01 |
sdake | so rathe then suffer disrutpion of a repo split all through pike, I'd like to ge tto 1.0.0 | 21:01 |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Pull container tallballs into the gate https://review.openstack.org/426598 | 21:02 |
kfox1111 | sdake: yeah, and we waited too long on that one. and got a bunch of black eyes for it I think. :/ | 21:02 |
SamYaple | this is all politics that im not going to get involved with. circling back around | 21:02 |
sdake | kfox1111 no we did not wait too long | 21:02 |
SamYaple | kfox1111: in Pike, all projects will be deployable via wsgi and uwsgi | 21:02 |
kfox1111 | if we did it earlier, it would have been such a large lift. | 21:02 |
sdake | kfox1111 the timing was perfect - especially for a four month cycle | 21:02 |
SamYaple | so that seems like the direction you would want to head for ssl termination | 21:03 |
kfox1111 | SamYaple: great. then by Q, it will probably be stable. | 21:03 |
sdake | kfox1111 disagree earlier could have potential to kill the project | 21:03 |
kfox1111 | sdake: yeah. but could have made a lot more friends too. | 21:03 |
sdake | I am off to dinner - ttyl | 21:03 |
sdake | kfox1111 we have lots o friends | 21:03 |
kfox1111 | same logic as the helm thing. | 21:03 |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Add a script to build an example cloud yaml https://review.openstack.org/426417 | 21:05 |
sdake | kfox1111 your irssi client is busted again - ok gotta jet - enjoy the day folks :) | 21:07 |
kfox1111 | sdake: you keep renaming your user. ;) | 21:08 |
sdake | kfox1111 blame cisco | 21:08 |
sdake | kfox1111 vpn ftl | 21:08 |
kfox1111 | yup. I don't blame anyone. just saying, your vpn and my irc client don't play well. :/ | 21:11 |
kfox1111 | neither one's fault. :/ | 21:12 |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Check to see if Horizon is working in the gate. https://review.openstack.org/426025 | 21:18 |
sbezverk | kfox1111: it looks the issue is when I starte re-using localVals | 21:18 |
sbezverk | as soon as I switch to re-use it I see crash in glance api | 21:19 |
sbezverk | probably it does not get re-initialized properly | 21:19 |
sbezverk | really wierd in some casesw re-init works in some it does not.. | 21:20 |
*** lamt has quit IRC | 21:23 | |
kfox1111 | sbezverk: got an example of what you mean? | 21:25 |
kfox1111 | not quite sure what you mean. | 21:26 |
*** sdake has quit IRC | 21:26 | |
sbezverk | kfox1111: localVals is used to get ceph_backend | 21:26 |
kfox1111 | do you have a before/after ps numbers? | 21:27 |
kfox1111 | so I can see what you are trying? | 21:27 |
sbezverk | I found the issue just need one thing to confirm. | 21:28 |
sbezverk | $_ := set $c1 "retDict" $localVals | 21:28 |
sbezverk | this set appends retDict value | 21:28 |
kfox1111 | it sets the retDict key in c1 to have the value localVals. | 21:29 |
sbezverk | or there could be only 1 key value passed to macro? | 21:29 |
kfox1111 | the set macro only seems to allow one key/value to be set at a time currently. :/ | 21:29 |
sbezverk | can I add 2 keys to search in one macro call? | 21:29 |
kfox1111 | not currently. | 21:29 |
sbezverk | I see in this case I must use another dict I cannot use localVals | 21:29 |
sbezverk | because it is already used to get ceph_backend | 21:30 |
kfox1111 | have to search both, then use a |default b | 21:30 |
kfox1111 | local values should be sharable. | 21:30 |
kfox1111 | just don't share c | 21:30 |
sbezverk | c in not share | 21:30 |
kfox1111 | can you show me the code thats broken? | 21:30 |
kfox1111 | I'm doing shared local values elswere already I thinkk. | 21:30 |
sbezverk | http://paste.openstack.org/show/596828/ | 21:31 |
sbezverk | I think localVals gets screwed up by second call | 21:31 |
sbezverk | and as a result glance api goes bananas | 21:31 |
kfox1111 | line 6 is killing localvalues. | 21:31 |
kfox1111 | just remove the line. | 21:32 |
*** sayantani01 has joined #openstack-kolla | 21:32 | |
kfox1111 | {{ .. := dict }} | 21:32 |
kfox1111 | dict creates an empty dict. | 21:32 |
kfox1111 | if no args. | 21:32 |
sbezverk | k that is what I was trying to get if I can extend existing or need to use new re-initizliaed one | 21:32 |
kfox1111 | yeah. | 21:32 |
kfox1111 | if you reuse the old one, it should work I think. | 21:33 |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 21:33 |
*** Jeffrey4l_ has quit IRC | 21:33 | |
sbezverk | kfox1111: we will see soon | 21:33 |
sbezverk | when it is sorted out, what is the plan for catching term signal? | 21:34 |
*** Jeffrey4l_ has joined #openstack-kolla | 21:34 | |
kfox1111 | the previous ps you had was really close I think. | 21:38 |
kfox1111 | I'm puzzled by the missing file though. | 21:39 |
kfox1111 | we just need to figure that bit out I think. | 21:39 |
*** rhallisey has joined #openstack-kolla | 21:40 | |
*** richwellum has joined #openstack-kolla | 21:42 | |
*** rwellum has joined #openstack-kolla | 21:44 | |
*** matrohon has quit IRC | 21:45 | |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Pull container tallballs into the gate https://review.openstack.org/426598 | 21:47 |
*** richwellum has quit IRC | 21:47 | |
kfox1111 | sbezverk: this should be ready: https://review.openstack.org/#/c/426417/ | 21:52 |
*** lamt has joined #openstack-kolla | 21:52 | |
*** kristian__ has quit IRC | 22:01 | |
cliles | I'm looking at 60d24417df69b05c506e073bcc187c7b50edc501 and finding multiple backward incompatible changes,,,,how is this tested that is a gap between my env?? | 22:01 |
*** kristian__ has joined #openstack-kolla | 22:01 | |
*** breitz has quit IRC | 22:02 | |
*** breitz has joined #openstack-kolla | 22:03 | |
*** kristian__ has quit IRC | 22:05 | |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Pull container tallballs into the gate https://review.openstack.org/426598 | 22:06 |
*** lamt has quit IRC | 22:07 | |
SamYaple | cliles: can you provide a link to what you are refering too? | 22:09 |
cliles | https://github.com/openstack/kolla-ansible/commit/60d24417df69b05c506e073bcc187c7b50edc501 | 22:09 |
cliles | this has broken cinder deployment | 22:09 |
cliles | I am reverting 1 change at a time | 22:09 |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Gate fix 4 https://review.openstack.org/426609 | 22:09 |
SamYaple | can you give more specifics about how its not compatible and broken? | 22:10 |
SamYaple | this "optimzation" change has been rolled through most all the services now | 22:10 |
cliles | yeah, that is how I am wondering how it passed | 22:11 |
cliles | I'm just doing a git pull and pip install . | 22:11 |
cliles | then trying to reinstall latest :/ | 22:11 |
cliles | SamYaple - let me get you some pastebins | 22:12 |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Check to see if Horizon is working in the gate. https://review.openstack.org/426025 | 22:15 |
sbezverk | kfox1111: it looks like the main container will wait for file /var/lib/kolla-kubernetes/event/shutdown to appear. 1. I saw that even helm delete was executed haproxy container was still running, as a result it would not create this file 2. If haproxy is not running at all since it is optional what will 'touch' this file for the main container to exit?? | 22:16 |
kfox1111 | sbezverk: last I looked in the review logs, the logs for the haproxy container said that haproxy exited. | 22:17 |
kfox1111 | so the file should have been touched. | 22:17 |
sbezverk | kfox1111: possibly but what about the case without haproxy? | 22:18 |
*** sdake has joined #openstack-kolla | 22:18 | |
*** sdake has quit IRC | 22:19 | |
kfox1111 | in that case, nothing touches the file, but I think you killed the prestop check in that case. | 22:19 |
kfox1111 | I'm guesing its what I think is the real cause. | 22:19 |
kfox1111 | that both ways, the main container is ignoring the TERM signal. | 22:19 |
kfox1111 | when you put in the command: .. check in there, it worked for the iscsi case. | 22:19 |
sbezverk | kfox1111: I do not kill pre-stop at least I doubt helm uses --force | 22:19 |
kfox1111 | I thought you put a conditional around it in your ps. | 22:20 |
sbezverk | right now it works for all gate jobs | 22:20 |
kfox1111 | yeah. cause its forcing delete. | 22:21 |
sbezverk | kfox1111: right my bad I did put a condition | 22:21 |
kfox1111 | interestingly, you don't force it in the iscsi case? | 22:21 |
sbezverk | nope | 22:22 |
sbezverk | all is the exactly the same way | 22:22 |
kfox1111 | thats even stranger. | 22:22 |
sbezverk | both gates call the same cleanup script | 22:22 |
kfox1111 | that in theory shouldn't be any different between ceph and iscsi. | 22:22 |
kfox1111 | oh. no, they aren't quite the same: | 22:23 |
kfox1111 | you have: | 22:23 |
kfox1111 | https://review.openstack.org/#/c/426438/31/tests/bin/ceph_workflow.sh | 22:23 |
kfox1111 | and | 22:23 |
kfox1111 | https://review.openstack.org/#/c/426438/31/tests/bin/ceph_workflow_services.sh | 22:23 |
kfox1111 | but no iscsi_workflow.sh one. | 22:23 |
sbezverk | it was there :( I must have missed it when rolled back | 22:24 |
kfox1111 | oh. ok. | 22:24 |
sbezverk | ok I will remove 0 from ceph to be the same with iscsi | 22:24 |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: WIP test https://review.openstack.org/426611 | 22:26 |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 22:26 |
kfox1111 | sbezverk: just a ps to kick off a few more runs. trying to gather some more data points. | 22:26 |
sbezverk | kfox1111: ok | 22:27 |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Check to see if Horizon is working in the gate. https://review.openstack.org/426025 | 22:39 |
*** adrian_otto1 has joined #openstack-kolla | 22:40 | |
*** sdake has joined #openstack-kolla | 22:41 | |
*** lamt has joined #openstack-kolla | 22:52 | |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Check to see if Horizon is working in the gate. https://review.openstack.org/426025 | 22:53 |
sbezverk | kfox1111: it seems to be working | 22:54 |
kfox1111 | interesting. | 22:54 |
kfox1111 | https://review.openstack.org/#/c/426438/32/tests/bin/ceph_workflow_service.sh | 22:56 |
kfox1111 | hmm. but the ceph_worklfow.sh one is gone. | 22:56 |
kfox1111 | so does seem to be working. | 22:56 |
sbezverk | kfox1111: damn I guess it getting late :( I thought I removed it everywhere, well another try then.. | 22:59 |
*** ipsecguy_ has quit IRC | 22:59 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 22:59 |
*** ipsecguy has joined #openstack-kolla | 23:00 | |
kfox1111 | one little ws issue too. | 23:00 |
*** jrobinson has joined #openstack-kolla | 23:03 | |
sbezverk | kfox1111: in which file, I went through could not see it? | 23:06 |
kfox1111 | all_values | 23:08 |
kfox1111 | line 47 | 23:08 |
*** lamt has quit IRC | 23:08 | |
*** ipsecguy has quit IRC | 23:08 | |
sbezverk | yeah, thanks by some reason it did not highlighted with red | 23:10 |
sbezverk | hard to spot in green | 23:10 |
kfox1111 | yeah. really weird. :/ | 23:10 |
kfox1111 | gerrit's pretty cool, but still has some things it needs to do better. | 23:11 |
*** newmember has joined #openstack-kolla | 23:13 | |
*** ipsecguy has joined #openstack-kolla | 23:13 | |
*** adrian_otto1 has quit IRC | 23:13 | |
*** adrian_otto has joined #openstack-kolla | 23:16 | |
*** sdake has quit IRC | 23:22 | |
openstackgerrit | Jeffrey Zhang proposed openstack/kolla-ansible: Optimize reconfigure action for horizon https://review.openstack.org/422719 | 23:24 |
*** adrian_otto has quit IRC | 23:24 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes: PS adds glance cleanup service https://review.openstack.org/426438 | 23:29 |
*** sdake has joined #openstack-kolla | 23:31 | |
*** rwellum has quit IRC | 23:32 | |
*** lrensing has joined #openstack-kolla | 23:39 | |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Check to see if Horizon is working in the gate. https://review.openstack.org/426025 | 23:42 |
*** sdake has quit IRC | 23:53 | |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes: Pull container tallballs into the gate https://review.openstack.org/426598 | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!