*** dave-mccowan has quit IRC | 00:23 | |
*** PagliaccisCloud has quit IRC | 00:37 | |
*** dave-mccowan has joined #openstack-containers | 00:57 | |
*** Nel1x has quit IRC | 01:02 | |
*** hongbin has quit IRC | 01:45 | |
*** PagliaccisCloud has joined #openstack-containers | 01:49 | |
*** cbrumm_ has joined #openstack-containers | 02:12 | |
*** cbrumm has quit IRC | 02:13 | |
*** lbragstad has quit IRC | 02:13 | |
*** lbragstad has joined #openstack-containers | 02:16 | |
*** itlinux has quit IRC | 02:29 | |
*** hongbin has joined #openstack-containers | 03:05 | |
*** lbragstad has quit IRC | 03:38 | |
*** udesale has joined #openstack-containers | 03:40 | |
*** hongbin has quit IRC | 04:00 | |
*** ramishra has joined #openstack-containers | 04:34 | |
*** ykarel has joined #openstack-containers | 04:40 | |
*** ricolin has joined #openstack-containers | 04:45 | |
*** dave-mccowan has quit IRC | 04:53 | |
*** ykarel_ has joined #openstack-containers | 04:59 | |
*** ykarel has quit IRC | 05:02 | |
*** ykarel_ has quit IRC | 06:30 | |
*** belmoreira has quit IRC | 07:04 | |
*** belmoreira has joined #openstack-containers | 07:16 | |
*** belmoreira has quit IRC | 07:18 | |
*** belmoreira has joined #openstack-containers | 07:18 | |
*** rcernin has quit IRC | 07:23 | |
*** udesale has quit IRC | 07:24 | |
*** ykarel_ has joined #openstack-containers | 07:48 | |
*** ykarel_ is now known as ykarel | 08:16 | |
*** ykarel is now known as ykarel|lunch | 08:17 | |
*** ykarel|lunch is now known as ykarel | 08:42 | |
*** udesale has joined #openstack-containers | 08:49 | |
*** shrasool has joined #openstack-containers | 09:25 | |
lxkong | strigazi: hi, i'm working on how to accelerate the launch time of magnum k8s cluster and i found there are several software deployments that are executed in sequence after kubemaster is alive. | 09:42 |
---|---|---|
lxkong | strigazi: that's why i asked in heat channel | 09:42 |
strigazi | lxkong: give me a sec, I'll explain | 09:43 |
lxkong | sure | 09:43 |
strigazi | The improvement I have thought is to parallelize the deployment of master and worker nodes. Running the SDs in parallel won't help a lot. | 09:49 |
strigazi | lxkong: the major bottleneck is that nodes wait for master to pull everything. The deployment of k8s manifests is async so it is fast. | 09:50 |
strigazi | lxkong: I'm writing a PoC for this now | 09:52 |
lxkong | strigazi: in my devstack deployment, the core_dns_service takes 80s and kubernetes_dashboard takes 171s somehow | 10:00 |
lxkong | in sequence | 10:01 |
lxkong | for master and worker, i have created this https://storyboard.openstack.org/#!/story/2004573 for HA cluster | 10:01 |
lxkong | but for non-ha, after master gets created and has an ip address, the workers could start | 10:02 |
lxkong | rather than waiting until all the things are finished inside the master | 10:02 |
brtknr | lxkong: isn't that to do with your internet connection speed as it is mostly the time required to download containers? | 10:03 |
brtknr | strigazi: i like the idea of launching the master and workers in parallel! | 10:04 |
lxkong | brtknr: no, besides the container installation time, we also want to decrease the provision time for others | 10:04 |
lxkong | another k8s cluster was just created in my devstack, i can see: | 10:05 |
lxkong | https://www.irccloud.com/pastebin/xKUT4VOM/ | 10:06 |
lxkong | most of them are just `kubectl apply`, i don't know why they took so long time to finish | 10:06 |
ricolin | IMO try to increase the polling interval, might help if you wish it run in short time, just remember to change it back after you're done | 10:10 |
ricolin | https://github.com/openstack/heat-templates/blob/master/hot/software-config/boot-config/templates/fragments/os-collect-config.conf#L10 | 10:10 |
ricolin | s/increase/decrease/ | 10:10 |
ricolin | strigazi, have you try that yet? | 10:10 |
lxkong | ricolin: 'change it back after you're done' sounds like a hack instead of a solution :-( | 10:13 |
ricolin | lxkong, actually not a hack IMO, just an improvement of performance for vm | 10:14 |
ricolin | lxkong, you can always leave it with quick polling interval | 10:14 |
lxkong | ricolin: you mean, in magnum scenario, we give it a small value and remove it after something done? | 10:15 |
*** salmankhan has joined #openstack-containers | 10:15 | |
lxkong | but if a small polling interval will increase the api load? | 10:15 |
lxkong | ricolin: that may work, but not a perfect solution | 10:16 |
ricolin | lxkong, it will increase api loading in certain level. Not sure there's a better design for a `polling design`, maybe we can try to figure out one:) | 10:17 |
ricolin | lxkong, and yes, you should be able to change it after you're done | 10:18 |
*** udesale has quit IRC | 10:18 | |
lxkong | ricolin: actually, my original requirement is to execute some scripts inside the vm in parallel, currently we are using SD, but seems like it doesn't support parallelism | 10:19 |
lxkong | unless we write all the script in one SD rather than using multiple SDs for a single vm | 10:20 |
ricolin | you can use shell script do to that in SD right? | 10:20 |
ricolin | s/do to/to do/ | 10:20 |
lxkong | ricolin: exactly and ugly :-) | 10:20 |
lxkong | i guess in magnum, we separate the script into differetn SDs to gain some modular benefit | 10:21 |
ricolin | lxkong, we can start discuss if we should have that in agent, just have to figure out how to do it in a dozen of agents | 10:22 |
ricolin | shell, ansible, puppet, etc | 10:22 |
lxkong | ricolin: sounds like a plan | 10:22 |
ricolin | lxkong, happy to help to review and trigger a discussion | 10:23 |
ricolin | and it will be better if we got people help on it | 10:23 |
ricolin | like partial success | 10:24 |
ricolin | we need people to do it for sure | 10:24 |
* ricolin Hope he got more time on that | 10:24 | |
lxkong | ricolin: btw, not sure if you know anybody in heat team who is familiar with Octavia that can help with this https://storyboard.openstack.org/#!/story/2004564? | 10:25 |
ricolin | ramishra, is the one | 10:26 |
lxkong | ramishra: it'd be much appreciated if you are available for this feature | 10:28 |
ricolin | lxkong, btw, it's not like everything list in etherpad will happen, but would be very appreciated if you can leave your feedback in etherpad with your name, so we can give more voice on that issue https://etherpad.openstack.org/p/heat-user-berlin | 10:29 |
ramishra | lxkong: you asking me to implement it? ;) I can review if someone works on it, have no time to do it myself. | 10:40 |
lxkong | ramishra: that's totally fine, thanks | 10:41 |
lxkong | strigazi: hi, do you have some time to discuss https://review.openstack.org/#/c/497144/? i'm very happy to hear your suggestion | 10:48 |
brtknr | lxkong: i am currently trying to setup a simple lbaas using neutron, octavia seemed like an overkill for what i was trying | 10:49 |
lxkong | but neutron-lbaas was already depcreated | 10:50 |
lxkong | but using neutron-lbaas is ok for this feature, just need to add another hook for it | 10:50 |
brtknr | lxkong: yes, but it should still work... i think magnum uses neutron lbaas by default, you need to enable octavia explicitly in /etc/magnum/magnum.conf | 10:51 |
brtknr | I got as far as setting up the lbaas but my controller manager is complaining: http://paste.openstack.org/show/736864/ | 10:51 |
brtknr | I am currently digging around for solutions | 10:51 |
lxkong | brtknr: nope, if Octavia is deployed in your cloud, it will be used as defulat | 10:51 |
lxkong | brtknr: https://github.com/openstack/magnum/blob/f27bde71719905e6f274a1a57799595780bc50c2/magnum/drivers/heat/template_def.py#L342 | 10:53 |
brtknr | oh okay | 10:55 |
strigazi | sorry, I was in a meeting | 10:55 |
brtknr | what about loadbalancer for services though? | 10:55 |
brtknr | strigazi: wb | 10:56 |
brtknr | my https://github.com/openstack/magnum/blob/f27bde71719905e6f274a1a57799595780bc50c2/magnum/drivers/h │ shrasool | 10:56 |
brtknr | │ | eat/template_def.py#L342 | 10:56 |
brtknr | oops sorry | 10:56 |
strigazi | I'm reading the thread | 10:56 |
strigazi | lxkong: the time that matters is, after the apiserver reports ok how much time do the SDs take. | 10:58 |
brtknr | https://github.com/openstack/magnum/blob/53a1840d68382fd7bd6cc1f7c6752a37a632b50b/magnum/drivers/heat/k8s_template_def.py#L104 | 10:58 |
brtknr | looks like you're right lxkong | 10:58 |
strigazi | putting everything in one script might not help if the node performance is pour. The biggest gain is pull everything in parallel. | 11:00 |
lxkong | strigazi: `pull everything in parallel` you mean for atomic images or heat sd? | 11:01 |
lxkong | strigazi: from the logs, all the SDs are executed in sequence, so the total time is about 4min after the apiserver is up and running | 11:03 |
lxkong | but from what those scripts do, 4min is not reasonable | 11:03 |
lxkong | and from ricolin's explanation, there is a param for os-collect-config `polling-interval` with default value 30, that may be the reason, but i'm not sure | 11:04 |
lxkong | os-collect-config/os-refresh-config/os-apply-config....so many things | 11:05 |
strigazi | maybe your bottleneck is somewhere else. check what these scripts do The times you present don't make sense to me. In our cloud the agent starts at 12:18:53 and the last SD finishes at: 12:19:29 | 11:08 |
lxkong | strigazi: are you using the latest magnum code? | 11:09 |
strigazi | lxkong: just a warning, if you reduce the polling time, you will do a ddos to your heat api. Just measure exxactly where the time is spent. | 11:10 |
lxkong | my environment is a devstack with latest code for all the projects | 11:10 |
brtknr | a little while ago, when i was testing changes to magnum on devstack, the VM that i was using had it's ip address throttled by openstack servers... I changed the floating ip and problem was solved. | 11:10 |
strigazi | let me check me devstack which is master. | 11:10 |
lxkong | strigazi: yeah, i know | 11:10 |
*** udesale has joined #openstack-containers | 11:11 | |
strigazi | in devstack master: start Dec 07 18:29:55, end Dec 07 18:31:47 | 11:12 |
lxkong | hmm... | 11:13 |
lxkong | strigazi: do you mind paste your heat-container-agent service log? | 11:15 |
lxkong | from the service start to all the phases finished | 11:16 |
strigazi | http://paste.openstack.org/raw/736889/ | 11:21 |
lxkong | strigazi: in your cluster, you don't have calico_service and kubernetes_dashboard, right? | 11:25 |
*** ArchiFleKs has joined #openstack-containers | 11:29 | |
lxkong | strigazi: btw, have you tried to run `atomic install` when deployed the k8s cluster in a multi-thread or multi-process manner? | 11:34 |
* lxkong has to get some sleep, will continue testing tomorrow | 11:38 | |
strigazi | lxkong: we don't use calico, maybe calico is slow | 12:11 |
strigazi | lxkong: I have the dashboard too | 12:13 |
strigazi | lxkong: paste.openstack.org cut my pase | 12:22 |
strigazi | lxkong: here it is https://paste.fedoraproject.org/paste/6-B--pO2cvkvx73-RDbXRg/raw | 12:22 |
*** salmankhan has quit IRC | 12:24 | |
*** salmankhan has joined #openstack-containers | 12:31 | |
*** cbrumm_ has quit IRC | 12:42 | |
*** cbrumm_ has joined #openstack-containers | 12:49 | |
strigazi | lxkong: https://paste.fedoraproject.org/paste/HfYsaoQLuNDpKsSJ80VCbw/raw with calico | 13:08 |
strigazi | lxkong: less than two mins | 13:09 |
*** dave-mccowan has joined #openstack-containers | 13:17 | |
*** jmlowe has quit IRC | 13:45 | |
brtknr | strigazi: have you started using cloud-controller-manager by any chance? | 13:53 |
openstackgerrit | Merged openstack/magnum stable/rocky: Add support for www_authenticate_uri in ContextHook https://review.openstack.org/623679 | 13:57 |
*** openstackstatus has joined #openstack-containers | 14:17 | |
*** ChanServ sets mode: +v openstackstatus | 14:17 | |
*** lbragstad has joined #openstack-containers | 14:17 | |
*** jmlowe has joined #openstack-containers | 14:24 | |
*** jmlowe has quit IRC | 14:24 | |
*** jmlowe has joined #openstack-containers | 14:25 | |
*** aspiers has quit IRC | 14:26 | |
*** jmlowe has quit IRC | 14:30 | |
*** shrasool has quit IRC | 14:34 | |
*** jmlowe has joined #openstack-containers | 14:35 | |
*** aspiers has joined #openstack-containers | 14:56 | |
brtknr | anyone here started using cloud-controller-manager service? | 15:09 |
brtknr | with magnum? | 15:09 |
brtknr | looks like there is a service for it | 15:09 |
brtknr | https://hub.docker.com/r/openstackmagnum/openstack-cloud-controller-manager/ | 15:09 |
*** salmankhan1 has joined #openstack-containers | 15:11 | |
*** salmankhan has quit IRC | 15:12 | |
*** salmankhan1 is now known as salmankhan | 15:12 | |
brtknr | is this wip? | 15:12 |
brtknr | looks like this change was abanadoned in July and restored recently: https://review.openstack.org/#/c/577477/3/magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-master.sh | 15:18 |
strigazi | brtknr: the PS i pushed works | 15:18 |
strigazi | brtknr: you can test | 15:19 |
strigazi | brtknr: i pushed PS version 3 | 15:19 |
brtknr | strigazi: nice, what prompted you to look at occm? | 15:22 |
brtknr | there is nothing with the tag v0.2.0 on dockerhub btw | 15:23 |
brtknr | occm is tagged to download v0.2.0 | 15:23 |
strigazi | brtknr: http://paste.openstack.org/show/736905/ | 15:31 |
strigazi | brtknr: I was testing things upstream for the lbaas delete hook written by lxkong. At CERN we are still not interested. | 15:32 |
strigazi | brtknr: well we might need it for the autoscaler :) | 15:33 |
strigazi | brtknr: we need it for the autoscaler, but i didn't have it in mind xD | 15:33 |
brtknr | oops, i was inspecting the wrong provider | 15:34 |
brtknr | i'm currently looking at neutron-lbaasv2, using cloud-provider=openstack hasnt worked for me. im checking to see if cloud-provider=external will | 15:35 |
strigazi | brtknr: with octavia "It works in devstack"TM | 15:35 |
brtknr | lol | 15:36 |
*** hongbin has joined #openstack-containers | 15:36 | |
brtknr | are you using rocky at cern or still queens? | 15:36 |
strigazi | 40% rocky | 15:36 |
strigazi | or more | 15:36 |
strigazi | we cherry-pick only what we need, it is difficult to rebase. Our network is too special | 15:37 |
brtknr | i see | 15:38 |
strigazi | we plan to upgrade in Jan, after holidays. | 15:40 |
strigazi | brtknr: we might be in queens but we use k8s v1.12.3. | 15:41 |
brtknr | oh, how are you using v1.12.3? the openstackmagnum docker hub only goes up to v1.11.5? or are you using the upstream k8s image? | 15:44 |
*** itlinux has joined #openstack-containers | 15:44 | |
*** jmlowe has quit IRC | 15:46 | |
*** itlinux has quit IRC | 15:46 | |
brtknr | strigazi: do you need kubelet running on master for occm to work? | 15:51 |
*** lpetrut has joined #openstack-containers | 16:12 | |
*** munimeha1 has joined #openstack-containers | 16:21 | |
*** jmlowe has joined #openstack-containers | 16:24 | |
strigazi | brtknr yes | 16:27 |
*** jmlowe has quit IRC | 16:34 | |
*** openstackgerrit has quit IRC | 16:35 | |
brtknr | cool neutron lbaas v2 is working now :) | 16:41 |
brtknr | with cloud-provider=external | 16:42 |
brtknr | strigazi: just tested https://review.openstack.org/#/c/577477/3 and it works great! | 16:43 |
brtknr | can we cherry-pick https://review.openstack.org/#/c/571190 to queens? | 16:51 |
brtknr | i get merge conflict when i try to do it | 16:52 |
*** ykarel has quit IRC | 16:52 | |
*** ykarel has joined #openstack-containers | 16:53 | |
*** itlinux has joined #openstack-containers | 16:56 | |
*** udesale has quit IRC | 16:58 | |
*** openstackgerrit has joined #openstack-containers | 17:01 | |
openstackgerrit | Bharat Kunwar proposed openstack/magnum stable/queens: k8s_fedora: Add cloud_provider_enabled label https://review.openstack.org/624132 | 17:01 |
*** ykarel is now known as ykarel|away | 17:03 | |
*** ricolin has quit IRC | 17:05 | |
openstackgerrit | Bharat Kunwar proposed openstack/magnum stable/queens: k8s_fedora: Add cloud_provider_enabled label https://review.openstack.org/624132 | 17:09 |
openstackgerrit | Bharat Kunwar proposed openstack/magnum stable/queens: k8s_fedora: Add cloud_provider_enabled label https://review.openstack.org/624132 | 17:10 |
*** ricolin has joined #openstack-containers | 17:14 | |
*** munimeha1 has quit IRC | 17:14 | |
*** PagliaccisCloud has quit IRC | 17:26 | |
*** jmlowe has joined #openstack-containers | 18:12 | |
*** ricolin has quit IRC | 18:22 | |
*** PagliaccisCloud has joined #openstack-containers | 18:34 | |
*** salmankhan has quit IRC | 18:36 | |
*** ykarel|away has quit IRC | 19:08 | |
*** PagliaccisCloud has quit IRC | 19:27 | |
*** salmankhan has joined #openstack-containers | 20:01 | |
*** shrasool has joined #openstack-containers | 20:03 | |
*** salmankhan has quit IRC | 20:06 | |
openstackgerrit | Roberto Soares proposed openstack/magnum master: [k8s] Add vulnerability scanner https://review.openstack.org/598142 | 20:10 |
*** shrasool has quit IRC | 20:18 | |
*** jmlowe has quit IRC | 20:26 | |
*** lpetrut has quit IRC | 20:27 | |
*** jmlowe has joined #openstack-containers | 20:32 | |
*** jmlowe has quit IRC | 20:33 | |
mordred | anybody know off the top of their head - using magnum with atomic hosts, is there a particular directory on each node that should be used for files that should persist across reboots? | 20:37 |
*** jmlowe has joined #openstack-containers | 20:58 | |
cbrumm_ | According to redhat's docs /var is writable and persists through reboots | 21:01 |
*** salmankhan has joined #openstack-containers | 21:04 | |
lxkong | strigazi: hi are you still here? Could you please give more suggestion about https://review.openstack.org/#/c/497144? I didn't fully understand your comment 'This is per coe. All other features are written per driver', I want to make sure we agree with the plugin mechanism before I start writing some doc for how to config and test. | 21:10 |
*** roukoswarf has joined #openstack-containers | 21:12 | |
roukoswarf | does magnum allow you to set the deployment kubernetes version anywhere? | 21:15 |
lxkong | roukoswarf: `--labels kube_tag=YOUR_VERSION_HERE` | 21:16 |
lxkong | make sure you can find the version in openstackmagnum account in docker hub if you are using the upstream images | 21:17 |
roukoswarf | ah okay, forgot about labels and didnt know where it was sourcing it | 21:17 |
roukoswarf | no versions beyond 1.11? is there a specific update frequency? | 21:18 |
*** salmankhan has quit IRC | 21:36 | |
roukoswarf | so even with kube_tag set to v1.11.5-1, i end up with nodes using v1.10.3+coreos.0, is there something im missing? | 22:18 |
*** itlinux has quit IRC | 22:43 | |
mnaser | lxkong: strigazi ricolin i actually took a shot at improving speed deploy time | 22:56 |
mnaser | it is failing, i dont know why, but if someone knows the right thing to do, it might pass | 22:57 |
mnaser | it involves using softwaredeploymentgroups | 22:57 |
mnaser | https://review.openstack.org/623724 | 22:58 |
mnaser | btw, anyone going to be at kubecon? | 22:58 |
*** rcernin has joined #openstack-containers | 22:59 | |
lxkong | mnaser: not me :-( | 23:05 |
mnaser | lxkong: aw bummer | 23:06 |
mnaser | btw, i would love for reviews on this stack here -- https://review.openstack.org/#/c/623628/ | 23:07 |
mnaser | everything with functional: prefix brings back functional tests for magnum and runs them in an env with nested virt! | 23:07 |
mnaser | also https://review.openstack.org/#/c/619643/ would be nice because that breaks things for a lot of users :( | 23:09 |
lxkong | mnaser: thanks for proposing this https://review.openstack.org/#/c/623724/, i've added to my review list and will test when i'm available | 23:09 |
mnaser | lxkong: yeah, i dont have tools to repro right now but i think that's the direction we wanna go | 23:12 |
lxkong | mnaser: definitely | 23:12 |
mnaser | that way in non-ha deploys, the servers get deployed quickly and the minions can start spinning up already while softwaredeploys happen for masters | 23:12 |
mnaser | but i havent had time to setup a dev env to hack on it, but yay for functional working tests so we can actually just see that change in CI :) | 23:12 |
lxkong | mnaser: what i thought is spin up the workers just after the vm created successfully | 23:13 |
lxkong | becuase what the workers need is just a ip address | 23:13 |
mnaser | lxkong: yep! and thats why ha goes up faster because they take the lb ip address | 23:13 |
lxkong | in ha cluster, it's lb vip, for non-ha cluster that's the master0's private ip | 23:13 |
mnaser | vs in non-ha we have to wait for the vm to go up | 23:13 |
mnaser | yup :D | 23:13 |
mnaser | also with the approach i took there | 23:14 |
lxkong | yeah | 23:14 |
mnaser | you reduce the # of resources you have | 23:14 |
mnaser | no need to have a softwareconfig+softwaredeployment per ever node | 23:14 |
mnaser | esp when the softwareconfig's are actually static, so it is less load on heat | 23:14 |
lxkong | i'll give it a try after review and testing this one https://review.openstack.org/#/c/561783/ | 23:15 |
*** dave-mccowan has quit IRC | 23:19 | |
mnaser | lxkong: awesome! :) | 23:29 |
mnaser | please do keep me updated | 23:29 |
lxkong | sure, i'll | 23:29 |
mnaser | if you needs anything please feel free to ping! | 23:30 |
lxkong | mnaser: yep, thanks :-) | 23:30 |
*** itlinux has joined #openstack-containers | 23:49 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!