*** rcernin has joined #openstack-containers | 00:08 | |
*** rcernin has quit IRC | 00:55 | |
*** rcernin has joined #openstack-containers | 00:55 | |
*** xinliang has quit IRC | 02:18 | |
*** ricolin has joined #openstack-containers | 02:19 | |
*** xinliang has joined #openstack-containers | 02:48 | |
*** rcernin has quit IRC | 03:06 | |
*** rcernin has joined #openstack-containers | 03:23 | |
*** udesale has joined #openstack-containers | 04:05 | |
*** ramishra has joined #openstack-containers | 04:09 | |
*** vishalmanchanda has joined #openstack-containers | 04:13 | |
*** rcernin has quit IRC | 04:16 | |
*** rcernin has joined #openstack-containers | 04:16 | |
*** ramishra has quit IRC | 04:32 | |
*** dave-mccowan has quit IRC | 04:43 | |
*** ramishra has joined #openstack-containers | 04:46 | |
*** ykarel|away is now known as ykarel | 04:58 | |
*** rcernin is now known as rcernin|lunch | 05:01 | |
*** goldyfruit has quit IRC | 05:58 | |
*** goldyfruit has joined #openstack-containers | 05:58 | |
*** goldyfruit has quit IRC | 06:30 | |
*** goldyfruit has joined #openstack-containers | 06:30 | |
*** elenalindq has joined #openstack-containers | 06:42 | |
*** ivve has quit IRC | 07:24 | |
*** lpetrut has joined #openstack-containers | 07:31 | |
*** ianychoi has joined #openstack-containers | 07:46 | |
*** ykarel is now known as ykarel|lunch | 07:55 | |
*** xinliang has quit IRC | 08:00 | |
*** ivve has joined #openstack-containers | 08:45 | |
openstackgerrit | Feilong Wang proposed openstack/magnum master: [k8s] Fix instance ID issue with podman and autoscaler https://review.opendev.org/707336 | 08:47 |
---|---|---|
*** xinliang has joined #openstack-containers | 08:51 | |
*** flwang1 has joined #openstack-containers | 08:57 | |
flwang1 | strigazi: brtknr: meeting in 3 mins | 08:57 |
flwang1 | #startmeeting magnum | 09:00 |
openstack | Meeting started Wed Feb 12 09:00:39 2020 UTC and is due to finish in 60 minutes. The chair is flwang1. Information about MeetBot at http://wiki.debian.org/MeetBot. | 09:00 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 09:00 |
*** openstack changes topic to " (Meeting topic: magnum)" | 09:00 | |
openstack | The meeting name has been set to 'magnum' | 09:00 |
flwang1 | #topic roll call | 09:00 |
*** openstack changes topic to "roll call (Meeting topic: magnum)" | 09:00 | |
flwang1 | 0/ | 09:00 |
brtknr | o/ | 09:02 |
flwang1 | brtknr: hey, how are you | 09:03 |
brtknr | good thanks and you! | 09:03 |
brtknr | ? | 09:03 |
flwang1 | very good | 09:03 |
flwang1 | let's wait strigazi a bit | 09:03 |
flwang1 | brtknr: did you see my comments on your csi patch? | 09:04 |
brtknr | yes thanks for reviewing | 09:05 |
brtknr | did you see the issue on devstack? | 09:06 |
brtknr | can you leave comment about what k8s version you used and if it was podman or coreos etc. | 09:06 |
brtknr | i havent seen the same issue locally | 09:06 |
flwang1 | ok, will do | 09:06 |
flwang1 | i'm using v1.16.3 with podman | 09:06 |
flwang1 | and coreos | 09:06 |
flwang1 | brtknr: let's go through the agenda? | 09:08 |
brtknr | sounds good | 09:08 |
brtknr | btw instead of kubelet-insure-tls, maybe i can use --kubelet-certificate-authority | 09:08 |
brtknr | and use the ca for the cluster? | 09:09 |
flwang1 | brtknr: it would be great, otherwise, enterprise use won't like it | 09:09 |
flwang1 | 1. Help with removing the constraint that there must be a minimum of 1 worker in a given nodegroup (including default-worker). | 09:09 |
flwang1 | have you already got any idea for this? | 09:09 |
brtknr | i can manually specify count as 0 and get the cluster to reach CREATE_COMPLETE but i havent been able to override the value of node_count to 0, at the moment, it defaults to 1 | 09:10 |
brtknr | i havent been able to figure out where exactly this constraint is applied. any pointer would be appreciated but i realise there is no easy answer without properly digging underneath | 09:12 |
flwang1 | what do you mean manually specify? like openstack coe cluster create xxx --node-count 0 ? | 09:12 |
brtknr | that will not work because there is api level constraint | 09:12 |
brtknr | when i remove the api level constraint, the node-count still defaults to 1 | 09:12 |
flwang1 | brtknr: i see. so you mean you hacked the code to set it to 0? | 09:12 |
brtknr | i can override count in the kubecluser.yaml file | 09:13 |
brtknr | and only then cluster reaches CREATE_COMPLETE | 09:13 |
flwang1 | ah, i see. | 09:13 |
flwang1 | i would say if it works at the Heat level, then the overall idea should work | 09:13 |
flwang1 | we can manage it in magnum scope | 09:13 |
flwang1 | it would be nice if you can dig and propose a patch so that we can start to review from there | 09:14 |
brtknr | sounds good | 09:14 |
flwang1 | 2. metrics sever CrashLoopBack | 09:15 |
flwang1 | is there anything we should discuss about this? | 09:15 |
brtknr | flwang1: i saw your comment | 09:17 |
brtknr | i will see if there is a way to do this without insecure-tls | 09:17 |
brtknr | there is a --kubelet-certificate-authority and --tls-cert-file option which i havent explored | 09:18 |
flwang1 | cool, i appreciate your work on this issue | 09:18 |
flwang1 | 3. the heat agent log | 09:19 |
brtknr | we're still on roll call topic btw | 09:19 |
brtknr | lol | 09:19 |
flwang1 | ah, sorry | 09:19 |
flwang1 | #topic heat agent log | 09:19 |
*** openstack changes topic to "heat agent log (Meeting topic: magnum)" | 09:19 | |
flwang1 | with this one, we probably need to wait the investigation result from strigazi | 09:19 |
flwang1 | so let's skip it for now? | 09:19 |
brtknr | i was digging into the heat agent log yesterday as its sometimes impossible to see what is happening when the cluser is creating | 09:20 |
brtknr | the main problem is subprocess.communicate does not provide option to stream output | 09:20 |
flwang1 | :( | 09:21 |
brtknr | there may be a way to output stdout and stderr to a file on the other hand | 09:21 |
brtknr | from whereever its being executed from | 09:22 |
brtknr | but happy to wait for what strigazi has to say | 09:22 |
flwang1 | cool, thanks | 09:22 |
flwang1 | move on? | 09:22 |
brtknr | sure | 09:25 |
flwang1 | #topic volume AZ | 09:27 |
*** openstack changes topic to "volume AZ (Meeting topic: magnum)" | 09:27 | |
flwang1 | brtknr: did you get a chance to review volume AZ fix https://review.opendev.org/705592 ? | 09:27 |
brtknr | yes it has a complicated logic | 09:30 |
flwang1 | brtknr: yep, i don't like it TBH, but i can't figure out a better way to solve it | 09:31 |
brtknr | how has it worked fine until now? | 09:32 |
flwang1 | sorry? | 09:32 |
flwang1 | you mean why it was working? | 09:32 |
brtknr | until now, how have we survived without this patch is what im asking | 09:33 |
flwang1 | probably because most of the companies are not using multi AZ | 09:33 |
flwang1 | without multi AZ, user won't run into this issue | 09:33 |
flwang1 | as far as i know, Nectar jakeyip, they're hacking the code | 09:34 |
flwang1 | i don't know if cern is using multi az | 09:34 |
elenalindq | Hi there, may I interrupt with a question? I tried openstack coe cluster update $CLUSTER_NAME replace node_count=3 and it failed, because of lack of resources, so my stack is in UPDATE_FAILED state. I fixed the quota and I tried to rerun the update command hoping it will kick it off again, but nothing happens. If I try openstack stack update <stack_id> --existing it will kick off the update, which succeeds (heat sh | 09:35 |
elenalindq | DATE_COMPLETE), but openstack coe cluster list still shows my stack in UPDATE_FAILED. Is there a way to rerun the update from magnum? Using Openstack Train. | 09:35 |
brtknr | flwang1: can we not get a default value for az in the same way novda does? | 09:36 |
brtknr | elenalindq: if you are using train with an up to date CLI, you can rerun `openstack coe cluster resize <cluster_name> 3` | 09:37 |
flwang1 | brtknr: cinder can handle the "" for az | 09:37 |
brtknr | and this will reupdate the heat stack | 09:37 |
elenalindq | thank you brtknr! | 09:37 |
brtknr | flwang1: but not nova? | 09:37 |
flwang1 | brtknr: cinder can NOT handle the "" for az | 09:37 |
flwang1 | but nova can | 09:37 |
flwang1 | cinder will just return a 400 IIRC | 09:37 |
brtknr | flwang1: can we look into nova code to see how they infer sensible default for region name? | 09:38 |
flwang1 | you can easily test this without using multi az | 09:38 |
flwang1 | are you trying to solve this issue in cinder? | 09:38 |
brtknr | i think magnum should have an internal default for availability zone | 09:39 |
brtknr | rather than "" | 09:39 |
flwang1 | like a config option? | 09:40 |
flwang1 | then how can you set the default value for the this option? | 09:40 |
flwang1 | and this may break the backward compatibility :( | 09:40 |
*** ykarel|lunch is now known as ykarel | 09:41 | |
brtknr | flwang1: hmm will cinder accept None? | 09:43 |
flwang1 | brtknr: no, based on i tried | 09:43 |
flwang1 | you can give the patch a try and we can discuss offline | 09:44 |
flwang1 | it's a small issue but just make the template complicated, i understand that | 09:44 |
flwang1 | brtknr: let's move on | 09:46 |
flwang1 | #topic docker storage for fedora coreos | 09:46 |
*** openstack changes topic to "docker storage for fedora coreos (Meeting topic: magnum)" | 09:46 | |
flwang1 | docker storage driver for fedora coreos https://review.opendev.org/696256 | 09:46 |
flwang1 | brtknr: can you pls revisit above patch? | 09:47 |
flwang1 | strigazi: ^ | 09:47 |
brtknr | flwang1: my main issue with that patch is that we should try and use the same configure-docker-storage.sh and add a condition in there for fedora coreos | 09:51 |
brtknr | it is hard to factor out common elements when they are in different files | 09:51 |
flwang1 | brtknr: we can't use the same script. i'm kind of using the same one from coreos | 09:52 |
flwang1 | because the logic are different | 09:52 |
flwang1 | brtknr: see https://github.com/openstack/magnum/blob/master/magnum/drivers/k8s_coreos_v1/templates/fragments/configure-docker.yaml | 09:53 |
brtknr | ah ok i see what you mean | 09:54 |
brtknr | my bad | 09:54 |
brtknr | when i tested it, worked for me | 09:55 |
flwang1 | all good, please revisit it, because we do need it for fedora coreos driver to remove the TODO :) | 09:55 |
flwang1 | let's move on, we only have 5 mins | 09:56 |
brtknr | i just realised that atomic has its own fragment | 09:56 |
brtknr | so this pattern makes sense | 09:56 |
flwang1 | #topic autoscaler podman issue | 09:56 |
*** openstack changes topic to "autoscaler podman issue (Meeting topic: magnum)" | 09:56 | |
brtknr | i am happy to take this patch as is | 09:57 |
brtknr | on the topic of autoscaler, are you guys planning to work on supporting nodegroups? | 09:57 |
flwang1 | this is a brand new bug, see https://github.com/kubernetes/autoscaler/issues/2819 | 09:57 |
flwang1 | brtknr: i'm planning to support /resize api first and then nodegroups, not sure if cern guys will take the node groups support | 09:58 |
flwang1 | brtknr: and here is the fix https://review.opendev.org/707336 | 09:58 |
flwang1 | we just need to add the volume mount for /etc/machine-id | 09:58 |
flwang1 | the bug reporter has confirmed that works for him | 09:59 |
brtknr | flwang1: excellent, looks reasonable to me | 09:59 |
brtknr | we havent started using coreos in prod yet but this kind of bug is precisely the reason why | 09:59 |
brtknr | what do you mean support the resize api and then nodegroups? | 10:00 |
brtknr | doesnt it already support resize? | 10:00 |
flwang1 | brtknr: it's using old way https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/magnum/magnum_manager_heat.go#L231 | 10:01 |
flwang1 | in short, it call heat to remove the node and then call magnum update api to update the correct number, which doesn't make sense given we now got the resize api | 10:02 |
brtknr | flwang1: ah okay thats true | 10:03 |
flwang1 | i'd like to use resize to replace it | 10:03 |
brtknr | why did we not update node count on magnum directly? | 10:03 |
flwang1 | to drop the dependency with heat | 10:03 |
flwang1 | because the magnum update api can't specify which node you can delete | 10:04 |
brtknr | flwang1: oh i see | 10:04 |
flwang1 | :) | 10:04 |
brtknr | but if you remove the node from heat stack and update node count, magnum doesnt remove any extra nodes? | 10:04 |
flwang1 | magnum just update the number because the node has been deleted | 10:05 |
flwang1 | it just "magically work" :) | 10:05 |
brtknr | i really want support for nodegroup autoscalinmg | 10:06 |
flwang1 | anyway, i think we all agree resize will be the right one to do this | 10:06 |
brtknr | happy to work on this but will need to learn golang first | 10:06 |
flwang1 | brtknr: then show me the code :D | 10:06 |
flwang1 | let's move on | 10:06 |
flwang1 | i'm going to close this meeting now | 10:06 |
flwang1 | we can discuss the out of box storage class offline | 10:07 |
brtknr | #topic out of box storage clasS? | 10:07 |
flwang1 | brtknr: it's related to this one https://review.opendev.org/676832 | 10:07 |
flwang1 | i proposed before | 10:07 |
brtknr | i see dioguerra lurking in the background | 10:07 |
flwang1 | :) | 10:09 |
flwang1 | let's end the meeting first | 10:09 |
flwang1 | #endmeeting | 10:09 |
*** openstack changes topic to "OpenStack Containers Team | Meeting: every Wednesday @ 9AM UTC | Agenda: https://etherpad.openstack.org/p/magnum-weekly-meeting" | 10:09 | |
openstack | Meeting ended Wed Feb 12 10:09:13 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 10:09 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-02-12-09.00.html | 10:09 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-02-12-09.00.txt | 10:09 |
openstack | Log: http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-02-12-09.00.log.html | 10:09 |
brtknr | flwang1: ah the post install manifest would solve a few problems for us! | 10:09 |
flwang1 | i'm so happy you like it :) | 10:09 |
flwang1 | yep, it can resolve a lot of vendor specific requirements | 10:09 |
flwang1 | actually, the design has been agreed by strigazi as well | 10:10 |
flwang1 | but we didn't push it hard | 10:10 |
flwang1 | now, we have users asking that again, so i'm trying to pick it up and try agian | 10:10 |
brtknr | flwang1: i'd really like the option to also provide this as a label | 10:11 |
brtknr | rather than magnum.conf | 10:11 |
brtknr | rather than just magnum.conf | 10:12 |
flwang1 | you mean put the URL as a label? | 10:13 |
flwang1 | so try to get label first and if fail, then try to get from magnum.conf ? | 10:14 |
brtknr | flwang1: yep | 10:15 |
flwang1 | please leave a comment, then you will get a new patch set ;) | 10:15 |
brtknr | ive already done it | 10:16 |
flwang1 | wonderful | 10:16 |
flwang1 | i have to off now | 10:16 |
flwang1 | thank you for joining the meeting, brtknr | 10:17 |
brtknr | okay, it was good to catch up! | 10:17 |
brtknr | goodnight flwang1 | 10:17 |
flwang1 | the best thing i have done in 2019 is inviting you join the magnum team | 10:17 |
flwang1 | just wanna say thank you for all you have done for Magnum | 10:18 |
brtknr | flwang1: aw :) i am humbled! i have learnt a lot from you and strigazi! | 10:19 |
flwang1 | brtknr: cheers, ttyl | 10:19 |
*** rcernin|lunch has quit IRC | 10:22 | |
dioguerra | brtknr: i always lurk from the background, but always late :( | 10:35 |
*** rcernin|lunch has joined #openstack-containers | 10:37 | |
*** udesale has quit IRC | 11:06 | |
*** vishalmanchanda has quit IRC | 11:12 | |
brtknr | dioguerra: :) can i ask you some questions re node_count? i am wondering what forces the node_count to be a minimum of 1 | 11:25 |
*** ianychoi has quit IRC | 12:25 | |
*** udesale has joined #openstack-containers | 12:34 | |
*** ykarel is now known as ykarel|afk | 13:14 | |
*** dave-mccowan has joined #openstack-containers | 13:19 | |
*** dave-mccowan has quit IRC | 13:26 | |
*** rcernin|lunch has quit IRC | 13:50 | |
*** lbragstad has joined #openstack-containers | 14:02 | |
*** ykarel|afk is now known as ykarel | 14:06 | |
*** lbragstad has quit IRC | 14:08 | |
*** vishalmanchanda has joined #openstack-containers | 14:11 | |
*** jmlowe has joined #openstack-containers | 15:08 | |
*** lbragstad has joined #openstack-containers | 15:15 | |
*** jmlowe has quit IRC | 15:21 | |
*** lpetrut has quit IRC | 15:24 | |
*** jmlowe has joined #openstack-containers | 15:34 | |
*** ykarel is now known as ykarel|afk | 15:49 | |
*** udesale has quit IRC | 16:02 | |
*** udesale has joined #openstack-containers | 16:02 | |
*** jmlowe has quit IRC | 16:13 | |
*** ivve has quit IRC | 16:23 | |
*** ramishra has quit IRC | 16:49 | |
*** udesale has quit IRC | 16:51 | |
*** udesale has joined #openstack-containers | 16:52 | |
*** udesale has quit IRC | 17:08 | |
*** flwang has quit IRC | 17:30 | |
*** ivve has joined #openstack-containers | 17:44 | |
*** vishalmanchanda has quit IRC | 17:51 | |
*** lbragstad has left #openstack-containers | 18:15 | |
*** ykarel|afk is now known as ykarel|away | 18:39 | |
*** jmlowe has joined #openstack-containers | 18:40 | |
*** flwang1 has quit IRC | 19:45 | |
*** jmlowe has quit IRC | 20:08 | |
*** elenalindq has quit IRC | 21:01 | |
*** rcernin has joined #openstack-containers | 21:18 | |
*** jmlowe has joined #openstack-containers | 21:27 | |
*** ivve has quit IRC | 21:31 | |
*** jmlowe has quit IRC | 22:40 | |
*** jmlowe has joined #openstack-containers | 22:44 | |
*** jmlowe has quit IRC | 22:53 | |
*** jmlowe has joined #openstack-containers | 22:56 | |
*** jmlowe has quit IRC | 23:12 | |
*** goldyfruit has quit IRC | 23:45 | |
*** goldyfruit has joined #openstack-containers | 23:45 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!