*** dave-mccowan has quit IRC | 01:01 | |
*** flwang1 has quit IRC | 03:34 | |
*** ykarel|away is now known as ykarel | 04:10 | |
*** udesale has joined #openstack-containers | 05:12 | |
*** udesale has quit IRC | 05:14 | |
*** udesale has joined #openstack-containers | 05:14 | |
*** vishalmanchanda has joined #openstack-containers | 05:22 | |
brtknr | flwang1: hey | 06:10 |
---|---|---|
*** irclogbot_1 has quit IRC | 06:49 | |
*** spotz has quit IRC | 06:51 | |
*** irclogbot_1 has joined #openstack-containers | 06:52 | |
*** jhesketh has quit IRC | 06:52 | |
*** irclogbot_1 has quit IRC | 06:53 | |
*** jhesketh has joined #openstack-containers | 06:53 | |
*** irclogbot_3 has joined #openstack-containers | 06:53 | |
*** dtomasgu has quit IRC | 06:54 | |
*** dtomasgu has joined #openstack-containers | 06:54 | |
*** rcernin has quit IRC | 07:15 | |
*** belmoreira has joined #openstack-containers | 07:16 | |
*** ttsiouts has joined #openstack-containers | 07:46 | |
openstackgerrit | Bharat Kunwar proposed openstack/magnum master: fcos: Upgrade etcd to v3.4.6, use quay.io/coreos/etcd https://review.opendev.org/714719 | 08:21 |
*** ykarel is now known as ykarel|lunch | 08:30 | |
openstackgerrit | Bharat Kunwar proposed openstack/magnum master: fcos: Upgrade default flannel_tag to v0.12.0-amd64 https://review.opendev.org/714720 | 08:41 |
*** flwang1 has joined #openstack-containers | 08:49 | |
openstackgerrit | Merged openstack/magnum master: [k8s] Upgrade calico to the latest stable version https://review.opendev.org/705599 | 08:59 |
flwang1 | #startmeeting magnum | 09:01 |
openstack | Meeting started Wed Apr 1 09:01:00 2020 UTC and is due to finish in 60 minutes. The chair is flwang1. Information about MeetBot at http://wiki.debian.org/MeetBot. | 09:01 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 09:01 |
*** openstack changes topic to " (Meeting topic: magnum)" | 09:01 | |
openstack | The meeting name has been set to 'magnum' | 09:01 |
flwang1 | #topic roll call | 09:01 |
*** openstack changes topic to "roll call (Meeting topic: magnum)" | 09:01 | |
flwang1 | o/ | 09:01 |
strigazi | o/ | 09:01 |
flwang1 | brtknr: | 09:02 |
brtknr | / | 09:02 |
brtknr | o/ | 09:02 |
flwang1 | cool | 09:02 |
flwang1 | before go through the topics, just want to let you guys know, i will be serving Magnum PTL for the V cycle | 09:03 |
flwang1 | thank you for you guys support in the last 2 cycles | 09:03 |
brtknr | thank you! | 09:04 |
strigazi | cheers | 09:04 |
flwang1 | strigazi: cheers | 09:04 |
flwang1 | #topic train 9.3.0 | 09:05 |
*** openstack changes topic to "train 9.3.0 (Meeting topic: magnum)" | 09:05 | |
flwang1 | anything else we need to discuss about the 9.3.0? | 09:05 |
brtknr | not really, just that we should add release notes for important fixes | 09:05 |
brtknr | i think there were quite a few in this release | 09:05 |
ttsiouts | ο/ | 09:05 |
brtknr | oh hi ttsiouts ! | 09:06 |
strigazi | nothing from me | 09:06 |
flwang1 | brtknr: i agree we should provide more detailed release note | 09:06 |
strigazi | I ususally -1 for reno | 09:06 |
strigazi | what we missed | 09:06 |
strigazi | brtknr: explain | 09:06 |
flwang1 | brtknr means we need more detailed release note, is it? | 09:06 |
strigazi | I mean, which fixed we didn't add reno for, specifically | 09:07 |
strigazi | s/fixed/fixes/ | 09:07 |
brtknr | not more detailed, but rather add them for various fixes we merged | 09:07 |
brtknr | i will need to go search | 09:07 |
brtknr | will add it to etherpad | 09:07 |
brtknr | not worth further discussion here | 09:08 |
flwang1 | like this https://review.opendev.org/#/c/715661/ :) ? | 09:09 |
strigazi | 17c6034e Add fcct config for coreos user_data | 09:09 |
flwang1 | https://review.opendev.org/#/c/715410/ :( | 09:09 |
brtknr | flwang1: thats right | 09:09 |
brtknr | mostly me :P | 09:10 |
flwang1 | brtknr: those are your baby | 09:10 |
strigazi | 17c6034e Add fcct config for coreos user_data, if this this needs a reno thne every single commit needs a reno. | 09:10 |
brtknr | strigazi: more of the user facing stuff | 09:10 |
brtknr | i would disagree that 17c6034e Add fcct config for coreos user_data needs a reno | 09:10 |
strigazi | you pasted it in etherpad :) | 09:11 |
flwang1 | ok, i think we're aware of the issue now and we just need to pay more attention and case by case? | 09:11 |
brtknr | strigazi: i am deleting thme one by one, | 09:12 |
brtknr | my brain can only work so fast | 09:12 |
strigazi | let's move one | 09:12 |
flwang1 | #topic health status update | 09:13 |
*** openstack changes topic to "health status update (Meeting topic: magnum)" | 09:13 | |
strigazi | anything to dicsuss? | 09:13 |
strigazi | brtknr: needs to review it first | 09:13 |
flwang1 | do you guys have new comments about this? | 09:13 |
strigazi | I don't | 09:14 |
brtknr | strigazi: i already added my comments once, need to retest it with the magnum_auto_healer label, havent found time, sorry | 09:14 |
flwang1 | ok, let's move on then | 09:14 |
flwang1 | #topic Multi AZ support for master nodes https://review.opendev.org/714347 | 09:14 |
*** openstack changes topic to "Multi AZ support for master nodes https://review.opendev.org/714347 (Meeting topic: magnum)" | 09:14 | |
flwang1 | strigazi: brtknr: did you get a chance to review this idea? | 09:14 |
flwang1 | strigazi: i know cern is using multi az, does it sound right to you? | 09:15 |
strigazi | without fixing add/del etcd members from etcd not heat, this makes little sense. | 09:15 |
strigazi | we don't use multi az for masters | 09:16 |
flwang1 | i know you don't, but why 'add/remove etcd' is a blocker for this one? | 09:16 |
strigazi | multi-az without NGs sounds like overengineering to me, but we discussed this before. | 09:16 |
flwang1 | this is for masters, we already have the mutli az support for worker based on NG | 09:17 |
strigazi | you need to remove a dead memeber from etcd | 09:17 |
flwang1 | we can use auto healing | 09:17 |
strigazi | no | 09:17 |
flwang1 | why | 09:18 |
strigazi | as you want guys, do it with autohealing | 09:18 |
strigazi | If it is opt in, lgtm | 09:18 |
flwang1 | why don't you explain more to help us understand? | 09:19 |
flwang1 | i know this is not the perfect solution, but i can see the benefits | 09:20 |
strigazi | etcd uses the raft algorithm for concensus | 09:20 |
strigazi | It starts with one master and more membres are added | 09:20 |
strigazi | each member contacts the all other memeber and they elect a leader. | 09:21 |
flwang1 | i understand your concern, you mean user may lost 1/2 etcd members in 1 az? | 09:21 |
strigazi | In magnum, we use discovery.etcd.io to advertise the IP of all member so they can elect a leader | 09:21 |
flwang1 | let's say there are 3 az and 3 master nodes or 5 master nodes | 09:22 |
flwang1 | can you please help me understand why it doesn't work? | 09:22 |
strigazi | When you change the number of members, smth needs to drop or add the old/new memeber from the the etcd API/leader | 09:22 |
strigazi | ^^ | 09:23 |
flwang1 | why do we have to change the number of members? when losing an AZ? | 09:23 |
strigazi | What if an etcd member is running there? | 09:24 |
strigazi | Who is going to delete it? | 09:24 |
flwang1 | sorry, why do we have to delete the member? in current design, when do we have to delete a member? | 09:25 |
brtknr | strigazi: where does an etcd member need to be deleted from? | 09:25 |
strigazi | And on then, a new member is added, how it will be added? with the discovery url it is not possible, the url is used only for bootstrap | 09:25 |
strigazi | sorry, why do we have to delete the member? in current design, when do we have to delete a member? | 09:26 |
strigazi | never ^^ we don't change master nods | 09:26 |
strigazi | strigazi: where does an etcd member need to be deleted from? | 09:26 |
strigazi | etcd ^^ | 09:26 |
strigazi | https://etcd.io/docs/v3.3.12/op-guide/runtime-configuration/#remove-a-member | 09:27 |
brtknr | so the master_az_list should work with fixed number of masters but not when we have scalable masters | 09:27 |
strigazi | yes | 09:27 |
flwang1 | strigazi: yep, i can see where are come from, but i think i'm trying to fix this based on current design | 09:27 |
brtknr | does the leader election fail if a member is dead? | 09:27 |
flwang1 | and even if we could support the resized master later, my current way should also work | 09:28 |
brtknr | or will the remaining members elect a leader and carry on? | 09:28 |
strigazi | so you plan to leave it dangling | 09:28 |
strigazi | I haven't tried, but the docs don't say: stop the member and everything is good. | 09:28 |
flwang1 | strigazi: the list is automatically generated based on the new number | 09:28 |
strigazi | ok | 09:29 |
*** belmoreira has quit IRC | 09:29 | |
flwang1 | strigazi: see https://review.opendev.org/#/c/714347/2/magnum/drivers/heat/k8s_fedora_template_def.py@225 | 09:29 |
flwang1 | it's not a fixed list | 09:29 |
strigazi | ok | 09:29 |
flwang1 | are we on the same page now? do you think my current solution could work ? | 09:30 |
strigazi | I don't knwo | 09:30 |
strigazi | I don't know | 09:30 |
strigazi | let's move on, when is is complete we can test | 09:30 |
flwang1 | TBH, i haven't done a fully testing, but the idea is there | 09:31 |
brtknr | what i dont understand is what the multi_az_list has to do with removing etcd member | 09:31 |
flwang1 | ok, i will keep polish it | 09:31 |
brtknr | seems like two separate issues | 09:31 |
flwang1 | brtknr: +1 | 09:31 |
strigazi | let;s move on | 09:31 |
brtknr | might be missing a point here | 09:31 |
flwang1 | #topic Allowed CIDRs for master LB https://review.opendev.org/#/c/715747/ | 09:31 |
*** openstack changes topic to "Allowed CIDRs for master LB https://review.opendev.org/#/c/715747/ (Meeting topic: magnum)" | 09:31 | |
flwang1 | here is a new feature introduced in Octavia stein release | 09:31 |
flwang1 | to allow setting a CIDR list for lb | 09:32 |
brtknr | this is conditional on heat merging the equivalent change right? | 09:32 |
brtknr | i.e. its not supported in heat yet | 09:32 |
brtknr | i.e. its not supported in stable/train branch yet | 09:32 |
flwang1 | it's supported in heat master branch | 09:32 |
brtknr | yep | 09:32 |
flwang1 | i'm trying to cherrypick to train | 09:32 |
brtknr | ah i see | 09:32 |
flwang1 | since it's a very useful feature | 09:33 |
flwang1 | especially when you want to open the lb on public | 09:33 |
brtknr | i'm happy with this patch | 09:33 |
flwang1 | strigazi: i need your help https://review.opendev.org/#/c/715747/1..2/magnum/drivers/common/templates/lb_api.yaml | 09:33 |
flwang1 | how should we deal with this case? | 09:34 |
strigazi | this was added where? | 09:34 |
flwang1 | we're using old heat template version, but how can we use a good new feature in latest heat version | 09:34 |
strigazi | the feature in heat | 09:34 |
flwang1 | https://review.opendev.org/#/c/715747/2/magnum/drivers/k8s_fedora_atomic_v1/templates/kubecluster.yaml | 09:35 |
brtknr | can we not upgrade the template version? | 09:35 |
strigazi | this was added where? the feature in heat | 09:35 |
flwang1 | strigazi: wait a sec | 09:35 |
flwang1 | https://review.opendev.org/#/c/715748/ | 09:35 |
strigazi | Which version we need change to | 09:35 |
strigazi | so train | 09:36 |
flwang1 | heat support this https://review.opendev.org/#/c/715748/1/heat/engine/resources/openstack/octavia/listener.py@137 | 09:36 |
strigazi | or ussuri | 09:36 |
flwang1 | but i haven't test if heat can automatically respect the version | 09:36 |
flwang1 | yep, Ussuri now | 09:36 |
flwang1 | and i'm trying to cherry pick it to train | 09:36 |
flwang1 | let's assume it can't be cherrypicked | 09:37 |
flwang1 | strigazi: any idea? | 09:37 |
strigazi | so change to ussuri, then heat ussuri/train is a hard dep for magnum when API LB is required | 09:37 |
flwang1 | that's what my concern | 09:38 |
flwang1 | that's why i'm asking you if there is a better solution | 09:38 |
flwang1 | to avoid the hard depedency | 09:38 |
strigazi | Add one more template? | 09:38 |
strigazi | fyi, at CERN it is not issue, we upgrade heat and we drop this part anyway. | 09:39 |
flwang1 | yep, another template may work | 09:39 |
brtknr | drop what part? | 09:40 |
flwang1 | i will think about it | 09:40 |
strigazi | Maybe it will be an issue, but we will add one more local patch | 09:40 |
flwang1 | will cern be interested in this feature? | 09:40 |
brtknr | is there a way to set allowed_cidr to the loadbalancer | 09:41 |
brtknr | is there a way to set allowed_cidr to the loadbalancer directly? | 09:41 |
strigazi | we don't have LBaaS. It is a very new feature with tungten. | 09:41 |
flwang1 | brtknr: should be, but it's not handy | 09:41 |
flwang1 | are you talking about set it by python code by calling octavia client? | 09:42 |
flwang1 | brtknr: ^ | 09:42 |
brtknr | we already have train heat requirement for train magnum | 09:42 |
brtknr | because of fedora coreos | 09:42 |
strigazi | Maybe we take this in gerrit? Is there anything else for the meeting? | 09:43 |
flwang1 | right | 09:43 |
flwang1 | nothing else | 09:43 |
flwang1 | i'm good | 09:43 |
flwang1 | anything else you guys want to discuss? | 09:43 |
brtknr | ttsiouts: dioguerra said you are working on the labels | 09:43 |
strigazi | we don't have a spec yet | 09:44 |
strigazi | when can discuss it then | 09:44 |
strigazi | we can discuss it then | 09:44 |
brtknr | ok sure | 09:44 |
ttsiouts | brtknr: I'm drafting the spec | 09:44 |
ttsiouts | I'll try to upload it today | 09:44 |
brtknr | ttsiouts: sounds good :) | 09:45 |
brtknr | 1 more thing | 09:47 |
brtknr | flwang1: strigazi: what are the curentl limitations for enabling resizable masters | 09:47 |
flwang1 | etcd | 09:47 |
brtknr | is that it? | 09:48 |
brtknr | because we dont have a way of adding/removing members dynamically? | 09:48 |
strigazi | have you every tried it? | 09:48 |
strigazi | you can use the heat api to change the number of masters | 09:49 |
*** osmanlicilegi has quit IRC | 09:49 | |
brtknr | strigazi: havent tried yet | 09:49 |
*** osmanlicilegi has joined #openstack-containers | 09:50 | |
flwang1 | strigazi: i'm a bit confused, do you mean calling heat api directly to resize master could work? | 09:50 |
strigazi | it will work towards failure | 09:51 |
flwang1 | right | 09:51 |
brtknr | strigazi: how do you resize directly using heat api? | 09:51 |
flwang1 | let's end the meeting | 09:51 |
openstackgerrit | Merged openstack/magnum master: Update hacking for Python3 https://review.opendev.org/716347 | 09:52 |
flwang1 | we can continue the discussion | 09:52 |
flwang1 | #endmeeting | 09:52 |
*** openstack changes topic to "OpenStack Containers Team | Meeting: every Wednesday @ 9AM UTC | Agenda: https://etherpad.openstack.org/p/magnum-weekly-meeting" | 09:52 | |
openstack | Meeting ended Wed Apr 1 09:52:13 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 09:52 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-04-01-09.01.html | 09:52 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-04-01-09.01.txt | 09:52 |
openstack | Log: http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-04-01-09.01.log.html | 09:52 |
flwang1 | thank you team | 09:52 |
*** k_mouza has joined #openstack-containers | 09:53 | |
flwang1 | brtknr: if you're interested in the resize master work, i'm happy to support you | 09:54 |
strigazi | flwang1: aren't you doing this already? | 09:54 |
brtknr | strigazi: openstack stack update --parameter number_of_masters=3 <stackid>? | 09:55 |
flwang1 | i did some investigation before, but haven't done any code | 09:55 |
strigazi | yes | 09:55 |
strigazi | Are brtknr are going to write the code? | 09:55 |
flwang1 | strigazi: i'm trying to convince him ;) | 09:56 |
brtknr | lol | 09:56 |
strigazi | brtknr: are you going to write the code? | 09:56 |
flwang1 | strigazi: i can see you're interested in this too | 09:56 |
strigazi | it seems I can't type. | 09:56 |
flwang1 | strigazi: why don't you do it? you're the best person can do this job | 09:56 |
brtknr | I need to convince my company to pay me to do this first | 09:56 |
flwang1 | me and brtknr don't have enough knowledge to finish it | 09:57 |
strigazi | I try to not overengineer magnum and then even for me it is impossible to patch | 09:57 |
flwang1 | i will leave this topic to you guys, i have to go | 09:57 |
flwang1 | thank you guys, good to have you in the team | 09:58 |
flwang1 | stay safe | 09:58 |
brtknr | looks like the command to use is: openstack stack update --parameter number_of_masters=2 k8s-flannel-coreos-r6k223a4cny3 --existing | 09:58 |
strigazi | good night | 09:58 |
strigazi | yes | 09:58 |
brtknr | etcd has failed | 09:58 |
flwang1 | brtknr: you can get the new master node, but the etcd member may fail to join | 09:58 |
*** ykarel|lunch is now known as ykarel | 09:59 | |
strigazi | so it works as I said :) | 09:59 |
strigazi | failure reached successfully ;) | 09:59 |
strigazi | brtknr: so you are working on it | 10:00 |
brtknr | strigazi: i jus wanted to see what would happen | 10:00 |
brtknr | i have no idea how to deal with this | 10:01 |
strigazi | but you can 't work on it yet | 10:01 |
strigazi | Anyway, see you later, I have a meeting in a bit. | 10:02 |
strigazi | have a nice day | 10:02 |
brtknr | you too strigazi | 10:02 |
openstackgerrit | Merged openstack/magnum master: fcos: Upgrade etcd to v3.4.6, use quay.io/coreos/etcd https://review.opendev.org/714719 | 10:06 |
*** rcernin has joined #openstack-containers | 10:35 | |
cosmicsound | brtknr , i was refering to the docker service daemon if it needs to advertise the etcd | 10:35 |
openstackgerrit | Merged openstack/magnum master: fcos: Upgrade default flannel_tag to v0.12.0-amd64 https://review.opendev.org/714720 | 10:45 |
*** ttsiouts has quit IRC | 11:42 | |
*** ttsiouts has joined #openstack-containers | 11:44 | |
*** ricolin has quit IRC | 11:54 | |
*** sapd1_x has quit IRC | 12:03 | |
*** spotz has joined #openstack-containers | 12:07 | |
openstackgerrit | Theodoros Tsioutsias proposed openstack/magnum-specs master: [WIP] Magnum Labels Override https://review.opendev.org/716571 | 12:36 |
ttsiouts | strigazi, brtknr: ^^ | 12:37 |
ttsiouts | it's still wip, but all comments are more than welcome | 12:37 |
cosmicsound | brtknr , latest heat agent log | 12:38 |
cosmicsound | it failes with: Create Failed | 12:38 |
cosmicsound | Resource Create Failed: Error: Resources[0].Resources.Master Config Deployment: Deployment To Server Failed: Deploy Status Code: Deployment Exited With Non-Zero Status Code: 1 | 12:38 |
cosmicsound | while the journalctl -u heat-agent gives me: Apr 01 12:39:06 kubernetes-bi25yp3cc4ta-master-0.novalocal runc[1622]: /var/lib/os-collect-config/local-data not found. Skipping | 12:39 |
*** dave-mccowan has joined #openstack-containers | 12:50 | |
*** dave-mccowan has quit IRC | 12:54 | |
*** sapd1_x has joined #openstack-containers | 12:57 | |
k_mouza | good afternoon all! Quick question please. I'm trying to deploy a K8s cluster via Openstack Magnum (stable/rocky), based on fedora-atomic-29 images. The environment I'm working on doesn't have internet access and I've set up a local registry for the K8s components. I've set that registry in the container_infra_prefix label, but it's insecure and the stack fails when trying to pull images from | 12:59 |
k_mouza | it. I've also tried the insecure_registry flag in the template, but that's for the app images from what I can tell. Any help please? Is it possible to have a Magnum K8s deployed in an air-gapped environment? Thanks! | 12:59 |
*** ricolin_ has joined #openstack-containers | 13:01 | |
*** udesale_ has joined #openstack-containers | 13:09 | |
*** udesale has quit IRC | 13:11 | |
guilhermesp | hey brtknr flwang1 ! morgnins. I just applied the latest commit of master on our deployment! I'm about to try to create both v1.17 and v1.18 clusters and run conformance, but I have a little question for ya | 13:13 |
guilhermesp | how you guys deals with upgrades? I mean, sometimes we have new labels to add in the cluster templates ( i.e https://github.com/openstack/magnum/commit/454b0f55ec1b3e1cb561317bc0d52949916078ab ) | 13:13 |
guilhermesp | and it happens that we have some public cluster templates for our customers, but I noticed that, if a cluster template is being used we cannot update it and neither delete it. So which means we need to create another public cluster template to offer the latest magnum features | 13:14 |
cosmicsound | Do I need Octavia to run Magnum? | 13:16 |
guilhermesp | for multimaster, yes cosmicsound | 13:17 |
cosmicsound | I use only one master | 13:17 |
cosmicsound | alto even that one keeps on failing | 13:17 |
cosmicsound | I was thinking is mandatory | 13:19 |
cosmicsound | Since our cluster keeps on failing at start | 13:19 |
*** mgariepy has quit IRC | 13:24 | |
*** mgariepy has joined #openstack-containers | 13:25 | |
brtknr | guilhermesp: new features == new template | 13:31 |
brtknr | its by design | 13:32 |
*** ttsiouts has quit IRC | 13:38 | |
*** sapd1_x has quit IRC | 13:42 | |
*** ykarel is now known as ykarel|afk | 13:43 | |
*** ttsiouts has joined #openstack-containers | 13:48 | |
*** sapd1_x has joined #openstack-containers | 13:55 | |
guilhermesp | i see, thanks for clarifying brtknr | 13:55 |
brtknr | guilhermesp: in train, you can upgrade user cluster to a new template, after which the old cluster can be deleted | 13:57 |
guilhermesp | hum interesting... we've been trying to find ways to do this, coz nowadays we just say to customer: delete your cluster and create a new one with this new template, which makes them a bit sad | 13:58 |
*** ykarel|afk is now known as ykarel | 14:02 | |
guilhermesp | im currently using Fedora Atomic 29 [2019-08-20], should we use a newer version for current master? | 14:10 |
guilhermesp | after bumping shas to current master it seems my cluster is stuck on api_lb creation | 14:10 |
*** ttsiouts has quit IRC | 14:17 | |
*** sapd1_x has quit IRC | 14:17 | |
brtknr | guilhermesp: not sure why that would be | 14:22 |
brtknr | from using a new image | 14:22 |
guilhermesp | yeah not sure. My cluster is stuck on creating master and as I can see journalctl shows me | 14:24 |
guilhermesp | abr 01 14:24:22 v17-conformance-hfxchcas2wl6-master-0.vexxhost.local bash[3631]: E0401 14:24:22.602980 3694 kubelet.go:2263] node "v17-conformance-hfxchcas2wl6-master-0" not found | 14:24 |
guilhermesp | and | 14:25 |
guilhermesp | https://www.irccloud.com/pastebin/4qRIBg0P/ | 14:25 |
guilhermesp | i was using a relatively old sha 1115672e7284bdd77a5951e93d47b22495c33d91 | 14:27 |
guilhermesp | upgraded to 0fffdd1956af06c0f9bbcb325cbef2773816799a | 14:27 |
*** sapd1_x has joined #openstack-containers | 14:30 | |
guilhermesp | and finally the labels Im using :) | 14:32 |
guilhermesp | https://www.irccloud.com/pastebin/hFmbTHMC/ | 14:32 |
*** sapd1_x has quit IRC | 14:41 | |
*** sapd1_x has joined #openstack-containers | 14:46 | |
*** ttsiouts has joined #openstack-containers | 15:03 | |
*** sapd1_x has quit IRC | 15:15 | |
*** ykarel is now known as ykarel|away | 15:22 | |
guilhermesp | huum ok i removed that label etcd_tag': u'3.4.3' | 15:23 |
guilhermesp | now I have a cluster completed | 15:23 |
*** sapd1_x has joined #openstack-containers | 15:28 | |
*** pcaruana has quit IRC | 15:32 | |
guilhermesp | huum but UNHEALTHY . There is no kubectl command as well. Do we have to set an specific etcd_tag? | 15:37 |
brtknr | guilhermesp: are you using fedora atomic with use_podman=true? | 15:41 |
guilhermesp | yep, i remember once you guys saying etcd needs to be >= 3.4.3 right? | 15:43 |
guilhermesp | I removed etc_tag as above and failed, which i think is expected as it is getting an old version by default | 15:43 |
guilhermesp | I'm about to test 3.4.6 | 15:43 |
brtknr | ok | 15:43 |
brtknr | 3.4.3 should work but no harm in testing newer | 15:44 |
brtknr | whats in the error logs? | 15:44 |
brtknr | ssh into the master | 15:44 |
brtknr | now, there should be log inside /var/log/heat-config | 15:44 |
brtknr | if you are using ussuri-dev heat_container_agent tag which is the default if you are using train | 15:45 |
brtknr | 1.17.4 is the latest kube_tag | 15:46 |
brtknr | v1.17.4 | 15:46 |
brtknr | im looking at thi | 15:46 |
brtknr | https://www.irccloud.com/pastebin/hFmbTHMC/ | 15:46 |
brtknr | you dont need kube_version label | 15:46 |
*** pcaruana has joined #openstack-containers | 15:48 | |
guilhermesp | hum which means then i should be using etcd_tag=3.4.3 or higher and kube_tag=1.17.4 | 15:51 |
guilhermesp | i will give a try, using these tags above + etcd_tag=3.4.6 also fails. | 15:52 |
guilhermesp | i can see the heat-config logs here | 15:52 |
guilhermesp | and HEAT_CONTAINER_AGENT_TAG=ussuri-dev | 15:52 |
brtknr | that etcd registry is unofficial and lags behind latest etcd release | 15:52 |
brtknr | i think the latest supported there is 3.4.4 | 15:53 |
guilhermesp | i will change my template to use kube_tag=1.17.4 and etcd_tag=3.4.3 | 15:53 |
brtknr | are you using 9.2.0 train? | 15:53 |
brtknr | or master? | 15:53 |
guilhermesp | current master | 15:53 |
brtknr | ah okay | 15:53 |
guilhermesp | to apply the fixes for conformance | 15:53 |
brtknr | latest master? | 15:53 |
brtknr | or recent master? | 15:53 |
guilhermesp | and also add support to v1.18 | 15:53 |
guilhermesp | 0fffdd1956af06c0f9bbcb325cbef2773816799a this hsa | 15:54 |
brtknr | this is important because we merged a change for etcd | 15:54 |
guilhermesp | sha* | 15:54 |
brtknr | ah okay | 15:54 |
brtknr | so you should use v3.4.6 | 15:54 |
brtknr | beacuse we've change the etcd registery | 15:55 |
brtknr | which follows the same release tag convention as main etcd development repo | 15:55 |
brtknr | v1.18 is already supported | 15:55 |
guilhermesp | yeah so currently my cluster template is using etcd_tag=3.4.3. But I guess I also need to change the kube_tag to 1.17.4 right? | 15:56 |
brtknr | in fact, you dont need to specify etcd_tag :) | 15:56 |
brtknr | its already using the latest | 15:56 |
brtknr | if you are using fedora coreos | 15:57 |
guilhermesp | fedora-atomic | 15:57 |
brtknr | i have not tested atomic with podman | 15:57 |
brtknr | atomic is eol, no more security updates | 15:57 |
guilhermesp | huuum that might be an answer then, We've been using fedora-atomic | 15:57 |
brtknr | actually if you are using use_podman, v3.4.6 etcd_tag should work | 15:58 |
brtknr | as it uses the same common script | 15:58 |
guilhermesp | for atomic? | 15:58 |
brtknr | both atomic and coreos | 15:58 |
brtknr | yes | 15:58 |
guilhermesp | hum | 15:58 |
brtknr | its just not the default value | 15:58 |
brtknr | so you have to set it manually | 15:58 |
brtknr | because use_podman is an opt-in feature in atomic | 15:58 |
brtknr | use_podman=true by default is coreos | 15:58 |
guilhermesp | i though podeman as a requirement for both atomic and coreos prior to v1.17 | 15:59 |
guilhermesp | https://www.irccloud.com/pastebin/C3HIyHnk/ | 16:01 |
guilhermesp | ok currently those are my labels that are not working with fedora atomic | 16:01 |
guilhermesp | i will remove kube_version and edit kube_tag to 1.17.4 | 16:01 |
brtknr | guilhermesp: do not remove the v from 3.4.6 | 16:03 |
brtknr | it should be v3.4.6 | 16:03 |
brtknr | not 3.4.6 | 16:03 |
guilhermesp | ah ok, fixing this | 16:03 |
guilhermesp | ok testing this one now | 16:04 |
guilhermesp | https://www.irccloud.com/pastebin/4nDJXB7E/ | 16:05 |
guilhermesp | ok create complete. Let me check how the cluster is | 16:19 |
guilhermesp | https://www.irccloud.com/pastebin/Zi5FVGxq/ | 16:20 |
guilhermesp | watching to see if they become ReadY | 16:20 |
guilhermesp | for now the cluster status is UNHEALTHY | 16:21 |
guilhermesp | https://www.irccloud.com/pastebin/gMK3372i/ | 16:22 |
guilhermesp | yeah i guess it might be better for me to start using fedora coreos. Do you have an image come in handy for me to test brtknr ? | 16:50 |
guilhermesp | think i found it http://beta.release.core-os.net/amd64-usr/current/coreos_production_openstack_image.img.bz2 | 17:09 |
*** udesale_ has quit IRC | 17:29 | |
*** k_mouza has quit IRC | 18:03 | |
*** k_mouza has joined #openstack-containers | 18:06 | |
*** k_mouza has quit IRC | 18:19 | |
*** k_mouza has joined #openstack-containers | 18:33 | |
*** k_mouza has quit IRC | 18:34 | |
*** k_mouza has joined #openstack-containers | 18:35 | |
*** k_mouza has quit IRC | 18:39 | |
*** sapd1_x has quit IRC | 18:43 | |
brtknr | guilhermesp: not coreos, fedora coreos | 19:04 |
brtknr | you can try this script: https://github.com/stackhpc/magnum-terraform/blob/master/upload-coreos.sh | 19:05 |
guilhermesp | yeah I havent had much success with that image | 19:05 |
* guilhermesp looking | 19:05 | |
brtknr | guilhermesp: you need jq installed | 19:05 |
brtknr | it always uploads the latest stable image | 19:06 |
guilhermesp | ive noticed in that failed cluster above is that is it using some train tags for calico, coreos, but as I'm running the latest master commit, I think those values should be default to ussuri? | 19:06 |
guilhermesp | ok i will give a try with that image | 19:06 |
guilhermesp | it seems that fedora-atomic is no longer be an option in the future ( or now actually ) ? | 19:07 |
* guilhermesp uploads using script :) | 19:10 | |
guilhermesp | ok so as you said earlier no need to set label use_podman as it is by default for fedora-coreos right? | 19:21 |
guilhermesp | hum yeah it seems that image is not booting up in our cloud | 19:29 |
flwang1 | guilhermesp: it would be nice if you can share you template | 19:34 |
guilhermesp | https://www.irccloud.com/pastebin/LoODmw2E/ | 19:35 |
guilhermesp | there it is flwang1 | 19:35 |
guilhermesp | but yeah, not sure but instances with that image are stuck on cloud-init | 19:35 |
guilhermesp | ps: first time I use a fedora-cores image | 19:36 |
guilhermesp | ive been using atomic till now so, I think is preferable for now on fedora-coreos | 19:36 |
flwang1 | guilhermesp: what's your heat version? | 19:36 |
guilhermesp | let me grab it here | 19:36 |
flwang1 | guilhermesp: fedora atomic has been EOL since last November | 19:37 |
guilhermesp | btw some logs of cloud-init of master node ( with coreos ) http://paste.openstack.org/show/791484/ | 19:37 |
guilhermesp | yeah, that was a good point, I want to switch to coreos for now on | 19:37 |
flwang1 | fedora Coreos doesn't support cloud-init, it's using Ignition | 19:38 |
guilhermesp | huum ok so that explains | 19:39 |
guilhermesp | btw these are the parameters we use for heat | 19:39 |
guilhermesp | https://www.irccloud.com/pastebin/tvqbCOgr/ | 19:40 |
guilhermesp | so flwang1 we are not able then to run fedora-coreos with cloud-init? | 19:40 |
flwang1 | guilhermesp: there is no cloud-init inside fedora-coreos | 19:41 |
flwang1 | guilhermesp: you probably have to upgrade heat to train or cherrypick patch like this https://review.opendev.org/#/c/696327/ | 19:42 |
guilhermesp | i think upgrading heat to train would be better. As I understood, with train or this patch I am be able to create the nodes with fedora-coreos through heat right? | 19:46 |
flwang1 | guilhermesp: yes | 19:48 |
*** k_mouza has joined #openstack-containers | 19:48 | |
flwang1 | guilhermesp: are you responsible for taking care k8s/magnum in vexxhost? | 19:49 |
*** k_mouza has quit IRC | 19:50 | |
guilhermesp | yep flwang1 | 19:50 |
guilhermesp | from time to time | 19:50 |
guilhermesp | when I have a stable cluster template | 19:50 |
guilhermesp | i start testing new k8s version | 19:50 |
flwang1 | nice, then i'd like invite you to join magnum team and our weekly meeting | 19:51 |
guilhermesp | would be important i guess. I'm missing a lot of stuff :P | 19:51 |
flwang1 | so that you can know what's happening ;) | 19:51 |
guilhermesp | yep | 19:52 |
guilhermesp | is there some sort of reminder for the meeting? | 19:52 |
flwang1 | https://etherpad.openstack.org/p/magnum-weekly-meeting guilhermesp | 19:53 |
guilhermesp | thanks flwang1 | 19:54 |
guilhermesp | well yeah I'm discussing here with mohammed what patch we should follow | 19:54 |
guilhermesp | upgrade our heat to train or just cherry pick | 19:54 |
guilhermesp | the thing is we offer magnum for some customers, and there are clouds that are still stein | 19:55 |
guilhermesp | so maybe a cherry pick would be a quick option to test fedora-coreos | 19:55 |
guilhermesp | flwang1: did you see the cluster template above? i guess the labels might be ok with fedora-coreos | 19:56 |
flwang1 | i saw that | 20:05 |
flwang1 | it shouldn't be a problem | 20:05 |
flwang1 | i think cherrypick it an easier option | 20:06 |
flwang1 | guilhermesp: ^ | 20:11 |
guilhermesp | yeah i will try to cherry pick here in the env Im working and then i will use that cluster template to see what happens :P | 20:13 |
guilhermesp | i will tell you my results | 20:13 |
flwang1 | guilhermesp: no problem | 20:17 |
flwang1 | brtknr: re the health status update patch, what's the document you would like to see in the user guide? | 20:18 |
brtknr | flwang1: The effect magnum auto healer has on the poller | 20:19 |
flwang1 | brtknr: ok, i see. | 20:19 |
flwang1 | i will add a section to explain the current health monitoring | 20:19 |
brtknr | flwang1: are you able to provide an auto healer Docker image that I can deploy to test this feature? | 20:24 |
*** flwang1 has quit IRC | 20:25 | |
*** flwang1 has joined #openstack-containers | 20:28 | |
flwang1 | https://hub.docker.com/r/k8scloudprovider/magnum-auto-healer/tags | 20:28 |
flwang1 | brtknr: ^ | 20:28 |
guilhermesp | btw flwang1 brtknr is this going to be backported to train? https://review.opendev.org/#/c/716420/ | 20:35 |
flwang1 | i need to think about it, but i can't see why not | 20:37 |
guilhermesp | yeah, as I was getting a v1.17 + fedora atomic with traino 9.2.0, i think we have this commit in train I can keep using mangum train and have support to v1.17 ( + conformance ) until we decide to upgrade our envs | 20:39 |
*** vishalmanchanda has quit IRC | 20:39 | |
flwang1 | guilhermesp: yep, but are you sure your customer will be happy using an EOL operating system for their k8s? | 20:40 |
flwang1 | so my personal suggestion is starting to consider the upgrade :) | 20:40 |
guilhermesp | yeah that would be good to take into consideration | 20:41 |
guilhermesp | i still need some of the mohammed's time to discuss :P | 20:41 |
guilhermesp | just thinking in some option, but yeah good point flwang1 | 20:41 |
flwang1 | :) let me know if you need any help | 20:42 |
*** bline has joined #openstack-containers | 21:09 | |
*** N3l1x has joined #openstack-containers | 21:15 | |
*** ttsiouts has quit IRC | 21:22 | |
*** ttsiouts has joined #openstack-containers | 21:54 | |
*** ttsiouts has quit IRC | 21:59 | |
*** aspiers has quit IRC | 22:08 | |
*** k_mouza has joined #openstack-containers | 22:15 | |
*** flwang1 has quit IRC | 22:22 | |
*** ttsiouts has joined #openstack-containers | 22:36 | |
*** aspiers has joined #openstack-containers | 22:40 | |
*** k_mouza has quit IRC | 23:30 | |
*** flwang1 has joined #openstack-containers | 23:43 | |
*** flwang1 has quit IRC | 23:45 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!