21:04:15 #startmeeting magnum 21:04:16 Meeting started Tue Aug 27 21:04:15 2019 UTC and is due to finish in 60 minutes. The chair is flwang. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:04:17 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:04:19 The meeting name has been set to 'magnum' 21:04:29 #topic roll call 21:04:33 o/ 21:04:45 brtknr: jakeyip: 21:04:51 anyone else online? 21:05:36 strigazi: ok, let's start first 21:05:48 #topic flannel conformance 21:06:02 strigazi: did you see my email? 21:06:32 The nic patch is definitely an issue for master branch 21:06:42 after that I get internal IPs 21:06:42 strigazi: yes 21:06:49 ok, cool 21:06:57 I'm looking into sec groups now 21:07:02 have you completed another sonobuoy testing? 21:07:06 and the iptables patch we dropped 21:07:28 I'm just checking one test groups of tests regarding DNS 21:07:44 sonobuoy run --e2e-focus "DNS" 21:07:56 strigazi: ok, did you see my last comment on https://review.opendev.org/#/c/668163/? 21:07:57 this covers the network usually 21:08:19 when this passes the rest should work 21:08:32 at least based on my testing, the iptable patch doesn't help 21:08:48 strigazi: so you also got 10 test cases failed, right? 21:09:07 yes, do the same pass for calico? 21:09:12 yes 21:09:14 for master branch? 21:09:17 yes 21:09:27 so calico is the issue :) 21:09:35 :D 21:09:41 http://paste.openstack.org/show/763160/ 21:09:53 can you pls check if you got the same 10 cases? 21:10:07 I haven't left one to finish 21:10:19 but the DNS one I have them 21:10:25 *ones 21:10:57 ok 21:11:29 you mean this one [Fail] [sig-network] DNS [It] should provide DNS for the cluster [Conformance] ? 21:11:41 yes 21:11:52 and for services 21:12:24 right 21:12:53 anyway, tomorrow I guess I'll have it working. 21:13:23 why is calico working? it is no affected by the NIC patch? 21:13:25 fantastic 21:13:36 strigazi: it's also blocked by the nic 21:13:50 so it doesn't work for master 21:14:02 when i said calico working, i mean the test i did about several weeks ago 21:14:12 at that moment, the nic patch hasn't merged yet 21:14:51 ok 21:15:07 strigazi: this patch should be able to fix the regression issue https://review.opendev.org/678067 21:15:18 i will check with brtknr if it's ready for testing 21:16:32 strigazi: shall we move to next topic? 21:16:48 ok 21:16:57 #topic fedora coreos 30 21:17:25 yesterday, i have managed to get the ssh key, hostname and openstack-ca working for the new fedora coreos 30 image 21:17:37 today i will work on the heat-container-agent part 21:17:45 ok 21:18:35 o/ 21:18:41 btw, i can't remember how the cfn-init-data is written into the instance, can you pls remind me? 21:18:53 brtknr: hey 21:19:00 heat appends them in cloud-init user-data 21:19:01 apologies, i was at the cinema 21:19:29 strigazi: ah, i see. so we may have to inject it by ignition "manually"? 21:19:45 in our case this gile will need to be crafted and injected as user data 21:19:48 flwang: its ready for testing 21:20:01 flwang: https://review.opendev.org/#/c/678067/ this patch 21:20:04 strigazi: i see. i will try 21:20:11 brtknr: thanks for the confirmation 21:22:21 i will update the fedora coreos 30 work with you guys later when there is any progress 21:22:36 #topic rolling upgrade 21:23:18 so far the rolling upgrade patch for node operating system has passed my testing, https://review.opendev.org/669593 it would be nice if you guys can start reviewing it 21:23:57 flwang: I'll test it tomorrow 21:24:03 the other thing i'd like to test is, if it can support migrating from fedora atomic 29 to fedora coreos 30, given they're all based on (rpm-) ostree 21:24:12 strigazi: ^ any comments? 21:24:31 flwang: I don't know if it is possible 21:25:01 maybe it is 21:25:37 strigazi: anyway, we still need this upgrade to support user upgrade for fedora atomic 21:25:56 flwang: I remember seeing on #fedora-coreos channel that they recommend users to rebuild instances instead of trying to upgrade 21:26:00 no matter is fedora atomic 27- >29 or small upgrade based on fedora atomic 29 21:26:37 brtknr: i understand that, just thinking aloud, i know it's not a recommended way :) 21:26:37 rebuild is the best in all scenarios IMO 21:27:12 strigazi: but for rebuild, we can't resolve the downtime issue now 21:27:47 unless we have a better way to orchestrate the upgrade progress 21:27:59 depends in the pattern of usage 21:28:07 yes, i know 21:28:22 if the pattern is cloudy, rebuild works 21:28:59 assume the cluster is created in a private network, mangum controll plane can't reach the cluster, then there is no good way to control the rebuild process 21:29:06 anyway, depending on flannl I'll test upgfdae 21:29:18 strigazi: thank you 21:31:46 strigazi: brtknr: anything else your want to discuss? 21:32:13 yes, i wanted to talk about whther you guys have kube_tag=v1.15.x working? 21:32:32 i see there are images but i can only get upto 1.14 working on master 21:33:03 brtknr: for flannel we need to update the manifest and a pod security policy 21:33:11 after that it works 21:33:23 i see theres a patch for supporting 1.16 from Richardo 21:33:56 this is for the apis 21:34:09 we need a better debug output for heat-container-agent... its currently incomprehensible 21:35:02 brtknr: we need set +x before every source of heat-params 21:35:27 and before when we write files to disk 21:35:54 i can see that heat-container-agent:stein-stable has a readable outout to debug log but since train, it is hard to see what is failing 21:36:47 brtknr: it's related to the py3 support i think 21:37:00 it's a formating issue i would say 21:37:22 in other words, we still get the same output, but current format is bad 21:38:40 flwang: okay i'll create a story for this as a reminder to investigate 21:38:50 we can use logging into a file? 21:39:02 but journal is better IMO 21:39:19 strigazi: thats also a good idea... like /var/log/heat-container-agent-output.log? 21:39:31 yeap 21:39:47 os-collect-config should have something 21:40:09 if it is more readable that how it currently is, i'd like that.. but i also prefer journalctl 21:40:12 before we fix the formating issue, redirect to a file doesn't help, IMHO 21:42:14 i think the entire debug is getting written to the journal at once upon failure at the moment: https://github.com/openstack/magnum/blob/master/dockerfiles/heat-container-agent/scripts/55-heat-config#L153 21:42:37 brtknr: yes 21:42:59 it needs to be written atomically 21:43:11 in pretty format :) 21:43:28 i dont understand how it looked pretty before 21:43:30 strigazi: did cern do any security review for magnum deployed k8s ? 21:44:05 brtknr: basically convert \n to a real breakline 21:45:03 flwang: only from the outside of the cluster. And it is fine 21:45:15 we have also used kube-hunter 21:46:41 shall we wrap? 21:46:49 strigazi: cool 21:46:52 anything else to discuss? 21:46:55 i'm good 21:46:57 1 last question about nodegroups 21:46:59 brtknr: anything else? 21:47:02 any progress? 21:47:06 brtknr: i asked yesterday :D 21:47:09 or plans to? 21:47:56 it is in good shape but the author had some family priorities :) 21:48:06 next week he is back 21:49:01 ok, let's wrap this one 21:49:08 ah yes I heard about the paternity :) please send him my congratulations! 21:49:15 thank you for joining, strigazi, brtknr 21:49:18 #endmeeting