21:04:15 <flwang> #startmeeting magnum
21:04:16 <openstack> Meeting started Tue Aug 27 21:04:15 2019 UTC and is due to finish in 60 minutes.  The chair is flwang. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:04:17 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:04:19 <openstack> The meeting name has been set to 'magnum'
21:04:29 <flwang> #topic roll call
21:04:33 <strigazi> o/
21:04:45 <flwang> brtknr: jakeyip:
21:04:51 <flwang> anyone else online?
21:05:36 <flwang> strigazi: ok, let's start first
21:05:48 <flwang> #topic flannel conformance
21:06:02 <flwang> strigazi: did you see my email?
21:06:32 <strigazi> The nic patch is definitely an issue for master branch
21:06:42 <strigazi> after that I get internal IPs
21:06:42 <flwang> strigazi: yes
21:06:49 <flwang> ok, cool
21:06:57 <strigazi> I'm looking into sec groups now
21:07:02 <flwang> have you completed another sonobuoy testing?
21:07:06 <strigazi> and the iptables patch we dropped
21:07:28 <strigazi> I'm just checking one test groups of tests regarding DNS
21:07:44 <strigazi> sonobuoy run --e2e-focus "DNS"
21:07:56 <flwang> strigazi: ok, did you see my last comment on https://review.opendev.org/#/c/668163/?
21:07:57 <strigazi> this covers the network usually
21:08:19 <strigazi> when this passes the rest should work
21:08:32 <flwang> at least based on my testing, the iptable patch doesn't help
21:08:48 <flwang> strigazi: so you also got 10 test cases failed, right?
21:09:07 <strigazi> yes, do the same pass for calico?
21:09:12 <flwang> yes
21:09:14 <strigazi> for master branch?
21:09:17 <flwang> yes
21:09:27 <strigazi> so calico is the issue :)
21:09:35 <flwang> :D
21:09:41 <flwang> http://paste.openstack.org/show/763160/
21:09:53 <flwang> can you pls check if you got the same 10 cases?
21:10:07 <strigazi> I haven't left one to  finish
21:10:19 <strigazi> but the DNS one I have them
21:10:25 <strigazi> *ones
21:10:57 <flwang> ok
21:11:29 <flwang> you mean this one [Fail] [sig-network] DNS [It] should provide DNS for the cluster  [Conformance]   ?
21:11:41 <strigazi> yes
21:11:52 <strigazi> and for services
21:12:24 <flwang> right
21:12:53 <strigazi> anyway, tomorrow I guess I'll have it working.
21:13:23 <strigazi> why is calico working? it is no affected by the NIC patch?
21:13:25 <flwang> fantastic
21:13:36 <flwang> strigazi: it's also blocked by the nic
21:13:50 <strigazi> so it doesn't work for master
21:14:02 <flwang> when i said calico working, i mean the test i did about several weeks ago
21:14:12 <flwang> at that moment, the nic patch hasn't merged yet
21:14:51 <strigazi> ok
21:15:07 <flwang> strigazi: this patch should be able to fix the regression issue https://review.opendev.org/678067
21:15:18 <flwang> i will check with brtknr if it's ready for testing
21:16:32 <flwang> strigazi: shall we move to next topic?
21:16:48 <strigazi> ok
21:16:57 <flwang> #topic fedora coreos 30
21:17:25 <flwang> yesterday, i have managed to get the ssh key, hostname and openstack-ca working for the new fedora coreos 30 image
21:17:37 <flwang> today i will work on the heat-container-agent part
21:17:45 <strigazi> ok
21:18:35 <brtknr> o/
21:18:41 <flwang> btw, i can't remember how the cfn-init-data is written into the instance, can you pls remind me?
21:18:53 <flwang> brtknr: hey
21:19:00 <strigazi> heat appends them in cloud-init user-data
21:19:01 <brtknr> apologies, i was at the cinema
21:19:29 <flwang> strigazi: ah, i see. so we may have to inject it by ignition "manually"?
21:19:45 <strigazi> in our case this gile will need to be crafted and injected as user data
21:19:48 <brtknr> flwang: its ready for testing
21:20:01 <brtknr> flwang: https://review.opendev.org/#/c/678067/ this patch
21:20:04 <flwang> strigazi: i see. i will try
21:20:11 <flwang> brtknr: thanks for the confirmation
21:22:21 <flwang> i will update the fedora coreos 30 work with you guys later when there is any progress
21:22:36 <flwang> #topic rolling upgrade
21:23:18 <flwang> so far the rolling upgrade patch for node operating system has passed my testing, https://review.opendev.org/669593  it would  be nice if you guys can start reviewing it
21:23:57 <brtknr> flwang: I'll test it tomorrow
21:24:03 <flwang> the other thing i'd like to test is, if it can support migrating from fedora atomic 29 to fedora coreos 30, given they're all based on (rpm-) ostree
21:24:12 <flwang> strigazi: ^ any comments?
21:24:31 <strigazi> flwang: I don't know if it is possible
21:25:01 <strigazi> maybe it is
21:25:37 <flwang> strigazi: anyway, we still need this upgrade to support user upgrade for fedora atomic
21:25:56 <brtknr> flwang: I remember seeing on #fedora-coreos channel that they recommend users to rebuild instances instead of trying to upgrade
21:26:00 <flwang> no matter is fedora atomic 27- >29 or small upgrade based on fedora atomic 29
21:26:37 <flwang> brtknr: i understand that, just thinking aloud, i know it's not a recommended way :)
21:26:37 <strigazi> rebuild is the best in all scenarios IMO
21:27:12 <flwang> strigazi: but for rebuild, we can't resolve the downtime issue now
21:27:47 <flwang> unless we have a better way to orchestrate the upgrade progress
21:27:59 <strigazi> depends in the pattern of usage
21:28:07 <flwang> yes, i know
21:28:22 <strigazi> if the pattern is cloudy, rebuild works
21:28:59 <flwang> assume the cluster is created in a private network, mangum controll plane can't reach the cluster, then there is no good way to control the rebuild process
21:29:06 <strigazi> anyway, depending on flannl I'll test upgfdae
21:29:18 <flwang> strigazi: thank you
21:31:46 <flwang> strigazi: brtknr: anything else your want to discuss?
21:32:13 <brtknr> yes, i wanted to talk about whther you guys have kube_tag=v1.15.x working?
21:32:32 <brtknr> i see there are images but i can only get upto 1.14 working on master
21:33:03 <strigazi> brtknr: for flannel we need to update the manifest and a pod security policy
21:33:11 <strigazi> after that it works
21:33:23 <brtknr> i see theres a patch for supporting 1.16 from Richardo
21:33:56 <strigazi> this is for the apis
21:34:09 <brtknr> we need a better debug output for heat-container-agent... its currently incomprehensible
21:35:02 <strigazi> brtknr: we need set +x before every source of heat-params
21:35:27 <strigazi> and before when we write files to disk
21:35:54 <brtknr> i can see that heat-container-agent:stein-stable has a readable outout to debug log but since train, it is hard to see what is failing
21:36:47 <flwang> brtknr: it's related to the py3 support i think
21:37:00 <flwang> it's a formating issue i would say
21:37:22 <flwang> in other words, we still get the same output, but current format is bad
21:38:40 <brtknr> flwang: okay i'll create a story for this as a reminder to investigate
21:38:50 <strigazi> we can use logging into a file?
21:39:02 <strigazi> but journal is better IMO
21:39:19 <brtknr> strigazi: thats also a good idea... like /var/log/heat-container-agent-output.log?
21:39:31 <strigazi> yeap
21:39:47 <strigazi> os-collect-config should have something
21:40:09 <brtknr> if it is more readable that how it currently is, i'd like that.. but i also prefer journalctl
21:40:12 <flwang> before we fix the formating issue, redirect to a file doesn't help, IMHO
21:42:14 <brtknr> i think the entire debug is getting written to the journal at once upon failure at the moment: https://github.com/openstack/magnum/blob/master/dockerfiles/heat-container-agent/scripts/55-heat-config#L153
21:42:37 <flwang> brtknr: yes
21:42:59 <brtknr> it needs to be written atomically
21:43:11 <flwang> in pretty format :)
21:43:28 <brtknr> i dont understand how it looked pretty before
21:43:30 <flwang> strigazi: did cern do any security review for magnum deployed k8s ?
21:44:05 <flwang> brtknr: basically convert \n to a real breakline
21:45:03 <strigazi> flwang: only from the outside of the cluster. And it is fine
21:45:15 <strigazi> we have also used kube-hunter
21:46:41 <strigazi> shall we wrap?
21:46:49 <flwang> strigazi: cool
21:46:52 <strigazi> anything else to discuss?
21:46:55 <flwang> i'm good
21:46:57 <brtknr> 1 last question about nodegroups
21:46:59 <flwang> brtknr: anything else?
21:47:02 <brtknr> any progress?
21:47:06 <flwang> brtknr: i asked yesterday :D
21:47:09 <brtknr> or plans to?
21:47:56 <strigazi> it is in good shape but the author had some family priorities :)
21:48:06 <strigazi> next week he is back
21:49:01 <flwang> ok, let's wrap this one
21:49:08 <brtknr> ah yes I heard about the paternity :) please send him my congratulations!
21:49:15 <flwang> thank you for joining, strigazi, brtknr
21:49:18 <flwang> #endmeeting