*** bobh has joined #openstack-infra | 00:02 | |
*** sgw has quit IRC | 00:02 | |
*** aaronsheffield has quit IRC | 00:05 | |
*** jamesmcarthur has joined #openstack-infra | 00:06 | |
*** jamesmcarthur has quit IRC | 00:11 | |
*** markvoelker has joined #openstack-infra | 00:11 | |
ianw | clarkb: if you're in a base-jobs sort of mood, https://review.opendev.org/#/c/676120/ would help for arm64 functional test on dib | 00:11 |
---|---|---|
*** markvoelker has quit IRC | 00:15 | |
*** bnemec has joined #openstack-infra | 00:16 | |
*** pkopec has quit IRC | 00:20 | |
*** markvoelker has joined #openstack-infra | 00:26 | |
*** zhurong has quit IRC | 00:27 | |
*** bobh has quit IRC | 00:35 | |
*** bnemec has quit IRC | 00:35 | |
guilhermesp | thanks fungi ! I will let you know when Im done with it | 00:41 |
*** gyee has quit IRC | 00:41 | |
openstackgerrit | Merged opendev/base-jobs master: Fix stable branch path https://review.opendev.org/677555 | 00:46 |
corvus | ianw, clarkb: 677584 lgtm; i'm not around -- we can approve it if ianw can keep an eye on it, or i can do so tomorrow | 00:46 |
ianw | corvus: sure, i can watch it for the rest of the day and make sure nothing obviously explodes | 00:47 |
corvus | ianw: thanks! +w | 00:48 |
*** bobh has joined #openstack-infra | 00:52 | |
*** exsdev has quit IRC | 00:54 | |
openstackgerrit | Merged zuul/zuul-jobs master: upload-logs-swift: fix keystoneauth1 exceptions https://review.opendev.org/677584 | 00:57 |
*** ianychoi has quit IRC | 00:57 | |
*** ianychoi has joined #openstack-infra | 00:59 | |
*** spsurya has joined #openstack-infra | 00:59 | |
*** hongbin has joined #openstack-infra | 01:04 | |
*** ricolin has joined #openstack-infra | 01:05 | |
*** rlandy|ruck has quit IRC | 01:06 | |
*** gregoryo has joined #openstack-infra | 01:07 | |
*** exsdev has joined #openstack-infra | 01:14 | |
*** zzehring has quit IRC | 01:24 | |
*** zzehring has joined #openstack-infra | 01:24 | |
*** sgw has joined #openstack-infra | 01:29 | |
*** igordc has quit IRC | 01:35 | |
openstackgerrit | Merged openstack/diskimage-builder master: dracut-regenerate: catch failures and exit code https://review.opendev.org/676032 | 01:48 |
*** liuyulong has quit IRC | 01:51 | |
*** bobh has quit IRC | 01:54 | |
*** igordc has joined #openstack-infra | 01:58 | |
*** jamesmcarthur has joined #openstack-infra | 02:20 | |
*** michael-beaver has quit IRC | 02:21 | |
openstackgerrit | Merged openstack/diskimage-builder master: block-device-efi : expand disk size calculation https://review.opendev.org/676354 | 02:30 |
*** rh-jelabarre has quit IRC | 02:37 | |
*** bhavikdbavishi has joined #openstack-infra | 02:40 | |
*** jamesmcarthur has quit IRC | 02:41 | |
*** bhavikdbavishi1 has joined #openstack-infra | 02:42 | |
*** jamesmcarthur has joined #openstack-infra | 02:44 | |
*** bhavikdbavishi has quit IRC | 02:44 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 02:44 | |
*** igordc has quit IRC | 02:49 | |
*** bobh has joined #openstack-infra | 02:53 | |
*** bobh has quit IRC | 02:56 | |
*** bhavikdbavishi has quit IRC | 03:03 | |
guilhermesp | appreciate more votes here https://review.opendev.org/#/c/677538/ :) | 03:08 |
*** larainema has joined #openstack-infra | 03:09 | |
fungi | config-core: ^ | 03:14 |
*** bhavikdbavishi has joined #openstack-infra | 03:20 | |
openstackgerrit | Merged openstack/project-config master: Add os_murano project https://review.opendev.org/677538 | 03:25 |
*** jamesmcarthur has quit IRC | 03:26 | |
fungi | guilhermesp: it could be up to an hour from when that merged until the repository exists in gerrit/gitea but after that it should be safe to recheck the corresponding governance change | 03:27 |
guilhermesp | thanks for the time perspective fungi I was going to wait a bit, but is late here hahaha tomorrow morning I will do it. But if you would be around, feel free to recheck it too! | 03:30 |
fungi | sure, happy to | 03:31 |
guilhermesp | nice thanks! | 03:31 |
*** igordc has joined #openstack-infra | 03:33 | |
*** psachin has joined #openstack-infra | 03:33 | |
*** janki has joined #openstack-infra | 03:41 | |
*** diga has joined #openstack-infra | 03:57 | |
*** hongbin has quit IRC | 04:06 | |
AJaeger | infra-root, Zuul shows a config error with " | 04:09 |
AJaeger | Gerrit error executing git-upload-pack openstack/openstack-ansible-os_murano" | 04:09 |
fungi | it just got created | 04:11 |
fungi | may be a race in zuul configuration getting applied sooner than manage-projects runs | 04:11 |
AJaeger | ah, thanks! | 04:13 |
AJaeger | infra-root, could you delete the extra directory https://docs.openstack.org/python-tripleoclient/stable/stein/ from our servers, please? I just checked and all looks fine, so https://docs.openstack.org/python-tripleoclient/stein/ was just updated | 04:27 |
openstackgerrit | Merged opendev/base-jobs master: mirror-info: add ubuntu-ports https://review.opendev.org/676120 | 04:30 |
ianw | AJaeger: np, removed | 04:31 |
ianw | speaking of afs volumes, yum-puppetlabs i've unlocked and is currently running a release; it was the same problem as the others | 04:32 |
*** jaosorior has quit IRC | 04:34 | |
openstackgerrit | Ian Wienand proposed openstack/project-config master: AFS grafana : add yum-puppetlabs https://review.opendev.org/677601 | 04:38 |
*** ramishra has joined #openstack-infra | 04:41 | |
*** soniya29 has joined #openstack-infra | 04:44 | |
AJaeger | thanks, ianw | 04:47 |
*** ykarel|away has joined #openstack-infra | 04:48 | |
AJaeger | yeah, config error disappeared - thanks, fungi | 04:53 |
AJaeger | config-core, we now can use promote for the infra jobs, please review https://review.opendev.org/677540 | 04:57 |
*** kopecmartin|off is now known as kopecmartin | 05:08 | |
*** udesale has joined #openstack-infra | 05:10 | |
*** udesale has quit IRC | 05:14 | |
*** raukadah is now known as chkumar|rover | 05:17 | |
*** odicha has joined #openstack-infra | 05:19 | |
*** dave-mccowan has quit IRC | 05:21 | |
*** odicha has quit IRC | 05:23 | |
*** soniya29 has quit IRC | 05:24 | |
*** odicha has joined #openstack-infra | 05:34 | |
*** ykarel|away is now known as ykarel | 05:35 | |
*** jaosorior has joined #openstack-infra | 05:49 | |
*** dpawlik has joined #openstack-infra | 06:20 | |
*** jaosorior has quit IRC | 06:21 | |
*** igordc has quit IRC | 06:30 | |
*** dciabrin has joined #openstack-infra | 06:30 | |
*** ianychoi has quit IRC | 06:34 | |
*** ianychoi has joined #openstack-infra | 06:34 | |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Add starlingx promote jobs https://review.opendev.org/677647 | 06:41 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Add starlingx promote jobs https://review.opendev.org/677647 | 06:44 |
openstackgerrit | Ian Wienand proposed openstack/project-config master: AFS grafana : add yum-puppetlabs https://review.opendev.org/677601 | 06:45 |
AJaeger | ianw: one more pasto ^ | 06:49 |
AJaeger | I marked initially the wrong line - got confused, sorry | 06:49 |
openstackgerrit | Ian Wienand proposed openstack/project-config master: AFS grafana : add yum-puppetlabs https://review.opendev.org/677601 | 06:50 |
ianw | AJaeger: ah, makes sense :) thanks for checking | 06:50 |
*** kjackal has joined #openstack-infra | 06:53 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Overriding max. starting builds. https://review.opendev.org/670461 | 07:00 |
*** ianychoi has quit IRC | 07:00 | |
*** ianychoi has joined #openstack-infra | 07:01 | |
*** jaosorior has joined #openstack-infra | 07:04 | |
*** trident has quit IRC | 07:10 | |
*** rcernin has quit IRC | 07:14 | |
openstackgerrit | Merged openstack/openstack-zuul-jobs master: Use promote for publish-tox-docs-infra https://review.opendev.org/677540 | 07:15 |
openstackgerrit | Merged openstack/project-config master: AFS grafana : add yum-puppetlabs https://review.opendev.org/677601 | 07:17 |
*** trident has joined #openstack-infra | 07:17 | |
*** udesale has joined #openstack-infra | 07:24 | |
openstackgerrit | Andreas Jaeger proposed openstack/openstack-zuul-jobs master: Revert "Use promote for publish-tox-docs-infra" https://review.opendev.org/677656 | 07:32 |
AJaeger | ianw, frickler, tox-docs needs updating first - sorry. ^ | 07:33 |
yoctozepto | another case of no private ipv4 address: https://7ee223d0d079934adc99-de5ae15935168409da4576fce7897429.ssl.cf5.rackcdn.com/677228/2/check/kolla-ansible-centos-source/982f9e3/ara-report/ :-( | 07:36 |
yoctozepto | sad it hits the retry limit | 07:37 |
openstackgerrit | Andreas Jaeger proposed openstack/openstack-zuul-jobs master: Use promote job for infra https://review.opendev.org/677657 | 07:38 |
AJaeger | so, this is the proper way ^ | 07:38 |
* AJaeger will self approve the revert 677656 - and then you can review 677657 at leisure... | 07:39 | |
*** rpittau|afk is now known as rpittau | 07:40 | |
*** jpena|off is now known as jpena | 07:47 | |
openstackgerrit | Andreas Jaeger proposed opendev/base-jobs master: Make names in promote job unique https://review.opendev.org/677662 | 07:49 |
*** jtomasek has joined #openstack-infra | 07:51 | |
openstackgerrit | Andreas Jaeger proposed opendev/base-jobs master: Make names in promote job unique https://review.opendev.org/677662 | 07:51 |
*** ralonsoh has joined #openstack-infra | 07:52 | |
openstackgerrit | Merged openstack/openstack-zuul-jobs master: Revert "Use promote for publish-tox-docs-infra" https://review.opendev.org/677656 | 07:55 |
*** e0ne has joined #openstack-infra | 07:56 | |
*** zbr is now known as zbr|ooo | 07:56 | |
openstackgerrit | Fatih Degirmenci proposed opendev/glean master: Test glean gates https://review.opendev.org/677665 | 07:56 |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Make direct-push configurable on project-level https://review.opendev.org/677109 | 07:58 |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Implement push job in merger https://review.opendev.org/677110 | 07:58 |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Push changes in GerritReporter if direct-push is enabled https://review.opendev.org/677111 | 07:58 |
*** jaosorior has quit IRC | 08:00 | |
*** lucasagomes has joined #openstack-infra | 08:01 | |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Implement push job in merger https://review.opendev.org/677110 | 08:08 |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Push changes in GerritReporter if direct-push is enabled https://review.opendev.org/677111 | 08:08 |
*** jtomasek has quit IRC | 08:09 | |
*** dtantsur|afk is now known as dtantsur | 08:11 | |
*** pkopec has joined #openstack-infra | 08:12 | |
*** jtomasek has joined #openstack-infra | 08:15 | |
*** gfidente|afk is now known as gfidente | 08:15 | |
*** ykarel is now known as ykarel|lunch | 08:15 | |
cgoncalves | hey Infra folks! this patch https://review.opendev.org/#/c/676120/ broke Octavia gate | 08:18 |
cgoncalves | https://object-storage-ca-ymq-1.vexxhost.net/v1/86bbbcfa8ad043109d2d7af530225c72/logs_55/639155/8/check/octavia-v2-dsvm-scenario/175ad61/controller/logs/dib-build/amphora-x64-haproxy.qcow2_log.txt.gz | 08:18 |
cgoncalves | /etc/ci/mirror_info.sh: line 54: NODEPOOLMIRROR_HOST: unbound variable | 08:18 |
rm_work | :( | 08:18 |
rm_work | yeah, missing underscore | 08:18 |
cgoncalves | right. unless someone is pushing a patch right now I'll do it | 08:19 |
rm_work | i assume you should just do it | 08:19 |
rm_work | then more of them can +2 :) | 08:19 |
AJaeger | rm_work: can you sent a patch? | 08:20 |
AJaeger | cgoncalves: ah, please go ahead... | 08:20 |
openstackgerrit | Chandan Kumar (raukadah) proposed opendev/base-jobs master: Fixed NODEPOOL_MIRROR_HOST typo https://review.opendev.org/677669 | 08:25 |
chkumar|rover | ianw: AJaeger: https://review.opendev.org/#/c/677669/ can we got this merged? | 08:25 |
*** gregoryo has quit IRC | 08:25 | |
chkumar|rover | it broke the tripleo ci | 08:25 |
openstackgerrit | Carlos Goncalves proposed opendev/base-jobs master: Fix NODEPOOL_UBUNTU_PORTS_MIRROR default value https://review.opendev.org/677670 | 08:26 |
AJaeger | cgoncalves: chkumar|rover was faster ^ | 08:26 |
AJaeger | cgoncalves, rm_work, chkumar|rover: I approved 677669 | 08:26 |
AJaeger | Thanks - and sorry for the breakage! | 08:27 |
chkumar|rover | AJaeger: thanks :-) | 08:27 |
cgoncalves | abandoned | 08:27 |
AJaeger | cgoncalves: thanks! | 08:29 |
*** tkajinam has quit IRC | 08:29 | |
*** piotrowskim has joined #openstack-infra | 08:30 | |
openstackgerrit | Merged opendev/base-jobs master: Fixed NODEPOOL_MIRROR_HOST typo https://review.opendev.org/677669 | 08:32 |
*** ianychoi has quit IRC | 08:32 | |
*** ianychoi has joined #openstack-infra | 08:33 | |
AJaeger | cgoncalves, chkumar|rover, rm_work, change is merged ^ | 08:35 |
*** ociuhandu has joined #openstack-infra | 08:35 | |
cgoncalves | cool, thanks | 08:35 |
rm_work | \o/ | 08:45 |
*** elod_off is now known as elod | 08:47 | |
*** Lucas_Gray has joined #openstack-infra | 08:51 | |
*** yamamoto has joined #openstack-infra | 08:53 | |
*** Lucas_Gray has quit IRC | 09:10 | |
*** jtomasek has quit IRC | 09:11 | |
*** Lucas_Gray has joined #openstack-infra | 09:12 | |
*** jaosorior has joined #openstack-infra | 09:19 | |
*** ykarel|lunch is now known as ykarel | 09:25 | |
*** derekh has joined #openstack-infra | 09:26 | |
*** yamamoto has quit IRC | 09:28 | |
ykarel | Hi which job publishes deploy-guide to docs.openstack.org/project-deploy-guide | 09:51 |
ykarel | is it publish-openstack-docs-pti or publish-deploy-guide, background:- https://review.opendev.org/#/c/677661 | 09:52 |
AJaeger | ykarel: publish-deploy-guide | 09:52 |
ykarel | ramishra, ^^ | 09:52 |
ykarel | AJaeger, Thanks | 09:52 |
AJaeger | use this: https://docs.openstack.org/infra/openstack-zuul-jobs/project-templates.html#project_template-deploy-guide-jobs | 09:53 |
AJaeger | ykarel, ramishra ^ | 09:53 |
ykarel | AJaeger, yes proposed this only https://review.opendev.org/#/c/677661/2/.zuul.yaml | 09:54 |
ykarel | but had some doubts so confirming here | 09:54 |
*** rpittau is now known as rpittau|bbl | 10:14 | |
*** sshnaidm|afk is now known as sshnaidm | 10:23 | |
*** gfidente has quit IRC | 10:35 | |
*** yamamoto has joined #openstack-infra | 10:36 | |
*** gfidente has joined #openstack-infra | 10:41 | |
*** kjackal has quit IRC | 10:43 | |
*** jchhatbar has joined #openstack-infra | 10:47 | |
openstackgerrit | Roman Gorshunov proposed openstack/openstack-zuul-jobs master: Fix propose-translation-update job failure on wrong po format https://review.opendev.org/677696 | 10:47 |
*** janki has quit IRC | 10:49 | |
*** ociuhandu has quit IRC | 10:49 | |
jrosser | does anyone know what this means? https://zuul.opendev.org/t/openstack/build/4867c6cab168436e92db7753836c3a5d/log/job-output.txt#2947 | 10:53 |
jrosser | /etc/ci/mirror_info.sh: line 54: NODEPOOLMIRROR_HOST: unbound variable | 10:54 |
chkumar|rover | jrosser: it got fixed | 10:54 |
jrosser | ooooh - recheck then | 10:54 |
chkumar|rover | jrosser: https://review.opendev.org/#/c/677669/ | 10:54 |
*** yamamoto has quit IRC | 10:55 | |
pabelanger | I guess we didn't test that in base-test first? | 10:56 |
*** owalsh is now known as owalsh|away | 11:08 | |
*** ociuhandu has joined #openstack-infra | 11:20 | |
*** kjackal has joined #openstack-infra | 11:23 | |
roman_g | AJaeger: can't see change in i18n :) | 11:23 |
*** ociuhandu has quit IRC | 11:25 | |
*** udesale has quit IRC | 11:26 | |
*** udesale has joined #openstack-infra | 11:27 | |
AJaeger | roman_g: nobody will be able to find the warning you add. And your implementation is broken, it REMOVES files. | 11:27 |
*** dayou has quit IRC | 11:28 | |
*** yikun has joined #openstack-infra | 11:28 | |
AJaeger | ianychoi: any feedback on https://review.opendev.org/677696 ? I'm completely opposed since I think it makes the situation worse (if the implementation would do what it should do) | 11:28 |
roman_g | AJaeger: yeah, I saw that I actually remove whole translation. Will update PS. | 11:29 |
AJaeger | roman_g: I still don't like the approach since it is impossible to find those WARNINGS. | 11:30 |
roman_g | > Let's fix the i18n repo to warn about "Leading/trailing newline (\n)" - so that translators see that specific error. I just did that ;) | 11:35 |
roman_g | AJaeger: where? | 11:35 |
roman_g | could i have a look? | 11:35 |
*** rh-jelabarre has joined #openstack-infra | 11:37 | |
*** jpena is now known as jpena|lunch | 11:39 | |
*** dayou has joined #openstack-infra | 11:39 | |
AJaeger | roman_g: I just updated translate.openstack.org - that's an option in the project and it was not enabled. It's there now.. | 11:47 |
AJaeger | I guess that's something for admins only ( ianychoi and myself are admins) | 11:47 |
*** jamesmcarthur has joined #openstack-infra | 11:49 | |
*** rlandy has joined #openstack-infra | 11:50 | |
*** rlandy is now known as rlandy|ruck | 11:50 | |
roman_g | AJaeger: I actually saw that error in old UI. But I was working in new UI, and it was not showing there, thus I did this mistype with additional newlines | 11:52 |
AJaeger | roman_g: you got a warning - now it's an ERROR ;) | 11:53 |
AJaeger | and AFAIK zanata will not export ERROR strings ;) | 11:53 |
roman_g | AJaeger: cool. so it shouldn't let me save translation if format in translation differs? | 11:53 |
AJaeger | not sure - but AFAIK the import would ignore that entry | 11:55 |
AJaeger | you would need to experiment to be sure ;) | 11:55 |
* AJaeger will be offline for some time now | 11:56 | |
*** rlandy|ruck is now known as rlandy|ruck|mtg | 11:58 | |
roman_g | AJaeger: works very good. Thanks! Need to have the same ERROR for other possible errors, e.g. "Unexpected variable: %s" and similar | 11:58 |
*** udesale has quit IRC | 12:02 | |
*** jamesmcarthur has quit IRC | 12:07 | |
*** jamesmcarthur has joined #openstack-infra | 12:08 | |
*** rpittau|bbl is now known as rpittau | 12:10 | |
fdegir | infra-core: are you aware of issues with glean gates? | 12:13 |
fdegir | infra-core: opensuse job for this change https://review.opendev.org/#/c/652238/ fails but the failure is not related to what changed as I tried a dummy change with no code impact and it failed for it to? | 12:14 |
*** Lucas_Gray has quit IRC | 12:20 | |
*** jamesmcarthur has quit IRC | 12:28 | |
*** larainema has quit IRC | 12:29 | |
*** udesale has joined #openstack-infra | 12:32 | |
*** rfolco has quit IRC | 12:33 | |
*** sgw has quit IRC | 12:33 | |
*** jchhatba_ has joined #openstack-infra | 12:39 | |
*** rfolco has joined #openstack-infra | 12:39 | |
*** jpena|lunch is now known as jpena | 12:40 | |
*** jchhatbar has quit IRC | 12:42 | |
*** jchhatba_ has quit IRC | 12:43 | |
*** larainema has joined #openstack-infra | 12:43 | |
*** rlandy|ruck|mtg is now known as rlandy|ruck | 12:43 | |
*** ociuhandu has joined #openstack-infra | 12:46 | |
*** ociuhandu has quit IRC | 12:47 | |
*** ociuhandu has joined #openstack-infra | 12:47 | |
AJaeger | roman_g: there are some more options - best discuss with ianychoi... | 12:49 |
AJaeger | dirk, evrardjp , can you help fdegir , please? | 12:50 |
*** jamesmcarthur has joined #openstack-infra | 12:50 | |
guilhermesp | btw fungi is this the correct hold ssh root@162.242.235.61 ? Seems that the uptime of the instance is 22 days | 12:53 |
*** sgw has joined #openstack-infra | 12:53 | |
guilhermesp | and it is a xenial server | 12:53 |
guilhermesp | just as a reminder | 12:53 |
guilhermesp | that's the autohold | 12:53 |
guilhermesp | zuul autohold --tenant openstack --project openstack/openstack-ansible-os_heat --job openstack-ansible-deploy-aio_distro_metal-ubuntu-bionic --change 672948 --reason "guilhermesp debugging mysterious failure which won't reproduce locally" --count 1 | 12:53 |
*** aaronsheffield has joined #openstack-infra | 12:54 | |
guilhermesp | but seems that this is not the correct hold maybe | 12:54 |
guilhermesp | is there any kind o f caching of old instances and maybe it is pointing to an old instnace that is shown as deleted but actually still exists? | 12:54 |
guilhermesp | also, I can't see somthing related to the openstack-ansible deployment in there so... i think that this might be a wrong instance | 12:55 |
*** eharney has joined #openstack-infra | 12:55 | |
*** ociuhandu has quit IRC | 13:00 | |
*** sthussey has joined #openstack-infra | 13:10 | |
frickler | guilhermesp: yes that looks like the wrong instance, 104.130.127.203 seems to be yours | 13:10 |
mordred | infra-root: I'm going to be basically out again today - sorry for the lack of warning. we're in the transition between locations for the first time and it's turned out to need much more time than we budgeted. | 13:10 |
frickler | guilhermesp: can you point me to your ssh key then I can set it up for you | 13:11 |
guilhermesp | sure frickler a sec | 13:11 |
guilhermesp | there it is https://github.com/guilhermesteinmuller.keys | 13:11 |
frickler | guilhermesp: ok you should have access now | 13:14 |
guilhermesp | cool frickler Im in | 13:14 |
guilhermesp | :) | 13:14 |
mnaser | when is the next rename scheduled? | 13:16 |
mnaser | https://review.opendev.org/#/c/669298/ -- this has been sitting around for 1.5 months now, its only beacause its also blocking a governance change thats been up as long | 13:17 |
*** tesseract has joined #openstack-infra | 13:22 | |
*** tesseract has quit IRC | 13:22 | |
AJaeger | config-core, could you review https://review.opendev.org/#/c/677547/ and https://review.opendev.org/677657 , please? | 13:24 |
*** Lucas_Gray has joined #openstack-infra | 13:29 | |
fungi | guilhermesp: sorry about that! i must have mis-copied from another hold someone hadn't cleaned up yet | 13:37 |
*** mriedem has joined #openstack-infra | 13:39 | |
*** eharney has quit IRC | 13:40 | |
*** Goneri has joined #openstack-infra | 13:42 | |
*** zbr|ooo is now known as zbr | 13:47 | |
*** udesale has quit IRC | 13:47 | |
fungi | fdegir: i don't know much specifically about opensuse, but i think the error may be related to a missing etc/resolv.conf (or missing etc entirely?) in the chroot: https://zuul.opendev.org/t/openstack/build/707eff03f5b2484b958a1837a5b08264/log/nodepool/builds/test-image-0000000001.log#3579 | 13:48 |
mnaser | question - has anyone ever wanted a ci node *without* unbound? | 13:48 |
mnaser | i'm having issues deploying k8s in ci because coredns seems to have some loop protection (it grabs whatever is in /etc/resolv.conf as forwarders) | 13:48 |
fungi | fdegir: this may be a recent regression, all our tumbleweed images in production are over 1.5 days old now too so should have been rebuilt more recently | 13:48 |
mnaser | and given that it takes 127.0.0.1 -- it configures itself to forward to that address and then turns off to avoid a loop | 13:49 |
fungi | mnaser: you could probably uninstall unbound, but then you'll have no local caching (unless kubernetes provides that functionality for you) | 13:49 |
mnaser | im just trying to think from a software pov on how to best workaround this | 13:52 |
fungi | fdegir: hrm, though our production tumbleweed image builds seem to be failing on a missing package in our mirror instead: | 13:52 |
fungi | 2019-08-21 13:36:10.291 | File './x86_64/openSUSE-release-ftp-20190815-227.1.x86_64.rpm' not found on medium 'http://mirror.dfw.rax.openstack.org/opensuse/tumbleweed/repo/oss/' | 13:52 |
*** ykarel is now known as ykarel|afk | 13:53 | |
fungi | mnaser: i think we must have a workaround somewhere because there are other jobs installing kubernetes on test nodes, and i recall this challenge coming up | 13:53 |
*** bhavikdbavishi has quit IRC | 13:54 | |
mnaser | i think if we actually just set the /etc/resolv.conf servers to $host_ip it might work around it | 13:54 |
guilhermesp | fungi: no worries! | 13:55 |
mnaser | because then when it slurps it on start up, it won't be looping | 13:55 |
fungi | mnaser: if unbound is listening on that interface i guess, and if iptables isn't preventing stuff on the node from reaching it | 13:55 |
*** bnemec has joined #openstack-infra | 13:55 | |
* mnaser is just trying to take into consideartion people that might do the same thing inside the software | 13:56 | |
mnaser | so not just solve it for zuul but working around for users who might be doing the same thing | 13:56 |
fungi | right, i haven't personally been very involved in the way any of the kubernetes testing jobs are designed, so off the top of my head i don't know what workaround(s) got implemented | 13:57 |
*** ykarel|afk has quit IRC | 14:02 | |
Shrews | we use minikube, iirc, in our k8s test jobs in nodepool (so not a production k8s install). not sure if that makes any difference | 14:04 |
clarkb | mnaser: the workaround others have used is to read unbounds config amd configure other resolvers to use that same config or in the case of docker just always use google dns | 14:05 |
clarkb | I dont think you should disable unbound as the host still needs dns | 14:05 |
clarkb | lxc is the best when it comes to this imo | 14:06 |
clarkb | they set up a dnsmasq to forward to host resolver if it is there | 14:06 |
fungi | dirk: evrardjp: i'm not quite sure what to make of the tumbleweed image build failures we're seeing in production... seems zypper is looking for opensuse/tumbleweed/repo/oss/x86_64/openSUSE-release-ftp-20190815-227.1.x86_64.rpm on our mirrors, but we mirror from http://mirror.us.leaseweb.net/opensuse/tumbleweed/repo/oss/x86_64/ and i don't see it listed there either | 14:07 |
fungi | looks like we've been unable to build tumbleweed images for at least 12 hours | 14:07 |
dirk | fungi: we're currently renaming the opensuse products, some fallout is expected I guess | 14:07 |
dirk | I'll take a closer look after meeting madness ( ~ 3 hours) | 14:08 |
fungi | okay, should we disable our ci jobs which test building tumbleweed too? | 14:08 |
dirk | no | 14:08 |
dirk | we're just going to lose the opensuse prefix (so distros will be called leap and tumbleweed) | 14:08 |
fungi | seeing a different failure for those very recently and can't merge glean changes because the job that tests we can build a tumbleweed image with dib breaks | 14:08 |
Shrews | did opensuse-423 get fixed? | 14:09 |
dirk | can you point me to the revie wthat you need? | 14:09 |
clarkb | Shrews: I think 42 is being removed | 14:09 |
clarkb | fungi what glean change? | 14:09 |
fungi | dirk: https://review.opendev.org/652238 is failing dib-nodepool-functional-openstack-opensuse-tumbleweed-src because it can't build a tumbleweed image | 14:10 |
Shrews | clarkb: i put up a change to remove it, but AJaeger pointed me to a change to fix it | 14:10 |
dirk | Shrews: the 423 thing should be fixec by https://review.opendev.org/#/c/677188/ but it looks like I need to reiterate this | 14:10 |
Shrews | dirk: *nod* | 14:11 |
fungi | dirk: though the failure there seems to be about a missing etc/resolv.conf in the chroot dib is building | 14:11 |
fungi | dirk: https://zuul.opendev.org/t/openstack/build/707eff03f5b2484b958a1837a5b08264/log/nodepool/builds/test-image-0000000001.log#3579 | 14:11 |
dirk | certainly anything using opensuse-423 should be replaced by opensuse-15 | 14:11 |
fungi | dirk: the closer i look at that error, the more i suspect etc is entirely missing in the chroot even | 14:12 |
fungi | so there may be something breaking earlier in the image build still | 14:12 |
clarkb | fungi: fwiw I feel like we've tested that sync behavior and its been fine? | 14:13 |
clarkb | I think when python exits it flushes? maybe the early close affects that? | 14:13 |
*** rfolco is now known as not_rlandy | 14:13 | |
clarkb | that was a suspected issue when we had the fn ipv6 issues | 14:14 |
clarkb | but ruled out | 14:14 |
*** not_rlandy is now known as folco | 14:14 | |
*** folco is now known as rfolco | 14:14 | |
fungi | clarkb: yeah, i'm not sure under what circumstances fdegir is seeing it | 14:14 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/project-config master: Add uwsgi role to gerritbot https://review.opendev.org/677738 | 14:14 |
*** portdirect has quit IRC | 14:14 | |
*** sdoran has quit IRC | 14:14 | |
*** coreycb has quit IRC | 14:14 | |
*** eharney has joined #openstack-infra | 14:15 | |
*** aprice has quit IRC | 14:15 | |
*** dougwig has quit IRC | 14:15 | |
*** kmalloc has quit IRC | 14:15 | |
*** evgenyl has quit IRC | 14:15 | |
*** petevg has quit IRC | 14:15 | |
*** tiffanie has quit IRC | 14:15 | |
*** sthussey has quit IRC | 14:16 | |
*** mordred has quit IRC | 14:16 | |
*** _Cyclone_ has quit IRC | 14:16 | |
*** zxiiro has quit IRC | 14:16 | |
*** cmurphy has quit IRC | 14:16 | |
*** sparkycollier has quit IRC | 14:16 | |
*** jamespage has quit IRC | 14:16 | |
*** philroche has quit IRC | 14:16 | |
*** jbryce has quit IRC | 14:16 | |
*** mgagne has quit IRC | 14:16 | |
*** csatari has quit IRC | 14:16 | |
*** aaronsheffield has quit IRC | 14:16 | |
*** logan- has quit IRC | 14:16 | |
*** crodriguez has quit IRC | 14:16 | |
*** irclogbot_2 has quit IRC | 14:17 | |
*** adriancz has joined #openstack-infra | 14:17 | |
*** _Cyclone_ has joined #openstack-infra | 14:17 | |
*** logan_ has joined #openstack-infra | 14:17 | |
*** mgagne has joined #openstack-infra | 14:17 | |
*** irclogbot_2 has joined #openstack-infra | 14:17 | |
*** dougwig has joined #openstack-infra | 14:18 | |
*** logan_ is now known as logan- | 14:18 | |
*** tiffanie has joined #openstack-infra | 14:18 | |
*** portdirect has joined #openstack-infra | 14:18 | |
*** coreycb has joined #openstack-infra | 14:18 | |
*** sthussey has joined #openstack-infra | 14:19 | |
*** crodriguez has joined #openstack-infra | 14:19 | |
*** sparkycollier has joined #openstack-infra | 14:19 | |
*** philroche has joined #openstack-infra | 14:19 | |
*** aaronsheffield has joined #openstack-infra | 14:19 | |
*** kmalloc has joined #openstack-infra | 14:19 | |
*** aprice has joined #openstack-infra | 14:19 | |
*** csatari has joined #openstack-infra | 14:19 | |
*** sdoran has joined #openstack-infra | 14:19 | |
*** rosmaita has joined #openstack-infra | 14:19 | |
*** evgenyl has joined #openstack-infra | 14:19 | |
*** jbryce has joined #openstack-infra | 14:19 | |
*** Jeffrey4l_ has quit IRC | 14:19 | |
*** jamespage has joined #openstack-infra | 14:19 | |
*** zxiiro has joined #openstack-infra | 14:20 | |
*** Jeffrey4l has joined #openstack-infra | 14:21 | |
*** mordred has joined #openstack-infra | 14:26 | |
*** petevg has joined #openstack-infra | 14:26 | |
*** ykarel|afk has joined #openstack-infra | 14:27 | |
*** ykarel|afk is now known as ykarel | 14:28 | |
*** rlandy|ruck is now known as rlandy|ruck|mtg | 14:29 | |
mnaser | hmm | 14:35 |
mnaser | we have the systemd-resolved service running in our ci images | 14:36 |
mnaser | yet we kinda override /etc/resolv.conf directly | 14:36 |
*** _Cyclone_ has quit IRC | 14:36 | |
*** jeliu_ has joined #openstack-infra | 14:37 | |
*** _Cyclone_ has joined #openstack-infra | 14:39 | |
*** ociuhandu has joined #openstack-infra | 14:39 | |
*** chkumar|rover is now known as raukadah | 14:40 | |
*** mattw4 has joined #openstack-infra | 14:40 | |
*** jaosorior has quit IRC | 14:41 | |
mnaser | so i think we should configure systemd-resolved to point towards it instead? | 14:42 |
sshnaidm | clarkb, is it ok that in some logs storage logs files are not gzipped? https://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/logs_70/664170/13/check/tripleo-ci-centos-7-scenario012-multinode-oooq-container/c59c8dc/ | 14:43 |
sshnaidm | clarkb, like job-output.txt instead of job-output.txt.gz | 14:43 |
clarkb | mnaser: I believe systemd-resolvd alreary does the correct thing in this case (it will ask unbound if it gets a request over dbus iirc) | 14:44 |
mnaser | clarkb: afaik usually what happens is /etc/resolv.conf should point to 127.0.0.53 and then systemd-resolved will do all the magic it needs | 14:44 |
mnaser | and then /run/systemd/resolve/resolv.conf is the 'content' -- right now it seems like its overridden | 14:44 |
clarkb | sshnaidm: they are compressed in transit and storage by openstacksdk and swift aiui | 14:44 |
clarkb | sshnaidm: if you wget the file you get it back in deflate encoding | 14:45 |
sshnaidm | clarkb, ok | 14:45 |
*** rlandy|ruck|mtg is now known as rlandy|ruck | 14:45 | |
clarkb | mnaser: well we dont want systemd resolvd to be the primary resolver | 14:46 |
mnaser | right, in that case i believe the service is doing nothing? | 14:46 |
clarkb | perfect :) | 14:46 |
mnaser | so we should probably shut it down then to avoid confusion in ci (or configure it, and point it towards unbound as a forwarder) | 14:46 |
openstackgerrit | Jeff Liu proposed zuul/zuul-operator master: Add PerconaXDB Cluster to Zuul-Operator https://review.opendev.org/677315 | 14:46 |
mnaser | (as someone building deployment tooling against nodepool vms, its a "broken scenario" if systemd-resolved is running but /etc/resolv.conf isn't pointing to it) | 14:47 |
*** cmurphy has joined #openstack-infra | 14:48 | |
clarkb | why? isnt resolv.conf canonical? | 14:49 |
clarkb | that is what libc will look at | 14:49 |
clarkb | I guess if you want dns over dbus it might be confusing if systemd resolvd doesnt look at resolv.conf (I thought it did) | 14:49 |
*** raissa has joined #openstack-infra | 14:49 | |
mgoddard | Hi there infra, is there a gerrit restart planned? I'm waiting on a project rename (x/kayobe -> openstack/kayobe) | 14:50 |
clarkb | mgoddard: I had been expecting more rename requests after the opendev reorg so was waiting. But we continue to not have those so we shoyld probably go ahead and schedule something with the list we do have | 14:52 |
mgoddard | clarkb: would be appreciated :) | 14:52 |
pabelanger | mnaser: clarkb: I want to say we did stop systemd-resolve before, but reverted it | 14:53 |
mnaser | clarkb: resolv.conf should be a symlink to /run/systemd/resolve/resolv.conf when using systemd-resolved | 14:54 |
mnaser | which will include all the forwarders configured in systemd-resolved there | 14:54 |
mnaser | https://www.irccloud.com/pastebin/WhfQ4ScW/ | 14:55 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add caching of autohold requests https://review.opendev.org/663412 | 14:55 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add autohold-info CLI command https://review.opendev.org/662487 | 14:55 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Record held node IDs with autohold request https://review.opendev.org/662498 | 14:55 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Auto-delete expired autohold requests https://review.opendev.org/663762 | 14:55 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Mark nodes as USED when deleting autohold https://review.opendev.org/664060 | 14:55 |
mnaser | if we make it a symlink, then the behaviour won't change, but at least systemd-resolved will be doing the right thing(tm) | 14:55 |
clarkb | your paste looks exactly ad I exppected | 14:55 |
mnaser | imho /etc/resolv.conf should be a symlink to /run/systemd/resolve/resolv.conf | 14:56 |
clarkb | libc will resolve against unbound anything that goes to systemd resolvd will too (dbus) | 14:56 |
pabelanger | mnaser: https://opendev.org/openstack/project-config/src/branch/master/nodepool/elements/nodepool-base/finalise.d/89-unbound#L203 | 14:56 |
clarkb | mnaser: why? systemd resolvd is not tge primary resolver | 14:56 |
clarkb | it is largely there for dbus I think | 14:56 |
mnaser | right, but if you see the 'forwarder' for systemd-resolved to be 1.1.1.1 (or whatever), then /etc/resolv.conf will update to that value | 14:57 |
clarkb | but we dont want that | 14:57 |
mnaser | so if you set the systemd-resolved forwarder to 127.0.0.1 (aka unbound) then the resolv.conf will update automatically to point to it | 14:57 |
clarkb | you should make that change the other way around | 14:57 |
mnaser | but the behaviour will be correct | 14:57 |
AJaeger | config-core, could you review https://review.opendev.org/#/c/677547/ and https://review.opendev.org/677657 , please? Two more changes for promote... | 14:57 |
clarkb | you update unbounds config and have systemd resolvd resolv against unbound | 14:58 |
mnaser | i agree with you | 14:58 |
mnaser | if we symlink /etc/resolv.conf to /run/systemd/resolve/resolv.conf ... nothing will change, but at least if we change forwaders in systemd-resolved, it will update resolv.conf correctly | 14:58 |
clarkb | I dont understand that | 14:58 |
clarkb | why would we change forwarders in systemd resolvd? | 14:59 |
clarkb | we use unbound as a forwarding caching resolver | 14:59 |
mnaser | the /run/systemd/resolve/resolv.conf file is a dynamically generated file that contains a list of all the 'forwarders' or 'dns' servers configured in systemd-resolved | 14:59 |
fdegir | fungi: clarkb: fyi - glean opensuse job fails for a change that has no code impact: https://review.opendev.org/#/c/677665/ | 14:59 |
mnaser | if you tell systemd-resolved to use 127.0.0.1, it will update /run/systemd/resolve/resolv.conf with 'nameserver 127.0.0.1' | 14:59 |
mnaser | if you tell it to use 1.1.1.1, it will update /run/systemd/resolve/resolv.conf with 'nameserver 1.1.1.1' | 14:59 |
mnaser | so, if we do the symlink, and systemd-resolved is configured to use 127.0.0.1 -- nothing will change, the resolv.conf file will simply be dynamic instead of static | 15:00 |
clarkb | but we want it static | 15:00 |
clarkb | its doing exactly what we want right now | 15:00 |
mnaser | it will continue to be static. until someone like me needs to change it because it doesn't work for me | 15:01 |
mnaser | then i need to update both /etc/resolv.conf and systemd-resolved. | 15:01 |
clarkb | why do you need to change it? | 15:01 |
mnaser | because kubeadm deployments pull in /etc/resolv.conf into the pods | 15:01 |
mnaser | and so the coredns pod goes like 'oh no i am looping' | 15:01 |
mnaser | because /etc/resolv.conf contains 127.0.0.1 and then it tries to do lookups on itself | 15:01 |
clarkb | aiui you configure that on the k8s dide | 15:01 |
mnaser | and im trying to build ansible playbooks that operate under the assumption that a system is _properly_ configured | 15:01 |
clarkb | you say dns is at $server | 15:02 |
clarkb | it is properly configured | 15:02 |
mnaser | in this case, a running systemd-resolved service + static /etc/resolv.conf is a combo that doesnt make sense | 15:02 |
mnaser | _literally_ nothing changes if we setup a symlink | 15:02 |
mnaser | i dont see why its an issue | 15:02 |
mnaser | its how distros ship by default | 15:02 |
mnaser | the only difference will be the file will be dynamically created by systemd-resolved instead of us manually plopping it down | 15:02 |
mnaser | the same exact content | 15:02 |
mnaser | see https://www.irccloud.com/pastebin/To0qKIVM/ | 15:03 |
clarkb | lets back up | 15:03 |
clarkb | can we accept it is perfectly normal to use a local resolver for caching purposes? in which case a resolv.conf containing localhost is fine? | 15:03 |
mnaser | yes, that's the behaviour i want to keep | 15:03 |
*** jamesmcarthur has quit IRC | 15:04 | |
clarkb | great. Then it is up to anything using namespacing to deal with the fact that a new network namespace's localhost is not necessarily going to have access to htat local resolver? | 15:04 |
mnaser | correct. in my case, i'd want to change the host DNS so that it propagates properly right off the bat | 15:04 |
*** bobh has joined #openstack-infra | 15:05 | |
clarkb | mnaser: well you want ot bypass host dns and configure remote resolvers directly | 15:05 |
clarkb | eg 1.1.1.1 | 15:05 |
*** jtomasek has joined #openstack-infra | 15:05 | |
mnaser | in my playbook, i'd have two possible paths: a) systemd-resovled is not running, /etc/resolv.conf is configured staticly -- so i will override /etc/resolv.conf | 15:05 |
mnaser | or b) systemd-resolved *is* being used and running, /etc/resolv.conf is configured dynamically, so i reconfigure systemd-resolved | 15:05 |
mnaser | but the case we have in nodepool vms is systemd-resolved *is* running, but /etc/resolv.conf is configured staticly | 15:06 |
clarkb | ok that helps. I think you are neglatcing that b) has two subcases | 15:06 |
clarkb | 1) systemd-resolvd is primary resolver and 2) systemd-resolvd is simply part of chain to primary resolver | 15:06 |
clarkb | you want everyone to run system-resolvd in the 1) case | 15:07 |
clarkb | but we run it in the 2) case | 15:07 |
clarkb | because unbound is a better resolver | 15:07 |
*** priteau has joined #openstack-infra | 15:07 | |
*** Lucas_Gray has quit IRC | 15:07 | |
mnaser | i think you can tell systemd-resolved to do nothing more than 'point' the system to the resolvers it's given (which is where i was thinknig at) | 15:07 |
mnaser | so the systemd-resolved would be configured with 127.0.0.1 and it will dynamically generate a resolv.conf file with 'nameserver 127.0.0.1' | 15:08 |
mnaser | and in that case, it is _not_ acting as a dns server | 15:08 |
mnaser | just a /etc/resolv.conf-igurator | 15:08 |
clarkb | mnaser: that is basically what we have right now | 15:08 |
clarkb | mnaser: what you are suggesting is to bypass unbound | 15:08 |
clarkb | if I'm understanding how this will make the containers work | 15:09 |
mnaser | ok so | 15:09 |
clarkb | basically you want to tell systemd-resolvd that 1.1.1.1 is now the resolver which will write that to /etc/resolv.conf | 15:09 |
clarkb | then containers will use 1.1.1.1 | 15:09 |
mnaser | "systemd-resolved maintains the /run/systemd/resolve/stub-resolv.conf file for compatibility with traditional Linux programs. This file may be symlinked from /etc/resolv.conf. This file lists the 127.0.0.53 DNS stub (see above) as the only DNS server." | 15:09 |
mnaser | "systemd-resolved maintains the /run/systemd/resolve/resolv.conf file for compatibility with traditional Linux programs. This file may be symlinked from /etc/resolv.conf and is always kept up-to-date, containing information about all known DNS servers." | 15:09 |
mnaser | i dont want us to use stub-resolv.conf | 15:10 |
mnaser | but if we use the second method, then yes, we can configure systemd-resolved to use 127.0.0.1 and the behaviour won't change | 15:11 |
mnaser | "Note that /run/systemd/resolve/resolv.conf should not be used directly by applications, but only through a symlink from /etc/resolv.conf. If this mode of operation is used local clients that bypass any local DNS API will also bypass systemd-resolved and will talk directly to the known DNS servers." | 15:11 |
*** Lucas_Gray has joined #openstack-infra | 15:11 | |
clarkb | how does that help your container case? it will be equivalent to today with 127.0.0.1 | 15:11 |
mnaser | i can update systemd-resolved to point to $some_other_resolver and everything will work properly | 15:12 |
clarkb | mnaser: and /etc/resolv.conf will contain for example 1.1.1.1 right? | 15:12 |
clarkb | (because that is where containers are reading the info) | 15:12 |
mnaser | in this case kubeadm is smart enough to actually read /run/systemd/resolve/resolv.conf and pull resolvers from there | 15:13 |
mnaser | if it sees systemd-resolved is running | 15:13 |
clarkb | and that is the file you want to symlink /etc/resolv.conf to | 15:14 |
clarkb | so /etc/resolv.conf will also have 1.1.1.1 in it | 15:14 |
clarkb | this is what I would like to avoid | 15:14 |
mnaser | right, but im not asking to change default behaviour, by default /etc/resolv.conf _will_ have 127.0.0.1 | 15:14 |
clarkb | mnaser: sure but its still a problem for your running jobs for a cuple reasons | 15:14 |
mnaser | its only if i change systemd-resolved settings, i can assume /etc/resolv.conf will have the right value (as it would be in a _real_ deployment i guess) | 15:15 |
clarkb | basically in your jobs the entire host will stop using the dns resolver we've setup to add caching and ensure we are resolving against the correct IPs for the current cloud region | 15:15 |
fungi | reminder: i'm disappearing for a few hours to run errands on another island, but should be back in time for the storyboard meeting | 15:16 |
clarkb | we've had to be very careful particularly where NAT is involved to do this otherwise your dns requests start failing a lot | 15:16 |
mnaser | so what would be a good solution in my case if i want to continue to use the on-host unbound | 15:16 |
clarkb | on ipv6 only clouds we very specifically use ipv6 resolvers to ensure we don't get NAT'd to avoid problems in NAT and being blacklisted for too many requests on the remote side | 15:17 |
mnaser | i dont think unbound listens on the primary ip, does it? | 15:17 |
clarkb | mnaser: it does not because we don't want to be used to ddos | 15:17 |
mnaser | i figured as much | 15:17 |
* mnaser thinks | 15:17 | |
clarkb | mnaser: what I've seen other people do are tell k8s to use the resolvers that we've configured unbound to use | 15:17 |
clarkb | because that ensures we are avoiding NAT if we have to. You don't get caching though | 15:18 |
clarkb | the LXC solution is the best one but I'm not sure how hard it would be to implement for k8s | 15:18 |
clarkb | LXC sets up a localhost to localhost port 53 bridge | 15:18 |
clarkb | (its actually container network IP to host localhost not localhost to localhost) | 15:19 |
AJaeger | dtroyer: could you review https://review.opendev.org/677647 for starlingx as well, please? | 15:19 |
clarkb | docker does the worst thing and tells everything to use 8.8.8.8 | 15:19 |
clarkb | which is still better than forcing the entire host to use 8.8.8.8 | 15:19 |
mnaser | clarkb: yeah.. thing is i dont want to start overriding too much of the default behaviour of kubeadm | 15:20 |
mnaser | so im trying to figure out a way to make it work without it being too nodepool-opinitiated | 15:20 |
mnaser | (my solution to k8s in gate is to build a library that deploys against nodepool vms, then use said library in magnum but deploy against heat vms) | 15:21 |
mnaser | it won't be tested with integration in magnum but we'll at least know the library works | 15:21 |
mnaser | it's.. an interesting workaround the nested virt issue that i'm toying with | 15:21 |
*** gyee has joined #openstack-infra | 15:22 | |
clarkb | kubeadm docs imply this is configurable | 15:22 |
clarkb | but don't directly say how | 15:22 |
mnaser | i looked everywhere here but no avail https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2 | 15:22 |
mnaser | https://github.com/kubernetes/kubernetes/blob/efcb62abff0ee6511cb5f82b9f25d28b17d81912/cmd/kubeadm/app/phases/addons/dns/manifests.go#L305-L330 | 15:24 |
* mnaser hmms | 15:24 | |
clarkb | looks like you supply a different corefile to coredns service | 15:24 |
clarkb | proxy . /etc/resolv.conf is what it does by default but I think you change that? | 15:25 |
clarkb | ya looks like you use that forward directive | 15:25 |
mnaser | clarkb: ok one final thing to bother you on, whats the best way of getting the best dns servers for the nodepool vm | 15:26 |
clarkb | mnaser: right now reading the unbound config is probably best | 15:26 |
clarkb | we might be able to expose that in a more general way then have the unbound role consume that? | 15:26 |
mnaser | so i cant use any of the zuul ansible vars? :< | 15:26 |
clarkb | https://opendev.org/opendev/base-jobs/src/branch/master/roles/configure-unbound is what we currently use to configure unbound | 15:27 |
clarkb | https://opendev.org/opendev/base-jobs/src/branch/master/roles/configure-unbound/tasks/main.yaml#L12-L41 and it uses that logic | 15:27 |
mnaser | so i guess i can borrow that | 15:28 |
clarkb | the major thing is using ipv6 if possible though that reminds me k8s can't ipv6 | 15:28 |
clarkb | so that may be a complete waste of effort here :/ | 15:28 |
mnaser | poop | 15:29 |
clarkb | in that case I would tell coredns to use 8.8.8.8 and 1.1.1.1 (and if you can only pick one do so randomly?) | 15:30 |
clarkb | then we at least distribute the potential for NAT blacklisting across a couple services | 15:30 |
clarkb | https://opendev.org/opendev/base-jobs/src/branch/master/roles/configure-unbound/defaults/main.yaml google and cloudflare are the two services we'll distribute across by default | 15:31 |
*** mattw4 has quit IRC | 15:33 | |
*** efried is now known as efried_afk | 15:33 | |
clarkb | you'd think that a google backed service would support ipv6 out of the box since google is all ipv6 | 15:34 |
logan- | i learned this week that gcp compute instances don't support ipv6 yet... also surprising | 15:34 |
clarkb | logan-: I'm also told all the gcp ipv4 traffic is tunneled over ipv6 | 15:35 |
pabelanger | logan-: docker hub also doesn't | 15:35 |
clarkb | because google only knows how to ipv6 | 15:35 |
clarkb | its too bad that we don't have a more localhost version of localhost that is namespace wide | 15:38 |
clarkb | that would likely require updates to ip rfcs so no hope of that any time soon :) | 15:39 |
*** kjackal has quit IRC | 15:40 | |
clarkb | oh coredns will do caching at least | 15:40 |
clarkb | that is good | 15:40 |
clarkb | so ya the big gap there then is ipv6 only clouds and forcing all requests through NAT which is probably ok if we avoid it as much as possible, have caching, and distribute across providers | 15:40 |
*** jamesmcarthur has joined #openstack-infra | 15:41 | |
mnaser | now to figure out how to feed this config to kubeadm.. | 15:43 |
*** tdasilva has joined #openstack-infra | 15:44 | |
clarkb | mnaser: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/control-plane-flags/ something like that maybe? | 15:45 |
mnaser | clarkb: coredns runs as a 'deployment' on top of kubernetes so it would have to edit the config it uses by default | 15:46 |
mnaser | im going through the k8s code and it seems to try and read the kube-dns config to grab that info from.. | 15:46 |
mnaser | https://github.com/kubernetes/kubernetes/blob/efcb62abff0ee6511cb5f82b9f25d28b17d81912/cmd/kubeadm/app/phases/addons/dns/dns.go#L175-L202 | 15:47 |
*** mattw4 has joined #openstack-infra | 15:48 | |
clarkb | heh the open issue for this in k8s was closed because there was a crashloop issue in kubeadm that they decided to track instead except the fix for the crashloop issue in kubeadm was to update docs | 15:49 |
openstackgerrit | Jeff Liu proposed zuul/zuul-operator master: Add PerconaXDB Cluster to Zuul-Operator https://review.opendev.org/677315 | 15:50 |
clarkb | https://github.com/kubernetes/kubernetes/issues/71705#issuecomment-444128684 discussion almost gets to the point of addressing this problem then trickles off | 15:51 |
clarkb | mnaser: reading more it sounds like you can get the config map of the running coredns, modify it, then apply those updates. I suppose that only works if it has working dns already? | 15:53 |
clarkb | though coredns should bootstrap without that given its place in the system | 15:53 |
mnaser | clarkb: yeah and it starts feeling a little hacky at taht point, plus maybe a kubeadm upgrade would override it again | 15:53 |
mnaser | im thinking if i override resolv-conf though | 15:53 |
mnaser | that might just fix it for me | 15:53 |
mnaser | (and it would also ensure that i don't have to worry about different hosts with different resolv.conf) | 15:54 |
clarkb | to me that seems super heavy handed because now all services on the host too must use the wrong resolver | 15:54 |
*** ykarel is now known as ykarel|afk | 15:54 | |
clarkb | (which is the one saving grace of what docker does, it doesn't modify the host) | 15:54 |
*** mgagne has quit IRC | 15:54 | |
*** mgagne has joined #openstack-infra | 15:55 | |
clarkb | this makes me wonder how the nodepool jobs works | 15:56 |
*** igordc has joined #openstack-infra | 15:56 | |
*** jtomasek has quit IRC | 15:56 | |
*** raissa has quit IRC | 15:57 | |
*** tdasilva has quit IRC | 15:57 | |
*** diablo_rojo has joined #openstack-infra | 15:58 | |
*** larainema has quit IRC | 15:59 | |
clarkb | those jobs use minikube which should run coreDNS too | 16:00 |
clarkb | seems like they are going to use localhost too if I'm reading minikube correctly. How does that work? | 16:03 |
clarkb | maybe the jobs don't resolve anything external from within the container | 16:05 |
*** rpittau is now known as rpittau|afk | 16:05 | |
*** ricolin has quit IRC | 16:05 | |
clarkb | Shrews: ^ do you know if that is the case in the nodepool functional jobs? | 16:07 |
*** markvoelker has quit IRC | 16:08 | |
clarkb | --extra-config=kubelet.resolv-conf= is a flag we can set on minikube to specify the addrs though | 16:08 |
Shrews | clarkb: i don't know. minikube is entirely self contained and the k8s commands "just work". nothing outside of the vm is used | 16:08 |
clarkb | which kubeadm seems to lack | 16:09 |
clarkb | Shrews: ya I think that must be it then. If we ran workload that wanted to talk to say github or opendev or whatever that would fail | 16:09 |
mnaser | we can override kubelet resolv-conf apparently | 16:09 |
Shrews | clarkb: i suspect so | 16:09 |
mnaser | i mean my idea was to have resolvers configured _for_ kubernetes so that my roles/playbooks don't touch the host system | 16:09 |
mnaser | so something lik /etc/resolv-k8s.conf or something | 16:10 |
mnaser | but seems hacky :< | 16:10 |
clarkb | mnaser: I think that is how I would do it | 16:10 |
clarkb | mnaser: which makes sense since k8s runs its own dns resolver | 16:11 |
clarkb | nothing says it has to use the host setup | 16:11 |
clarkb | I'm going to write that change for zuul-jobs/install-kubernetes now while I'm thinking about it | 16:13 |
*** mattw4 has quit IRC | 16:19 | |
*** ykarel|afk is now known as ykarel | 16:20 | |
*** lucasagomes has quit IRC | 16:21 | |
*** markvoelker has joined #openstack-infra | 16:22 | |
*** tdasilva has joined #openstack-infra | 16:22 | |
*** Lucas_Gray has quit IRC | 16:25 | |
*** tdasilva has quit IRC | 16:27 | |
*** armax has joined #openstack-infra | 16:29 | |
*** e0ne has quit IRC | 16:30 | |
*** jpena is now known as jpena|off | 16:30 | |
*** dtantsur is now known as dtantsur|afk | 16:37 | |
openstackgerrit | Clark Boylan proposed zuul/zuul-jobs master: Allow for overriding dns resolvers in install-kubernetes https://review.opendev.org/677787 | 16:37 |
clarkb | mnaser: Shrews ^ something like thatfor the minikube case | 16:37 |
*** diablo_rojo has quit IRC | 16:40 | |
*** igordc has quit IRC | 16:41 | |
AJaeger | config-core, could you review https://review.opendev.org/#/c/677547/ and https://review.opendev.org/677657 and a cosmetic https://review.opendev.org/677662, please? more changes for promote... | 16:42 |
clarkb | AJaeger: on it | 16:42 |
AJaeger | thanks | 16:42 |
*** spsurya has quit IRC | 16:43 | |
*** Garyx has quit IRC | 16:45 | |
clarkb | yoctozepto: to follow up on the ipv4 problems in limestone I believe the next round of centos image builds should address that | 16:46 |
clarkb | let me see if I should manually trigger builds | 16:46 |
clarkb | hrm we haven't updated diskimage-builder on the builders yet /me checks on that first | 16:47 |
clarkb | we install diskimage-builder when we update nodepool | 16:48 |
clarkb | I think in this case I'll manually update diskimage-builder then trigger centos 7 rebuild | 16:49 |
*** Garyx has joined #openstack-infra | 16:49 | |
pabelanger | has that been the case the whole time? I thought we managed that independently | 16:50 |
clarkb | I think it must've changed at some point | 16:51 |
clarkb | #status log updated diskimage-builder to 2.26.0 on nodepool builders to pick up centos network manager ipv4 fix | 16:52 |
openstackstatus | clarkb: finished logging | 16:52 |
*** noorul has joined #openstack-infra | 16:52 | |
clarkb | I've also kindly asked nodepool to build centos 7 but it is already building some suse stuff so might be a little while | 16:52 |
*** derekh has quit IRC | 17:00 | |
*** bobh has quit IRC | 17:01 | |
clarkb | centos7 is now building on nb01 | 17:02 |
*** psachin has quit IRC | 17:04 | |
clarkb | fdegir: ya fungi was prodding prople to see hat is necessary to fix that | 17:04 |
clarkb | fdegir: sounds like tumbleweed is having a repo reorg so we may just have to wait | 17:05 |
clarkb | fdegir: that said I recently tested if fsyncing was necessary and couldn't find a case where it was | 17:05 |
clarkb | fdegir: once python exits it should flush and sync all the things to disk | 17:05 |
clarkb | fdegir: if you need those files on disk before glean exits then maybe that is the problem? | 17:05 |
*** noorul has quit IRC | 17:08 | |
*** noorul has joined #openstack-infra | 17:13 | |
*** mattw4 has joined #openstack-infra | 17:13 | |
guilhermesp | infra-root: the following hold can be deleted 104.130.127.203 | 17:15 |
clarkb | out of curiousity what was the problem (and any idea why it wasn't reproduceable locally?) | 17:17 |
*** diga has quit IRC | 17:18 | |
clarkb | guilhermesp: ^ | 17:19 |
*** mattw4 has quit IRC | 17:19 | |
clarkb | delete has been submitted to nodepool | 17:19 |
*** noorul has quit IRC | 17:20 | |
*** priteau has quit IRC | 17:21 | |
clarkb | fedora mirror is now in same size range as ubuntu \o/ | 17:22 |
clarkb | hopefully that helps make releaess for it more relaible | 17:22 |
guilhermesp | clarkb: seems that we are having issues with mixed setup (source/distro) installs with debian based OS and having duplication of tempest puglins for heat. We are rebasing the broken patch to see if recent changes regarding the issue fixed the job :) | 17:23 |
*** noorul has joined #openstack-infra | 17:25 | |
melwitt | can anyone point me to docs or otherwise give me a pointer as to whether a non-zero return value from a post test hook script is expected to result in a POST_FAILURE rather than a FAILED job status? | 17:26 |
melwitt | example: https://zuul.opendev.org/t/openstack/build/707d40fa5ad54efd8e8ea4ea9d10812e | 17:27 |
clarkb | melwitt: https://zuul-ci.org/docs/zuul/user/jobs.html#build-status should be POST_FAILURE | 17:28 |
AJaeger | melwitt: if the run playbook does not generate the content that is required by teh post-run, it fails with post-failure. | 17:29 |
clarkb | melwitt: https://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/logs_40/672840/9/check/nova-next/707d40f/ara-report/reports/53598c9a-c84c-4374-934c-f14acf081790.html that is why you get that return result | 17:29 |
AJaeger | So, might be that the post-run is not handling missing content correctly | 17:29 |
fdegir | clarkb: here is the background for that glean fsync | 17:29 |
fdegir | clarkb: as you know, network configuration is done by glean via a linux service (one registration for each interface) | 17:30 |
fdegir | clarkb: the pre part is to run glean to generate the needed files, and the main part is to run “ifup”. | 17:30 |
fdegir | clarkb: howeve,r glean does not attempt to do os.fsync after writing files, which is not safe in python, | 17:30 |
*** noorul has quit IRC | 17:30 | |
fdegir | clarkb: as python will not guarantee the file being written even if it is closed in the python level. | 17:30 |
fdegir | clarkb: if we are lucky, ifup may complain unknown interface as the file is still in cache. | 17:30 |
AJaeger | clarkb, melwitt run-post-test-hook runs in post-run, so gets POST_FAILURE. So, question is: Should it run there? ;) | 17:31 |
clarkb | fdegir: it is my understanding that when the python process exits it flushes though? | 17:31 |
clarkb | I mean your change won't hurt anything but I spent some time trying to reproduce that recently and wasn't able to | 17:32 |
*** ociuhandu has quit IRC | 17:32 | |
fdegir | clarkb: that is the luck part | 17:32 |
clarkb | but also even if the file is still in cache other readers should see it via the cache | 17:32 |
fdegir | clarkb: it happens randomly and when we patched glean locally and rebuilt the images with the patched version, we never experienced this issue | 17:32 |
clarkb | it would only be a problem if you hard rebooted under it I think | 17:32 |
clarkb | fdegir: ya could also be a difference in filesytem behavior. I was testing on ext4 | 17:33 |
melwitt | clarkb, AJaeger: thanks! looks like POST_FAILURE is always expected in this case then (good). I think I just hadn't seen a failure before and was surprised when it didn't say FAILED. I shall crunch on the info you linked | 17:35 |
*** noorul has joined #openstack-infra | 17:35 | |
fdegir | clarkb: regarding tumbleweed issue - is there any place where i can check to see if it is resolved and then try recheck? | 17:35 |
fdegir | rather than doing random rechecks unnecessarily? | 17:35 |
clarkb | fdegir: it sounded like fungi was going to try and shepherd your change in? we can probably ping you here when we believe tumbleweed has stabilized | 17:36 |
clarkb | fdegir: reading more on python stuff. closing the file is supposed to be sufficient | 17:36 |
fdegir | clarkb: that would be great | 17:36 |
clarkb | makes me wonder if there is some funny bug in either python or the system that this is hitting a corner case of | 17:37 |
*** noorul has quit IRC | 17:38 | |
fdegir | if i remember correctly, we faced this both on ubuntu1604 and centos7 | 17:38 |
*** noorul has joined #openstack-infra | 17:38 | |
fdegir | but since it's been a while the change was submitted and we continued using the image we built with locally patched glean, i've forgotten the details | 17:39 |
clarkb | ya thats fine. As mentioned your change won't hurt anything and is just more explicit (which is why it makes me wonder of a corner case) | 17:39 |
clarkb | the close and process exit (if clean) should handle it for us but maybe in some cases on some filesystems or whatever it doesn't | 17:39 |
clarkb | and in those caes being explicit is fine | 17:40 |
fdegir | in any case, i'll talk to my colleagues to see if they remember more and bring the info back or the people | 17:41 |
*** odicha has quit IRC | 17:44 | |
*** efried_afk is now known as dansmith1 | 17:46 | |
*** dansmith1 is now known as efried | 17:46 | |
*** e0ne has joined #openstack-infra | 17:49 | |
aspiers | clarkb, fungi, corvus: in case you missed it, this got uploaded yesterday (one year later!) https://www.youtube.com/watch?v=kM7dxm1O1jg | 17:51 |
*** bobh has joined #openstack-infra | 17:52 | |
openstackgerrit | Dirk Mueller proposed openstack/diskimage-builder master: zypper-minimal: Don't get confused by etc/resolv.conf symlink https://review.opendev.org/677796 | 17:57 |
openstackgerrit | Dirk Mueller proposed opendev/glean master: Sync when writing the file https://review.opendev.org/652238 | 17:58 |
*** e0ne has quit IRC | 18:01 | |
*** igordc has joined #openstack-infra | 18:03 | |
*** dave-mccowan has joined #openstack-infra | 18:06 | |
*** gfidente is now known as gfidente|afk | 18:06 | |
donnyd | How much divergence is there between software-factory and the openstack infra aside from the obvious parts | 18:10 |
*** bobh has quit IRC | 18:10 | |
*** dave-mccowan has quit IRC | 18:11 | |
clarkb | donnyd: they often run patches to zuul and nodepool for unmerged changes to add in features (not sure if that has slowed down as more code has merged to zuul) | 18:11 |
clarkb | other than that and the obvious bits like bug tracker etc I think it is pretty close | 18:12 |
AJaeger | infra-root, a couple of more delete request for wrong content from ages ago: http://files.openstack.org/docs/project-deploy-guide/OpenStack-Ansible/ http://files.openstack.org/docs/project-deploy-guide/kolla-ansible/draft/ http://files.openstack.org/docs/project-deploy-guide/kolla-ansible/html/ http://files.openstack.org/docs/project-deploy-guide/openstack-ansible/draft/ | 18:12 |
AJaeger | http://files.openstack.org/docs/project-deploy-guide/openstack-ansible/html/ - thanks | 18:12 |
*** diablo_rojo has joined #openstack-infra | 18:13 | |
donnyd | It looked like was pretty close. I only ask for the purpose of 3P CI wanting to be as close as possible... and short of rolling their own it looks to be a great place to start | 18:13 |
*** pkopec has quit IRC | 18:14 | |
clarkb | donnyd: there are also differences in the test node images | 18:14 |
clarkb | I want to say they run a lot more distro packages than we do | 18:15 |
clarkb | for things like tox | 18:15 |
donnyd | You mean the images nodepool builds? Can't that be customized in the SF deployment? or is it baked in | 18:16 |
*** markvoelker has quit IRC | 18:16 | |
clarkb | ya the images nodepool builds. I don't think SF lets users customize those | 18:17 |
clarkb | but I may be wrong | 18:17 |
donnyd | Thanks for filling in the gaps for me clarkb | 18:17 |
clarkb | you could do runtime changes though | 18:17 |
tristanC | clarkb: donnyd: most of our patches are now merged, only three left, see: https://softwarefactory-project.io/cgit/scl/zuul-distgit/tree/ and https://softwarefactory-project.io/cgit/scl/nodepool-distgit/tree/ | 18:18 |
tristanC | donnyd: SF let you build the same image as upstream, using disk-image-builder and custom elements, see: https://softwarefactory-project.io/cgit/config/tree/nodepool/rdo-cloud.yaml | 18:20 |
*** markvoelker has joined #openstack-infra | 18:22 | |
tristanC | donnyd: and SF also support container based system and virt-customize image too | 18:23 |
tristanC | but for opendev 3P CI, it's easier to use the same dib elements as openstack-infra | 18:24 |
*** ramishra has quit IRC | 18:24 | |
donnyd | Right, but I could potentially just plug in the same as infra and it should work the same | 18:24 |
tristanC | donnyd: yes, that's what we do for review.rdoproject.org 3P CI | 18:25 |
openstackgerrit | Akihiro Motoki proposed openstack/openstack-zuul-jobs master: Add support for building PDFs https://review.opendev.org/664555 | 18:28 |
*** markvoelker has quit IRC | 18:30 | |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Add more promote jobs https://review.opendev.org/677799 | 18:31 |
*** jtomasek has joined #openstack-infra | 18:32 | |
clarkb | tristanC: what is a virt-customize image in the context of nodepool? Is that run container to execute virt-customize then execute it with libvirt? | 18:32 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Add more promote jobs https://review.opendev.org/677799 | 18:33 |
*** kopecmartin is now known as kopecmartin|off | 18:33 | |
clarkb | or maybe running virt-customize within dib? | 18:33 |
donnyd | I would think it would be virt-customize in lieu of dib | 18:34 |
clarkb | donnyd: ya but nodepool doesn't support that | 18:34 |
clarkb | maybe that is a change they are carrying? Ididn't look at hte patch list | 18:34 |
*** ralonsoh has quit IRC | 18:35 | |
tristanC | clarkb: it's using a fake disk-image-create script injected with the PATH env... | 18:35 |
clarkb | (fwiw in the past its bee nsuggested that tools like that and libguestfs could be used as a dib build mechanism and keep all the existing elements and other driver compat, but I don't think anyone added that to dib either. Could be there are patches there) | 18:35 |
fungi | fdegir: clarkb: yeah, it looks like dirk pushed up some dib fixes while i was running errands so hopefully we can get those confirmed and added almost as easily as temporarily setting the job non-voting | 18:35 |
*** ralonsoh has joined #openstack-infra | 18:35 | |
pabelanger | clarkb: http://libguestfs.org/virt-dib.1.html | 18:36 |
pabelanger | never tried it | 18:36 |
tristanC | e.g.: https://softwarefactory-project.io/cgit/config/tree/nodepool/elements/virt-customize/disk-image-create | 18:36 |
clarkb | pabelanger: ah ok so they did it the other way around | 18:36 |
clarkb | pabelanger: the suggestion we made as that running diskimage-create --guestfs or whatever would build image in libguestfs VM and run elements there | 18:36 |
pabelanger | but for ansible network, we have some vendors doing virt-customize (vyos for example). So plan on trying it out someday. | 18:36 |
clarkb | pabelanger: basically something to replace the chroot builder that exists today if you don't mind nested virt or want other features that libguestfs provides | 18:37 |
clarkb | tristanC: seems like we should be able to suppor that better than subverting a testing mechanism that could be changed in nodepool? | 18:37 |
clarkb | that definitely isn't part of any supported user contract | 18:37 |
pabelanger | yah, big thing we need, is boot appliance and interact with some console port first | 18:38 |
pabelanger | to setup SSH | 18:38 |
pabelanger | for now, we qemu-img directly and pepect | 18:38 |
pabelanger | pexpect* | 18:38 |
pabelanger | I think virt-customize has a cleaning interface there | 18:38 |
tristanC | clarkb: oh yes, that would be easier, nodepool could directly use a non dib create process | 18:38 |
pabelanger | I'm waiting for https://review.opendev.org/672196/ before trying it, just drop down different .sh files | 18:39 |
tristanC | clarkb: but in the meantime, that has been working quite well, the main reason is to be able to use the same cloud image as published by the distrib | 18:40 |
clarkb | pabelanger: do you have nested virt for that or is it just expected to be slow? | 18:40 |
openstackgerrit | Andreas Jaeger proposed openstack/openstack-zuul-jobs master: Switch deploy-guides to promote publishing https://review.opendev.org/677803 | 18:40 |
clarkb | tristanC: just so you know dib supports that already | 18:40 |
clarkb | tristanC: the centos element for example works that way | 18:40 |
clarkb | (centos-minimal builds the image from scratch instead) | 18:40 |
pabelanger | clarkb: yah, we need nested virt for a few appliances. Cisco specifically, like 12GB / 24GB, not fun | 18:40 |
clarkb | same is true with ubuntu vs ubuntu-minimal elements | 18:40 |
pabelanger | smaller ones, we don't | 18:40 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Remove now obsolete publish jobs https://review.opendev.org/677804 | 18:42 |
clarkb | AJaeger: for your afs deletes what makes the content wrong? | 18:43 |
AJaeger | clarkb: those were errors in publishing years ago | 18:43 |
clarkb | oh I see the timestamps are 2016 | 18:43 |
fungi | mis-published to wrong path? | 18:43 |
AJaeger | nothing should link to them | 18:43 |
AJaeger | yes, mispublished | 18:43 |
fungi | and so never cleaned up. got it | 18:43 |
AJaeger | So, instead of http://files.openstack.org/docs/project-deploy-guide/OpenStack-Ansible/ we have http://files.openstack.org/docs/project-deploy-guide/openstack-ansible/ | 18:44 |
AJaeger | and the draft and html directories are wrong | 18:44 |
clarkb | AJaeger: the correct one is the loewr case version? | 18:44 |
AJaeger | yes | 18:44 |
*** e0ne has joined #openstack-infra | 18:44 | |
AJaeger | And it's also /draft and /html in those only - last change in 2016, 2017, or 2018 | 18:45 |
AJaeger | /OpenStack-Ansible/draft is wrong | 18:45 |
AJaeger | and nothing else in /OpenStack-Ansible | 18:45 |
AJaeger | fungi, if you have time later, could you review https://review.opendev.org/#/c/677547/ and https://review.opendev.org/677657, https://review.opendev.org/677799 and a cosmetic https://review.opendev.org/677662, please? more changes for promote | 18:47 |
*** kjackal has joined #openstack-infra | 18:47 | |
rlandy|ruck | hi - we used to have a footer in our (tripleo) upstream logs and we lost that with the new logging change (https://bugs.launchpad.net/tripleo/+bug/1840818). Is there any expectation that the footer will be re-enabled or is this something we should place in our own log docs now? | 18:51 |
openstack | Launchpad bug 1840818 in tripleo "Headers and footers no longer appear in upstream logss" [Medium,Triaged] - Assigned to Ronelle Landy (rlandy) | 18:51 |
clarkb | rlandy|ruck: that might be a better queston for #zuul as we'd need to update the zuul dashboard to support that I think | 18:52 |
*** factor has quit IRC | 18:52 | |
clarkb | rlandy|ruck: I expect zuul would be open to that. Could have a special file in a dir that zuul dashboard renders as part of the dashboard? | 18:52 |
rlandy|ruck | clarkb: thanks - will ask there | 18:53 |
rlandy|ruck | we could even add a new log file | 18:53 |
rlandy|ruck | the footer was just a convenient place to point users at | 18:53 |
clarkb | rlandy|ruck: that would be doable in the existing system | 18:54 |
clarkb | a README file per dir or something | 18:54 |
rlandy|ruck | so we have that in places | 18:54 |
clarkb | and then maybe later we update zuul to render that in the dashboard automatically | 18:54 |
rlandy|ruck | some people don't look there though :) | 18:54 |
*** mriedem has quit IRC | 18:58 | |
*** mriedem has joined #openstack-infra | 18:59 | |
clarkb | centos-7 image in limestone updated just under an hour agao | 19:00 |
clarkb | yoctozepto: sshnaidm ^ I think that means we should have more reliable ipv4 now. However this is a race between network manager and the kernel. Please do let us know if you see it persist | 19:00 |
*** ralonsoh has quit IRC | 19:05 | |
openstackgerrit | Akihiro Motoki proposed openstack/openstack-zuul-jobs master: Add support for building PDFs https://review.opendev.org/664555 | 19:12 |
*** jtomasek has quit IRC | 19:15 | |
*** jtomasek has joined #openstack-infra | 19:17 | |
openstackgerrit | Sorin Sbarnea proposed opendev/bindep master: Expose base python version as an atom https://review.opendev.org/639951 | 19:19 |
fdegir | thanks to dirk, opensuse job works now and glean change https://review.opendev.org/#/c/652238/ passed the check | 19:26 |
yoctozepto | clarkb: thanks! :D | 19:27 |
fungi | awesome, thanks for the heads up fdegir! | 19:28 |
fungi | amd thanks for fixing that, dirk! | 19:28 |
fungi | dirk: i think the rm is fine since it's conditional on a test that it is actually a link anyway | 19:31 |
fungi | so unless we're running that on a tumbleweed server and the main /etc/resolv.conf is a symlink already and $TARGET_ROOT is for some reason empty, i don't think it'll end up causing any collateral damage | 19:32 |
*** e0ne has quit IRC | 19:32 | |
*** e0ne has joined #openstack-infra | 19:33 | |
clarkb | fungi: its the tee not tye rm that is a problem aiui | 19:35 |
clarkb | the rm is a dded so the tee writes in the chroot and not out of it | 19:36 |
fungi | right, i just meant as a response to his question about doing that vs a sudo chroot | 19:37 |
openstackgerrit | Sorin Sbarnea proposed opendev/bindep master: Expose base python version as an atom https://review.opendev.org/639951 | 19:39 |
clarkb | heading out on the bike now. back in a bit | 19:40 |
*** markvoelker has joined #openstack-infra | 19:44 | |
*** markvoelker has quit IRC | 19:49 | |
*** eernst has joined #openstack-infra | 19:52 | |
*** factor has joined #openstack-infra | 19:55 | |
*** eernst has quit IRC | 19:57 | |
*** eernst_ has joined #openstack-infra | 19:57 | |
*** noorul has quit IRC | 19:57 | |
*** noorul has joined #openstack-infra | 19:58 | |
sshnaidm | clarkb, thanks, will keep eye | 19:58 |
openstackgerrit | Akihiro Motoki proposed openstack/openstack-zuul-jobs master: Add support for building PDFs https://review.opendev.org/664555 | 20:00 |
*** noorul has quit IRC | 20:06 | |
*** e0ne has quit IRC | 20:10 | |
*** e0ne has joined #openstack-infra | 20:12 | |
*** ykarel has quit IRC | 20:13 | |
openstackgerrit | Akihiro Motoki proposed openstack/openstack-zuul-jobs master: Add support for building PDFs https://review.opendev.org/664555 | 20:14 |
*** kjackal has quit IRC | 20:15 | |
*** ykarel has joined #openstack-infra | 20:16 | |
*** eernst_ has quit IRC | 20:21 | |
*** noorul has joined #openstack-infra | 20:21 | |
*** jtomasek has quit IRC | 20:21 | |
*** jamesmcarthur has quit IRC | 20:23 | |
*** noorul has quit IRC | 20:26 | |
*** noorul has joined #openstack-infra | 20:41 | |
openstackgerrit | Merged openstack/diskimage-builder master: zypper-minimal: Don't get confused by etc/resolv.conf symlink https://review.opendev.org/677796 | 20:43 |
openstackgerrit | Akihiro Motoki proposed openstack/openstack-zuul-jobs master: Add support for building PDFs https://review.opendev.org/664555 | 20:43 |
*** jamesmcarthur has joined #openstack-infra | 20:44 | |
*** e0ne has quit IRC | 20:44 | |
*** noorul has quit IRC | 20:47 | |
*** ykarel has quit IRC | 20:48 | |
*** noorul has joined #openstack-infra | 20:52 | |
*** bobh has joined #openstack-infra | 20:55 | |
mnaser | is the zuul console a lil broken? | 20:55 |
mnaser | i got two builds and i just have "--- END OF STREAM ---" listed there | 20:56 |
clarkb | if the jobs just started that usually indicates the job isn't far noeugh along to be running the port 19885 console streaming service on the remote node | 20:56 |
*** noorul has quit IRC | 20:56 | |
clarkb | if it happens well after job has started I think that means something prevents that network connection? | 20:57 |
*** bobh has quit IRC | 21:00 | |
*** eernst has joined #openstack-infra | 21:00 | |
*** jamesmcarthur has quit IRC | 21:01 | |
*** noorul has joined #openstack-infra | 21:02 | |
openstackgerrit | Sean McGinnis proposed opendev/irc-meetings master: Switch release team to 1600 UTC https://review.opendev.org/677831 | 21:02 |
*** noorul has quit IRC | 21:07 | |
mnaser | clarkb: in my case, it seems like something is off, i had a job eventually fail | 21:08 |
mnaser | (so it was running ok) | 21:08 |
mnaser | in my case http://zuul.opendev.org/t/opendev/stream/4842112d931c4836ab78645823b744f3?logfile=console.log has been running for a while but still giving end of stream | 21:08 |
fungi | yeah, if something kills the streaming daemon (oom, reboot) or blocks access to its socket (firewall or networking changes) then that can cause it | 21:11 |
fungi | logs should still be collected and reported at the end of the build, as long as log collection via rsync isn't also broken | 21:12 |
*** noorul has joined #openstack-infra | 21:12 | |
*** Goneri has quit IRC | 21:12 | |
fungi | maybe there will be something in there which explains the log streamer not working | 21:12 |
clarkb | that job is running on ze10 | 21:14 |
clarkb | I cannot ssh onto the test node as root | 21:15 |
*** jamesmcarthur has joined #openstack-infra | 21:16 | |
clarkb | that implies to me that the host is unhappy? | 21:16 |
*** altlogbot_3 has quit IRC | 21:16 | |
*** noorul has quit IRC | 21:17 | |
clarkb | oh maybe the job is done? | 21:17 |
clarkb | it doesn't seem to have logged its completion | 21:17 |
clarkb | but those test nodes are unknown by nodepool right now | 21:17 |
clarkb | oh nevermind I can't grep by hostname anymore in nodepool output | 21:18 |
*** mriedem has quit IRC | 21:18 | |
clarkb | all three nodes have a process listening at 19885 | 21:20 |
corvus | zuul 21479 0.0 0.0 0 0 ? Z Aug09 2:19 [zuul-executor] <defunct> | 21:22 |
*** noorul has joined #openstack-infra | 21:22 | |
corvus | clarkb: ^ the proxy on the executor died | 21:22 |
clarkb | ah | 21:22 |
clarkb | fwiw if I nc to the hosts port 19885 I get nothing either (do I need to use finger protocol to that service too?) | 21:23 |
clarkb | looks like there was an OOM on the 17th but zuul-executor was last restarte on the 9th | 21:23 |
clarkb | that may be where the proxy was killed I guess | 21:23 |
clarkb | looks like 09 and 10 are in that same situation. Should we restart the zuul-executor service on those hosts? | 21:27 |
clarkb | also should zuul consider waitpid'ing its child processes and restarting them if necessary? | 21:27 |
*** noorul has quit IRC | 21:27 | |
*** rcernin has joined #openstack-infra | 21:27 | |
clarkb | (I can restart the zuul-executor service on 09 and 10 if that is what we want to do there | 21:28 |
*** mattw4 has joined #openstack-infra | 21:30 | |
*** bobh has joined #openstack-infra | 21:32 | |
clarkb | infra-root ^ any objection to that? I doubt there is much we can do to debug in its current state. I've confirmed in 09's dmesg output that OOMKiller killed hte child zuul-executor process | 21:32 |
*** noorul has joined #openstack-infra | 21:32 | |
ianw | ++ | 21:33 |
clarkb | ianw: did you see the dib fix for tumbleweed and the followup to make fdegir glean usage happy? | 21:33 |
clarkb | (we might want to do releases of those two to close that out if possible) | 21:33 |
corvus | clarkb: ++ also waitpid++ | 21:34 |
clarkb | I'lld do 09 and 10 sequentially to reduce any potential load on the other executors that might trigger the same thing there | 21:34 |
ianw | clarkb: will check on it | 21:35 |
*** altlogbot_2 has joined #openstack-infra | 21:37 | |
*** noorul has quit IRC | 21:37 | |
*** altlogbot_2 has quit IRC | 21:38 | |
*** sshnaidm is now known as sshnaidm|afk | 21:40 | |
*** altlogbot_2 has joined #openstack-infra | 21:41 | |
ianw | clarkb / fungi : ahh, so did 677796 (deleting outside chroot) break our builders? | 21:41 |
fungi | clarkb: sounds fine to me | 21:41 |
clarkb | ianw: you know I didn't check but other builds seemed to be working? | 21:42 |
clarkb | ianw: possible that puppet is fixing it for us? | 21:42 |
*** altlogbot_2 has quit IRC | 21:42 | |
fungi | ianw: tumbleweed changed to having /etc/resolv.conf symlink into /run so when editing it outside of a chroot the tee was unable to write into that symlink as it was a dead link | 21:42 |
*** noorul has joined #openstack-infra | 21:42 | |
clarkb | 09 zuul-executor is done | 21:42 |
clarkb | doing 10 now | 21:42 |
fungi | ianw: by deleting the symlink from the filesystem subtree, tee now creates a normal file at that location as intended | 21:43 |
ianw | cool; i just released 2.26.1 with just that fix | 21:45 |
ianw | fdegir: ^ | 21:46 |
*** jamesmcarthur has quit IRC | 21:46 | |
fdegir | ianw: thx - i'm waiting for this to go in as well - https://review.opendev.org/#/c/652238/ | 21:46 |
*** jamesmcarthur has joined #openstack-infra | 21:47 | |
*** noorul has quit IRC | 21:47 | |
ianw | fdegir: np, just having a look now ... ISTR very similar issues, i wonder if it was an old change or we just missed this | 21:48 |
clarkb | and now 10 is restarted | 21:51 |
*** jamesmcarthur has quit IRC | 21:51 | |
*** jamesmcarthur has joined #openstack-infra | 21:51 | |
ianw | i'm thinking of the issues we had @ ~ http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2018-07-25.log.html#t2018-07-25T00:37:51 with the centos ssh key being written but not available | 21:51 |
*** noorul has joined #openstack-infra | 21:52 | |
*** rlandy|ruck is now known as rlandy|ruck|bbl | 21:57 | |
ianw | i'm not sure why we didn't sync at that time too, anyway, LGTM | 21:57 |
*** noorul has quit IRC | 21:58 | |
ianw | looks like that will basically be it from 1.14.1 in glean -- fdegir do you need a release? | 21:58 |
*** markvoelker has joined #openstack-infra | 21:59 | |
*** bnemec has quit IRC | 21:59 | |
fdegir | ianw: yes if it is not too much to ask | 21:59 |
*** jeliu_ has quit IRC | 22:01 | |
*** jeliu_ has joined #openstack-infra | 22:01 | |
*** noorul has joined #openstack-infra | 22:03 | |
*** diablo_rojo has quit IRC | 22:04 | |
*** noorul has quit IRC | 22:08 | |
*** jeliu_ has quit IRC | 22:09 | |
openstackgerrit | Clark Boylan proposed zuul/zuul master: Restart log streamer if it dies https://review.opendev.org/677846 | 22:09 |
*** noorul has joined #openstack-infra | 22:13 | |
*** jamesmcarthur has quit IRC | 22:15 | |
*** noorul has quit IRC | 22:18 | |
*** rh-jelabarre has quit IRC | 22:21 | |
*** bobh has quit IRC | 22:22 | |
*** bobh has joined #openstack-infra | 22:22 | |
*** dchen has joined #openstack-infra | 22:23 | |
*** noorul has joined #openstack-infra | 22:23 | |
*** dchen has quit IRC | 22:23 | |
*** markvoelker has quit IRC | 22:24 | |
*** dchen has joined #openstack-infra | 22:25 | |
*** bobh has quit IRC | 22:27 | |
*** noorul has quit IRC | 22:27 | |
openstackgerrit | Paul Belanger proposed zuul/zuul master: Switch ansible_default to 2.8 https://review.opendev.org/676695 | 22:29 |
openstackgerrit | Paul Belanger proposed zuul/zuul master: WIP: Support Ansible 2.9 https://review.opendev.org/674854 | 22:29 |
*** noorul has joined #openstack-infra | 22:33 | |
*** threestrands has joined #openstack-infra | 22:34 | |
openstackgerrit | Merged zuul/zuul master: Add Tristan to Zuul Maintainers https://review.opendev.org/677308 | 22:35 |
*** markvoelker has joined #openstack-infra | 22:35 | |
*** noorul has quit IRC | 22:37 | |
*** rcernin has quit IRC | 22:40 | |
*** rosmaita has quit IRC | 22:40 | |
*** markvoelker has quit IRC | 22:40 | |
*** mattw4 has quit IRC | 22:41 | |
*** eharney has quit IRC | 22:41 | |
*** noorul has joined #openstack-infra | 22:43 | |
*** rcernin has joined #openstack-infra | 22:43 | |
*** eernst has quit IRC | 22:44 | |
*** eernst has joined #openstack-infra | 22:45 | |
*** noorul has quit IRC | 22:48 | |
openstackgerrit | Merged opendev/glean master: Sync when writing the file https://review.opendev.org/652238 | 22:51 |
*** eernst has quit IRC | 22:56 | |
*** tkajinam has joined #openstack-infra | 22:56 | |
*** eernst has joined #openstack-infra | 22:56 | |
*** jamesmcarthur has joined #openstack-infra | 22:57 | |
*** aaronsheffield has quit IRC | 23:13 | |
donnyd | who owns windmill? | 23:15 |
ianw | donnyd: pabelanger i would say | 23:16 |
ianw | fdegir: glean 1.15.0 should have that change, thanks! | 23:16 |
donnyd | I wasn't sure which channel messages from windmill get posted to | 23:16 |
ianw | there is an #openstack-windmill | 23:17 |
donnyd | I just looked and I don't see anything... but i may have been a second too late | 23:17 |
pabelanger | yup, still kicking | 23:20 |
pabelanger | we use it to deploy zuul.ansible.com | 23:20 |
donnyd | hrm, can't resolve that | 23:25 |
corvus | we really should see if we can do something about that | 23:25 |
corvus | dashboard.zuul.ansible.com | 23:25 |
donnyd | I added windmill.ansible to the requirements because it was not there | 23:26 |
donnyd | https://review.opendev.org/#/c/677850/ | 23:26 |
pabelanger | yah, noticed that too. gets pulled in via windmill.ops | 23:26 |
pabelanger | was trying to convert it to a collection | 23:26 |
donnyd | I am trying to get it running, but i am not sure if I am doing it right | 23:27 |
*** bobh has joined #openstack-infra | 23:28 | |
pabelanger | yah, it is poorly documented. I really need to fix that | 23:28 |
pabelanger | basically, you need to also setup windmill-config | 23:28 |
pabelanger | https://github.com/ansible-network/windmill-config/ is an example for ansible zuul | 23:29 |
donnyd | Well all the playbooks ran successfully | 23:29 |
donnyd | I just ran it as is with windmill-config | 23:31 |
donnyd | but I am not sure i am doing it right | 23:31 |
pabelanger | we can move to openstack-windmill if you want | 23:31 |
*** sgw has quit IRC | 23:34 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Add arm64 based functional test https://review.opendev.org/676111 | 23:39 |
ianw | ^ not sure if anyone has any thoughts on adding that ... it would have caught the recent "efi blows out disk usage calculations and build runs out of space" issues | 23:40 |
ianw | rsync: link_stat "/repositories/Virtualization:/containers/openSUSE_Leap_42.3/." (in opensuse) failed: Input/output error (5) | 23:41 |
ianw | is this our end (afs) or the remote end i wonder? | 23:41 |
donnyd | ianw: do the providers need to support efi boot for VM's ? | 23:45 |
ianw | donnyd: with arm64 they only support efi. i have had an x86-64 efi based vm that dib created booting in qemu, but we've never tried it in a provider afaik | 23:46 |
donnyd | ok | 23:47 |
donnyd | :) | 23:47 |
*** sthussey has quit IRC | 23:50 | |
ianw | ok, so that opensuse mirror error, i think the afs part is a red herring and it's the remote end | 23:52 |
ianw | http://paste.openstack.org/show/761422/ | 23:52 |
ianw | same thing happens if i just try it to /tmp | 23:52 |
ianw | so it looks like rsync://provo-mirror.opensuse.org is having issues | 23:53 |
ianw | fdegir AJaeger dirk: ^ | 23:53 |
ianw | logs at http://files.openstack.org/mirror/logs/rsync-mirrors/opensuse.log | 23:53 |
clarkb | ya I think dirk said they are reorging repo structure | 23:54 |
ianw | infra-root: can we merge https://review.opendev.org/#/c/671963/ so those logs come up in the browser for people looking | 23:54 |
ianw | clarkb: cool, when i see i/o errors + afs + new mirror-update server i just get a little worried :) | 23:54 |
*** jamesmcarthur has quit IRC | 23:56 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!