*** ysandeep|out is now known as ysandeep | 05:05 | |
ysandeep | o/ good morning | 05:06 |
---|---|---|
ysandeep | reviewbot, please add in reviewlist: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/871493 | 05:06 |
reviewbot | I have added your review to the Review list | 05:06 |
ysandeep | akahat, hi o/ Once you wrote doc - How to deploy standalone using ansible.. do you still have that handy? | 06:05 |
marios | o/ | 06:09 |
ysandeep | hey marios o/ | 06:16 |
marios | o/ | 06:25 |
*** amoralej|off is now known as amoralej | 07:13 | |
Tengu|rover | hello there | 07:21 |
*** ysandeep is now known as ysandeep|lunch | 07:39 | |
akahat | ysandeep, o/ https://amolkahat.github.io/deploy-standalone-tripleo-using-tripleo-operator-ansible.html | 07:39 |
Tengu|rover | oh, there were promotions over the night. gooood | 07:46 |
*** ysandeep|lunch is now known as ysandeep | 08:10 | |
ysandeep | akahat, thanks! | 08:10 |
Tengu|rover | ok,. what's missing in wallaby for a promotion.... | 08:22 |
marios | bhagyashris|ruck: o/ | 08:31 |
*** jpena|off is now known as jpena | 08:35 | |
*** ysandeep is now known as ysandeep|afk | 08:55 | |
*** ysandeep|afk is now known as ysandeep | 09:07 | |
ysandeep | rlandy|out, marios Tengu|rover :) huff.. finally found out what's the issue with c8 train internal job "nameserver 127.0.0.1" entry in /etc/resolv.conf is causing issue. Took me a while to figure that our because before the job finishes we were correcting the dns entry and when I login on the reproducer node dns entry was correct and I couldn't reproducer the issue. | 09:10 |
ysandeep | rlandy_, marios Tengu|rover https://privatebin.corp.redhat.com/?655ab6f811d57f58#ECzq7cxaopuRvi1w9CLWeR1jJ14Fpts24j9PnfBg6RJ6 | 09:11 |
Tengu|rover | ysandeep: linked to networkmanager? | 09:11 |
ysandeep | Tengu|rover, https://privatebin.corp.redhat.com/?655ab6f811d57f58#ECzq7cxaopuRvi1w9CLWeR1jJ14Fpts24j9PnfBg6RJ6 , with "nameserver 127.0.0.1" as first entry in resolv.conf -> node was not able to resolve internal links | 09:12 |
Tengu|rover | ysandeep: care to check the content of /etc/NetworkManager/NetworkManager.conf and ensure there's a "dns=none" in the [main] section? | 09:13 |
Tengu|rover | ysandeep: I can work with you on that matter, if needed - I nudged things in upstream infra repo for that already. | 09:13 |
ysandeep | Tengu|rover, http://pastebin.test.redhat.com/1089214 | 09:14 |
ysandeep | dns=none not present | 09:14 |
Tengu|rover | ok - so it's missing the setting :) | 09:14 |
marios | thnks for digging ysandeep - cant quickly find where that comes from (127) just looked a bit with codesearch | 09:14 |
Tengu|rover | -> need to inifile it, and reload the NetworkManager.service systemd unit | 09:15 |
marios | probably default and not something we set | 09:15 |
ysandeep | yes, looks like coming from default c8 node.. | 09:15 |
ysandeep | Tengu|rover, trying your suggestion locally | 09:16 |
marios | ysandeep: this isnt ipa job but noting that the ipa job has explicit task to remove that there https://opendev.org/openstack/tripleo-quickstart-extras/src/commit/c13eb508987b853c93bff024c54402ee605aef09/roles/ipa-multinode/tasks/ipaserver-undercloud-setup.yml#L156 | 09:16 |
Tengu|rover | marios: sounds sooo wrong actually. | 09:16 |
Tengu|rover | that should be done via the correct networkmanager setting. | 09:17 |
Tengu|rover | else, it will override the /etc/resolv.conf at any time. | 09:17 |
Tengu|rover | marios: worth pushing a change request against that task file in order to ensure NM is correctly configured? | 09:19 |
Tengu|rover | marios, ysandeep https://opendev.org/openstack/tripleo-quickstart-extras/src/branch/master/playbooks/baremetal-full-freeipa.yml#L85-L97 | 09:20 |
Tengu|rover | ysandeep: you want the same -^^ | 09:20 |
ysandeep | Tengu|rover, didn't help, could you please check if I missed something: http://pastebin.test.redhat.com/1089216 | 09:22 |
Tengu|rover | ysandeep: hmm. So, you want to: ensure NM is properly configured, and reload it. and then only you'll be able to publish the /etc/resolv.conf | 09:23 |
Tengu|rover | ysandeep: pretty sure the resolv.conf is edited prior NM is configured/reloaded, and that the resolve.conf is edited in an "append" way. Since I don't know what you're actually running, I can't say for sure. Do you happen to have a playbook and related things? | 09:24 |
Tengu|rover | ysandeep: we can even jump in a meet if you want. | 09:25 |
* pojadhav afk for ~1hr | 09:27 | |
ysandeep | Tengu|rover, sure.. lets meet on gmeet | 09:27 |
ysandeep | meet.google.com/rix-wdxh-vuk | 09:27 |
Tengu|rover | wallaby on cs9 will promote shortly! | 09:29 |
Tengu|rover | marios: -^ | 09:29 |
marios | nice | 09:29 |
Tengu|rover | and we should get a clean resolution for that resolver issue pointed by ysandeep. | 09:36 |
Tengu|rover | marios: ah, quick question related to the dashboard: the second square, named "RDO promotion", shows a large gap - last promotions being months ago, while last builds are usually far closer to our current time. Is this normal? iirc you told me something about it, but I don't remember. | 09:48 |
Tengu|rover | (talking about http://dashboard-ci.tripleo.org/d/mhV51gdVk/upstream-and-rdo-promotions?orgId=1 - sorry) | 09:48 |
marios | no longer need to worry about current-tripleo-rdo so can ignore | 09:52 |
marios | Tengu|rover: ^ | 09:52 |
Tengu|rover | marios: ok! is this something to check whenever a new release is cut or something? | 09:52 |
Tengu|rover | or is it really just a dead topic | 09:53 |
marios | no used to be part of the prod chain but not for a while now (used to go current-tripleo then current-tripleo-rdo ) | 09:53 |
Tengu|rover | ok | 09:53 |
Tengu|rover | woot! promoted! | 10:04 |
Tengu|rover | all stable + master are fresh from either yesterday or today! | 10:04 |
Tengu|rover | (though train doesn't have new content for some days, now, so... it's "old", though up-to-date) | 10:05 |
Tengu|rover | lovely | 10:05 |
marios | :) | 10:05 |
marios | stop.saying.that. | 10:05 |
marios | o_O | 10:05 |
marios | :D | 10:05 |
Tengu|rover | :] | 10:05 |
Tengu|rover | sorry for being enthusiastic about that silly thing called promotion ;) | 10:06 |
marios | i am joking of course like don't jinx it ;) | 10:06 |
Tengu|rover | oh, well, it will eventually blow anyway, jinxed or not ;) | 10:06 |
marios | forgive him oh zuul, he knows not what he says! | 10:07 |
ysandeep | Tengu|rover, marios: Alternative solution - we are using dib to build that c8 image, in c8 we use unbound local resolver in upstream as well.. we were missing correct forwarders in unbound - configuring correct downstream namerservers as forwarders solved the issue as well: http://pastebin.test.redhat.com/1089222 | 10:07 |
Tengu|rover | ysandeep: soooo. it's a bit more complicated, but yeah | 10:08 |
marios | ysandeep: thanks what was the primary (network manager?) | 10:08 |
Tengu|rover | ysandeep: in case you want to actually use the ubound service, you'll still need to ensure rc-manager=unmanaged is present in the NM config + service reload. | 10:08 |
Tengu|rover | ysandeep: else, NM may override the whole file in a way you don't want. | 10:09 |
marios | ysandeep: really my question is is there a reason we want to do the alternative did the primary fix not work/other blocker to do it? | 10:09 |
Tengu|rover | ysandeep: so, "whatever", but you want to ensure NM isn't editing the /etc/resolve.conf under any circumstances. | 10:09 |
ysandeep | As this is c8, I think we should keep using unbound with correct forwarders to be in sync with upstream jobs. | 10:09 |
Tengu|rover | marios: now that ysandeep said it, I remember my whole work on the upstream infra was to actually ensure we were using unbound without any of the NM interference.... | 10:10 |
Tengu|rover | ysandeep: yeah, that's probably the best approach. so pushing the correct forwarder, while ensuring NM doesn't touch the /etc/resolv.conf - this means reloading both services | 10:11 |
Tengu|rover | but that's for the better. | 10:11 |
ysandeep | marios, iirc.. for c8 - unbound is the default local resolver and that's why we have "nameserver 127.0.0.1" entry in the first place | 10:11 |
Tengu|rover | and... yeah, this explain why it can't resolve, actually. If the unbound was down, it would fallback on the second or third nameserver set in the config. | 10:12 |
Tengu|rover | but since unbound is running, and answers, it will throw some NXDOMAIN which is a valid answer, thus... crash | 10:12 |
Tengu|rover | ysandeep++ for the digging! | 10:12 |
Tengu|rover | and dumb me for not remembering that very same topic for the upstream - it was during the end of last year. | 10:13 |
marios | ysandeep: Tengu|rover: thanks i | 10:14 |
ysandeep | Tengu|rover, so I think these 3 things 1) Configure correct forwarders in unbound for downstream case + unbound reload 2) rc-manager=unmanaged is present in the NM config 3) NM service reload | 10:17 |
* ysandeep checking in which pre we can include above ^^ | 10:18 | |
Tengu|rover | ysandeep: yep, that sounds like the right plan | 10:18 |
ysandeep | Tengu|rover, marios thanks! | 10:18 |
Tengu|rover | ysandeep: lemme know when you have reviews up | 10:28 |
reviewbot | Do you want me to add your patch to the Review list? Please type something like add to review list <your_patch> so that I can understand. Thanks. | 10:28 |
Tengu|rover | hmm we may have an issue with master. There are 2 issues with Tempest, one for periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-master and the other for periodic-tripleo-ci-centos-9-ovb-1ctlr_2comp-featureset020-master. Not really sure what to look for. One is related to network settings, the other one to cinder (snapshot). It seems | 11:05 |
Tengu|rover | to loop on those 2. | 11:05 |
Tengu|rover | read timeout. Maybe infra is overloaded? | 11:05 |
Tengu|rover | #lunch | 11:07 |
*** rlandy|out is now known as rlandy | 11:09 | |
rlandy | ysandeep: ok to w+ https://review.opendev.org/c/openstack/tripleo-heat-templates/+/871493? | 11:13 |
rlandy | marios: bhagyashris|ruck: hi not sure if you saw ping slack | 11:13 |
ysandeep | rlandy, yes | 11:13 |
rlandy | done | 11:14 |
ysandeep | thanks | 11:14 |
rlandy | frenzy_friday: is there a way to test https://code.engineering.redhat.com/gerrit/c/openstack/rrcockpit/+/438919 - of not, pls comment | 11:14 |
rlandy | and then we can merge and try it out | 11:15 |
rlandy | if you are around to watch it and revert | 11:15 |
frenzy_friday | rlandy, commented. Yep, I think we can merge and I'll revert if stuff breaks | 11:16 |
rlandy | frenzy_friday: ok - I ma going to workflow - pls keep you eyes on the board and let me know befire your EoD if we are ok | 11:17 |
frenzy_friday | rlandy, cool, thanks | 11:17 |
*** dviroel|out is now known as dviroel | 11:18 | |
rlandy | done | 11:18 |
rlandy | bhagyashris|ruck: ^^ fyi if you are watching the downstream dashboard | 11:19 |
*** ysandeep is now known as ysandeep|afk | 11:22 | |
dpawlik | marios: o/ | 11:24 |
dpawlik | https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-containers-multinode-master/b6b75d6/job-output.txt | 11:24 |
dpawlik | I check logs on quay | 11:24 |
dpawlik | that contains that two ip address: 38.102.83.94 and 38.102.83.39 | 11:25 |
dpawlik | and there is nothing related to those ips | 11:26 |
dpawlik | marios: could you recheck some job to get new logs? | 11:28 |
marios | dpawlik: the issue is resolved now so we won't get the issue reproduced any more with recheck ... but you can check newer run there for example (2 top results are from last periodic runs/green) | 11:35 |
marios | rlandy: which one there were a couple 13:13 < rlandy> marios: bhagyashris|ruck: hi not sure if you saw ping slack | 11:35 |
marios | rlandy: replied in slack... :) | 11:36 |
marios | dpawlik: there https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-containers-multinode-master&skip=0 | 11:37 |
rlandy | marios: bhagyashris|ruck: can we meet for 5 re: 17.1? | 11:38 |
bhagyashris|ruck | rlandy, sure | 11:38 |
marios | rlandy: k | 11:38 |
rlandy | https://meet.google.com/uvk-qdgj-uro?pli=1&authuser=0 | 11:39 |
*** ysandeep|afk is now known as ysandeep | 12:11 | |
ysandeep | Tengu|rover, https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/438957 | 12:26 |
Tengu|rover | ysandeep: checking | 12:27 |
ysandeep | I have added a condition to only add 'rc-manager=unmanaged' for c8 case only to limit the breakage scope of this patch as this is config patch and we can't speculatively test | 12:28 |
Tengu|rover | ysandeep: reviewed. It's... well. I'd just configure NM in any cases, especially seeing the other tasks present in this file. | 12:34 |
ysandeep | Tengu|rover, ++ thanks for the review, I will update/comment after mtgs.. | 12:54 |
ysandeep | arxcruz++ great demo \o/ | 12:56 |
arxcruz | \o/ | 12:56 |
*** amoralej is now known as amoralej|off | 13:11 | |
*** amoralej|off is now known as amoralej|lunch | 13:11 | |
Tengu|rover | can anyone help me on debugging a couple of tempest issue? :) | 13:13 |
rlandy | ysandeep: thanks for debugging the train ovb failures | 13:16 |
ysandeep | rlandy, :) happy to help, I will fix the review comment from Tengu|rover and send it back. | 13:16 |
ysandeep | Tengu|rover, tempest failures - are those consistent? just a headup that sometime infra act up and we see random tempest failures. | 13:19 |
Tengu|rover | ysandeep: I think they are consistent, yes. the 2 same jobs are being constantly failing as far as I can see. | 13:19 |
Tengu|rover | http://dashboard-ci.tripleo.org/d/mhV51gdVk/upstream-and-rdo-promotions?orgId=1&viewPanel=14 | 13:20 |
ysandeep | dashboard is very slow here.. what is the job name? | 13:21 |
Tengu|rover | https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-master&pipeline=openstack-periodic-integration-main&skip=0 | 13:21 |
Tengu|rover | https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-ovb-1ctlr_2comp-featureset020-master&pipeline=openstack-periodic-integration-main&skip=0 | 13:21 |
Tengu|rover | hmm. | 13:21 |
Tengu|rover | it shows something else here than the grafana | 13:22 |
Tengu|rover | fun..... grafana shows more runs o_O | 13:22 |
Tengu|rover | maybe I should just rekick both jobs? | 13:24 |
Tengu|rover | hmm. yeah... let's see. | 13:25 |
ysandeep | Tengu|rover, failing tests are different on both jobs.. and as per build history they failed on other issues earlier | 13:27 |
ysandeep | so yeah lets recheck | 13:27 |
Tengu|rover | nudged. | 13:28 |
Tengu|rover | if we can get a master promotion today, that would be 2 in a row :). | 13:28 |
ysandeep | Tengu|rover, what kind of templating you have in mind for https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/438957/1/playbooks/tripleo-rdo-base/configure-nameserver.yaml#22 | 13:30 |
ysandeep | * we first find any IP address, replace with internal nameservers? | 13:31 |
Tengu|rover | ysandeep: I don't know the actual file format expected by unbound, but... basically, something that ensure file consistency, being an actual ansible.builtin.template, or an ansible.builtin.copy with content: | full file content | 13:31 |
Tengu|rover | at least, something ensure a consistency even if the nameservers are changed at some point. | 13:31 |
ysandeep | forwarding.conf don't have many entries - https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario007-standalone-train/c47a58f/logs/undercloud/etc/unbound/forwarding.conf , so copy/template sound good. | 13:33 |
Tengu|rover | what I thought. yeah. | 13:35 |
Tengu|rover | so that we override the whole file, and we're sure we don't have any weird content in there. | 13:35 |
Tengu|rover | ysandeep: maybe beware of the IPv6 resolvers? | 13:35 |
Tengu|rover | not sure if Red Hat provides any... ? | 13:35 |
ysandeep | may be we should just use ipv4 resolver for downstream case | 13:36 |
Tengu|rover | ah. hmm. VPN seems to provide an IPv6, meaning there should be some v6 resolver. | 13:36 |
Tengu|rover | ysandeep: yeah, we can start with pure v4 resolvers first. | 13:36 |
Tengu|rover | and if need be, we may dig further into that. | 13:36 |
ysandeep | Tengu|rover, ack for pure ipv4 resolver first.. left a link for you in internal channel about dns info I found | 13:40 |
*** ysandeep is now known as ysandeep|afk | 13:43 | |
*** dasm|off is now known as dasm | 13:58 | |
dasm | o/ | 13:58 |
*** amoralej|lunch is now known as amoralej | 14:05 | |
bhagyashris|ruck | rlandy, fs001 passed rhos17.1 https://code.engineering.redhat.com/gerrit/c/testproject/+/438532/6#message-a46a1bebeb0376240d5a1bd2ff86480acd794074 | 14:14 |
rlandy | Tengu|rover: sorry - still need help? | 14:15 |
Tengu|rover | rlandy: I think I'm good for now | 14:15 |
rlandy | k | 14:15 |
Tengu|rover | just need to get back to the 9.2 testproject you created | 14:15 |
* Tengu|rover all over the place | 14:16 | |
bhagyashris|ruck | we can merge skip patch | 14:16 |
rlandy | Tengu|rover: yeah - wanted to give that some focus in my afternoon | 14:21 |
rlandy | and get those review in | 14:21 |
rlandy | you are busy with rr | 14:21 |
Tengu|rover | rlandy: apparently we need to nudge the qcow2 image link for 9.2 | 14:21 |
rlandy | Tengu|rover: ok - let's touch base before you are EoD | 14:22 |
Tengu|rover | rlandy: nudged qcow2 in both patches (tripleo-environment + tripleo-ci-internal-jobs) | 14:26 |
Tengu|rover | we therefore should be able to re-kick your testproject. | 14:26 |
* bhagyashris|ruck leaving for the day | 14:48 | |
pojadhav | Community Call in 5 mins : arxcruz, rlandy, marios, ysandeep, bhagyashris|ruck , svyas, soniya29, pojadhav, akahat, chandankumar, frenzy_friday, anbanerj, dviroel, dasm, Tengu, jgilaber | 14:55 |
pojadhav | https://meet.google.com/igc-nxwj-gws?authuser=0 | 14:55 |
Tengu|rover | already in :) | 14:56 |
pojadhav | https://hackmd.io/iraYQWGBT4qPCKH0VNG31A#2023-01-24-Community-Call | 14:56 |
pojadhav | folks, please add agenda if any | 14:56 |
*** dviroel is now known as dviroel|lunch | 15:19 | |
* pojadhav afk | 15:22 | |
*** ysandeep|afk is now known as ysandeep|out | 15:24 | |
* ysandeep|out out, see everyone tomorrow o/ | 15:24 | |
dasm | ysandeep|out: o/ | 15:25 |
*** dviroel|lunch is now known as dviroel | 16:30 | |
*** dviroel is now known as dviroel|doc_appt | 16:43 | |
*** amoralej is now known as amoralej|off | 16:53 | |
Tengu|rover | OK Folks - going offline. See you tomorrow! | 16:57 |
*** jpena is now known as jpena|off | 17:21 | |
dasm | Tengu|rover: o/ | 17:25 |
*** dviroel|doc_appt is now known as dviroel | 19:13 | |
frenzy_friday | rlandy, dasm looks like the cockpit patch didnt mess anything up : http://tripleo-cockpit.lab4.eng.bos.redhat.com/d/MmX0tFSVk/osp-component-ci?orgId=1 - We still have data | 21:37 |
frenzy_friday | We dont need to revert it I think | 21:38 |
rlandy | frenzy_friday++ nice - thanks for checking | 21:38 |
rlandy | frenzy_friday: it's late ofr you ... tomorrow pls see slack - requesting your review on 9.2 patches on review list | 21:38 |
dasm | frenzy_friday++ thanks for checking that. | 21:39 |
dasm | frenzy_friday: i would be surprised if it would affect other views. But you never know with software :D | 21:39 |
*** dviroel is now known as dviroel|out | 22:44 | |
*** rlandy is now known as rlandy|out | 23:01 | |
* dasm => offline | 23:09 | |
*** dasm is now known as dasm|off | 23:09 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!