ianw | it's rather nail-bighting having 25 system-config jobs in gate | 00:29 |
---|---|---|
opendevreview | Merged opendev/system-config master: Base work for exporting encrypted logs https://review.opendev.org/c/opendev/system-config/+/828810 | 00:29 |
opendevreview | Merged opendev/system-config master: run-production-playbook: return encrypted logs https://review.opendev.org/c/opendev/system-config/+/829147 | 00:29 |
opendevreview | Ian Wienand proposed opendev/system-config master: run-production-playbook: default false when encrypting logs https://review.opendev.org/c/opendev/system-config/+/830104 | 00:45 |
NeilHanlon | i'm not particularly opinionated on where it's fixed as i mentioned in the comments for the swap change for dib. I think ultimately either is fine and if something happens down the line we can always fix it then | 01:11 |
opendevreview | Merged opendev/system-config master: run-production-playbook: default false when encrypting logs https://review.opendev.org/c/opendev/system-config/+/830104 | 01:54 |
opendevreview | Ian Wienand proposed opendev/system-config master: infra-prod: bump codesearch playbook https://review.opendev.org/c/opendev/system-config/+/830108 | 01:58 |
ianw | https://zuul.opendev.org/t/openstack/build/fe22f777ef24414b8799e3e33a62c4ae has run post ^^ so we're back to working on the regular path | 01:59 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: bootloader: fix arm64 install path https://review.opendev.org/c/openstack/diskimage-builder/+/830111 | 02:52 |
opendevreview | Merged opendev/system-config master: infra-prod: bump codesearch playbook https://review.opendev.org/c/opendev/system-config/+/830108 | 03:07 |
*** pojadhav is now known as pojadhav|ruck | 03:32 | |
ianw | hrm, i guess ^ has falled behind the periodic jobs, so it may be a while before it runs. hopefully i can check on it | 03:35 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: encrypt-file: become when installing packages https://review.opendev.org/c/zuul/zuul-jobs/+/830112 | 03:51 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: encrypt-file: become when installing packages https://review.opendev.org/c/zuul/zuul-jobs/+/830112 | 03:51 |
*** frenzy_friday is now known as frenzyfriday|ruck | 03:59 | |
*** frenzyfriday|ruck is now known as frenzyfriday|rover | 04:00 | |
opendevreview | Merged zuul/zuul-jobs master: encrypt-file: become when installing packages https://review.opendev.org/c/zuul/zuul-jobs/+/830112 | 04:08 |
*** ysandeep|out is now known as ysandeep | 04:38 | |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: Update fedora element testing to F35 https://review.opendev.org/c/openstack/diskimage-builder/+/830113 | 04:40 |
*** prometheanfire is now known as Guest2 | 04:59 | |
*** Guest2 is now known as prometheanfire | 05:09 | |
ianw | clarkb / fungi : not really going to argue much over the rocky fix, if you still prefer the dib side feel free to merge. if we do though, i think we've got two more for probably a 3.18.1 release | 05:23 |
ianw | https://review.opendev.org/c/openstack/diskimage-builder/+/829978 as gentoo is currently failing | 05:23 |
ianw | and https://review.opendev.org/c/openstack/diskimage-builder/+/830111 (or something like it) to fix a regression in arm64 bootloader path i should have realised when we cleaned that up | 05:24 |
ianw | i'm not 100% sure of the status of stevebaker's changes | 05:26 |
*** amoralej|off is now known as amoralej | 07:01 | |
*** ysandeep is now known as ysandeep|afk | 07:07 | |
opendevreview | Ian Wienand proposed opendev/system-config master: infra-prod: bump codesearch playbook (again) https://review.opendev.org/c/opendev/system-config/+/830120 | 08:21 |
*** jpena|off is now known as jpena | 08:36 | |
*** ysandeep|afk is now known as ysandeep | 08:47 | |
*** sshnaidm|afk is now known as sshnaidm | 08:55 | |
opendevreview | Merged opendev/system-config master: infra-prod: bump codesearch playbook (again) https://review.opendev.org/c/opendev/system-config/+/830120 | 09:16 |
dpawlik | fungi, Clark[m]: hey, could you tell me on which region you spawned logscraper instance? I would like to have that information in my notes | 09:31 |
ianw | i found it, it's vexxhost ca-ymq-1. we just wanted to hard reboot it | 09:39 |
ianw | it looks like permissions errors with the gpg signing. have to think about that one, but i think it can wait | 09:39 |
dpawlik | thanks ianw++ | 09:43 |
*** rlandy|out is now known as rlandy|ruck | 11:13 | |
*** dviroel_ is now known as dviroel | 11:16 | |
*** ysandeep is now known as ysandeep|afk | 11:57 | |
*** rcastillo|rover is now known as rcastillo | 12:35 | |
*** arxcruz|ruck is now known as arxcruz | 12:38 | |
*** ysandeep|afk is now known as ysandeep | 12:48 | |
*** amoralej is now known as amoralej|lunch | 13:09 | |
fungi | ianw: gpg signing of what? | 13:15 |
fungi | clarkb: this is worth keeping an eye on from a "where's setuptools going next" perspective: https://discuss.python.org/t/13821 | 13:48 |
*** amoralej|lunch is now known as amoralek | 13:59 | |
fungi | infra-root: if the gitea links in gerrit check out for you (they're looking fine to me so far) we should be able to proceed with https://review.opendev.org/829975 to block public access to the gitiles plugin | 14:01 |
opendevreview | Pierre Riteau proposed opendev/irc-meetings master: Remove inactive IRC chairs https://review.opendev.org/c/opendev/irc-meetings/+/830177 | 14:15 |
amusil | gtema: Hi, is there any plan about when openstacksdk 0.62.0 will be released? | 14:25 |
gtema | next planned is 1.0, but that will take a little bit more time | 14:28 |
amusil | Ok, thanks | 14:30 |
*** pojadhav|ruck is now known as pojadhav|dinner | 14:51 | |
dpawlik | fungi, Clark[m]: it took some time today to ensure that the issue with logscraper workflow is ok, the system is also ok (after an upgrade), but logstash seems to be "freezed". Sometime ago I got same issue with the logstash service. That prompts me to think if the current log workflow is correct and reduce some services, if it is possible. | 14:56 |
*** dviroel is now known as dviroel|lunch | 15:26 | |
*** ysandeep is now known as ysandeep|out | 15:31 | |
*** pojadhav|dinner is now known as pojadhav|ruck | 15:44 | |
clarkb | dpawlik: I'm not sure I understand what you mean by reduce some services. Do you mean index less data? | 16:00 |
*** ykarel is now known as ykarel|away | 16:03 | |
dpawlik | clarkb: due the logstash service is freezed once again, maybe the whole log workflow can be improved: https://softwarefactory-project.io/etherpad/p/gearman-replacement | 16:05 |
dpawlik | clarkb: if you are checking that etherpad, feel free to comment | 16:10 |
dpawlik | clarkb: I will write an email today/tomorrow | 16:10 |
clarkb | dpawlik: thanks, left a couple thoughts for things I Noticed. | 16:17 |
dpawlik | clarkb: yep, I saw. Thanks for commenting. I will do an email and send on the mailing list tomorrow | 16:17 |
dpawlik | if nobody will against that, I will start working on the improvements | 16:18 |
dpawlik | clarkb: TBH I will leave it as it is, but the logstash freezed second time and if nobody is able to restart/check whats going on, it makes a problem | 16:19 |
clarkb | dpawlik: yes, problems like this are why we haven't been able to maintain and operate the service let alone upgrade it. It needs a lot of care | 16:19 |
dpawlik | that's why I was thinking with tristanC for the log workflow improvement | 16:20 |
clarkb | because the volume of logs is non trivial | 16:20 |
dpawlik | yup, and there will be a lot of them. I'm hoping that it will be just "download" and in few minutes it will be computed by other service that send it to the elasticsearch and log should be deleted | 16:21 |
clarkb | fungi: I think what I'm reading re setuptools is that more and more of what made PBR useful is getting consumed by upstream. Seems like most of what PBR would be doing for us is versioning? I half wonder if a good approach here is to replace PBR with a git versioning specific plugin and then rely on setuptools-scm and setuptools and pepwhateveritis to get the tools installed via the | 16:21 |
clarkb | toml spec | 16:21 |
fungi | specifically what pbr is doing for us is pep-440 compliant semver versioning based on git tags, recording of git commit info in custom package metadata, and generation of changelog and authors files based on git history | 16:23 |
*** dviroel|lunch is now known as dviroel | 16:26 | |
clarkb | fungi: one of the things I don't quite understand with modern setuptools is how you are expected to hook into it these days. But ya it seems like hooking into setuptools for that subset of functionality should be doable then we don't have to worry about the other bits | 16:26 |
fungi | for the versioning, pbr is solving a couple of problems i haven't seen setuptools-scm cover: scalar dev versions, and determination of upcoming major/minor/patch level increment | 16:26 |
clarkb | -scm also doesn't handle the git hashes safely iirc. Maybe that is what you meant by scalar dev versions | 16:27 |
fungi | that, yes | 16:29 |
fungi | setuptools-scm creates non-pep440-compliant dev versions | 16:29 |
clarkb | frickler: did you have a chance to test the mergeability update on our held gerrit yet? I'm happy to help with that if I can just let me know | 16:30 |
fungi | pbr solves it by putting the commit id in separate metadata and recording commit count in the dev version string, relying on clues in commit messages within the commit history to disambiguate upcoming versions across concurrent branches | 16:31 |
fungi | that comes in really handy for projects like openstack which release from multiple branches | 16:31 |
clarkb | fungi: ianw NeilHanlon I have approved https://review.opendev.org/c/openstack/project-config/+/830101 to address the coreutils rocky problem outside of dib. Once that lands we can unpause image builds and see what happens next :) | 16:32 |
fungi | sgtm, thx! | 16:32 |
clarkb | infra-root https://review.opendev.org/q/topic:retirement+status:open that large set of changes is the next step for OpenDev project retirements. Once those land I'll push up a change to remove them from zuul and we can abandon their open changes | 16:33 |
clarkb | fungi: its also useful when you are on a single branch as it makes clear that things aren't sortable as expected if you are testing two bugfixes to a single branch without stacking them | 16:34 |
clarkb | fungi: re your question to ianw about gpg signing I think this is the error and context https://zuul.opendev.org/t/openstack/build/9d1da184e98947a4b11e989cbe3605c9 | 16:36 |
fungi | aha, thanks | 16:39 |
opendevreview | Merged openstack/project-config master: infra-package-needs: don't require coreutils for Rocky Linux 8 https://review.opendev.org/c/openstack/project-config/+/830101 | 16:42 |
*** amoralek is now known as amoralej|off | 16:43 | |
fungi | clarkb: ianw: i think that's a red herring. "gpg: can't create '/var/log/ansible/service-codesearch.yaml.log.gpg': Permission denied" seems more likely to indicate that the /var/log/ansible/ path isn't writeable by the zuul user (try adding become:true?) | 16:44 |
clarkb | that makes sense | 16:44 |
clarkb | or stage and then copy with perms | 16:44 |
fungi | the logs we write there on production bridge.o.o are created by root | 16:45 |
clarkb | fungi: ya and I want to say the dir is 775 and zuul isn't in the root group | 16:48 |
clarkb | the testing udpated it to 755 too maybe even | 16:49 |
*** jpena is now known as jpena|off | 17:43 | |
frickler | clarkb: didn't get to it today, feel free to take over if you have time | 17:44 |
clarkb | ok will see | 17:48 |
*** pojadhav|ruck is now known as pojadhav|out | 17:49 | |
opendevreview | Merged opendev/irc-meetings master: Remove inactive IRC chairs https://review.opendev.org/c/opendev/irc-meetings/+/830177 | 17:51 |
NeilHanlon | clarkb: ack, thank you! | 17:53 |
clarkb | the updated element made it onto the builders and I've run `nodepool image-unpause rockylinux-8` | 18:19 |
clarkb | nb02 is building the image now so we should be able to follow that log and see if there are more things to look at | 18:20 |
clarkb | + dnf -y install --enablerepo=epel haveged then Error: Unknown repo: 'epel' | 18:54 |
clarkb | I'll run the pause against that image next | 18:54 |
fungi | oof | 18:54 |
fungi | thanks | 18:55 |
fungi | NeilHanlon: ^ next iteration | 18:55 |
clarkb | and pause is failing on a 503 from vexxhost trying to get their client profile. I'll try again in a few minutes in case this is not persistent | 18:55 |
opendevreview | Merged openstack/diskimage-builder master: dhcp-all-interfaces: opt let NetworkManager doit. https://review.opendev.org/c/openstack/diskimage-builder/+/825983 | 18:55 |
clarkb | yup rerunning a minute later seems to have worked | 18:56 |
NeilHanlon | clarkb, fungi: i have some feeling that is due to something jrosser was doing the other day with nodepool and epel | 19:15 |
clarkb | NeilHanlon: the root of the problem is we install haveged on the VMs to increase entropy on the VMs. To do that on other rhel likes we have to install it from epel. I wonder if we just need to add epel as a repo source to the images | 19:16 |
* jrosser hopes there is not another repo to have to opt out of…. | 19:17 | |
clarkb | jrosser: I don't think so. THis is our rocky image builds failing beacuse it cannot install haveged from epel | 19:19 |
clarkb | I think we either need to find somewhere to install haveged from or don't install it and accept less entropy on rocky | 19:19 |
clarkb | I beleve starting with 9 we expect haveged to not be necessary due to kernel changes? | 19:19 |
NeilHanlon | I think jrosser made a change to not auto-enable epel; but i may be misremembering | 19:22 |
NeilHanlon | all that needs doing is installing epel-release first (or otherwise enabling it) which is available from the extras repo on the image | 19:23 |
clarkb | I suspect that the centos elements may be preinstalling epel for us | 19:23 |
clarkb | and that is why it has caught us out here | 19:23 |
clarkb | ah nope I see it, one sec | 19:24 |
jrosser | so long as whatever happens matches what we find in the wild with a bare metal host, I’m happy :) | 19:26 |
clarkb | jrosser: I think that ship has already sailed on coreutils | 19:26 |
clarkb | but this would be setup to mimic what centos is doing | 19:26 |
opendevreview | Merged openstack/diskimage-builder master: update gpg / file verification for Gentoo https://review.opendev.org/c/openstack/diskimage-builder/+/829978 | 19:33 |
opendevreview | Merged openstack/diskimage-builder master: Make growvols config path platform independent https://review.opendev.org/c/openstack/diskimage-builder/+/827557 | 19:33 |
opendevreview | Clark Boylan proposed openstack/diskimage-builder master: Add rocky support to the epel element https://review.opendev.org/c/openstack/diskimage-builder/+/830278 | 19:33 |
clarkb | I think ^ is needed to make the epel role do what we want? | 19:35 |
clarkb | the role is included it just doesn't know how to handle rocky tet | 19:35 |
clarkb | https://paste.opendev.org/show/bo4vZh09V1WdtKypeOSL/ is the relevant bit of the build log and I Think 830278 should address that by ending up in the other block in the if else | 19:37 |
NeilHanlon | that seems sane to me clarkb | 19:51 |
clarkb | thank you for looking | 19:51 |
ianw | thanks for looking in on the rocky bits, i'll double check but i think the epel element is the way to go for our images | 20:46 |
ianw | yes the gpg signing was referring to the codesearch deploy job | 20:46 |
ianw | it's probably better to open the permissions on the log dir more than run as root, i'm thinking? | 20:47 |
tristanC | is https://opendev.org unreachable? | 20:47 |
ianw | tristanC: hrm, yes it could be ... i got an error | 20:48 |
clarkb | yes its been discussed in #openstack-infra | 20:48 |
clarkb | the hypervisor hosting the instance is apparently dead dead and is being recovered now | 20:48 |
clarkb | I've mentioned to mnaser that we can rebuild the host if necessary and to give us an indication if we should do that (the assumption being recovering the instance on another hv will be quicker right now though) | 20:48 |
mnaser | clarkb: its back now | 20:49 |
ianw | ++ can confirm from here :) | 20:50 |
ianw | well that was a 1m 30s of excitement :) | 20:50 |
clarkb | yup confirmed and I see in the haproxy log that gitea01 was marked up | 20:50 |
clarkb | ti was the other server I noticed that may have had trouble. We may need to resync gitea01 from gerrit though | 20:50 |
mnaser | fwiw, gitea01-lb is a spof so i guess everything will go down if it does | 20:51 |
clarkb | ya confirmed gitea01 uptime is short | 20:51 |
mnaser | i can confir mboth gitea01 and gitea-lb01 were moved off | 20:51 |
clarkb | mnaser: yup, iirc because the only way to run haproxy with neutron and openstack is via octavia? | 20:51 |
clarkb | we're happy to run an lb pair ourselves but iirc the networking in openstack doesn't make this very viable. But maybe there are layer 7 workarounds we could make use of | 20:52 |
mnaser | clarkb: i've actually been toying with some potential ideas without octavia but yes, that's the straight forward one | 20:52 |
clarkb | our previous experiences some of the managed services have made us cautious of going that route | 20:52 |
clarkb | but ya I think octavia is aavailable now in vexxhost? | 20:52 |
mnaser | yep it's been for a while :) | 20:53 |
clarkb | infra-root I have manually disabled gitea01 now. I'm going to go eat lunch, but the next thing there is likely to tell gerrit replication to replicate to gitea01 | 20:54 |
clarkb | I can do that after lunch if no one beats me to it | 20:54 |
ianw | i can get gerrit started on that | 20:55 |
clarkb | ianw: I think it is the url flag that allows you to do something like gerrit replication start --url gitea01 and it will only replicate to gitea01 | 20:55 |
clarkb | ianw: the show queue command will also show you all of those inflight and queued tasks once requested | 20:55 |
ianw | i ran replication start --url 'ssh://git@gitea01.opendev.org' | 20:58 |
fungi | that sounds right | 20:58 |
ianw | 2160 queue jobs for gitea01 which seems about right | 20:58 |
fungi | it'll probably match whatever the entry is in our replication plugin config | 20:58 |
ianw | currently the job is doing "gpg2 --encrypt --output /var/log/ansible/service-codesearch.yaml.log.gpg --recipient=0x9615aec8 /var/log/ansible/service-codesearch.yaml.log" | 21:01 |
ianw | we can either open the permissions on /var/log/ansible so zuul can write there, or update the role so that it can output to a different path | 21:01 |
ianw | or maybe just copy it and do it all in a tmpdir | 21:05 |
opendevreview | Ian Wienand proposed opendev/system-config master: run-production-playbook: encrypt logs in temporary staging directory https://review.opendev.org/c/opendev/system-config/+/830288 | 21:19 |
opendevreview | Ian Wienand proposed opendev/system-config master: run-production-playbook: encrypt logs in temporary staging directory https://review.opendev.org/c/opendev/system-config/+/830288 | 21:25 |
fungi | ianw: yes, any of those seems like a fine solution to me | 21:27 |
clarkb | gerrit show queue looks empty now. Should I reenable gitea01? | 21:41 |
clarkb | ianw: doesn't look like the rocky functests are run by the change I pushed to update the epel element: https://zuul.opendev.org/t/openstack/build/0959c5b20047453586ed9f8959db556b/logs ? | 21:42 |
clarkb | is there another way to test that? | 21:42 |
clarkb | oh I see I can just update the change to run them in functests I think | 21:45 |
clarkb | then remove if people don't want them permanently after we've seen them be happy | 21:45 |
fungi | or push a follow-on addition | 21:45 |
fungi | which would run the additional testing optionally without blocking merge of the fix | 21:45 |
opendevreview | Clark Boylan proposed openstack/diskimage-builder master: Add rocky support to the epel element https://review.opendev.org/c/openstack/diskimage-builder/+/830278 | 21:47 |
clarkb | ya I think having the extra checks shouldn't be too much extra burden on the test runs so I figure keep it if this generates data showing it isn't a big impact | 21:47 |
clarkb | otherwise I'll split it out | 21:47 |
clarkb | I'm reenabling gitea01 now that 830278's update shows up on it | 21:50 |
*** dviroel is now known as dviroel|out | 21:52 | |
ianw | sorry back now, yeah we don't run that test in gate | 22:09 |
ianw | i'd probably prefer to not have it in functests too, because we already have so much going on running tests on the same element twice seems a bit wasteful | 22:10 |
clarkb | ianw: I don't know that we are using epel anywhere in the tests? But I may be wrong about that | 22:10 |
ianw | probably not, i mentioned in the change we could think about adding it to the nodepool tests | 22:12 |
clarkb | ah yup looks like we support an extra element argument. I can update for that | 22:13 |
ianw | my only concern with that though is that it ties us more to epel reliability with gate testing | 22:14 |
clarkb | I guess this comes down to how willing dib is to accept changes to the epel element without testing | 22:15 |
clarkb | it looks like a year ago we updated the role for centos-stream-9. Not sure how that was tested if at all | 22:15 |
ianw | we just have to trade off the cost of building a whole distro for one element, or the extra dependencies if adding it to the extant tests | 22:16 |
ianw | obviously the best solution would be to only test it when it changes, but i'm not sure how we could achieve that | 22:16 |
ianw | that is a bit of a general problem with dib, in that it throws everything at most every change | 22:17 |
clarkb | ya | 22:17 |
clarkb | we could modify the nodepool tests to only run when key bits change, but the key bits tend to be modified often enough that may not help as much as we hope | 22:17 |
clarkb | (things like the partitioning and so on) | 22:18 |
fungi | we could put some separate jobs in experimental | 22:19 |
clarkb | oh thats a thought | 22:19 |
ianw | it's probably easiest to do the follow-on idea and have one run that double-checks the match works, given it is unlikely to ever change after that it's, to me, an ok trade-off | 22:19 |
ianw | fungi / clarkb: if you could just double-check https://review.opendev.org/c/opendev/system-config/+/830288, i'd like to get that in to make sure the codesearch prod job starts working | 22:20 |
fungi | oh, yep, i already had that one pulled up to see how the job did | 22:21 |
fungi | lgtm | 22:21 |
ianw | thanks; it's an annoying path because it really only shows up in deploy | 22:22 |
fungi | and for future reference, those gpg messages about trust levels are just noise | 22:22 |
ianw | fungi: which messages are they? | 22:27 |
fungi | the ones you thought indicated bad signatures earlier | 22:27 |
clarkb | looking | 22:28 |
ianw | fungi: umm, sorry still not sure which ones? do you mean where we're updating the trustdb in the encrypt-file role? | 22:30 |
ianw | clarkb: if you have time too, https://review.opendev.org/c/openstack/diskimage-builder/+/830111 in dib can go in with the epel fix, if you can just double check I haven't missed anything in the matching/logic path there | 22:30 |
ianw | it fixes a regression in arm64 introduced by recent bootloader cleanup | 22:31 |
fungi | 09:39 <ianw> it looks like permissions errors with the gpg signing. have to think about that one, but i think it can wait | 22:31 |
clarkb | one thing on that change | 22:31 |
ianw | fungi: oh sorry, i meant on-disk file permissions | 22:31 |
fungi | well, there was also no signing going on there | 22:32 |
fungi | hence my confusion about what you meant | 22:32 |
ianw | clarkb: oh, doh, good catch | 22:34 |
clarkb | ianw: for https://zuul.opendev.org/t/openstack/build/146dd72e3f914332ad6b8f63dda8cb00/log/logs/rocky-container_build-succeeds.FAIL.log does that imply the elements list for build-succeeds is insufficient? | 22:34 |
clarkb | I'd like to figure out how to fix that but split it out into a child change | 22:35 |
clarkb | I see the fedora one has block-device-gpt. I'll try that | 22:35 |
ianw | ahh yeah, that will need block-device-*, mbr or gpt will work | 22:36 |
ianw | because it has the vm element | 22:36 |
ianw | now it says remote_src doesn't work with mode setting | 22:37 |
opendevreview | Clark Boylan proposed openstack/diskimage-builder master: Add rocky support to the epel element https://review.opendev.org/c/openstack/diskimage-builder/+/830278 | 22:39 |
clarkb | something like that maybe | 22:39 |
opendevreview | Clark Boylan proposed openstack/diskimage-builder master: DNM Follow on commit to test rocky + epel https://review.opendev.org/c/openstack/diskimage-builder/+/830291 | 22:39 |
clarkb | ianw: hrm | 22:39 |
clarkb | ianw: I find this aspect of ansible to be extremely confusing. Maybe we shell out a cp? | 22:40 |
ianw | i just tried it, and it works as i expected | 22:40 |
opendevreview | Ian Wienand proposed opendev/system-config master: run-production-playbook: encrypt logs in temporary staging directory https://review.opendev.org/c/opendev/system-config/+/830288 | 22:41 |
clarkb | https://docs.ansible.com/ansible/latest/collections/ansible/builtin/copy_module.html indicates that src is not remote unless remoet_src is set | 22:41 |
ianw | clarkb: the other interesting thing with that change is that we probably do not setup podman to be able to run a containerfile element on the functest host | 22:44 |
ianw | so i wouldn't be surprised if it maybe fails trying to get the base image | 22:45 |
clarkb | ianw: ya that is why I also went with the nodepool functest update too :) | 22:45 |
clarkb | I figured I had twice the opportunity to get something working that way | 22:46 |
clarkb | ianw: for `&& ! -d /usr/lib/grub/*-efi` what are we trying to accomplish there? I'm comparing against the code that added the regression and it isn't clear | 22:46 |
clarkb | ianw: seems like before we always set the i386-pc target but now we're adding an additional spot where we don't? | 22:47 |
ianw | if "/usr/lib/grub/<arch>-efi" is there, it means the grub efi packages are installed. so this is trying to say "we are on a system that doesn't have grub-efi installed" | 22:47 |
clarkb | but in both cases don't we need to set the target for when it updates the grub install? Or maybe we just don't care about updating if it is already there? | 22:48 |
ianw | no, but now i look again, perhaps we should move this check into the section below | 22:50 |
ianw | actually, no, that doesn't work | 22:50 |
clarkb | ianw: I think the way it works is you always get both bios and uefi compat the way it was written | 22:51 |
clarkb | but now I think we may only get uefi depending on whether or not /usr/lib/grep/*-efi exist | 22:51 |
ianw | we added something so that uefi would make bios compatible images | 22:53 |
ianw | https://review.opendev.org/c/openstack/diskimage-builder/+/743243/3/diskimage_builder/elements/bootloader/finalise.d/50-bootloader | 22:54 |
ianw | this bit | 22:54 |
clarkb | ianw: that sets the same flag the code you are updating modifies | 22:55 |
clarkb | is the code you are modifying entirely redundant? | 22:55 |
ianw | no i don't think so, because we need to explicitly set the target (i think) in the case where you're building on a BIOS only image on an EFI system -- to avoid it trying to guess from /sys | 22:56 |
clarkb | I suspect it is since both seem to check for use of efi then set the --target=i386-pc flag | 22:56 |
ianw | if we fall into the mbr/gpt bits we want that flag set | 22:57 |
clarkb | why are we checking /sys and /usr then? | 22:57 |
clarkb | Shouldn't we just check if this is mbr/gpt? | 22:57 |
clarkb | I think that is what I'm confused about. We seem to be checking the same thing multiple differetn ways and setting the same flags either way. If efi then set --target=i386-cp | 22:58 |
ianw | we may be able to refactor it to that, yes | 22:58 |
ianw | i think history might show that we have added the gpt/mbr path well after this check, and never gone through and read it top-to-bottom | 23:00 |
clarkb | I also may be getting a bit confused by the values in DIB_BLOCK_DEVICE | 23:01 |
clarkb | since efi can boot mbr and gpt (though apparently on arm they neglected the part of the spec that requires they be backward compatible?) | 23:01 |
clarkb | and I thought bios only did mbr? But maybe if you have a newer bios it does both too | 23:01 |
clarkb | anyway I see now where it is different as mbr or gpt doesnt seem to imply efi. Even though we check if efi is used then set the value for mbr or gpt :) | 23:03 |
ianw | i believe that efi can only boot from gpt | 23:03 |
clarkb | ianw: I think I undersatnd this better and it seems like if DIB_BLOCK_DEVICE == gpt or mbr then we alway want to set i386-pc because those imply non efi systems. I think the efi check is because grub will automatically determine i386-pc properly unless boot ed with efi | 23:04 |
ianw | so in our code, efi implies gpt, but gpt does not imply efi | 23:04 |
clarkb | ianw: to simplify this I think you can just add a check for x86 in the mbr/gpt block and set the flag there. Then it is a lot more direct and makes sense | 23:04 |
clarkb | you shouldn't need the efi check in there since we've already determined we are not efi? | 23:04 |
ianw | ok, so this check came in with | 23:06 |
ianw | https://review.opendev.org/c/openstack/diskimage-builder/+/36861 | 23:06 |
ianw | which interestingly gives a gerrit 500 error | 23:06 |
clarkb | really I think the confusing part is checking if we want to manually set the flag because we're on efi and grub won't autodetect. But instead we can just always set the target if not efi regardless of the host situation | 23:07 |
ianw | this is *way* predates any of the block-device gpt/mbr/efi stuff | 23:07 |
ianw | it seems like we've added those bits, but not really pulled out the old check | 23:07 |
clarkb | I think those gerrit errors are related to comments that didn't make the migration :/ | 23:07 |
ianw | so i tend to agree a full refactor will work | 23:08 |
clarkb | *notedb migration | 23:08 |
clarkb | ianw: I'm happy to approve it now that I understand the tangle there a bit better :) | 23:08 |
clarkb | or would you prefer to try and refactor first? | 23:08 |
ianw | perhaps i'll propose a follow-on with the refactor to keep it separate | 23:08 |
clarkb | the !-d /usr/.... check still seems unnecessary since we should be able to set the target regardless in this case | 23:08 |
ianw | right; now we actually know what we are building, in the 36861 days we didn't | 23:09 |
clarkb | jentoio: does tomorrow afternoon work for syncing up on the container stuff? we can do a jitsimeet call? | 23:18 |
clarkb | I've updated the meeting agenda. Please edit to add or fixup any topics in the next little bit then I'll send it out | 23:20 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: bootloader: clean up EFI checking https://review.opendev.org/c/openstack/diskimage-builder/+/830292 | 23:22 |
ianw | clarkb: thanks, fresh eyes on bits are always good! | 23:22 |
clarkb | cool +2'd both but didn't approve the first one in case you wanted to land them together or squash | 23:24 |
ianw | let's make sure it passes separately and see | 23:26 |
*** rlandy|ruck is now known as rlandy|ruck|bbl | 23:30 | |
clarkb | ianw: https://zuul.opendev.org/t/openstack/build/5af09c236f6745af8f5e9811e62d64e6/log/nodepool/builds/test-image-0000000001.log#924-969 I think that shows the epel for rocky change working | 23:39 |
ianw | thanks, lgtm | 23:45 |
opendevreview | Merged opendev/system-config master: run-production-playbook: encrypt logs in temporary staging directory https://review.opendev.org/c/opendev/system-config/+/830288 | 23:46 |
clarkb | Gerrit is doing a hackathon in May with in person and remote attendance. The inperson is super limtied and I'm not sure I'm up for the travel anyway. So then I look at the reot stuff and wonder if I can be away from 09:00 - 17:00 London time | 23:52 |
clarkb | I suspect that would be very difficilt :) | 23:52 |
clarkb | python3.10 adds additional determinism to python thread scheduling. I guess we'll want to keep testing an old python with anything that might have races | 23:54 |
opendevreview | Ian Wienand proposed opendev/system-config master: run-production-playbook: fix path typo https://review.opendev.org/c/opendev/system-config/+/830294 | 23:57 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!