Wednesday, 2022-05-18

ianw"The warning appears both in the host and guest kernel logs." -- it's possible we're flooding the logs on the ovh side with this too00:01
ianwFor reference, 0x4 is SSBD mitigation, and 0x2 is the STIBP mitigation, and in the original bug report which was related to SSBD only00:06
ianwthe write was 0x4, but here it is 0x6 indicating that guest wants to enable both.00:06
opendevreviewMerged opendev/system-config master: Add testing for jammy openafs
ianwIgn:4 jammy-backports InRelease00:28
ianwthis might be something else to look into.  as we figured out, bionic images don't have backports enabled.  i'm not sure what our policy is on it, but we should probably make the images ~ the same00:29
ianwmodel name: Intel Core Processor (Haswell, no TSX)00:30
ianwthis is what ovh is reporting.  so it doesn't seem like amd is involved here00:31
ianw[    2.290166] kernel: unchecked MSR access error: WRMSR to 0x48 (tried to write 0x0000000000000004) at rIP: 0xffffffffabc90af4 (native_write_msr+0x4/0x20)00:32
ianwinteresting, this time is was 0x400:32
ianwoh, wait, it was before too00:32
ianwSSBD mitigation00:32
Clark[m]Re backports I'm not concerned about enabling them but iirc we do configure out sources lists explicitly with Ansible so someone likely decided not to include them on our servers00:33
ianw      SSBD: speculative store bypass disable   = false00:34
ianw      virtualized SSBD                         = false00:34
ianw      SSBD fixed in hardware                   = false00:34
ianwand another weird thing -- this is an OVH vm, but it has rax mirrors?01:11
fungiare you...sure?01:14
fungiif so, is it the rax mirror we use when building images?01:15
fungimaybe we're not resetting it correctly with the base job01:15
fungiyeah, dfw is what we have in our images, so that's likely01:16
ianwfungi: as sure as i ever am, which means there is a large possibility i have something dramatically wrong :)01:17
ianwso using a mainline 5.17 kernel on this ovh jammy image the problem doesn't happen01:18
ianwergo there is something that can be backported to fix it01:18
fungias long as it doesn't also need to support openafs ;)01:20
fungier, with 5.17 i mean01:20
fungiwhatever might get backported probably wouldn't have anything to do with my openafs build problem01:21
ianwoh, there's usually an upstream patch to openafs for more recent kernels pretty quickly after release?  but it may not have made it into openafs releases yet01:24
fungior debian may need to update its openafs package01:36
fungithe lkm for won't build on debian with linux 5.17 kernel headers:
fungierror: implicit declaration of function ‘complete_and_exit’01:40
ianwyeah it is fixed
fungioh, awesome01:42
ianwnot that i know much, but i've filed
ianwi've got an ovh vm and am trying to bisect it there; we'll see if that works ...01:57
Clark[m]You are bisecting between latest mailine and 5.15? That might take a while02:28
ianw5.17 and 5.15; yeah it says ~10 steps02:31
ianwi haven't actually managed to build a kernel yet, so ... yeah :)02:32
fungimaybe you'll get lucky and it'll be near an early bisection point02:44
ianwoh of course, i haven't run configure-mirrors on this manually setup node, doh.  so that's why that is pointing at rax02:51
*** pojadhav- is now known as pojadhav05:39
*** ysandeep|out is now known as ysandeep|rover06:04
*** ysandeep|rover is now known as ysandeep|rover|lunch07:24
*** ysandeep|rover|lunch is now known as ysandeep|rover08:38
ianw2f46993d83ff4abb310ef7b4beced56ba96f0d9d is the first fixed commit09:11
ianwamorin: ^ 09:16
ianwif we can understand why this changes fixes things, then hopefully we can request a backport09:19
ianws/this changes/this change/09:19
*** jpena|off is now known as jpena09:46
fricklerianw: nice find. according to that commit it looks like we could have its effect by setting proper kernel cmdline options?10:01
*** rlandy|out is now known as rlandy10:22
ianwyeah, that might be a workaround, i'll have to really parse what's going on tomorrow.  when the changelog is a couple of orders of magnitude bigger than the change there's something going on :)10:27
*** dviroel_ is now known as dviroel11:19
*** sfinucan is now known as stephenfin12:04
*** arxcruz_ is now known as arxcruz12:34
mnaserinfra-root: could someone have a look at the zuul logs to see why this is getting +2'dby zuul but not merged?
mnaser(with no reports as to why)12:44
fricklermnaser: it needs a rebase, note the red "merged" entries in the relation chaing12:52
mnaserfrickler: but the parent change was in the same state also but that merged12:53
mnaseroh nvm, i see it was rebased12:53
fricklermnaser: also, if you expanded the full change info on the left with "show all", you could see the parent info with a circled "I" that shows a popup note "Not current - rebase possible"12:54
*** pojadhav is now known as pojadhav|afk14:05
*** ysandeep|rover is now known as ysandeep|dinner15:17
*** rlandy is now known as rlandy|mtg15:31
*** dviroel is now known as dviroel|lunch15:39
*** ysandeep|dinner is now known as ysandeep15:52
*** ysandeep is now known as ysandeep|out15:57
johnsomFollow up on the fips/reboot/unbound issue from yesterday: We have a patch for devstack that works:
johnsomHowever frickler feels this should not be in devstack, but in the zuul level.16:13
johnsomPersonally I think it should be in devstack as I think it's good to stop devstack early with a clear error rather than have it run down to error out with missing packages.16:13
johnsomGiven the DNS issues we have had in the past, I don't think this only applies to the FIPS jobs.16:14
johnsomWondering if anyone here has additional thoughts to add to that patch.16:14
*** dviroel|lunch is now known as dviroel16:31
*** rlandy|mtg is now known as rlandy16:35
*** jpena is now known as jpena|off16:52
*** rlandy is now known as rlandy|mtg18:28
*** rlandy|mtg is now known as rlandy19:08
*** dviroel is now known as dviroel|out20:28
*** rlandy is now known as rlandy|bbl22:11

Generated by 2.17.3 by Marius Gedminas - find it at!