Wednesday, 2023-09-20

opendevreviewMerged openstack/ironic-inspector master: CI: re-add genade job to normal CI queues  https://review.opendev.org/c/openstack/ironic-inspector/+/89586300:11
opendevreviewJake Hutchinson proposed openstack/bifrost master: Bifrost NTP configuration  https://review.opendev.org/c/openstack/bifrost/+/89569108:45
dtantsurJayF, could we squeeze https://review.opendev.org/c/openstack/ironic-inspector/+/881463 in the release? The IPA deprecation part has been there for some time.09:34
*** vanou is now known as Guest73710:33
opendevreviewMerged openstack/ironic-inspector master: Handle bracketed IPv6 redfish_address  https://review.opendev.org/c/openstack/ironic-inspector/+/89573410:57
iurygregorymorning ironic o/11:15
opendevreviewIury Gregory Melo Ferreira proposed openstack/ironic master: RedfishFirmware Interface  https://review.opendev.org/c/openstack/ironic/+/88542511:35
iurygregoryok, funny thing I still get the Failed to set node power state to power on. =( 11:36
iurygregorymaybe is a bad firmware I'm updating? <thinking>11:36
opendevreviewMerged openstack/ironic-inspector master: Support LLDP data coming in the new field  https://review.opendev.org/c/openstack/ironic-inspector/+/88146312:38
opendevreviewHarald Jensås proposed openstack/ironic-inspector stable/2023.1: Handle bracketed IPv6 redfish_address  https://review.opendev.org/c/openstack/ironic-inspector/+/89590612:49
TheJuliaBad firmware or firmware not ready13:03
TheJuliaLet’s love forward, and just be mindful to backport fixes13:04
mmalchukgood day Ironic o/13:06
iurygregoryTheJulia, yeah agree, trying to understand if there is something else I could do to handle the case... like re-try to power on after some time maybe...13:16
iurygregoryif the error was ironic.common.exception.PowerStateFailure: Failed to set node power state to power on.13:16
iurygregorybut in general, the code updates the firmware, the information is update in the DB (in case of the power failure I haven't seen the versions being updated)13:18
TheJuliaI think, if it does fail to power on, and our expected state is power on, we'll fix that eventually13:19
TheJuliabut you can only try/except/retry ;) so many different cases with minimal data13:20
mmalchukfolks, please review https://review.opendev.org/c/openstack/diskimage-builder/+/895486 and trivial fixes in the related chain 13:21
iurygregoryTheJulia, yeah =( I wish things could be more deterministic lol13:22
TheJuliaiurygregory: you need lots of data points13:22
iurygregoryyup13:23
TheJuliaiurygregory: if your super worried about it, add a release note indicating we would love feedback since it is a new feature and we as a developer community have limited hardware access13:23
TheJuliaor something along those lines13:23
TheJuliawe don't have everything everyone is using13:23
iurygregory++ let me update the release note, I'm also pushing a separate patch with the docs about the feature13:24
iurygregorywill also mention there13:24
TheJulia++13:26
TheJulia"warning: may have sharp edges... and dull ones too. please reach out to the ironic developer community with any unexpected behavior."13:27
iurygregory oh perfect CI doesn't seem in a good shape in ironic :D or maybe is just my patch13:34
TheJuliaiurygregory: link?13:40
TheJuliahelp me see what you see!13:40
dtantsurmmalchuk, I think only TheJulia has any rights on DIB, and she has already reviewed13:41
TheJuliastevebaker[m] does as well13:41
TheJuliaand there is the #openstack-dib channel13:41
iurygregoryhttps://zuul.opendev.org/t/openstack/build/529fc30cdf5f4abeb014352780de5ac3 https://zuul.opendev.org/t/openstack/build/f7028e211ccb4c2e87b7c4bdb3d95ee5  https://zuul.opendev.org/t/openstack/build/8a71f08821f64da4a1186465c7e96351 I'm focusing in the standalone first13:42
mmalchukdtantsur thank you13:42
iurygregorymetalsmith seems unhappy and bifrost also .-.13:42
mmalchukTheJulia thanks for link13:42
TheJuliareally?!?13:43
TheJulia(that was for iurygregory )13:43
iurygregory:D13:43
iurygregoryok, failed to start uefi doesn't seem like a good sign13:44
iurygregoryhttps://zuul.opendev.org/t/openstack/build/529fc30cdf5f4abeb014352780de5ac3/log/controller/logs/ironic-bm-logs/node-3_console_log.txt 13:44
TheJuliaiurygregory: I think it is your patch maybe13:51
TheJuliaor maybe not13:51
TheJuliahttps://paste.opendev.org/show/bxrj0E82McAM8RMLIcpe/13:53
iurygregoryyeah, trying to figure out what could have cause this till patch set 4 CI was green on it13:53
TheJuliaperhaps a dummy change on CI just to see it's current health14:09
iurygregoryyeah pushing now14:15
opendevreviewIury Gregory Melo Ferreira proposed openstack/ironic master: [DNM] Testing  https://review.opendev.org/c/openstack/ironic/+/89593814:23
TheJuliaanyone have the ptg etherpad link handy?14:49
TheJuliahttps://etherpad.opendev.org/p/ironic-ptg-october-202314:50
iurygregoryon my patch the functional job is hitting timed_out, in the dnm is green =(14:54
iurygregoryyeah stadnalone is also green15:08
TheJuliaI think I see what is going on15:36
TheJuliaiurygregory: want to jump on a call and talk through it?15:37
iurygregorysure15:37
iurygregorylet me get a meet link15:37
TheJuliaok15:38
iurygregoryTheJulia, https://meet.google.com/afo-jzvj-vpj 15:39
JayFFYI we (well, openstack telemetry project) are adding an OSC plugin to talk to prometheus15:48
JayFpretty neat15:48
JayFhttps://review.opendev.org/c/openstack/governance/+/89491515:48
clarkbJayF: this is the query data in prometheus?15:49
clarkb"communication with prometheus" is a bit ambiguous :)15:49
JayFhttps://github.com/infrawatch/python-observabilityclient15:51
JayF> observabilityclient is an OpenStackClient (OSC) plugin implementation that implements commands for management of Prometheus.15:51
JayFthat's where it's being imported from15:51
JayFI'15:51
JayF**I am going to begin cutting releases and branches for all Ironic things except Ironic proper15:51
clarkbside note: All those osc plugins you are installing are what make osc performance terrible15:51
clarkbthe plugin registration system adds significant overhead to python process startup times15:52
JayFTheJulia: iurygregory: /me will be more appreciative of those tests in the future :D 16:03
TheJulia:)16:03
iurygregory:D16:04
opendevreviewMark Goddard proposed openstack/bifrost master: ironic: Perform online data migrations with localhost DB  https://review.opendev.org/c/openstack/bifrost/+/89594816:11
opendevreviewIury Gregory Melo Ferreira proposed openstack/ironic master: RedfishFirmware Interface  https://review.opendev.org/c/openstack/ironic/+/88542516:13
JayFmgoddard: I suspect it's being used as a proxy for sqlite16:25
JayFmgoddard: and we don't support sqlite migrations16:25
JayFmgoddard: just a hunch16:25
JayFfyi; I'm tracking my work on making Ironic releases for 2023.2: https://etherpad.opendev.org/p/ironic-bobcat-releases in case any of the release managers on the team want to review my work16:30
iurygregoryJayF, ack16:33
iurygregoryif you want to share the overhead I can take care of some releases16:33
JayFYou have one job16:34
* JayF kicks the redfish bmc16:34
JayFlol16:34
JayFI don't mind overviewing all the releases, if you want to offer peace of mind you can check my work16:34
iurygregoryI'm checking the patches you have open o/16:37
JayFiurygregory: I was all worried about us cutting a bifrost release outta master and not doing bugfix/ or stable/ branching at the time16:38
JayFI ran around my a little while worried until it occurred to me: that is the happy path for all other projects16:38
JayFwe are the weirdos who love a good branch lol16:38
iurygregorydidn't we make a decision about not having the bugfix in bifrost?16:38
JayFin this situation, it's more that when we cut libraries a month ago16:39
JayFwe didn't branch16:39
JayFwhich is something to look out for enxt time16:39
iurygregoryyeah16:39
iurygregoryJesus https://review.opendev.org/q/project:openstack/bifrost+status:open+branch:master16:40
iurygregorya lot of merge conflict :D 16:40
iurygregoryhttps://review.opendev.org/c/openstack/bifrost/+/884198 this would be good to include in bifrost, but we can have this as backport also16:41
JayFI have no problem handling that as a backport16:41
iurygregoryJayF, fyi molteniron we don't have releases16:42
iurygregory:D16:42
JayFack16:42
JayFthat is just a list from parsing projects.yaml16:42
JayFwhich I do because I sometimes forget things we manage LOL16:42
iurygregoryjust to be sure, the list you have in the etherpad are you planning to cut the stable branch right?16:42
JayFthat is my list of things to check16:42
JayFif it has a bullet under it, it's been checked that's the status16:42
JayFiurygregory: fwiw I usually use the release cycle boundry as a good oppo to see if the indepedent projects need a release, it's just easy to do +4 more :D 16:51
iurygregoryagree16:52
iurygregoryis just a reminder that we don't have stable branches for them16:52
JayFiurygregory: TheJulia: others; any objection to me making sushy-tools next release 1.0.0?16:53
* JayF really dislikes the 0.x.y styling of releases16:53
JayFI'm pretty sure it works :D 16:53
iurygregoryit's a good point16:54
* JayF JFDI16:54
iurygregoryI don't have objections tbh16:54
JayFmgoddard: tenks hasn't been released since 2019; and has a weird branching model. We should probably get some understanding of how releases there should be managed (or maybe just ... no more releases and change the model?)17:02
iurygregoryJayF, I've added a comment in the metalsmith one17:02
JayFupdated17:02
JayFyeah that's gone17:03
JayFmgoddard: /me sends you an inside slack about this since I don't know how much you IRC :)17:04
JayFI'm cutting inspector now, that only leaves Ironic17:12
JayFwhich I will wait to cut until Iury's patch gets along or at some point tomorrow if it's not gonna make it (but it will)17:12
JayFNote that at this point; any changes landed in master to any Ironic project except openstack/ironic will miss the release unless you contact me to update the releases patch17:15
iurygregoryJayF, I have the feeling that ngs is broken...17:26
JayFiurygregory: it is until neutron branches17:26
JayFiurygregory: I have a note on the PTG etherpad about it; I think in the future we need to branch and release it at library time17:26
iurygregoryok, so I probably missed something lol17:26
JayFiurygregory: it's all requirements-shifting-bs17:26
iurygregorygotcha17:26
JayFiurygregory: line 76, it talks about networking-bm but I expect ngs to have similar problems17:27
JayFhttps://etherpad.opendev.org/p/ironic-ptg-october-202317:28
iurygregoryJayF, tks!17:28
iurygregoryJayF, ironic-lib I don't think we need, like metalsmith 17:33
JayFiurygregory: ack; got it17:34
TheJuliayeah, anything which uses neutron lib can break, but I think ngs was okay last I looked sans the dlm testing is failing because of something with etcd. I don't quite grok where things went sideways though17:45
iurygregoryhumm18:03
* iurygregory checks something18:03
TheJuliait happens, we've had to fix it post-release in the past18:13
TheJuliaiurygregory: +2'ed, "short order" in my book is the next couple of weeks.18:19
TheJuliathree things, two minor, one I think you'll be spending some time on, but given the amount of testing I'm comfortable at this time18:19
iurygregoryack18:20
TheJuliaSo, any odds on if my car will be done today? :)18:21
iurygregorydepends on what they are fixing I would say18:21
JayFbased on my recent experience with repair professionals of all kinds18:22
JayFprobably not18:22
JayFdid they tell you it'd be done last week sometime? if so, maybe18:22
iurygregoryhttps://zuul.opendev.org/t/openstack/build/9fd16425ab244fb8ac4959ffe2594578/log/job-output.txt#10170 seems like ovs is mad in ngs18:23
TheJuliaJayF: "most likely today", which really means tomorrow18:25
JayFmost likely today means they are starting on it today18:26
JayFand maybe tomorrow if there's no shenanigans18:26
JayFbut shenanigans pay well so good luck :P18:26
TheJuliaheh18:27
iurygregoryhttps://stackoverflow.com/questions/48577019/not-able-create-ports-in-ovs18:28
iurygregoryI'm wondering if this is what we are hitting in ngs18:29
JayFDoes NGS' CI do anything that Ironic's doesn't?18:29
JayFmeaning in terms of environmental setup18:29
JayF(I'm curious if that is the breakage, would we see it in Ironic; if so, can you steal the fix from ironic/devstack/etc)18:30
iurygregorywell the job I was looking is pure tempest testing..18:30
TheJuliait does some additional networking setup18:30
iurygregorythinking of giving a try in https://opendev.org/openstack/networking-generic-switch/src/branch/master/devstack/plugin.sh#L12818:30
TheJuliabut https://60870c86ba00a6b46654-40c40653085e805e0ea6c3df0bb43128.ssl.cf2.rackcdn.com/886404/1/check/networking-generic-switch-tempest-dlm/9fd1642/controller/logs/screen-q-svc.txt is downright fatal18:31
JayFiurygregory: TheJulia: I'm going to go -1 my releases patch for ngs while you are digging on this18:31
iurygregoryright etcd is angry 18:31
JayFNo I'm not, just kidding, it's just a branch create not a release18:32
TheJuliayeah, I've been trying to get people to look/chime in for weeks on this18:32
JayFTheJulia: this is in johnthetubaguy's changes that merged this cycle, yes?18:32
TheJuliano18:32
TheJuliaI don't think so18:32
JayFI thought that's what used etcd18:32
TheJuliathe dlm code has been there for ages, but for some reason it can't find the entry18:33
TheJulianow, maybe he touched it, dunno18:33
TheJuliabut it seems to have started failing after that s well18:33
TheJulias/\ s\ /\ as\ /18:34
TheJuliawell, https://review.opendev.org/c/openstack/networking-generic-switch/+/743283 was the last patch to merge18:41
opendevreviewJay Faulkner proposed openstack/networking-generic-switch master: DNM: Revert "Support batching up commands"  https://review.opendev.org/c/openstack/networking-generic-switch/+/89591518:42
iurygregoryinteresting18:42
JayFjust some science18:42
JayFif we nail it down to that patch it'll make it easier18:42
iurygregory++18:42
TheJuliai don't think it is though, it raises the exception in the dlm code18:42
TheJuliawhich is not what that patch touched18:42
iurygregoryhummm18:43
iurygregorybut the patch added the etcg3gw requirement18:44
iurygregoryhttps://review.opendev.org/c/openstack/networking-generic-switch/+/743283/11/requirements.txt18:44
TheJuliahmmm18:45
TheJuliayeah18:45
TheJuliaintresting18:45
TheJuliasince i thought the dlm code was previously using it18:45
JayFI bet it's something like18:45
iurygregory*magic*18:45
JayFif etcd3gw is installed18:45
JayFtooz wants to use it as the backend18:45
opendevreviewJulia Kreger proposed openstack/networking-generic-switch master: DNM: Revert "Support batching up commands"  https://review.opendev.org/c/openstack/networking-generic-switch/+/89591618:45
iurygregory2 reverts? lol18:46
JayFjust for science18:46
JayFnot actually going to land them18:46
TheJuliaoh, doh18:46
iurygregory:D 18:46
iurygregoryagree, two showing the same result is better than one18:46
TheJuliaI abandoned mine18:46
iurygregorygoing to the gym now, be back in about 2hrs18:47
TheJuliahave fun, heading back to the house to wait for car in air conditioning18:47
JayFTheJulia: iurygregory: https://github.com/openstack/networking-generic-switch/blob/master/networking_generic_switch/devices/__init__.py#L173 returns true in all cases, I think18:50
JayFwhich leads to etcd/tooz coordination getting returned in every case18:50
JayFwhen in CI we should have it disabled until we do the work to enable it18:50
JayFhmm or backend_url is set18:53
JayFline 145, so it is the elif case there18:53
JayFthat code /should not be running/18:53
JayFhttps://60870c86ba00a6b46654-40c40653085e805e0ea6c3df0bb43128.ssl.cf2.rackcdn.com/886404/1/check/networking-generic-switch-tempest-dlm/9fd1642/controller/logs/etc/neutron/plugins/ml2/ml2_conf_genericswitch.ini18:57
JayFwe tell it etcd is there, but it's not or is not configured18:57
JayFhttps://60870c86ba00a6b46654-40c40653085e805e0ea6c3df0bb43128.ssl.cf2.rackcdn.com/886404/1/check/networking-generic-switch-tempest-dlm/9fd1642/controller/logs/screen-etcd.txt and etcd is running18:58
JayFSep 14 00:37:08.318687 np0035244654 etcd[23648]: advertise client URLs = http://10.209.64.96:237918:58
JayFthis is correct, too18:59
JayFfirewalling, perhaps? 18:59
JayFsince it's going to ip:port and not localhost:port?18:59
JayFand etcd had been running for 2 minutes at the point in which it's called19:00
JayFWTF19:00
opendevreviewJay Faulkner proposed openstack/networking-generic-switch master: CI fix: Ensure we use the same ETCD version tooz CI does  https://review.opendev.org/c/openstack/networking-generic-switch/+/89597319:07
JayFtrying ^ since I came to the conclusion the most likely explanation is that etcd3gw and etcd were incompatible, given they were all configured correctly19:12
JayFiurygregory: I'm approving firmware interface; please ensure you get the followup with TheJulia's issues resolved pushed up ASAP; I'd prefer just have that and not need to backport (I wanna cut Ironic by EOD tomorrow so I think we have time)19:13
JayFif that etcd version lock works; I might make a -nv version of the job on master after branch is cut that uses etcd and a voting version that doesn't; just to help us isolate failures (since I anticipate "etcd version mismatch" might be a recurring pain)19:14
JayFyes, confirmed19:19
JayFhttps://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.4.md?plain=1#L811 + https://github.com/openstack/tooz/blob/master/tooz/drivers/etcd3gw.py#L20419:19
JayFI think that tells teh story19:20
JayFand my version lock should work19:20
TheJuliaInteresting19:21
TheJuliaSeems like tooz needs a fix then too19:21
TheJuliaI guess that can just be next cycle too19:21
JayFI just mentioned in #openstack-oslo19:21
JayFand also you can force api version in config via url19:21
JayFso it's not that bad19:21
TheJuliagreat find!19:22
JayFalso I'll note: NGS is fine19:22
TheJulia\o/19:22
JayFthis is just operational shenanigans in CI19:22
JayF(assuming this is correct)19:22
JayFwhich is relatively safe at this point, I think19:22
JayFthat etcd version 404'd20:07
TheJuliawell, etcd, if we got a newer etcd by default via the gate, it would break then20:08
JayFI'm just looking at releases page to find one that works20:08
JayFto get this passing for now, until we talk to oslo about fixing tooz20:09
JayFI'd also rather not code in a default version to our devstack20:09
clarkbJan Gutter had a thread on openstack-discuss about upgrading etcd20:09
clarkbstarts on August 10 according to my email client20:09
clarkbhttps://review.opendev.org/c/openstack/tooz/+/891355 seems this is on peoples radar just stalled out maybe?20:10
JayFack, makes sense20:11
JayFI will try to just fix our job locally then20:11
clarkboh maybe it is waiting on the pifpaf release20:11
opendevreviewJay Faulkner proposed openstack/networking-generic-switch master: CI fix: Use the un-deprecated v3 etcd API  https://review.opendev.org/c/openstack/networking-generic-switch/+/89597320:14
JayFclarkb: thank you for that pointer, I never explicitly said that :D20:19
clarkbyou're welcome20:19
JayFI think for our purposes; we just need to fix CI and help (if not too many cooks) getting that stuff landed in tooz20:19
JayFall the knobs are there for the operators to ensure ngs works with their installed etcd version20:20
JayFthat failing NGS job is passing with that new patchset20:36
JayFthere's still another job running but that's even more confirmation our release is good and works \o/ 20:36
opendevreviewMerged openstack/ironic master: RedfishFirmware Interface  https://review.opendev.org/c/openstack/ironic/+/88542521:05
iurygregoryI'm back21:44
iurygregoryso we need to add "?api_version=v3"21:44
iurygregoryan this will magically fix ngs?21:44
TheJuliaseems like it21:45
iurygregorywhat kind of sorcery is that?!21:46
iurygregoryI'm ok with hardcoding (we should probably add some note regarding this so we don't forget to change or something21:46
JayFtooz defaults to v3alpha21:46
JayFwhich is gone since 3.321:46
iurygregoryperfect!21:46
JayFtooz parses that arg and uses the new one21:46
JayFthey have a patch to fix it up but it's delayed on deps21:47
iurygregorygot it21:47
iurygregorywe should keep on our radar21:47
iurygregory:D21:47
iurygregoryand maybe see if we can land other patches in ngs other than the fixes if we are ok with that...21:47
JayFThey will not be a part of the branch/release unless the patch for those are updated21:48
JayFmy preference would be to land things as if they are going in caracal and evaluate for backport21:49
JayFrather than rushing in more changes to a project we just got warm-fuzzies about the CI of :D21:49
TheJuliaif we don't, the release is nearly identical to 2023.121:51
TheJuliajust as a data point21:51
iurygregoryupdated?21:51
iurygregorymaybe I'm lost here21:51
TheJuliaJay just wants to ship the release, and not have to update the release branch21:51
TheJuliathe branch to be cut21:51
TheJuliathat is21:51
JayFRight now, the HEAD of ngs master is tagged in a releases change as the sha for stable/2023.221:52
JayFchanging that and/or backporting things for another release are ~trivial?21:52
TheJuliachanging it is super trivial, but we have to reach consensus21:52
JayFlet me put it this way21:52
JayFlets not talk about imaginary patches21:53
JayFlets talk about what's up and eligible, potentially21:53
TheJuliabasically everything sitting there looks like features21:53
iurygregoryhttps://review.opendev.org/c/openstack/networking-generic-switch/+/874793  https://review.opendev.org/c/openstack/networking-generic-switch/+/88640521:53
TheJuliaso we wouldn't backport htem21:53
iurygregoryhttps://review.opendev.org/c/openstack/networking-generic-switch/+/84759221:53
iurygregorythis 3 would need recheck after we land the fix21:53
iurygregoryunless they have conflict between them21:53
JayFafaict 793 is CI/testing related, I'm not sure having it in a release would matter to foolks using it in that context but IMBW21:54
iurygregoryand doesn't seem like they do21:54
JayF405 is a bugfix that should land21:54
JayFand would be backportable21:54
TheJuliai guess 592 is also a fix21:55
TheJuliaso we can backport that21:55
iurygregorymakes sense21:55
JayF592 is theoretically a fix21:55
JayFI do not know NGS well enough to make that judgement if it's OK or not21:55
iurygregorygreen https://review.opendev.org/c/openstack/networking-generic-switch/+/89597321:55
JayFchanging when something happens is worrisome in terms of "will the status quo keep working" but I should trust CI :)21:56
JayFTheJulia: iurygregory: To be clear: my concern is not "I don't wanna change a handful of characters in a PR", it's more about going "are we sure this works?" to "lets land a bunch of changes now that we are" is a little whiplash21:56
JayFnote that I indicated not landing stuff pre-release was my preference21:57
iurygregoryI see =)21:57
JayFnot trying to dictate just like ... a little skiddish :)21:57
TheJuliaalmost nobody reviews n-g-s21:57
JayFI don't because I don't trust myself to, bluntly21:57
TheJuliaso... all I can do is recheck and trust CI 21:57
JayFwhich is playing into the conservative preference I have21:57
TheJuliaif we don't trust CI though... that is a whole other issue21:57
iurygregoryin CI we trust21:57
TheJulia.... it is our community ethos21:58
JayFtrue21:58
JayFbut 90% of the community that has that ethos already has their rc releases in ;) 21:58
JayFlol21:58
iurygregorywhen we don't have CI we trust from the data points from testing in real hardware :D21:58
JayFI trust you all to make the right decision21:58
TheJuliaiurygregory: heh21:58
JayFI have a preference but it's not based in real technical proof just feelings21:58
JayFIf "trust CI" is the consensus; land my CI fix and rebase stuff tomorrow and we'll see where we are21:59
iurygregoryTheJulia, feel free to +W https://review.opendev.org/c/openstack/networking-generic-switch/+/895973 :D 21:59
TheJuliai just did21:59
JayFiurygregory: TheJulia: FWIW; I have an email in my inbox from HPE21:59
iurygregorywe don't need rebase *I think* we just wait for the promote job to finish and recheck21:59
JayFasking if we'd remove the ilo driver if they stopped 3rd party CI21:59
iurygregoryoh wow22:00
iurygregoryjesus22:00
JayFthey want to change to a "validate at the end of the cycle" model22:00
JayFI'm unsure how to respond, and I want to put it on the PTG and invite HPE to the session about it; does that sound alright to folks?22:00
* iurygregory is not aware of this ...22:00
TheJuliaYeah, lets see if we can talk through the mechanics of what they are thinking and the risks22:00
TheJuliabecause... we *cannot* delay for them22:00
iurygregorythinking a bit, we still support some drivers that stopped 3rd Party CI22:01
TheJuliaeh, we're kind of in a "if we get a report it is broken, out it goes"22:01
iurygregoryor maybe patches didn't trigger their CI...22:01
TheJuliamode22:01
iurygregoryhaven't seen FJ or Dell reporting in patches...22:01
TheJuliaBut their 3rd party CI has been broken for a while because they have local chages22:01
TheJuliayeah22:01
iurygregoryso maybe they only run if we change files related to their driver22:02
JayFMy major concern is not policy/precedent/etc22:02
TheJuliaI think I could be fine with a validate at the end of the cycle model, but I don't want to be in a situation like this22:02
JayFit's that ilo is about to release a substantial change22:02
JayFand it's a really, really crummy timing for them to pull 3rd party ci22:02
JayFbut I guess that's baked in the likely-invalid assumption that they'd version-bump22:02
iurygregoryilo6 is already out 22:02
TheJuliailo folks *are* better about bumping and versioning22:03
TheJuliabut yeah, that too is a risk22:03
TheJuliaLets set aside an hour, and try to get them to bring their thoughts/concerns22:03
JayFMy thought is that the best way to handle it would be to try and get them to come to a PTG session22:03
TheJuliaand discuss the risks22:03
JayFHmm you have more exp working with them so if you think a private meeting is better I can do that too22:03
TheJuliaMaybe time for perception to be revisited on our part as well, its not a bad thing22:03
JayFthen we can take the output to the PTG Session on hardware drivers22:03
iurygregory++22:04
TheJulia++22:04
JayFWho wants in on it?22:04
iurygregoryfeel free to add me, it will depend when its ofc =)22:04
TheJuliasure!22:05
JayFI assumed it might be a 4am/9pm meeting for me22:05
JayFhonestly I don't hate the idea of qualification22:08
JayFbut if we went that route we'd have to make our release process have a good spot for that tbh22:08
JayFcycle-with-intermediary-and-rc-and-all-of-the-things /s22:09
TheJulialets maybe not try and reflect this in the community release modeling22:10
* TheJulia *really* hates the RC model22:10
TheJulialike... unreasonably so22:10
JayFI'm more or less saying, you start pulling in third party QA people to help validate releases, you start flirting with things like "freezes" and the like potentially being needed22:12
JayFI'm not saying "these things are tied together" I'm saying "Ironic's special release framework gives us an extra thing to think about with consideration of that" :)22:12
TheJuliaoh, very much so22:14
* iurygregory also hates the RC model :D 22:15
TheJuliaerr, and they were not able to verify the major issue *why* I took the car to the dealer22:16
TheJuliaanother aspect is intermediate releases22:17
JayFlolyep22:17
JayFbugfix releases, do they get validated? etc22:17
TheJuliayup22:17
JayFand plus this model is extra rough for anyone community implementing anything22:17
JayFthey can only hope it works against theirs means it works in a general case22:17
JayFwhich I guess is same for 3rd party CI except it's whatever boxes they have running the4re22:18
TheJuliaYeah, and then the other concern is what patches get applied for the validation22:18
TheJuliaI think in every third party CI, we've seen one or more patches carried locally they rebase over, often it is just config/devstack stuffs, but... yeah22:19
TheJuliajust makes it harder for us to be aware22:19
JayFthat's a good thought as well22:20
iurygregorymaybe they need to reduce when they run things 22:28
iurygregoryat least would help a bit, or have jobs in experimental 22:29
iurygregorysome points we should consider when talking with them22:29
JayFHonestly it plays into something else I was going to suggest during bug triage topic at PTG 22:32
JayFif we had a rolling role, like cinder/neutron's bug deputy22:32
JayFwe could make "ensure CI is sane" part of that role too and try to reduce interrupts across the team (and maybe actually monitor/identify CI breakages when they happen to make them easier to root cause)22:32
JayFI think the difficulty of something like that is you need real committment from folks to be able to dedicate time upstream periodically, and that can be tough22:33
TheJuliathey have been reporting broken for a while22:42
TheJuliaunfortunately22:42
JayFwho has been reporting broken?22:42
JayFHPE? They fixed it very recently22:42
TheJuliaoh, did they finally fix it?22:42
JayFlike a 48 hour turnaround from an email from me, it wasn't that bad once I told em22:42
TheJuliathe merge conflict22:42
TheJuliano, not bad at all22:42
JayFI've had good experiences with them, I just have to poll for them myself22:42
JayFthey do not subscribe to the Ironic events feed so to speak :D22:43
TheJuliayeah22:43
JayFHPE CI is 100% passing, it was passing on Iury's firmware change, for instance22:43
JayFin fact it passed even with the functional test working forever22:43
JayFs/work/hang/22:43
TheJuliaokay, I've just not seen it recently22:43
TheJuliaI just went looking for dell/FJ22:44
JayFit's easy to get sorta notice-blind to it22:44
JayFDell/FJ is gone-gone-gone afaict22:44
TheJuliayeah22:44
JayFand have been... my entire PTL tenure, I believe?22:44
TheJulia*looks* like both went gone gone pulled plugs at EOY22:44
JayFso a year?22:44
JayFyeah that more or less lines up22:44
TheJuliaFJ commented on stuff stuff months later it looks like, but kind of hard to hunt down the accounts22:44
TheJuliadell was definitely end of year, I think FJ might have been up until May22:45
JayFIn a weird way, it's sorta a victory for Ironic and hardware?22:45
JayFbecause like, the third party CI is not as needed, and it flipped at some point from being "make sure Ironic works right" to "make sure the hardware works right"22:46
JayFwith redfish I think we have a stronger sense that it's going to work 22:46
TheJuliayeah, sort of I guess. The challenge is folks out there default into $vendor thinking it is the best/perfect option22:47
JayFI'm trying to think of the right way to think about this22:47
TheJuliaI think we need to work at providing feedback, in terms of "that won't work" or "you may want to try other driver22:47
JayFyes22:47
JayFexactly22:47
JayFwhich makes me think of like, QVLs from motherboards22:47
JayFwhere like "most stuff shuold work; we promise this stuff works" with an actual list of model numbers, etc22:47
JayFe.g. right now, someone comes to us and says "I have an HPE server, what driver do I use?" We need more information22:48
TheJuliayeah, but nobody is going to actually do that for Open Source, so all we can do today is "detect and try to shunt" or "detect bad case, and provide guidance"22:48
JayFa real process like that would probably be DMTF/redfish centered22:49
JayFwe test against a redfish standard way of doign things, hw vendors get certified good for redfish (or like; a matrix of features or whatever)22:49
JayFI want it to exist but I don't think we're the people to do it :)22:50
TheJuliabut then what do we do about the $super_weird_vendory_thing22:50
JayFHow many customers care about $super_weird_vendory_thing? 22:50
TheJulia"insert only one token, with one ethernet port connected... port is magically cloned because *magic*, and deploy the thing"22:50
TheJuliaonly $vendor can *really* tell us22:51
TheJuliawhich means, the *right* way is the vendor works with us to begin to tear down their driver22:52
JayFoh, I see what you mean now22:52
TheJuliawhich sounds awful, but maybe that is the right path22:52
JayFI thought you meant like, vendor features outside of redfish22:52
JayFyou mean deconstructing $brandName drivers into the redfish driver22:52
TheJuliaoh, no, like... FJ virtual media *has* to use SMBFS22:52
TheJuliathere is no other option, which makes it wholesale incompatible with stock redfish-virtual-media22:53
JayFfj virtual media mounts a samba share?!22:53
TheJuliawell, we don't manage it, but that is the design default in their firmware22:53
TheJuliayou configure a location AIUI, it gets the filename from there22:53
JayFwe'd almost have to like ... have some concept of "quirks" and mapping those quirks to hardware automatically or manually22:54
JayFlike maybe in this world, you point an FJ server at the redfish driver, it does some kind of redfishy "what are you?" question, we activate it into "fj-virtual-media-quirk" mode, and put that in like node[driver_info][quirks] (or make a top level idea for it)22:57
JayFtry to detach the idea of how the hardware is weird from who mnfr'd that hardware22:57
TheJuliaI think the special driver features likely need to be cataloged as well. Like FJ wants to do a new driver, but I don’t *think* they have posted a spec yet.23:01
TheJuliaOh, posted 2 weeks ago23:03
TheJuliaErr a week ago23:03
TheJuliaEverything is scrambled for me23:03
JayFyeah, I'm taking my EOD now, I'll see you in the morning23:07
JayFstevebaker[m]: if you're working NZ TZ tomorrow and want to do a solid, please recheck https://review.opendev.org/c/openstack/networking-generic-switch/+/874793 https://review.opendev.org/c/openstack/networking-generic-switch/+/886405 https://review.opendev.org/c/openstack/networking-generic-switch/+/847592 once CI fix merges23:07
JayFthank you in advance o/23:08
stevebaker[m]I'm on it23:08
TheJuliaHeh, the FJ driver spec says they will use third party ci.23:15
opendevreviewMerged openstack/ironic-specs master: Fix linter error in past spec which blocks new spec  https://review.opendev.org/c/openstack/ironic-specs/+/89357623:21
TheJuliavanou: o/23:23
opendevreviewMerged openstack/networking-generic-switch master: CI fix: Use the un-deprecated v3 etcd API  https://review.opendev.org/c/openstack/networking-generic-switch/+/89597323:25

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!