Friday, 2024-04-19

opendevreviewJulia Kreger proposed openstack/ironic master: DNM: Break tempest on purpose to figure out what tests might need to be addressed  https://review.opendev.org/c/openstack/ironic/+/91612800:14
iurygregorywow00:16
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: Fix vif tests  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/91629900:19
TheJuliaiurygregory: huh?00:20
iurygregoryTheJulia, I liked the the title in your change :D I was like wow 00:20
TheJuliaThat DNM is likely going to be a permenant enablement of tempest and rabbit on that job00:20
TheJuliajust need to see since we set some slightly different config to match flat networking as well00:21
iurygregoryoh nice \o/00:21
TheJuliaalso https://review.opendev.org/c/openstack/ironic-specs/+/91612600:21
* TheJulia whistles innocently00:22
iurygregoryI saw that =)00:22
iurygregoryI gave a heads-up to lucas (since he might be interested)00:22
stevebaker[m]TheJulia: I'm still getting the cinder detach error, you mentioned it requires a service token, which I think is the case. So I'm not sure what to try next https://zuul.opendev.org/t/openstack/build/2902d34980b841859ac8229bcb2e99ac/log/controller/logs/screen-ir-cond.txt#1645 02:54
stevebaker[m]https://review.opendev.org/c/openstack/ironic/+/90026502:54
TheJuliastevebaker[m]: ill look in the morning02:55
stevebaker[m]thanks, I'll revisit on Monday :)02:55
TheJuliaIā€™m unfortunately in line for post dinner desert at the moment02:55
TheJuliaHave a great weekend!02:55
stevebaker[m]fortunately, surely02:55
opendevreviewOpenStack Proposal Bot proposed openstack/ironic-ui master: Imported Translations from Zanata  https://review.opendev.org/c/openstack/ironic-ui/+/91636003:24
opendevreviewcid proposed openstack/ironic master: wip: Remove special treatment of .json for API objects  https://review.opendev.org/c/openstack/ironic/+/91346703:34
opendevreviewJulia Kreger proposed openstack/ironic master: Run neutron for the functional test job  https://review.opendev.org/c/openstack/ironic/+/91612804:20
TheJuliastevebaker[m]: took a look, I think I see the confusion there, the huge block of logic actually sort of holds the secret, but I linked to the more openstacksdk-ey example on how to do it. Because of the user request coming in, you want to start with sourcing the user authentication, and as you likely noticed if we didn't have it, then we just used our own credentials and did the endpoint massaging. There was reason for my 04:34
TheJuliamadness04:34
stevebaker[m]There is a version of that change which did the exact same as the neutron get_client, same result04:36
stevebaker[m]TheJulia: It was Patchset 9 which just copied what you did for neutron get_client, which had the same result https://review.opendev.org/c/openstack/ironic/+/900265/9/ironic/common/cinder.py04:44
TheJuliabut neutron doesn't have the service auth requirement04:46
TheJuliaand I think we basically unwound the neutron stuff, but I'd had to take a look when I'm actually awake and not on the verge of heading to bed04:47
stevebaker[m]yeah I'll absorb those connection.py docs and try again. Maybe not today though, I'm outta spoons04:49
TheJuliaunderstand, it is a bit weird of a thing to wrap your head around since it is a whole *other* dual token bearer thing04:49
opendevreviewSteve Baker proposed openstack/ironic master: Replace cinderclient usage with openstacksdk  https://review.opendev.org/c/openstack/ironic/+/90026505:13
rpittaugood morning ironic! happy friday o/07:21
opendevreviewMerged openstack/ironic stable/2024.1: Add states.SERVICING and SERVICEWAIT to _FASTTRACK_HEARTBEAT_ALLOWED  https://review.opendev.org/c/openstack/ironic/+/91608508:42
opendevreviewMerged openstack/ironic stable/2023.2: Add states.SERVICING and SERVICEWAIT to _FASTTRACK_HEARTBEAT_ALLOWED  https://review.opendev.org/c/openstack/ironic/+/91608608:55
iurygregorygood morning Ironic11:45
opendevreviewRiccardo Pittau proposed openstack/ironic stable/2024.1: Fix redfish detach generic vmedia device method  https://review.opendev.org/c/openstack/ironic/+/91534112:57
rpittauiurygregory, TheJulia, when you have a moment can you please check ^ ? thanks!12:58
iurygregoryrpittau, sure12:58
iurygregorytbh I would love to see a unit test with the case of multiple devices (do we already have one?)13:00
* iurygregory checks13:00
rpittauiurygregory: we don't, I'm working on a patch to add them13:04
rpittauwell, specifically for the redfish attach/detach13:04
iurygregoryrpittau, oh ok ^^13:04
iurygregoryack13:04
rpittauthanks for the review13:05
iurygregoryyw13:10
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: Fix vif tests  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/91629913:40
TheJuliahey folks, I'd appreciate reviews on ^^^13:41
TheJuliaI've tagged it as ironic-week-prio. It is required to land a patch on ironic to enable neutron on the functional test jobs13:42
dtantsurTheJulia: I see an issue in the fact that we use microversion 1.31 in a test that is literally testing microversion 1.213:55
TheJuliathen our only other option is to delete the test13:56
TheJuliabecause otherwise the test is purely reliant upon those interfaces being default to not fail13:56
dtantsurTheJulia: restore the microversion after changing the interfaces?13:56
dtantsurit's still hacky, but something I can live with13:56
TheJuliawhich is a crap behavior, but it is what it is13:57
TheJuliadtantsur: I think I did that13:57
TheJuliaoh, wait13:57
dtantsurTheJulia: not for inherited tests13:57
TheJuliayeah, there are two other classes13:57
TheJuliayeah13:57
TheJuliagive me a minute and I'll do that13:57
TheJuliaI completely forgot about the inhereted tests13:57
* TheJulia makes more money13:57
TheJuliaerr13:57
dtantsurTheJulia: let's just set it again right in setUp13:57
TheJuliacoffee13:57
TheJuliayeah13:57
dtantsurboth money and coffee are not too bad13:58
TheJuliaI did that originally but through refacctors hit the other issue13:58
dtantsurI assume, update_node does not accept a different microversion?13:58
opendevreviewMerged openstack/ironic master: Fix device_type in attach/detach vmedia for Redfish  https://review.opendev.org/c/openstack/ironic/+/91630613:59
TheJuliadtantsur: afaik, it does not, I sifted through the code yesterday and it didn't look like it off hand with the pattern taken14:03
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: Fix vif tests  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/91629914:05
TheJuliaThat should do it. I did it originally, but i later realized because what we're really testing is start node state behavior based on the enrolled version at the end of the day14:06
TheJulianot re-adding the depends on patch to turn neutron on14:06
TheJuliasince... those tests are disjointed from neutron14:07
dtantsurOkay, leaving the review open, will need to check some other things quickly14:08
TheJuliasure, sure14:08
opendevreviewRiccardo Pittau proposed openstack/ironic-specs master: [WIP] Add work items for 2024.2 Dalmatian development cycle  https://review.opendev.org/c/openstack/ironic-specs/+/91629514:47
opendevreviewRiccardo Pittau proposed openstack/ironic-specs master: [WIP] Add work items for 2024.2 Dalmatian development cycle  https://review.opendev.org/c/openstack/ironic-specs/+/91629515:02
TheJuliadtantsur: I'll need to rev that patch one more time to remove the dependency on it, but I'll let it finish up first before doing so15:10
opendevreviewRiccardo Pittau proposed openstack/ironic-specs master: Add work items for 2024.2 Dalmatian development cycle  https://review.opendev.org/c/openstack/ironic-specs/+/91629515:20
dtantsurTricky question, TheJulia, janders. If a node in active state and power *off* goes through servicing, must it return to power off without ever trying to boot into the instance?15:32
dtantsurSame if power_off special step is used as the last step.15:33
TheJuliaoh jeeze15:33
TheJuliaI had always assumed you would only service things which are powered on15:34
dtantsuryeah, I'm sitting with a confused face in front of the BMO code15:34
TheJuliaactive and off is... hmm15:34
dtantsurCan there be reasons to not power on a node until it undergoes servicing?15:34
TheJuliaI guess the key is to have ironic record the state, and return it to that state and not assume "active == on"15:34
dtantsuryeaaaaaah15:35
TheJuliaI don't think there are reasons to service before powering on15:35
TheJuliasince service is all about being in the active state15:35
dtantsurI'll make a mental note about it, but it's not my largest priority now to fix it.15:35
TheJuliaok15:37
* dtantsur needs to demo a proof of concept of servicing with metal3 by end of month15:39
opendevreviewMerged openstack/ironic stable/2024.1: Fix redfish detach generic vmedia device method  https://review.opendev.org/c/openstack/ironic/+/91534115:52
opendevreviewMerged openstack/ironic master: Follow-up: Use ``microversion-parse`` to parse version headers in API requests  https://review.opendev.org/c/openstack/ironic/+/91614915:52
rpittaubye everyone, have a great weekend! o/16:22
TheJuliahave a great weekend16:23
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: Fix vif tests  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/91629916:32
opendevreviewcid proposed openstack/ironic master: wip: Remove special treatment of .json for API objects  https://review.opendev.org/c/openstack/ironic/+/91346716:45
opendevreviewcid proposed openstack/ironic-tempest-plugin master: Patch to enforce json extension works in existing API behaviour  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/91392616:48
JayFI'm trying really hard to think of a valid use case for service+power off17:13
JayFmost of what I can come up with are bad ideas; e.g. baremetal node service --steps {'name': 'omghax_recovery'}17:14
TheJuliastoring and restoring a state is not really unheard of in our code base, we've got a state or two which does it, but then what if you explicitly want to power on or off as a step17:16
TheJuliamaybe that is best done afterward, dunno17:16
TheJulia... it is just... weird since we don't really think of nodes as deployed && off17:17
JayFIt's 100% supported, and even exposed via the nova server start/stop api17:17
JayFso we at least have to have sane, documented behavior17:17
JayFwhich imo means we either need to stick to: "if you desire a different power state than you started in, use a power_{on,off} step explicitly at the end" or "we always return you to the same power state you started with"17:18
JayFI think #2 is the sanest way to go17:18
JayFtaking active+power off -> servicing -> active+power on ... this would be unexpected behavior imo17:18
clarkbwould it be valid for having a reserved node in that is fully deployed and ready to be started should power costs go down or demand go up?17:19
clarkbthen you don't have to worry about there not being any available nodes when you're in a position of "I need this now"17:19
TheJuliaclarkb: sort of an operator's use case and call17:20
TheJuliaI mean, you could , but you'd like want patches to be applied and whatnot17:21
TheJuliaso maybe it also is not the best idea... dunno17:21
JayFThe question is not if it's a good idea17:21
TheJuliatrue17:21
JayFthe question is if there's a reasonable use case our API can be used for that we have to honor17:21
JayFand I think we're clearly in "yes" territory 17:21
JayF(w/r/t ACTIVE+power_off)17:22
TheJuliaIt is a case we didn't anticipate because most folks don't operate with something in that state17:22
TheJuliasince deploying is relatively quick17:22
JayFI'll note that nova stop + nova lock is how we handled abuse at rackspace17:22
JayFso powered off + active was something we used there for some case at all17:22
TheJuliayeah, and I guess starting from off state *could* make sense if you had steps which... snapshooted it17:27
JayFthat's exactly what I was thinking re: omghax_recovery17:27
JayFas a fakey step17:27
JayFor lets say, major vuln in $firmware17:28
JayFyou can update via redfish but is only exploitable from OS17:28
JayFpower whole fleet off -> service to update them -> power on as they update17:28
JayF(for the 'I want servicing to change power state' use case) ... but in that case i'd be OK with just saying "make an extra power status call"17:28
JayFs/status/control/17:29
TheJuliapower_on and power_off *are* available steps, we just don't have any sort of guard to prevent an operator from using it as the very last step and then powering the machine back on, but we anticipate you'd always want to return to the prior state of on17:31
TheJuliarecording and restoring is what we have around some of the power code for other cases, makes sense to do it around servicing to me17:32
JayFagreed, would probably be valuable to indicate in docs that though -- so  people don't think "ending service steps with power off/on" has the expected effect17:40
TheJuliayeah17:46
JayFTheJulia: you have a few minutes, perhaps?18:43
JayFTheJulia: warning: IPA+WSGI discussion18:43
JayF(and it's not great)18:43
TheJuliaa couple of moments please18:43
JayFTheJulia: when you're ready: https://us06web.zoom.us/j/81363153778?pwd=iQBvg6gAJPE5sXghTY2bmfpbBOWibA.118:44
TheJuliaok18:44
JayFty Julia for knocking us over the hump there19:21
* JayF holds a funeral for a stateless IPA19:21
TheJuliaAwwwww19:37
TheJuliacrazy idea19:38
TheJuliause a socket19:38
TheJuliabetween the runner and wsgi app19:38
JayFheh19:39
JayFpick your poison19:39
JayFjson-rpc or sqlite19:39
JayFadamcarthur5: ^19:39
TheJulia"hey, here is a thing", and "hey, here is your thing back"19:40
JayFJSON-RPC is at least ... the right tool for the job19:40
TheJuliakind of yeah19:40
JayFany issues we'd have there should mimic issues we'd have IRL19:40
TheJuliaallows that call to get funneled into the actual source, if there is a clean way to serve it up19:40
JayFtoday, IPA gets a DB or RPC19:41
JayFtomorrow, IPA is just an ironic-api we spin up on a node /s19:41
* TheJulia twitches and falls over19:41
JayFthe JSON-RPC route is nice because we could keep the wsgi/api application side stateless, too19:42
TheJuliaall it would need to start is the results from the lookup19:42
JayFhonestly this is going to be more work than I anticipated19:42
TheJuliaso it could filter/guard/validate requests at a high level and have that so it could heartbeat19:42
JayFhowever it's going to end up being cleaner19:43
JayFheh19:43
JayFJSON-RPC heartbeat AND heartbeats to ironic-api19:43
* JayF adds two more decoder rings to his order form19:43
adamcarthur5I really don't like the socket-to-socket idea? That seems like an anti-pattern.19:43
* TheJulia looks at the ephemeral order form and feels confused19:43
adamcarthur5But that is almost certainly me not thinking in reference to Openstack enough 19:44
JayFadamcarthur5: well, we'd likely take the model from ironic-api <> ironic-conductor comms19:44
JayFwhich can use amqp OR json-rpc19:44
JayFobviously amqp is a terrible idea19:44
TheJuliaplease no19:44
JayFso we'd end up with (1-N) API app instances doing JSON-RPC to a single agent backend that did all the work19:44
JayFbasically the app at that point would just be doing similar to what ironic-api does for 90% of requests: validate the args, then pass them on to the backend19:45
JayFs/backend/conductor/19:45
TheJuliaover a socket would at least disjoint it so we're not worrying about wsgi and the such19:46
JayFadamcarthur5: let me ask the question this way: What is a more appropriate form of IPC for this use case19:46
JayFand socket can be unix socket or IP socket19:46
TheJuliafor the record, my please no was in regards to ampq19:47
JayFrabbitmq is the most stable, reliable part of openstack and has never been an operational pain point ever! what are you talking about!19:48
JayF;) 19:48
TheJuliahttps://www.pinterest.com/pin/side-eye--332140541273152294/19:49
* TheJulia go obtains corgi cuddles for a few minutes19:51
TheJuliaMy corgi gave me the same look19:52
adamcarthur5Probably sharing info through a socket ;((( 19:54
JayFI suspect you'd be surprised how often this happens :)19:54
adamcarthur5You are right. But I will write it up, we can make a final decision later 19:54
JayFI already added the alternateive to the etherpad19:54
adamcarthur5I think the bit that is more surprising is I'm having to deal with it. šŸ˜…19:55
adamcarthur5(can I send unicode characters in this without it breaking?)19:55
JayFyes19:55
JayFAnother perspective would be: wow, I've levelled up to the problems that are shaped weird and require low level solutions19:55
adamcarthur5Yes... Always think positive right :)) 19:56
JayFImplementing something that's awesome and a little gross is kinda a rite of passage when dealing with hardware19:56
JayFand tbh I don't even think a local unix socket with some RPC is really that gross19:56
JayFbetter than COM on windows :D 19:56
TheJuliaDmitry noted they switched ironic to use a socket for inbound requests and that it worked perfectly19:59
TheJuliawith apache in front 19:59
JayF...is this chicken<>egg? Would the JSON-RPC backend service just take a command and stop listening while it's going?20:02
JayFthat can't be true20:02
JayFwe still need asyncio or some kinda async comms, yeah?20:02
TheJulialikely, but it would disjoint it form wsgi at least20:40
opendevreviewJulia Kreger proposed openstack/ironic master: docs: update redfish docs to detail swift url issues  https://review.opendev.org/c/openstack/ironic/+/91593121:42
opendevreviewJulia Kreger proposed openstack/ironic master: redfish: change default virtual media storage to local storage  https://review.opendev.org/c/openstack/ironic/+/91593221:43
opendevreviewJulia Kreger proposed openstack/ironic master: docs: Cleanup/revise Secure Boot docs  https://review.opendev.org/c/openstack/ironic/+/91538621:50
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: CI: Increment stable jobs for 2024.1/drop Zed  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/91493121:54
jandersdtantsur TheJulia JayF in my ops life I've seen disasters caused by unintended power-on of a mission critical node that's been powered off due to issue (think old filesystem metadata server, a member of an HA group of sorts, cold-spare, whatever). Yeah there has to be pre-existing mess for that to happen and have bad impact but that doesn't mean it21:55
janderscan't happen. TL;DR: I reckon it would be good to make sure we don't boot the actual instance as a part of servicing and certainly power state pre and post servicing should match. If sorting out "ensuring not booting of the instance" is on the hard side we can handle that through documentation for starters ("DO NOT service a node that for some21:55
jandersreason must not be booted into the image deployed to it"). That's my perspective - sharing it in case it's useful.21:55
JayFyep, that maps to my understanding ; although that being said21:55
JayFif you're in that case, and you run servicing on your server21:56
JayFyou are a missed dhcp request away from breaking your env (so don't do that!) lol21:56
JayF(at least in noop network interface)21:56
jandersyeah being in that scenario means someone is already in a situation sufficient to cause a disaster21:56
jandersit would just be good if Ironic doesn't happen to be the trigger that tips it over from a potential disaster to an actual one21:57
opendevreviewJulia Kreger proposed openstack/ironic master: docs: update ilo docs regarding status -> use redfish  https://review.opendev.org/c/openstack/ironic/+/91538721:58
opendevreviewJulia Kreger proposed openstack/ironic master: docs: document stance on partition image use  https://review.opendev.org/c/openstack/ironic/+/91538822:02
jandersOK Saturday here so time to wrap things up for the weekend. Have a great weekend everyone, see you Monday!22:04
JayFo/22:10
opendevreviewMerged openstack/ironic master: Docs: Remove outdated RBAC content  https://review.opendev.org/c/openstack/ironic/+/91538522:10
opendevreviewJulia Kreger proposed openstack/python-ironicclient master: Add Service Steps to client  https://review.opendev.org/c/openstack/python-ironicclient/+/89114022:41

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!