opendevreview | Verification of a change to openstack/ironic-python-agent master failed: Trivial: fix reference of unusable i18n prefix https://review.opendev.org/c/openstack/ironic-python-agent/+/940096 | 01:16 |
---|---|---|
opendevreview | Merged openstack/ironic-python-agent master: Replace crypt module https://review.opendev.org/c/openstack/ironic-python-agent/+/937175 | 03:00 |
opendevreview | Merged openstack/ironic-python-agent master: Trivial: fix reference of unusable i18n prefix https://review.opendev.org/c/openstack/ironic-python-agent/+/940096 | 04:45 |
cardoe | I’ll review the rules stuff tomorrow. Got released from working earlier so wandered off. | 04:50 |
cid | Thanks cardoe | 07:14 |
rpittau | good morning ironic! o/ | 08:01 |
*** tkajinam is now known as Guest7334 | 09:05 | |
opendevreview | Verification of a change to openstack/ironic master failed: Add lsblk output to metal3 logs https://review.opendev.org/c/openstack/ironic/+/939976 | 10:50 |
opendevreview | Merged openstack/ironic-python-agent-builder master: Deprecate ironic-lib https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/939288 | 11:03 |
TheJulia | good morning | 13:57 |
opendevreview | Merged openstack/ironic master: Add lsblk output to metal3 logs https://review.opendev.org/c/openstack/ironic/+/939976 | 14:11 |
cardoe | Good morning. | 14:40 |
cardoe | cid: I was gonna ask you to weigh in on https://review.opendev.org/c/openstack/ironic/+/940277 cause I think that’ll be useful for rules as well. | 14:41 |
cardoe | I kinda hate how we have the inspection hooks vs default_hooks option. | 14:41 |
cardoe | I’m also thinking we need another config file or something for the plugins. Or maybe each plugin can have its own section prefixed with inspect_ ? | 14:42 |
TheJulia | brraaains | 14:42 |
* TheJulia sighs | 14:42 | |
cardoe | Some of these like the PCI devices are awkward as heck to configure. | 14:42 |
TheJulia | which devices? | 14:43 |
cardoe | Just brain dumping while I sit on this bus for my next assignment. | 14:43 |
TheJulia | :) | 14:43 |
cardoe | https://opendev.org/openstack/ironic/src/commit/81b3612046b6b4de9a3d106c0f86ef7641c29973/ironic/drivers/modules/inspector/hooks/pci_devices.py#L33 | 14:46 |
cardoe | In ironic-inspector each hook had its own section but now we’ve collapsed it into 1 [inspector] section. So we collapsed stuff into parsing JSON from strings. | 14:47 |
cardoe | And on some we threw in the towel and the config option is a path to another file that’s the config for the hook. | 14:48 |
* TheJulia blinks | 14:48 | |
cardoe | So part of me feels like we should just have a path and each hook gets $path/$hook.conf (or yaml) as its config. | 14:50 |
cardoe | Like [inspector]/hook_config_path | 14:50 |
cardoe | So what I’m trying to do with inspection hooks and rules is that we have hardware flavors. They are detailed with CPU, RAM, PCI, and disk info. I want to detect those and set it appropriately. | 14:53 |
cardoe | I’m happy to write up a post somewhere with a concrete example. Maybe guest blog something for jamesdenton | 14:54 |
cardoe | It feels like a correct operator use case but I could be wrong. | 14:54 |
opendevreview | Merged openstack/ironic-specs master: enable sphinx-lint on priorities and specs https://review.opendev.org/c/openstack/ironic-specs/+/940199 | 14:54 |
TheJulia | A concrete example would help, but that kind of makes sense, espescially if your driving some of the input from other processes which needs to land in those configurations | 14:55 |
*** Guest7381 is now known as bbezak | 14:56 | |
cid | cardoe, re: 'weigh in on ...', sure thing! | 15:00 |
opendevreview | Julia Kreger proposed openstack/ironic-specs master: OCI Container Registry Image Source https://review.opendev.org/c/openstack/ironic-specs/+/933612 | 15:06 |
TheJulia | JayF: fixed your nit | 15:06 |
TheJulia | so phase might make sense to be an enum if the phases are well documented, scope I doubt it based upon the 2nd patch in the inspection rules patches | 15:32 |
TheJulia | At the same time, We likely need to ask ourselves "what is being migrated from" and "did anyone do anything we don't quit expect" (They always do!) | 15:33 |
JayF | +2A on the OCI spec; was going to leave it for more reviews but patches exist for a reason :) | 16:18 |
TheJulia | Indeed :) | 16:19 |
TheJulia | Now! Bootable Containers! Muahahahaha | 16:21 |
TheJulia | Oh wait, got that working :) | 16:21 |
* JayF goes back to shoveling tech debt | 16:22 | |
JayF | form of: broken inspector grenade and ironic-lib | 16:22 |
* JayF away | 16:22 | |
JayF | (actually I need to do a devstack for cid and I to use to poke at kea later, then I'm going to do that) | 16:22 |
opendevreview | Merged openstack/ironic-python-agent stable/2024.1: Fix RAID volume name https://review.opendev.org/c/openstack/ironic-python-agent/+/939959 | 16:23 |
TheJulia | heh | 16:23 |
TheJulia | JayF: such is life :) | 16:23 |
* cid Kea, finally :D | 16:25 | |
JayF | ^^^ in order for us to make good progress on kea, we need help reviewing and polishing off inspector rules | 16:26 |
JayF | It's been at like 99% for a while | 16:26 |
kubajj | I have an issue with ipa-tempest-uefi-redfish-vmedia-src, https://zuul.opendev.org/t/openstack/build/ba95a977708d4eb2a607d68aeaa36bbb , it says that there is conflicting versions wanted for pbr, this doesn't look just like some anomaly that could go away with recheck, right? | 16:33 |
opendevreview | Merged openstack/ironic-specs master: OCI Container Registry Image Source https://review.opendev.org/c/openstack/ironic-specs/+/933612 | 16:34 |
TheJulia | JayF: I reviewed the inspector rules patches this morning, I think they are in relatively good shape although I feel like there is some missing context behind the scope value. Having not really been a user of the feature in inspector, I've only really just noticed something which made my eyebrow raise up like any typical vulcan science officer | 16:34 |
JayF | that has been a repeated point of WTF for us as well, quite frankly | 16:35 |
TheJulia | is in the data model on the inspector side? | 16:35 |
TheJulia | just not exposed? | 16:35 |
JayF | it's documented in the spec | 16:36 |
JayF | cid: if you're around, do you recall the deal around inspector scope? | 16:36 |
opendevreview | cid proposed openstack/ironic master: Apply Rules: inspection rules migration https://review.opendev.org/c/openstack/ironic/+/939218 | 16:36 |
TheJulia | ... oh, okay | 16:37 |
opendevreview | cid proposed openstack/ironic master: DB: inspection rules migration https://review.opendev.org/c/openstack/ironic/+/939318 | 16:37 |
TheJulia | There was another field which cardoe was wanting to be an enum instead, and I looked back at the structure and I don't think an enum fits exactly given the defaults | 16:37 |
TheJulia | it might actually be scope, but there was a second value he raised it on | 16:38 |
TheJulia | But aside from that, they generally look good from me | 16:38 |
TheJulia | FWIW | 16:38 |
cid | Yeah. I just posted a response to the comment and updated the commit message. | 16:40 |
TheJulia | I'm personally of the "doesn't need to be perfect" to be "good" camp | 16:41 |
cid | The scope field is a bit confusing, initally I thought it was a node uuid, but it's just a way to associate a node with a rule, just like traits. | 16:41 |
TheJulia | that whole, perfection is the enemy of good thing | 16:41 |
TheJulia | cid: ... I dind't see it getting invoked anywhere, is it just foundation to build upon later? | 16:42 |
cid | Yeah, just foundation to build upon later. | 16:42 |
cid | From a little meet we had with Dmitry, there's a possibility it gets dropped for 'traits' | 16:43 |
TheJulia | Are we building now, or later? | 16:43 |
TheJulia | I guess I'm wondering if we shouldn't include extra foundation because of present uncertinty | 16:44 |
TheJulia | which stinks, but... forward progress is better than no progress | 16:44 |
TheJulia | and adding a field is easy | 16:44 |
JayF | honestly, so is removing a field we've never populated | 16:44 |
cid | We think it will be best as a follow-up, alongside other stuffs too, (i tried to capture those in the commit message so that I will always be reminded) | 16:44 |
JayF | my biggest thing is I want those patches to stop churning and be good enough to land | 16:44 |
TheJulia | Yeah, I commented on the nullable stuff | 16:45 |
TheJulia | because its fine, and matches existing pattern | 16:45 |
TheJulia | ... (and frankly the requirement for the informational fields to be set values also makes unit testing WAY more complicated at higher levels...) | 16:46 |
TheJulia | (speaking from experience here....) | 16:46 |
keekz | i'm trying to set up automated cleaning with deletion and recreation of the raid. that part works fine according to logs. then it performs erase device metadata, which ipa also reports has completed. but then it gets stuck and appears to do nothing indefinitely. ipa keeps heartbeating successfully so it thinks it's doing something, but i'm not sure what. snip of my config and logs: | 16:50 |
keekz | https://gist.github.com/nicholaskuechler/e3f18fda3d3e675f9ca717b98e4b34c3 | 16:50 |
JayF | keekz: what does the node object look like? I'd also check conductor logs for traceback | 16:51 |
JayF | keekz: basically when a node heartbeats, conductor checks node fields to see what's next. This kinda behavior makes me think something got stuck *on that side* | 16:51 |
keekz | there's no tracebacks. what do you mean about the node object? | 16:51 |
JayF | openstack baremetal node show --detail mynode | 16:51 |
JayF | as someone admin-y enough to see driver_internal_info (secrets not required) | 16:51 |
keekz | hmmm i don't have --detail | 16:52 |
JayF | I might have placed it in the wrong place in the chain | 16:52 |
keekz | yeah i can see driver info, i'll add it to the gist, give me a sec | 16:52 |
JayF | or is it --long? | 16:52 |
JayF | I may very well just have the command slightly wrong | 16:52 |
cid | I think it's long | 16:52 |
keekz | i added it to https://gist.github.com/nicholaskuechler/e3f18fda3d3e675f9ca717b98e4b34c3 | 16:54 |
JayF | keekz: node.clean_step = ? and node.provision_state+target_provision_state? | 16:55 |
keekz | i put all the node details in there :) | 16:56 |
TheJulia | I... guess I'm curious also what is in the conductor log in response to the heartbeat | 16:57 |
TheJulia | since it should be triggering the next step in the list... | 16:57 |
keekz | updated with the heartbeat snippet from conductor logs and the most recent clean state change in conductor logs | 17:02 |
JayF | This is a pretty patched up Ironic, I'm assuming? | 17:03 |
keekz | no, it's pretty basic ironic | 17:04 |
TheJulia | it doesn't look like the conductor is actually polling command state based upon the logs | 17:06 |
keekz | it's ironic via openstack-helm fwiw. config is mostly in https://github.com/rackerlabs/understack/blob/main/components/ironic/aio-values.yaml excluding my automated cleaning override changes from that gist | 17:06 |
rpittau | good night! o/ | 17:07 |
TheJulia | can the conductor host curl https://10.4.50.97:9999 ? | 17:07 |
JayF | yeah, that's EXACTLY why I asked | 17:08 |
JayF | because it's a weird, weird situation that it's not polling there | 17:08 |
keekz | yes it can reach it: https://gist.github.com/nicholaskuechler/e3f18fda3d3e675f9ca717b98e4b34c3#file-gistfile1-txt-L155-L158 | 17:10 |
JayF | this is super weird | 17:12 |
JayF | I need to prep for a meeting soon, so I can't dig further, but this is either something really dumb we're missing or maybe a nasty bug | 17:12 |
keekz | was looking through bug reports but didn't find anything. i'm wondering if i'm doing the config wrong, as it does not appear to be very well documented and there aren't many examples | 17:14 |
TheJulia | keekz: try curling /v1/commands on that api surface | 17:14 |
TheJulia | keekz: your using two different override options,... It should be working though | 17:15 |
opendevreview | Merged openstack/ironic-python-agent master: switch from 'not is' to 'is not' for pep8 https://review.opendev.org/c/openstack/ironic-python-agent/+/940181 | 17:15 |
opendevreview | Merged openstack/ironic-python-agent master: avoid f-strings in logging per flake8 https://review.opendev.org/c/openstack/ironic-python-agent/+/940182 | 17:15 |
opendevreview | Merged openstack/ironic-python-agent master: fix sphinx errors with incorrect backticks https://review.opendev.org/c/openstack/ironic-python-agent/+/940183 | 17:16 |
keekz | v1/commands/: https://gist.github.com/nicholaskuechler/e3f18fda3d3e675f9ca717b98e4b34c3#file-gistfile1-txt-L193 | 17:16 |
keekz | yeah the clean_step_priority_override is multiopt in oslo-config i guess. i had never seen that before in openstack before working on this. | 17:17 |
TheJulia | keekz: most of the time we see people comma separating mapping values, fwiw | 17:18 |
keekz | kinda related, i see clean steps take arguments, but is it possible to pass arguments in via the override options syntax? i took a quick look around and it seems like it's not possible | 17:19 |
keekz | do you mean like: `clean_step_priority_override = raid.delete_configuration:124,raid.create_configuration:123,deploy.erase_devices_express:95` is that the syntax? i couldn't find any examples of that either, the docs just say use multiple clean_step_priority_override lines | 17:20 |
TheJulia | options via override, I don't believe so | 17:24 |
TheJulia | hey, what version of ironic is this? | 17:25 |
JayF | That's a really good feature suggestion though, I've thought about that before | 17:26 |
JayF | being able to map "step_with_args" to "new_step_with_args_preset" with config only | 17:26 |
TheJulia | oh, so I think I see it | 17:28 |
TheJulia | https://github.com/openstack/ironic/blob/master/ironic/drivers/modules/agent_base.py#L574-L584 | 17:28 |
TheJulia | basically, the value is true | 17:28 |
TheJulia | so we never call continue_cleaning to update status | 17:29 |
TheJulia | in this case, the step requires the agent to be polled | 17:29 |
TheJulia | which explains why it is locked out | 17:29 |
keekz | 26.1.2.dev12 | 17:34 |
TheJulia | okay, so I think the bug is that we don't seem to clear the value after a step completes execution where a reboot is required | 17:43 |
TheJulia | so next step executes, and boom | 17:43 |
TheJulia | except none of the steps requested it | 17:45 |
TheJulia | curious | 17:45 |
TheJulia | keekz: can you look in your conductor logs for anything related to "reboot_to_finish_step", also "clean reboot into the agent" | 17:48 |
keekz | sure, 1 sec | 17:48 |
TheJulia | or any reboots of the node... | 17:49 |
TheJulia | doesn't look like that should of necessarilly triggered, but super weird | 17:49 |
keekz | oh, it didn't reboot the node/agent, i'm ssh'd in to it still | 17:49 |
keekz | but let me grab the logs if there are any | 17:50 |
keekz | TheJulia: there are no mentions of "reboot_to_finish_step" or "clean reboot" | 17:52 |
TheJulia | hmm, I'll have to look a little later then | 17:58 |
TheJulia | you should file a bug in launchpad in the mean time | 17:58 |
TheJulia | Okay, I think I found it while waiting | 18:02 |
TheJulia | so it is set by the idrac raid interface | 18:03 |
TheJulia | we... likely... need to nuke that flag any time we go to set any flag any time we go to do a thing if it is already set | 18:04 |
TheJulia | hmm | 18:05 |
TheJulia | it is redfish, hmmmm | 18:05 |
TheJulia | most of the raid interfaces set it, it seems | 18:07 |
JayF | honestly, consistently broken somehow seems better than inconsistently broken | 18:08 |
keekz | i made https://bugs.launchpad.net/ironic/+bug/2096938 | 18:11 |
JayF | Is anyone interested in a inspector rules demo/review session? Maybe tomorrow? | 18:35 |
keekz | i'd listen... i haven't really used inspector much yet. | 18:39 |
opendevreview | Merged openstack/ironic-python-agent master: Add support for burnin-gpu https://review.opendev.org/c/openstack/ironic-python-agent/+/925415 | 19:20 |
TheJulia | keekz: by chance, can you find the last time the agent client connected to that node to check the status ? | 20:03 |
keekz | what do you mean, the conductor logs or agent logs? | 20:05 |
TheJulia | nvmd, I think you actually already provided it | 20:05 |
keekz | it's still actively heartbeating fwiw. the ipa has been booted for 4h40m and it's been 'finished' but still heartbeating for 4h now | 20:07 |
JayF | yeah that's going to continue ~forever unless you abort it | 20:11 |
JayF | and you'll never be able to patch it mid-process because it'll be a conductor restart to put in a patch so it'll kick you outta cleaning | 20:11 |
JayF | so just do the abort unless there's more info to be gained | 20:11 |
TheJulia | I think I have a fix | 20:13 |
TheJulia | ... it *might* not actually | 20:14 |
TheJulia | worth a try once I have a fix posted in a few minutes | 20:14 |
JayF | I thought conductor restart would abort .... oooooh it's not a -ing state | 20:14 |
JayF | it's a wait state | 20:14 |
JayF | nice | 20:14 |
TheJulia | exactly | 20:14 |
TheJulia | ... it might still if there was a lock being held | 20:15 |
TheJulia | but if you just kindly ask the process to exit and wait like 30 seconds | 20:15 |
TheJulia | it should be clear | 20:15 |
TheJulia | if you kill -9 it, oh boy | 20:15 |
TheJulia | I think the issue is the raid steps basically are all modeled around the use of driver side polling | 20:15 |
JayF | back in juno/onmetal days we used to just have conductors randomly segfault python | 20:15 |
TheJulia | but none of them clean it up | 20:15 |
JayF | thinking about how breaky that environment was now stresses me lol | 20:16 |
TheJulia | so when a non-raid step executes, it triggers it, but it it doesn't actually ever resume the next step to check status | 20:16 |
TheJulia | its actually a really bad bug | 20:16 |
TheJulia | sort of a weird lockout that never gets reset | 20:17 |
TheJulia | running unit tests again | 20:19 |
keekz | if you can tell the bosses it's not my fault and it's a really bad bug, that would be great :) | 20:19 |
JayF | I checked launchpad, it said you created the bug | 20:20 |
JayF | sorry I won't cover up for you /s | 20:20 |
TheJulia | keekz: added a comment | 20:26 |
TheJulia | to the bug that is | 20:26 |
TheJulia | and you get to say something like "I got the chair of the board to work on fixing it" | 20:26 |
frickler | sorry for interrupting with a different question: supermicro seems to have two levels of redfish licensing, with the smaller license (OOB), some things work, but I get regular errors in the conductor log, is that a known issue? | 20:27 |
TheJulia | frickler: three levels afaik | 20:28 |
TheJulia | What level license do you have and what error are you getting? | 20:28 |
keekz | lol thanks, TheJulia | 20:29 |
frickler | TheJulia: I have the OOB license (I think) and the error talks about the DCMS license iiuc https://paste.opendev.org/show/baw6wEZaZtR22wjnyxaC/ | 20:32 |
TheJulia | ... ugh supermicro | 20:32 |
TheJulia | Someone get someone on the line from there and shame them | 20:33 |
TheJulia | not exposing secure boot status | 20:33 |
TheJulia | ... SHAME! | 20:33 |
frickler | deploying using redfish-virtual-media seems to work fine, which is all I want really, so the node status is fine, I just have these errors regurlarly in the conductor log | 20:34 |
JayF | Until they drop that crazy policy, I'll never use or recommend their hardware. It's just straight up license-gating automation. | 20:34 |
TheJulia | yeah, so... DCMS is their management tooling | 20:35 |
TheJulia | There used to be a basic oob and then an advanced oob license | 20:35 |
JayF | Is this sorta thing severe enough we would add a field to driver_info as a quirk mode to not call, or at least expect failure? | 20:35 |
TheJulia | virtual media was hidden behind the advanced license | 20:35 |
JayF | I'm happy to write that patch, but you're going to need the Apache *3.0* license to be eligible for the bugfix /s | 20:35 |
TheJulia | JayF: Secure Boot is not a required field, so the code is just being loud logging wise | 20:35 |
* TheJulia thinks moving some steel might be more productive ;) | 20:36 | |
JayF | yeah, that's why I asked if frickler was that bothered by the noisy log | 20:36 |
frickler | well a) I don't think ironic should log an ERROR if it is not important, b) it makes finding real errors pretty difficult | 20:37 |
frickler | also maybe ironic could learn after the first attempt that this is not working instead of retrying every 5 minutes or so? | 20:38 |
JayF | yeah, that's a fair point, loglevel is too high | 20:38 |
JayF | probably should be a warning | 20:38 |
frickler | (sorry if this sounds noobish from an ironic pov) | 20:38 |
JayF | traditionally in the past (IPMI) how we've handled this is by adding something to driver_info basically indicating a quirk for that hardware. | 20:39 |
JayF | we don't hit this much in redfish so I don't know what prior art is there | 20:39 |
TheJulia | typically vendor detection | 20:40 |
TheJulia | and from that know if we can suppress or raise an error | 20:40 |
TheJulia | like if this was HPE or Dell, I'd get a whirling klaxon going | 20:40 |
frickler | ok, I can look into patching this a bit tomorrow, I was just wondering whether you'd say: you need to buy the DCMS license or don't bother with using ironic | 20:41 |
TheJulia | I don't think they sell the dcms license publically | 20:41 |
TheJulia | ... I don't think | 20:41 |
TheJulia | anyway, you can use ironic, just... as JayF noted, it is gating the sort of features most people need to be able to touch behind a license | 20:42 |
frickler | seems they do https://store.supermicro.com/us_en/supermicro-server-manager-dcms-license-key-sft-dcms-single.html | 20:42 |
JayF | we should 100% lower the loglevel of that message | 20:42 |
JayF | according to logging guidelines, log.error is something that should be severe enough to page ops | 20:43 |
TheJulia | ++ | 20:43 |
JayF | honestly even warn is too high, because it's not really "fixable" | 20:43 |
TheJulia | yeah | 20:43 |
frickler | well it is also not only that message, I omitted copying the 20 line or so backtrace that goes with it | 20:43 |
TheJulia | I'd info it, tbh | 20:43 |
JayF | frickler: can I get the full output into a bug? | 20:43 |
TheJulia | ++ | 20:43 |
JayF | cid: ^ if you wanted a low hanging fruit bug, this is it | 20:43 |
JayF | we were just talking today about how he wanted something straightforward to get a quick win since inspector rules have been a slog | 20:43 |
JayF | frickler can be the beneficiary lol | 20:44 |
frickler | sure, will do this my morning, thx | 20:44 |
TheJulia | ++ | 20:44 |
JayF | yeah I think he's closer to your TZ so that should still catch him next time he works | 20:44 |
TheJulia | keekz: just need to make sure the tests pass, write a reno, and do appropriate bug updating | 20:44 |
TheJulia | keekz: and if you want you can try the patch out | 20:44 |
opendevreview | Steve Baker proposed openstack/ironic master: Mask all driver_internal_info in node output https://review.opendev.org/c/openstack/ironic/+/939504 | 20:47 |
keekz | sounds good, thanks | 20:47 |
cardoe | JayF: I’m interested in an inspector rules session but I’m not around until next week so record it? Or make a blog post? | 20:52 |
opendevreview | Julia Kreger proposed openstack/ironic master: Fix agent from being locked out with complex steps https://review.opendev.org/c/openstack/ironic/+/940413 | 20:53 |
opendevreview | Julia Kreger proposed openstack/ironic master: Fix agent from being locked out with complex steps https://review.opendev.org/c/openstack/ironic/+/940413 | 20:53 |
TheJulia | Had to rebase ^ | 20:53 |
TheJulia | keekz: ^ | 20:53 |
JayF | seems reasonable to me, +2 with the obvious understanding we'll land it when it's confirmed as a fix | 20:54 |
JayF | just don't want you to be waitin on me | 20:54 |
cardoe | TheJulia/cid: I was more just asking questions. I’m trying to grok the spec and the change by flipping between tabs on the bus. So my asks might not be 100% correct. Still trying to get the end to | 20:55 |
TheJulia | cardoe: totally got that, and they are valid questions too | 20:55 |
JayF | the reality is we need *more* input/review from folks who use this and are familiar with it | 20:56 |
cardoe | So keekz is messing with that RAID stuff I was talking about afaik. I’m still not sure we want to clean the RAID each time. More just “ensure it’s created” | 20:57 |
TheJulia | cardoe: yeah, there is a definite bug there. Create is *supposed* to be modeled to just make it so | 20:58 |
TheJulia | but at the same time there are gray areas all around each driver too | 20:58 |
keekz | yeah that sounds more like a security ops question... not my area of expertise. i don't know if it matters if the raid is deleted and recreated before each deploy, from a data wipe perspective | 20:58 |
keekz | or what settings we want to use for better cleaning | 20:59 |
keekz | i just wanted to give it a try and see how long it would take, but then i got wrapped around the axle :) | 21:00 |
cardoe | Well create without an option fails if there’s RAID on there. | 21:00 |
cardoe | I think that’s what keekz and I found. | 21:01 |
keekz | TheJulia: it will take me a bit to prep everything on my side. i don't think i'll know if it works until tomorrow | 21:04 |
TheJulia | keekz: oh yeah, definitely an ops question at some point | 21:09 |
cardoe | keekz / TheJulia: https://etherpad.opendev.org/p/ironic-ensure-raid | 21:15 |
cardoe | work in progress... just sat down | 21:15 |
cardoe | But if you read roughly what I have, does that make sense? | 21:22 |
TheJulia | I feel like we need a ptg level call to discuss | 21:35 |
TheJulia | but I get where your going at a high level | 21:35 |
cardoe | jamesdenton: look at that ensure RAID thing as well. | 21:35 |
cardoe | Since that's really for you. | 21:35 |
cardoe | TheJulia: Sure thing. I was kinda thinking of somehow gathering up some example use cases into the docs somewhere. | 21:37 |
cardoe | Not to pick winners and losers in use cases. But sometimes it helps people see everything end to end with some use cases. | 21:38 |
cardoe | So like gather up metal3's use cases and how they drive ironic which is a bit more stand alone. Gather up some big operator it's all OpenStack down use cases. Which will be different but have overlap. | 21:39 |
JayF | cardoe: the big thing that I'm banging into thinking about your idea is a big fat *why* | 21:39 |
cardoe | Why for which part? | 21:40 |
JayF | cardoe: delete_config / erase_devices / create_config gives you a consistent result with seemingly only minimal extra cleaning time | 21:40 |
keekz | i'd be more willing to update docs and add examples, if i could get them to run and rst format didn't give me a headache | 21:40 |
cardoe | JayF: it takes 30 minutes for Dell gear. | 21:40 |
JayF | keekz: try setting basepython in the `tox -edocs` environment to python3.11 | 21:40 |
JayF | keekz: assuming you have it installed | 21:40 |
JayF | keekz: if you share whatever you hit with `tox -e docs` I can work it out with ya | 21:40 |
cardoe | JayF: delete_config / erase_devices / create_config also doesn't work with Dell PERC cards. Cause they won't expose any disks after the RAID is deleted. You'd need to change BIOS settings to flip them to JBOD so that erase_devices works. | 21:42 |
JayF | those are details that'd be great to put in that etherpad, especially if this might come up at the PTG | 21:43 |
JayF | .o(it's always f#%&$ Dell) | 21:43 |
cardoe | I added it. | 21:45 |
cardoe | That's what the old "dude you're getting a Dell" ad should have been. | 21:45 |
cardoe | working on my hooks and rules use case now as well | 21:47 |
keekz | JayF: hey thanks! the `tox -e docs` in python 3.11 did the trick. although i saw this sphinx-build error pop up: https://gist.github.com/nicholaskuechler/c1c6f37e5b32a27c93f0d848799b3cfd | 21:51 |
JayF | that's weird and new | 21:53 |
JayF | and implies it didn't work? | 21:53 |
TheJulia | heh, | 21:54 |
keekz | well the docs built and i can browse them. it seems to be running multiple builds i guess, so maybe there's a section that failed to build | 21:55 |
TheJulia | or a missing link | 21:57 |
TheJulia | Metrics was moved in but without a toc update recently | 21:57 |
TheJulia | multitenancy might have been dropped in refactoring changes which have been merged | 21:57 |
keekz | i think i may be missing modules/requirements too, i see the oslo.reports one and also this: WARNING: dot command 'dot' cannot be run (needed for graphviz output), check the graphviz_dot setting | 21:58 |
keekz | browsing around though, it seems like it's all there | 22:00 |
JayF | weird | 22:02 |
cardoe | It's cause "dot" isn't in allowed_externals for the "docs" target | 22:20 |
JayF | oh, nice | 22:21 |
cardoe | keekz: try to add "allowlist_externals = dot" under the [testenv:docs] section | 22:21 |
JayF | I will JFDI approve a change to do that fix if you wanna write it | 22:21 |
cardoe | Yeah wanted some confirmation from him. I can't seem to run it here. Our internet doesn't allow wireguard and they've got some port mangling so I'm getting EOF's on a bunch of ports. | 22:25 |
JayF | I don't know about for privacy purposes, but for "I want the US streaming services when I'm travelling" purposes, nordvpn has a great linux client lol | 22:25 |
keekz | yes, adding `allowlist_externals = dot` generated without that warning 👍 | 22:25 |
JayF | I've used that to get around crazy network restrictions | 22:25 |
cardoe | It looks like any traffic outside of port 443 and 80 is getting mangled. | 22:26 |
JayF | nord uses 443 | 22:26 |
opendevreview | Doug Goldstein proposed openstack/ironic master: allow docs targets to run dot https://review.opendev.org/c/openstack/ironic/+/940416 | 22:29 |
opendevreview | Doug Goldstein proposed openstack/ironic-python-agent master: migrate lints to pre-commit https://review.opendev.org/c/openstack/ironic-python-agent/+/940185 | 22:30 |
cardoe | That should enable pre-commit inside of CI for IPA | 22:31 |
cardoe | TheJulia: I'm gonna minimize function calls inside of f-strings. There's just a few exceptions. One of them being len() like... f'[{len(elements)}]' | 22:34 |
cardoe | The result here is the string is "[4]" for example. That's just how you do it. There's no short hand to get the length of a list. | 22:34 |
cardoe | The replaced code was.... '[%s]' % len(elements) | 22:35 |
cardoe | So it's basically the same thing | 22:35 |
cardoe | I had previously asked if he'd prefer a temp variable. | 22:36 |
cardoe | re: https://review.opendev.org/c/openstack/sushy/+/934916 | 22:37 |
TheJulia | Sounds okay to me | 22:42 |
opendevreview | Doug Goldstein proposed openstack/ironic master: doc: define the shape of inspection inventory https://review.opendev.org/c/openstack/ironic/+/940277 | 22:46 |
opendevreview | Merged openstack/ironic master: allow docs targets to run dot https://review.opendev.org/c/openstack/ironic/+/940416 | 23:09 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!