TheJulia | ugh, looks like indirection and remotable stuffs are horribly broken | 00:14 |
---|---|---|
TheJulia | a fresh mind tomororw | 00:14 |
opendevreview | Adam McArthur proposed openstack/ironic master: api: Add schema for inspection rules API (versioning) https://review.opendev.org/c/openstack/ironic/+/954038 | 03:22 |
opendevreview | Adam McArthur proposed openstack/ironic master: api: Add schema for inspection rules API (requests) https://review.opendev.org/c/openstack/ironic/+/954039 | 03:22 |
adamcarthur5 | stephenfin I have tried to move all your openapi changes forward in ironic. I'll also hopefully find time to do more myself this month. Cheers! | 03:23 |
adamcarthur5 | TheJulia (+ cc JayF) I have written another change to the spec, but I made the mistake of not writing down more notes during our meeting. I know we talked about thinking about things through the conductor, but I cannot remember how we were thinking about the database table layout. I have attempted to remember as best I can on these things but lets | 03:26 |
adamcarthur5 | see what you think. I am sure I will have a few other things wrong so I apologies for that. | 03:26 |
adamcarthur5 | On a high-level, I have re-worked it to talk about instance_name and a location field together being the base for how to talk to this API. I think potentially starting with instance_name since that feature is written already. Potentially we need another spec to make some sort of location field a top-level thing? | 03:27 |
opendevreview | Adam McArthur proposed openstack/ironic-specs master: Add emergency bulk operations https://review.opendev.org/c/openstack/ironic-specs/+/952533 | 03:27 |
opendevreview | Tatiana Kholkina proposed openstack/ironic master: Fix logging for verification steps https://review.opendev.org/c/openstack/ironic/+/956531 | 05:26 |
rpittau | good morning ironic! o/ | 06:33 |
queensly[m] | Good morning o/ | 07:13 |
abongale | good morning ironic! | 07:43 |
opendevreview | Verification of a change to openstack/ironic master failed: Populate switch_info with lldp system name https://review.opendev.org/c/openstack/ironic/+/956471 | 08:30 |
dtantsur | JayF: ugh, I actually tried to put some logic there to catch your IPv6 case, but apparently it was not enough.. | 09:06 |
opendevreview | Johannes Kulik proposed openstack/python-ironicclient master: Fix parallel initial version negotiation https://review.opendev.org/c/openstack/python-ironicclient/+/956542 | 09:15 |
jhorstmann | Hi, I need some help with this: The ironic state machine diagram shows nodes moving to available after successful cleaning (https://docs.openstack.org/ironic/2024.2/_images/states.svg). I am testing burnin with 2024.2 and the node moves back to manageable afterwards. What is the actual expected behaviour in this case? | 12:29 |
dtantsur | jhorstmann: there is a duality in the cleaning process. Read on about manual vs automated cleaning, they end in different states. | 12:30 |
dtantsur | Manual cleaning is what you're using, and indeed it ends back in manageable | 12:30 |
dtantsur | and yeah, an arrow is missing in the diagram :( | 12:31 |
jhorstmann | alright, thanks for clarifying. I was not aware that there are different kinds of cleaning | 12:32 |
dtantsur | jhorstmann: https://docs.openstack.org/ironic/latest/admin/cleaning.html | 12:32 |
dtantsur | it's pretty brief but there is a short overview | 12:32 |
opendevreview | Verification of a change to openstack/ironic master failed: Populate switch_info with lldp system name https://review.opendev.org/c/openstack/ironic/+/956471 | 12:33 |
jhorstmann | dtantsur: thanks for the link, it makes sense now | 12:38 |
rpittau | TheJulia: just playing around with ironic-website, we should merge this https://github.com/OpenStackweb/ironic-website/pull/72 to fix the CI | 12:45 |
TheJulia | rpittau: the foundation's website contractor was supposed to look at it this week | 12:59 |
TheJulia | Also, good morning | 12:59 |
rpittau | good morning! :) | 12:59 |
janders | TheJulia : I have servicing/hold question: I am trying to establish a process of recovering from failed firmware updates with Ironic (and downstream operators using Ironic, but that's probably a separate topic). Based on our past discussions I came up with a method using a hold step on the node in "servicing failed" state: | 13:05 |
janders | https://paste.openstack.org/show/bDQFJGZRPBkRjQfsg5dN/ | 13:05 |
janders | my question is: is this working as expected? Just from the sound of it holding may suggest that we will be putting the node in a hold state as opposed to going back to ACTIVE - but for me it does exactly that, bringing the node back to ACTIVE with no further steps needed. | 13:06 |
TheJulia | janders: *blink* *blink* | 13:07 |
TheJulia | hold shouldn't move the node to active | 13:07 |
TheJulia | wut?! | 13:07 |
janders | Is the "hold" step essentially a "noop" step in this context? Is this intended behaviour and use of it (as opposed to me finding a way how to use it for something it wasn't designed for)? | 13:08 |
janders | it's been a while but I think I came up with this after discussing it with you and Jay after realising I can't run servicing with empty steps | 13:08 |
TheJulia | hold should kick the node to a hold state, not active. something is very wrong | 13:09 |
janders | oops it sounds like the "using it for something it is not designed for" case | 13:09 |
janders | I will try to find that conversation | 13:09 |
janders | may be useful reference to find out what were we trying to do back then | 13:10 |
cardoe | good morning ironic | 13:11 |
cardoe | dtantsur: would really like to land https://review.opendev.org/c/openstack/ironic/+/955536 which is just a fix for the bad mocks for something that landed this cycle. The tests pass but not cause they're actually testing anything. This makes the mocks return the right data and checks for the right things. It fixes up a couple of other series that I've got running. | 13:14 |
cardoe | Honestly anyone can review that. It just makes the mock of sushy's ethernet data behave how sushy behaves. It also returns the same data from the two different ways we can call it in sushy. | 13:15 |
cardoe | btw metal3-integration test is timing out for a number of patches. | 13:16 |
dtantsur | cardoe: oh, property mocks, the dark side of mocking in Python.. | 13:18 |
cardoe | Nothing is actually failing. it's just hitting the max runtime. | 13:18 |
cardoe | Unfortunately it's a property in sushy. So we should have the same behavior. | 13:18 |
TheJulia | dtantsur: by dark side, do we mean cookies dark side, or holocrons and red light sabers ? | 13:19 |
cardoe | I've got another series that gets rid of using the summary entirely. | 13:20 |
dtantsur | cardoe: yeah, I know. I'd rather use something that cleans up the mock after the test though. Your version, IIUC, leaves the mock in place forever. | 13:20 |
dtantsur | (getting rid of summary is probably not a bad idea) | 13:20 |
cardoe | That function is the function that creates the entirely mocked system | 13:21 |
cardoe | It gets passed a mock for `get_system()` and then populates it. | 13:21 |
cardoe | So that's why it leaves it | 13:21 |
dtantsur | You're changing type(system_mock.ethernet_interfaces), which is what? mock.Mock? | 13:22 |
cardoe | Yep | 13:22 |
dtantsur | So you're changing a global class, the one coming from Python even | 13:22 |
cardoe | hrm I see what you mean now | 13:22 |
TheJulia | janders: now that I'm waking up and becoming less cranky, hold is modeled out to be an entire state, a holding state, which is why I'm going "wut?!?", your starting in service fail which means overall it appears that the hold request is getting ignored or just not being honored by the conductor | 13:23 |
dtantsur | >>> type(mock.Mock()) is mock.Mock | 13:24 |
dtantsur | False | 13:24 |
dtantsur | Python, you're so frustrating sometimes.... | 13:24 |
cardoe | @mock.patch.object(redfish_utils, 'get_system', autospec=True) is where system_mock comes from | 13:26 |
dtantsur | Interesting, Gemini claims that each Mock object has its own private subclass | 13:26 |
dtantsur | in which case, what you're doing is acceptable | 13:27 |
cardoe | yay | 13:28 |
dtantsur | sigh https://github.com/python/cpython/blob/main/Lib/unittest/mock.py#L450-L459 | 13:28 |
dtantsur | I've +2'ed your patch; fix the mock situation if you feel like | 13:28 |
cardoe | So I honestly just wanna commit a JSON file to ironic which is a fake system and just use sushy to return data based on that JSON file | 13:30 |
cardoe | and not try to mock out sushy. | 13:30 |
dtantsur | I'd accept such a change, especially since this is how sushy itself does its unit tests | 13:31 |
cardoe | yep sushy ships that helper function for that too | 13:31 |
* TheJulia is still curious: cookies or light saber... | 13:37 | |
TheJulia | adamcarthur5: I'll try to look today, but likely won't be until later this week. I envisioned it as a single table with a master operation of short and each conductor which is running. | 13:43 |
TheJulia | adamcarthur5: I think adding location is easy, its just another field to add which is relatively easy and can be included unless your thinking more logic around it. At the end of the day, I was thinking of it as something free form which could just be a string we could then enable matching to so operators can mentally and quickly map nodes to physical structure because if you know the PDU on row 3 needs to go down, its | 13:46 |
TheJulia | starts to become very easy to say "shutdown row 3" | 13:46 |
adamcarthur5 | TheJulia a okay, so just something that is set by operators, a bit like the shard API in layout? | 13:47 |
dtantsur | TheJulia: cookie saber! how is this for a turn? | 13:48 |
TheJulia | dtantsur: om nom nom | 13:49 |
TheJulia | adamcarthur5: exactly, just much more informational for the operator until there is something disasterous :) | 13:49 |
TheJulia | Humans like to model things in any number of different ways, so... yeah :) | 13:50 |
adamcarthur5 | Great thanks. Sorry, I'm going to ask a few more questions about the Conductor thing, because I remember I was at a point where I understood it 😅 | 13:50 |
TheJulia | adamcarthur5: no worries, just be warned I'm going to go make more coffeeeeee | 13:51 |
adamcarthur5 | I.e. we use this table, one row per conductor, to track emergency state, and then we use the current tables to track all other state (i.e. special emergency node-states)? I remember we were worried about performance at one point, so I want to check that | 13:51 |
adamcarthur5 | Ack about coffee :)) | 13:52 |
TheJulia | adamcarthur5: I'd think of it as a schema where the UUID is not unique on it's own, but its UUID + conductor, for the original request there is just no conductor assigned, but each conductor would then pickup work and add it's own row as it is running. | 13:53 |
TheJulia | but to be fair, we didn't get into the nitty gritty of that | 13:54 |
adamcarthur5 | Ah! Yes. I think we went down this path because there's no batching operation, this was our solution to that | 13:55 |
adamcarthur5 | Thank you :) I'll roughly write this up in the spec today so that we can discuss in future | 13:56 |
TheJulia | yeah, a periodic could eventually go "hey, this looks done" and maybe do a reconcile count or something. It was always thought of as semi-best-effort but then how do you report/identify/track that is ???? | 13:57 |
TheJulia | *and* opinionated based upon what end users really need | 13:57 |
TheJulia | adamcarthur5: another thought, and likely not as efficient, is we could bypass sync power state or use it to the overall advantage within each conductor. Leveraging it is likely best but it will also mean you'll have a higher swarm of db updates, but hey, maybe that is for the best. | 14:06 |
adamcarthur5 | Okay, thanks for this :) | 14:09 |
JayF | Aisle note that our customer for this feature downstream specifically requested a button that they can push | 14:13 |
JayF | While automatic resolution is a very nice feature and a good thing to do, I think it's culturally very difficult to get people to just stay hands off for 20 minutes for things to fix itself | 14:14 |
JayF | Aisle -> I'll (thanks voice to text) | 14:15 |
opendevreview | Merged openstack/ironic master: Fix logging for verification steps https://review.opendev.org/c/openstack/ironic/+/956531 | 14:26 |
cardoe | dtantsur: btw https://review.opendev.org/c/openstack/ironic/+/933066 works and I believe addresses your review comments through the series as well. that finishes off that release goal for 2025.1 https://specs.openstack.org/openstack/ironic-specs/priorities/2025-1-workitems.html which is unfortunately a release late. | 14:36 |
TheJulia | yeah, they need to quickly see that something is happening, which is sort of critical | 14:38 |
TheJulia | the exact pattern under the hood TBD | 14:39 |
* dtantsur bookmarks the change | 15:10 | |
JayF | dtantsur: fwiw my machine was weirdly broken. v6 connectivity on the public interface; no v6 on loopback. | 15:11 |
JayF | dtantsur: so I wouldn't take it as a valid test of any fallback code | 15:11 |
dtantsur | right | 15:11 |
TheJulia | Is there an award for stupid code tricks? :) | 15:26 |
opendevreview | Verification of a change to openstack/ironic master failed: Populate switch_info with lldp system name https://review.opendev.org/c/openstack/ironic/+/956471 | 15:26 |
queensly[m] | JayF: Hi, and welcome back. | 15:34 |
queensly[m] | I saw your comment in the chat on this https://github.com/openstack/ironic/commit/94948bb1945c83b54a178d4e7e5ea54eb1af6f3e | 15:34 |
queensly[m] | A release note was added before it got merged. I just wanted to know if you're referring to something else. | 15:34 |
JayF | queensly[m]: I clearly just missed the file, a problem that occurs more easily when spending all day reading code. My apologies! | 15:34 |
queensly[m] | No problem :) | 15:36 |
frickler | TheJulia: you mean like https://www.ioccc.org/ ? ;) | 15:46 |
TheJulia | frickler: lol, no. :) | 15:57 |
TheJulia | muahahahaha I've gotten remotable code to work through jsonrpc | 17:41 |
opendevreview | Merged openstack/ironic master: Populate switch_info with lldp system name https://review.opendev.org/c/openstack/ironic/+/956471 | 18:09 |
opendevreview | Merged openstack/ironic-python-agent-builder master: set a maximum systemd journal size https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/954527 | 18:16 |
opendevreview | Julia Kreger proposed openstack/ironic master: WIP: Remove direct mapping from API -> DB https://review.opendev.org/c/openstack/ironic/+/956512 | 19:30 |
opendevreview | Julia Kreger proposed openstack/ironic master: WIP: Optional indirection API use https://review.opendev.org/c/openstack/ironic/+/956504 | 19:30 |
JayF | TheJulia: you mentioned something yesterday about a catchup sync? | 19:36 |
TheJulia | sure, but I'm on calls for the next 2 hours | 19:38 |
JayF | 3p our time? | 19:45 |
TheJulia | sure | 19:52 |
TheJulia | that should give me enough time to try and redirect a customer :) | 19:52 |
cardoe | I love Julia's optimism there. | 20:03 |
TheJulia | worst comes to worst, I'll use dd | 20:03 |
* TheJulia goes back to team retro | 20:05 | |
JayF | hardwaremanager.initialize() is pretty neat | 20:11 |
opendevreview | Adam McArthur proposed openstack/ironic-specs master: Add emergency bulk operations https://review.opendev.org/c/openstack/ironic-specs/+/952533 | 21:01 |
TheJulia | JayF: https://meet.google.com/ido-hsik-fdt?authuser=0 | 22:01 |
janders | TheJulia JayF let me know if/when you have some bandwidth for my service/hold thing, ready to join any time | 22:27 |
TheJulia | janders: come on over! | 22:32 |
TheJulia | okay, where did my brain go | 23:17 |
opendevreview | Ivan Anfimov proposed openstack/ironic-ui master: Fix small mistake in text https://review.opendev.org/c/openstack/ironic-ui/+/956614 | 23:44 |
opendevreview | Ivan Anfimov proposed openstack/ironic-ui master: Fix small mistake in text https://review.opendev.org/c/openstack/ironic-ui/+/956614 | 23:46 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!