opendevreview | Julia Kreger proposed openstack/sushy master: Capture more errors https://review.opendev.org/c/openstack/sushy/+/853209 | 00:14 |
---|---|---|
TheJulia | dtantsur: I'm wondering if ironic should just dump the session on *any* exception | 00:26 |
opendevreview | Vanou Ishii proposed openstack/ironic stable/yoga: Fix iRMC driver to use certification file in HTTPS https://review.opendev.org/c/openstack/ironic/+/852797 | 04:00 |
arne_wiebalck | Good morning, Ironic! | 06:44 |
arne_wiebalck | Great, thanks stevebaker[m] ! | 06:45 |
opendevreview | Jakub Jelinek proposed openstack/ironic-python-agent master: WIP: Enable skipping RAIDS https://review.opendev.org/c/openstack/ironic-python-agent/+/852999 | 07:12 |
kubajj | dtantsur: TheJulia: as discussed on Thursday I think it is the easiest to address the skip list issue for RAID devices by pointing to them by the volume name (I found this in the documentation), however, if I am correct, the name is not used. Therefore, I have created a new change which adds the name to the create raid function. Let me know what you think: https://review.opendev.org/c/openstack/ironic-python-agent/+/853182 | 08:10 |
dtantsur | TheJulia: probably not all HTTP exceptions, but 401 and any connection issues | 10:09 |
dtantsur | kubajj: not sure I'm aware of the issue you're referring to | 10:10 |
opendevreview | Merged openstack/sushy-tools master: remove unicode from code https://review.opendev.org/c/openstack/sushy-tools/+/852622 | 10:27 |
opendevreview | Merged openstack/sushy-tools master: List network interfaces of all types https://review.opendev.org/c/openstack/sushy-tools/+/844284 | 10:37 |
opendevreview | Aija Jauntēva proposed openstack/sushy-tools master: Add Chassis to ServiceRoot https://review.opendev.org/c/openstack/sushy-tools/+/844126 | 10:44 |
iurygregory | good morning Ironic, I'm alive o/ | 11:09 |
dtantsur | yay, congrats iurygregory | 11:11 |
iurygregory | ty! | 11:12 |
iurygregory | dtantsur, quick question, do you think this would be enough to cover all vmedia cases we have https://review.opendev.org/c/openstack/ironic/+/852234 or I would need to update other drivers to support it? (talking in case they are not using redfish driver itself...) | 11:14 |
TheJulia | iurygregory: if I recall correctly, they use the image utils as well | 11:18 |
iurygregory | TheJulia, yeah, just wondering if would need to specify the new field "external_http_url" for driver_info ... | 11:19 |
iurygregory | also, good morning (very early for you...) | 11:19 |
TheJulia | I think it is likely covered, but relatively easy to check. | 11:20 |
TheJulia | (Up early because having some pain that is keeping me from getting back to sleep) | 11:20 |
iurygregory | ouch =( sorry to hear that | 11:20 |
dtantsur | oh, I hope you get better asap | 11:21 |
TheJulia | I am likely going to need to go find a doctor this morning… but none of them are open for 4+ hours and our insurance in the US denied my last visit to an emergency room as “inappropriate class of service for diagnosis” | 11:23 |
kubajj | dtantsur: We discussed with TheJulia how to extend the skip list safeguard to RAID devices. We agreed on that including the volume name (user-defined) in the list might be a good approach. Volume name is mentioned in the docs, but not used during the create raid device function execution. That's what the new change addresses. | 11:30 |
dtantsur | kubajj: volume name won't help you | 11:31 |
dtantsur | volume name becomes /dev/md/<volume>, while you need /dev/md5 (we don't work with any symlinks) | 11:31 |
dtantsur | I already hit a similar dead corner when trying to make /dev/disk/by-<something>/<something> work | 11:31 |
kubajj | dtantsur: but couldn't we work with them? | 11:31 |
dtantsur | potentially - yes | 11:32 |
dtantsur | speaking of which, you need to update ironic-inspector since it also deals with root devices | 11:32 |
dtantsur | ironic-inspector processes disks on the server side, making symlink resolution.. interesting | 11:32 |
kubajj | I was planning to get the list of logical devices and then possibly check if the symlink returns something or an error | 11:33 |
dtantsur | kubajj: https://opendev.org/openstack/ironic-inspector/src/branch/master/ironic_inspector/plugins/standard.py#L40-L67 | 11:33 |
dtantsur | kubajj: how do you do it outside of the node? | 11:33 |
kubajj | IPA runs on the node, right? What would we need it outside for? | 11:34 |
dtantsur | kubajj: ironic-inspector runs outside, please see the messages above and the link | 11:34 |
dtantsur | kubajj: I started https://review.opendev.org/c/openstack/ironic-lib/+/845986 but then realized the problem with inspector and gave up | 11:35 |
dtantsur | we could potentially send all aliases as part of the inventory. i.e. check /dev/disk/*, /dev/md/* for symlinks to the device. | 11:36 |
TheJulia | WRT volume names, I was thinking hardware raid, not software raid | 11:36 |
dtantsur | TheJulia: hardware RAID are just normal disks, no? | 11:36 |
TheJulia | Since you may not actually see underlying disks | 11:36 |
dtantsur | you won't even see the volume name from inside the machine? | 11:36 |
TheJulia | dtantsur: depends on the controller, but generally yes | 11:36 |
kubajj | TheJulia: I see, that's my bad then | 11:37 |
TheJulia | dtantsur: most expose the volume name | 11:37 |
dtantsur | but do we collect it? | 11:37 |
TheJulia | Truncated down to 12 chars…. | 11:37 |
dtantsur | or rather: do we have a vendor-neutral way to collect them? | 11:37 |
TheJulia | AFAIK, no: | 11:37 |
dtantsur | kubajj: side note: regardless of this discussion, please do update ironic-inspector, otherwise you're going to have 2 different logics to determine the root device | 11:38 |
TheJulia | Uhhhh there is a field in the SATA/SCSI spec afaik | 11:38 |
kubajj | dtantsur: What does the inspector do? | 11:38 |
TheJulia | I’m not sure how we would easily capture aside from going and picking the field out from the kernel. | 11:38 |
TheJulia | But my perception was always agent side processing, nothing to do with root device hinting | 11:39 |
dtantsur | kubajj: as part of processing the introspection data, it determines the root device and sets the node's local_gb accordingly | 11:39 |
TheJulia | Agent side processing in regards to what to ignore | 11:39 |
TheJulia | Which reminds me… is local_gb even applicable to any scheduling anymore? | 11:40 |
TheJulia | I guess “the old nova scheduler” way still gets used by some | 11:40 |
kubajj | dtantsur: and so I should update it to also ignore the disks on the skip list, am I following correctly? | 11:41 |
dtantsur | kubajj: correct | 11:42 |
dtantsur | TheJulia: we use it for free space calculation, I think | 11:42 |
dtantsur | but you're right, we should start phasing all this properties stuff out (except for cpu_arch) | 11:43 |
TheJulia | It is good we have maintained such long compatibility! | 11:44 |
dtantsur | \o/ | 11:47 |
opendevreview | Merged openstack/ironic-python-agent master: Enable skipping disks for cleaning https://review.opendev.org/c/openstack/ironic-python-agent/+/850861 | 11:49 |
opendevreview | Jakub Jelinek proposed openstack/ironic-python-agent master: Improve function list_block_devices_check_skip_list https://review.opendev.org/c/openstack/ironic-python-agent/+/853284 | 12:39 |
kubajj | Here is the follow-up dtantsur | 12:39 |
dtantsur | Thanks! | 12:46 |
opendevreview | Julia Kreger proposed openstack/sushy master: Capture requests errors https://review.opendev.org/c/openstack/sushy/+/853209 | 13:44 |
TheJulia | dtantsur: ^ | 13:44 |
TheJulia | JayF: I think https://review.opendev.org/c/openstack/ironic/+/852794 is ready to rock and roll | 13:48 |
opendevreview | Jakub Jelinek proposed openstack/ironic-inspector master: Introduce skip list to inspector https://review.opendev.org/c/openstack/ironic-inspector/+/853304 | 13:55 |
kubajj | dtantsur: and this should be the inspector | 13:57 |
frickler | iurygregory: any news on jsonschema? should I nag someone else instead? ;) | 14:29 |
JayF | TheJulia: +2 | 14:48 |
iurygregory | frickler, still trying to figure out what is happening, was able to reproduce issues but not to directly fix, seems like the exceptions changed I'm trying to debug to see how to make it match... | 14:52 |
kubajj | TheJulia: dtantsur: What should I do about the software RAIDs then if I can't use the names? | 14:55 |
kubajj | And what is the local_gb? | 14:56 |
TheJulia | iurygregory: I've taken a look at https://review.opendev.org/c/openstack/ironic/+/852797. mainly release note stuff I think. I'm leaning towards -1, but I could just see fixing the reno stuffs and just pushing the button. Not a fan of their own version determiantion logic getting added, but a fan over the overall approach since the lower minimum version was never incremented upward as time went on | 15:11 |
TheJulia | kubajj: local_gb is something I think you can ignore, it was mainly used for scheduling a long long time ago representing a sub of available storage on the root disk. | 15:11 |
iurygregory | TheJulia, looking | 15:12 |
TheJulia | kubajj: to dtantsur's point, you can't use a name on a traditional software raid and expect it to be exposed through. You *can* enumerate the devices in use via /proc/mdstat | 15:13 |
TheJulia | so if you have say a /dev/md1 you can see it's parent devices might be /dev/sda3 /dev/sdb3 via /proc/mdstat | 15:14 |
kubajj | TheJulia: I just came across it while writing the tests for inspector and had no clue where the numbers are coming from | 15:14 |
iurygregory | TheJulia, the logic you are talking is https://review.opendev.org/c/openstack/ironic/+/852797/7/ironic/drivers/modules/irmc/common.py right? | 15:16 |
TheJulia | iurygregory: yup | 15:16 |
iurygregory | I can say is a bit interesting... | 15:17 |
TheJulia | kubajj: data submitted on the likely root device, I believe | 15:17 |
opendevreview | Jakub Jelinek proposed openstack/ironic-python-agent master: Improve function list_block_devices_check_skip_list https://review.opendev.org/c/openstack/ironic-python-agent/+/853284 | 15:17 |
iurygregory | I do agree with your comment re 0.8.2 | 15:17 |
TheJulia | iurygregory: mixed feelings is my state of mind at the moment | 15:18 |
iurygregory | no worries | 15:26 |
adam-rozman | Hi all! Do I have to do some extra step to enable special operatiors for IPA root device hints? I have used a "<in> deviceserial" for the "serial" root device hint but my expressiong gets prepended with "s==" prefix as if IPA is unable to recognize that the expression contains an operator. | 15:41 |
adam-rozman | I got a response at the and such as Image provisioning failed: Deploy step deploy.write_image failed | 15:44 |
adam-rozman | on node 7f8ac1d4-c478-437a-81a3-9b276bbb99a5. No suitable device was found for | 15:44 |
adam-rozman | deployment using these hints {''serial'': ''s== "<in> deviceserial"\n''}' | 15:44 |
dtantsur | adam-rozman: feels like you somehow ended up with extra quotes | 15:45 |
adam-rozman | that could be dtantsur I have put it in the serialNumber field of a BMH object | 15:46 |
dtantsur | adam-rozman: ah metal3. I'm not sure it supports operators. lemme check. | 15:48 |
dtantsur | (I should have guess of course :) | 15:48 |
adam-rozman | thanks | 15:48 |
adam-rozman | I assume it is because I have tried to "edit" the manifest live and it was always removing my ""s and it was failing as it has only passed <in> but this way it failes because it carries over my ""s to IPA | 15:50 |
dtantsur | adam-rozman: yeah, BMO hardcodes operators: https://github.com/metal3-io/baremetal-operator/blob/main/pkg/provisioner/ironic/devicehints/devicehints.go#L30-L32 | 15:50 |
dtantsur | I don't remember why it was done, but I assume because simplicity | 15:51 |
dtantsur | maybe we need to update it to leave any existing operators alone | 15:51 |
adam-rozman | but it is also done on IPA level | 15:51 |
dtantsur | sorry? | 15:52 |
adam-rozman | https://opendev.org/openstack/ironic-lib/src/commit/95ce746ad101b7af32684c29384237061565a4af/ironic_lib/utils.py#L283 | 15:52 |
dtantsur | adam-rozman: yeah, but ironic-lib takes existing operators into account. BMO does not. | 15:52 |
dtantsur | this part actually checks for a correct operator: https://opendev.org/openstack/ironic-lib/src/commit/95ce746ad101b7af32684c29384237061565a4af/ironic_lib/utils.py#L279-L280 | 15:52 |
dtantsur | now, if you get redundant quotes, chances are high ironic will be confused as well | 15:53 |
adam-rozman | yeah but then I assume until BMO hardcodes the "s==" prefix I can't even pass any operator to IPA | 15:54 |
adam-rozman | wouldn't it be okay to just remove the "s==" prefix as IPA would anyhow check if there is operator or not? | 15:54 |
adam-rozman | I mean remove from BMO | 15:54 |
dtantsur | adam-rozman: it will, I seem to remember the argument was "explicit better than implicit" | 15:55 |
adam-rozman | I actually have a usecase where I would need to use the operators in serial device hint in metal3, it feels to me as if a very good Ironic/IPA feature would be artificially blocked. Would it be okay if I would open an issue related to this in BMO? | 16:00 |
adam-rozman | Ofc I wouldn't like to spam the Ironic community with this issue as it turns out to be a BMO thing, so I will transition the conversation there. And thanks for the help dtantsur !!!! | 16:03 |
JayF | adam-rozman: you're using our stuff, and a reasonably advanced feature of it... but it's broken, but it's not our fault it's broken? | 16:09 |
JayF | this is all good news LOL | 16:09 |
adam-rozman | yep exactly as you say JayF | 16:11 |
JayF | :D | 16:12 |
cbouchar | Is there a limit to the number of BM hosts one should manage with a single ironic instance? If so, do share with me the magic number. | 18:05 |
JayF | I don't think we have an official numbere | 18:09 |
JayF | and realistically, that number can go crazy-high depending on your use case, driver choice, and configuration | 18:09 |
ashinclouds[m] | Like five to six figure high. We generally recommend a starting point around 500 nodes per running conductor with default settings and the ipmi driver (as it has a much higher cpu overhead). Redfish is much more lightweight as we have cached sessions and by default config we will cache up to a thousand sessions in each ironic-conductor (I.e. 1000 nodes per conductor(this can be tuned!) | 18:21 |
cbouchar | Wow! I didn't expect this as a response. I seem to recall in metal3 there was a limitation 100-255. | 18:40 |
cbouchar | Thank you for clarifying! | 18:40 |
TheJulia | Well, depend on access model, may change things, but we did a lot to improve overall API performance. The costly thing with API performance though is asking the API for all fields on all nodes | 19:02 |
opendevreview | Julia Kreger proposed openstack/ironic master: Allow project scoped admins to create/delete nodes https://review.opendev.org/c/openstack/ironic/+/852794 | 19:07 |
TheJulia | JayF: ^ removed the whitespace. | 19:07 |
TheJulia | ajya: by chance do you watch the openstack mailing list? | 19:14 |
TheJulia | cbouchar: I would be happy to have a deep dive performance discussion at some point | 20:25 |
opendevreview | Julia Kreger proposed openstack/ironic master: Add kickstart template 'url' option https://review.opendev.org/c/openstack/ironic/+/853368 | 21:18 |
opendevreview | Julia Kreger proposed openstack/ironic-tempest-plugin master: WIP: Initial tempest test idea anaconda deploy https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/835917 | 21:20 |
opendevreview | Jay Faulkner proposed openstack/ironic stable/yoga: Fix pxe image lookups https://review.opendev.org/c/openstack/ironic/+/853331 | 22:22 |
opendevreview | Jay Faulkner proposed openstack/ironic stable/yoga: Fix pxe image lookups https://review.opendev.org/c/openstack/ironic/+/853331 | 22:37 |
JayF | TheJulia: ^^ Does that have a prereq, or did I resolve the conflict badly? Tests are failing around the changed code. | 22:37 |
* JayF is on a mission to ensure all our BC+1 patches are actually backported | 22:37 | |
TheJulia | I don't *think* it did, but I can look likely tomorrow | 22:38 |
JayF | both failures are | 22:38 |
JayF | > ironic.common.exception.ImageUnacceptable: 'stage2_id' is missing from the properties of the OS image http://fake.url/path. The anaconda deploy interface requires this to be set with the OS image or in instance_info['stage2']. | 22:38 |
JayF | which makes me wonder if there's a prereq on some of the validation cleanup that I very vaguely remember happening | 22:38 |
JayF | and/or when resolving the conflict, I broke these tests or left them in when they shoulda been removed | 22:39 |
TheJulia | oh... hmmmm | 22:39 |
TheJulia | yes, there is a prereq missing then | 22:39 |
TheJulia | hmmmm | 22:39 |
JayF | I'm going to put this chat in the comments of that merge req, if you have neurons fire in the direction of this change, put the info there and I'll follow-up on it | 22:40 |
TheJulia | it could be https://review.opendev.org/c/openstack/ironic/+/835769 | 22:40 |
JayF | TheJulia: also, a blast from the past now landing in victoria https://review.opendev.org/c/openstack/nova/+/800873 | 22:41 |
TheJulia | I also did https://review.opendev.org/c/openstack/ironic/+/834709 | 22:41 |
JayF | are those both backportage? | 22:41 |
JayF | 834709 is HUGE | 22:41 |
TheJulia | wow.... | 22:41 |
TheJulia | uhhhhhh I'm not sure about 709 | 22:41 |
JayF | if 769 applies without 709, I can try that | 22:42 |
JayF | that at least is sensible | 22:42 |
TheJulia | actually 709 is one I intended to eventually backport | 22:42 |
JayF | Like, trying to think of the right way to ask this | 22:43 |
JayF | is this a good use of my time? | 22:43 |
JayF | aka do we have any KS users in Yoga-and-earlier? | 22:43 |
TheJulia | I think cbouchar is using yoga right now | 22:44 |
JayF | the answer is probably "screw it do it anyway jay" okay I'm convinced lol | 22:44 |
TheJulia | lol | 22:44 |
* JayF is scared of https://dpaste.com/8SX8M8CVH | 22:45 | |
JayF | I'll make a note and revisit this tomorrow, this is gonna be a bloodbath and I don't wanna start it this late lol | 22:45 |
TheJulia | oh... my | 22:46 |
TheJulia | ++++ | 22:46 |
JayF | so to be explicit: 834809 then 835769 then 852201 | 22:46 |
TheJulia | I *think* so, yes | 22:46 |
JayF | and TBH looking at that cherry-pick, might ensure that there's nothign BEFORE 809 that needs to go | 22:46 |
JayF | nothing says "I had a super productive day" more loudly than your todo list being longer at the end of it than at the beginning | 22:47 |
JayF | and I'm kinda serious lol | 22:47 |
TheJulia | lol | 22:48 |
TheJulia | yeah | 22:48 |
opendevreview | Julia Kreger proposed openstack/ironic master: Add kickstart template 'url' option https://review.opendev.org/c/openstack/ironic/+/853368 | 22:58 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!