JayF | how is it possible that the default opensuse tumbleweed install in wsl doesn't come with openssl? lol | 00:00 |
---|---|---|
* TheJulia blinks | 00:00 | |
JayF | TheJulia: https://bpa.st/O2NQ is what raw.githubusercontent.com is serving up to me right now | 00:02 |
JayF | to summarize | 00:02 |
JayF | X509v3 Subject Alternative Name: | 00:03 |
JayF | DNS:*.github.io, DNS:github.io, DNS:*.github.com, DNS:github.com, DNS:www.github.com, DNS:*.githubusercontent.com, DNS:githubusercontent.com | 00:03 |
* JayF goes to look at the exact hostname on the failure | 00:03 | |
TheJulia | oh goodie | 00:04 |
TheJulia | ++ | 00:04 |
JayF | okay, so here's what's actually happening | 00:04 |
JayF | > + : curl -Lf -o disk.qcow2 http://download.cirros-cloud.net/0.6.1/cirros-0.6.1-x86_64-disk.img | 00:04 |
TheJulia | oh, look at the logs | 00:04 |
JayF | redirects to github.com releases | 00:04 |
TheJulia | it redirects it with 302 | 00:04 |
JayF | which has the wrong cert | 00:04 |
JayF | so I think our best solution is to directly use the github.com url | 00:04 |
TheJulia | we don't have a choice | 00:05 |
TheJulia | we don't create the 302 redirect, cirros does | 00:05 |
JayF | I'm saying have a human follow it lol | 00:05 |
TheJulia | Anyway, wifey needs to run out, and I can't look at a terminal anymore after trying to figure out what actually has testing, and doesn't | 00:05 |
JayF | it 302s to https://github.com/cirros-dev/cirros/releases/download/0.6.1/cirros-0.6.1-x86)64-disk.img | 00:05 |
JayF | so that seems like a reasonable place to pull it | 00:05 |
JayF | let me see if I can fix it... | 00:06 |
TheJulia | k | 00:06 |
TheJulia | likely | 00:06 |
TheJulia | plus side of writing a test and then writing code to support it, test driven development, but then when you go and go "oh, this needs more testing!, it feels like a slog | 00:06 |
JayF | I think about it this way | 00:07 |
JayF | most people operate maybe like, a dozen clouds at most? | 00:07 |
JayF | and have a lot of hardware | 00:07 |
JayF | we operate hundreds and hundreds in a teacup | 00:07 |
JayF | it's a super hard problem and sometimes I think we forget to appreciate that when it's working LOL | 00:07 |
TheJulia | +++++ | 00:08 |
TheJulia | We operate hundreds in a coffee mug, and the plate has many other coffee mugs | 00:09 |
opendevreview | Jay Faulkner proposed openstack/ironic master: CI: cirros images now 302 to github; correct the URL https://review.opendev.org/c/openstack/ironic/+/888961 | 00:10 |
opendevreview | Jay Faulkner proposed openstack/ironic master: CI: Re-enable voting for Grenade job https://review.opendev.org/c/openstack/ironic/+/888962 | 00:10 |
opendevreview | Jacob Anders proposed openstack/ironic master: [DNM] Retry connecting vmedia through a DVD device if available - alternate approach. https://review.opendev.org/c/openstack/ironic/+/888746 | 04:59 |
opendevreview | Jacob Anders proposed openstack/ironic master: [DNM] Retry connecting vmedia through a DVD device if available - alternate approach. https://review.opendev.org/c/openstack/ironic/+/888746 | 05:00 |
frickler | JayF: TheJulia: https://bugs.launchpad.net/ubuntu/+source/curl/+bug/2028170 don't blame the target if your tool is broken | 05:43 |
Nisha_Agarwal | morning ironic | 06:10 |
opendevreview | yatin proposed openstack/ironic master: [DNM] Attempt source install dnsmasq https://review.opendev.org/c/openstack/ironic/+/888121 | 06:54 |
rpittau | good morning ironic! o/ | 07:01 |
opendevreview | yatin proposed openstack/ironic master: [DNM] Attempt source install dnsmasq 2.89 https://review.opendev.org/c/openstack/ironic/+/888984 | 07:17 |
opendevreview | Verification of a change to openstack/bifrost master failed: Refactor the use of include_vars https://review.opendev.org/c/openstack/bifrost/+/874523 | 10:45 |
*** mohammed_ is now known as mohammed | 11:08 | |
iurygregory | good morning Ironic | 11:51 |
JayF | frickler: I'm going to be completely honest, I basically discarded the possibility that curl was broken because I assumed that it would break everything else in the entire world | 12:20 |
JayF | That bug makes it sound like it did break everything else in the entire world 😂 | 12:20 |
iurygregory | I never expected that curl would be broken lol | 12:21 |
frickler | yes, raises questions on ubuntu's QA | 12:22 |
frickler | at least it got fixed pretty fast | 12:23 |
TheJulia | wow, I would have never expected such a broken package to ship | 13:15 |
jfargen | Was just looking at the quickstart for bifrost and it seemed to install successfully. | 14:24 |
iurygregory | \o/ happy to hear that people can follow our docs and have successful installations | 14:28 |
jfargen | I see directions to enroll the node, but it doesn't mention the import of the node. | 14:43 |
jfargen | And when running enroll, seeing this error. | 14:44 |
jfargen | $ ./bifrost-cli enroll gvs02-sand2.json | 14:44 |
jfargen | [WARNING]: * Failed to parse /home/cloud-user/bifrost/playbooks/inventory/bifrost_inventory.py with script plugin: Inventory script (/home/cloud-user/bifrost/playbooks/inventory/bifrost_inventory.py) had an execution | 14:44 |
jfargen | error: 2023-07-20 10:39:57.384 4305 ERROR __main__ [-] Failed to parse JSON or YAML: while parsing a flow mapping in "/home/cloud-user/bifrost/gvs02-sand2.json", line 5, column 22 expected ',' or '}', but got '<scalar>' | 14:44 |
jfargen | in "/home/cloud-user/bifrost/gvs02-sand2.json", line 9, column 9: yaml.parser.ParserError: while parsing a flow mapping 2023-07-20 10:39:57.384 4305 ERROR __main__ [-] BIFROST_INVENTORY_SOURCE does not define a file | 14:44 |
jfargen | that could be processed: Failed to parse JSON or YAML.Tried JSON and YAML formats: Exception: Failed to parse JSON or YAML | 14:44 |
JayF | without seeing your json file I can't be sure; but it looks like invalid json or yaml in that file | 14:48 |
JayF | the gvs02-sand2.json | 14:49 |
jfargen | Is there an example of a yaml inventory file using redfish? | 14:52 |
jfargen | https://filebin.net/mnu08251w8jofq6i/gvs02-sand2.json | 14:54 |
jfargen | There is a copy of the file. | 14:54 |
JayF | aha so you gotta kill the trailing comma | 14:54 |
JayF | after the redfish_password entry | 14:54 |
JayF | same in nics[mac] as well | 14:54 |
TheJulia | actually, | 14:54 |
TheJulia | redfish_username is missing one | 14:55 |
JayF | oh snap | 14:55 |
TheJulia | password shouldn't have one for json | 14:55 |
JayF | so we allow trailing commas? | 14:55 |
JayF | probably because we are using a json/yaml parser... | 14:55 |
TheJulia | so, strictly i don't remember | 14:55 |
TheJulia | but we're sort of failing on the line before | 14:55 |
TheJulia | because it lacks one | 14:55 |
TheJulia | a json validator should complain about that | 14:56 |
jfargen | Ok, much happier. Thanks folks. | 14:56 |
jfargen | This doc - https://docs.openstack.org/bifrost/latest/user/howto.html#enroll-hardware, is missing the , (comma) after "redfish_username": "admin" . | 14:58 |
JayF | hah | 14:59 |
JayF | on it | 14:59 |
opendevreview | Jay Faulkner proposed openstack/bifrost master: Correct JSON by adding missing comma https://review.opendev.org/c/openstack/bifrost/+/889107 | 15:00 |
jfargen | Okay, the enroll progresses further, now seeing a new message. | 15:04 |
jfargen | https://filebin.net/70gmlebclnj1fmo5/gvs02-bifrost.out | 15:04 |
iurygregory | do you have uuid set on your json file? | 15:07 |
iurygregory | at least from the message this seems to be the problem | 15:07 |
TheJulia | I think our in repo example files all have UUIDs | 15:07 |
JayF | lol :/ | 15:08 |
iurygregory | yup, maybe we shouldn't make it required and generate one, not sure... | 15:08 |
TheJulia | it has been a while, and at one point we tried to name lookup | 15:08 |
jfargen | The doc says "The name, uuid, driver, and properties fields are directly mapped through to ironic. Only the driver is required. | 15:08 |
jfargen | " | 15:08 |
TheJulia | maybe that code got changed in refactoring | 15:08 |
jfargen | okay, I can add a uuid. | 15:09 |
JayF | Gonna be honest, this is the first I'm learning that you can provide a uuid at all for Ironic | 15:09 |
JayF | nodes | 15:09 |
TheJulia | oh, on create! :) | 15:09 |
JayF | yeah, enroll is what is happening | 15:09 |
jfargen | Do we have a way to work around a dreaded self-signed cert? | 15:17 |
jfargen | Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED | 15:18 |
jfargen | Full text https://filebin.net/70gmlebclnj1fmo5/gvs02-bifrost.out | 15:18 |
iurygregory | you can add redfish_verify_ca=False | 15:19 |
iurygregory | in the json under "driver_info" | 15:20 |
iurygregory | "redfish_verify_ca": false, | 15:22 |
iurygregory | something like this if I recall correctly | 15:22 |
jfargen | Yes, the second example helped. | 15:34 |
jfargen | Getting closer! | 15:34 |
TheJulia | Can I borrow a set of eyes? | 15:34 |
jfargen | fatal: [gvs02-sand2 -> localhost]: FAILED! => {"changed": false, "extra_data": {"data": null, "details": "None", "response": "None"}, "msg": "Node 20000000-0000-0000-0000-000000000002 could not reach state manageable: failed to verify management credentials; the last error is Failed to get power state for node 20000000-0000-0000-0000-000000000002. Error: The attribute Actions is missing from the resource | 15:34 |
jfargen | /redfish/v1/Systems"} | 15:34 |
TheJulia | jfargen: what hardware are you trying to remote control via redfish | 15:35 |
jfargen | dell | 15:35 |
jfargen | Dell R750 | 15:35 |
TheJulia | firmware version of the BMC? | 15:35 |
TheJulia | iurygregory: JayF: https://bugs.launchpad.net/ironic/+bug/2028279 I feel like python lost it's mind on this one, I'd appreciate someone else's eyes looking at it | 15:36 |
jfargen | Firmware Version5.10.10.00 | 15:36 |
jfargen | Firmware UpdatedTue Apr 19 15:01:30 2022 | 15:36 |
TheJulia | what version of python-sushy is installed? | 15:37 |
iurygregory | TheJulia, ack will look at it | 15:38 |
jfargen | @TheJulia Was this question "what version of python-sushy is installed?" for me? | 15:39 |
TheJulia | jfargen: yes, sorry | 15:39 |
jfargen | Seems not to be installed. | 15:40 |
jfargen | bifrost) [cloud-user@bifrost-0 bifrost]$ sudo rpm -qa | grep python-sushy | 15:40 |
jfargen | (bifrost) [cloud-user@bifrost-0 bifrost]$ | 15:40 |
TheJulia | try pip-freeze looking for sushy | 15:40 |
TheJulia | err pip freeze | 15:42 |
jfargen | sushy==4.5.0 | 15:43 |
TheJulia | so basically latest | 15:45 |
TheJulia | jfargen: I hate to ask this, but could you consider updating your bmc firmware? I suspect your hitting issues around authentication. You *could* try turning off session based authentication as well, that will change the underlying behavior of the library, not horribly so, but I think things are detonating on the base lookup for where to find to authenticate. If you've got an exception in the logs, I can likely fix it or | 15:46 |
TheJulia | account for it | 15:46 |
TheJulia | iurygregory: have you, by chance seen this on dell hardware? | 15:47 |
TheJulia | https://docs.openstack.org/ironic/latest/admin/drivers/redfish.html look for "redfish_auth_type" | 15:47 |
* TheJulia wonders if base bmc config also changes the ability to use session auth | 15:48 | |
rpittau | we've seen this in Dell and updateing firmware did fix that | 15:48 |
rpittau | suggested version is 6.10.30.00 | 15:48 |
jfargen | Shouldn't be an issue, let me see about doing it. | 15:48 |
TheJulia | I did kind of remember something with 5.10.10.10 specifically, but it just wasn't coming to the surface | 15:51 |
TheJulia | Thanks rpittau | 15:51 |
rpittau | no problem | 15:51 |
rpittau | yeah there are some bugged firmware in the 5.x series until 6.10 | 15:51 |
rpittau | in this case though it's indeed a compatibility issue, latest sushy works with 6.x series | 15:55 |
iurygregory | sorry was having lunch | 16:02 |
iurygregory | yeah like rpittau mentioned 5 series seems to have some problems | 16:02 |
JayF | If we know current sushy doesn't work on that version of drac firmware; is there a way to give somone like jfargen that message without the IRC chat round trip? Like can sushy improve the err in those cases? | 16:09 |
TheJulia | I seem to remember 5.10.20.30 being happier, but I might be making that up | 16:10 |
JayF | https://review.opendev.org/c/openstack/sushy/+/888649 can someone land this? | 16:10 |
TheJulia | We don't have a mechanism to detect and return that on specific versions, the idea has been floated before and we had some pretty extreme push back from hardware vendor land on any suggestion of anything "negative" | 16:11 |
rpittau | JayF: approved | 16:11 |
rpittau | aaand see you tomorrow! o/ | 16:11 |
TheJulia | rpittau: you beat me due to the login screen! :) | 16:12 |
TheJulia | Have a wonderful evening! | 16:12 |
JayF | Hardware vendors don't have core review status on sushy, why do they get a vote :) | 16:12 |
JayF | If they don't want versions blocklisted; don't release bad firmwares | 16:12 |
TheJulia | We always attempt to play nice with everyone. :) | 16:12 |
TheJulia | Which can be a mistake at times | 16:12 |
JayF | I never want to be nice to a corp at the expense of our users :) | 16:12 |
TheJulia | I'd +2 a "known broken, report a 'update your bmc'" change | 16:12 |
JayF | of course, all of those cases are 100% black and white | 16:13 |
JayF | with no grey area whatsoever | 16:13 |
JayF | ;) | 16:13 |
* TheJulia notes being a core is a bit like being on the grey council | 16:13 | |
JayF | I've been one ignored shadow away from breaking the staff in two dramatically for years now | 16:14 |
* TheJulia wants the hall, the robes only if they are not "too warm" | 16:14 | |
opendevreview | Jay Faulkner proposed openstack/sushy stable/zed: Requests must always have a read/connect timeout https://review.opendev.org/c/openstack/sushy/+/889112 | 16:29 |
opendevreview | Jay Faulkner proposed openstack/sushy stable/zed: Requests must always have a read/connect timeout https://review.opendev.org/c/openstack/sushy/+/889112 | 16:30 |
opendevreview | Jay Faulkner proposed openstack/sushy stable/zed: Requests must always have a read/connect timeout https://review.opendev.org/c/openstack/sushy/+/889112 | 16:32 |
opendevreview | Jay Faulkner proposed openstack/sushy stable/zed: Requests must always have a read/connect timeout https://review.opendev.org/c/openstack/sushy/+/889112 | 16:33 |
opendevreview | Jay Faulkner proposed openstack/sushy stable/yoga: Requests must always have a read/connect timeout https://review.opendev.org/c/openstack/sushy/+/889114 | 16:59 |
opendevreview | Jay Faulkner proposed openstack/sushy stable/xena: Requests must always have a read/connect timeout https://review.opendev.org/c/openstack/sushy/+/889115 | 17:01 |
mohammed | TheJulia: Yesterday we have tested metal3 build after merging your fixes (sqlite database is locked) we have not seen that issue anymore :smile: but we noticed sometimes the inspection fails with the error reported here https://github.com/metal3-io/cluster-api-provider-metal3/issues/1099 | 17:11 |
mohammed | I can see also similar failures on the ironic ci : https://zuul.opendev.org/t/openstack/build/54b3aeff26d44eb5816a485b953ed01f | 17:11 |
mohammed | We are still investigating this issue but I wanted to share this and see it you have any thoughts about it ! not sure if it is a side effect or related to the db issue ? | 17:11 |
opendevreview | Merged openstack/sushy stable/2023.1: Requests must always have a read/connect timeout https://review.opendev.org/c/openstack/sushy/+/888649 | 17:23 |
TheJulia | mohammed: okay, I see what is going on, kind of | 17:38 |
TheJulia | it is the retry handler, trying to retry! \o/ | 17:38 |
TheJulia | and failing oddly | 17:38 |
TheJulia | ...weird | 17:38 |
TheJulia | so the unit test works because the logging is mocked to capture the calls. The examples were followed | 17:39 |
TheJulia | the issue is the examples were wrong of that precise use of logging :( | 17:39 |
TheJulia | okay! easy-ish fix | 17:39 |
TheJulia | mohammed: on a plus side, we have clearly lessened the issue with the other work, the log is not 5 Megabytes! | 17:51 |
TheJulia | today's question, how many times can Julia become distracted | 17:54 |
opendevreview | Julia Kreger proposed openstack/ironic master: Fix retry logic logging https://review.opendev.org/c/openstack/ironic/+/889117 | 18:03 |
mohammed | I would change the question to: How many problems has Julia solved today? | 18:04 |
TheJulia | I have one I'm stumped on | 18:05 |
TheJulia | Like, it makes absolutely no sense to me | 18:05 |
TheJulia | like, mentally, I'm starting to wonder *IF* the python in ubuntu is carrying some extra performance patch that makes things unsafe for our code | 18:06 |
TheJulia | wheeee my ipv6 is not working, again | 18:07 |
* TheJulia just lets tox hang for a little while | 18:07 | |
JayF | TheJulia: after lunch, maybe we should look at that one together? I didn't fully undersrtand when I read the bug | 18:11 |
JayF | but one of those things where someone validating your assumptions would help, maybe? | 18:11 |
* JayF off to get said lunch | 18:11 | |
TheJulia | JayF: sure | 18:11 |
TheJulia | JayF: I do have a 1-on-1 starting in 50 minutes, that typically runs an hour, fyi | 18:11 |
JayF | so... 1pm PST then? | 18:11 |
TheJulia | yeah | 18:12 |
JayF | it's on the calendar, you have an inv | 18:12 |
TheJulia | k | 18:14 |
opendevreview | Julia Kreger proposed openstack/ironic master: Fix retry logic logging https://review.opendev.org/c/openstack/ironic/+/889117 | 18:37 |
*** Continuity__ is now known as Continuity | 18:54 | |
opendevreview | Jay Faulkner proposed openstack/ironic master: DNM: Testing Nova-Ironic driver lessee change. https://review.opendev.org/c/openstack/ironic/+/889122 | 19:33 |
TheJulia | https://bugs.launchpad.net/ironic/+bug/2026757 | 20:08 |
TheJulia | original change https://review.opendev.org/c/openstack/ironic/+/888121/5 | 20:11 |
TheJulia | https://zuul.opendev.org/t/openstack/build/817a806a3c1042488f7e5b6a1d1c22b0 | 20:12 |
TheJulia | https://caf2f172638deaa44bce-d94eb819944fd5bc8105d713aef77d0b.ssl.cf1.rackcdn.com/888121/5/check/ironic-standalone-redfish/817a806/controller/logs/tempest_log.txt | 20:14 |
jfargen | I've updated the iDRAC firmware to 6.10.80.00 and still seeing the same error. | 20:21 |
jfargen | https://filebin.net/99ragngc5bdw2ljq/gvs02-bifrost.out | 20:22 |
TheJulia | https://github.com/openstack/ironic-tempest-plugin/blob/master/ironic_tempest_plugin/tests/scenario/baremetal_standalone_manager.py | 20:23 |
jfargen | If there is a specific line number I should be interested in I would appreciate the guidance. | 20:27 |
jfargen | It might have been the Redfish API Path / redfish_system_id, it has been updated, waiting for output. | 20:32 |
JayF | yeah I'm not sure jfargen, hopefully someone more dell-minded sees it | 20:35 |
opendevreview | Julia Kreger proposed openstack/ironic-tempest-plugin master: WIP: Raise an error on unknown values being filtered https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/889128 | 20:35 |
JayF | Julia and I are on a call troubleshooting something, you might be able to get her to look with patience :D | 20:35 |
TheJulia | jfargen: o/ uhh, what settings are you using? | 20:37 |
TheJulia | jfargen: looks like you nuked the file | 20:47 |
TheJulia | jfargen: I'll be out tomorrow, but can look/respond in irc | 20:55 |
JayF | https://arstechnica.com/security/2023/07/millions-of-servers-inside-data-centers-imperiled-by-flaws-in-ami-bmc-firmware/ | 21:08 |
TheJulia | well, luckilly most of the server BMCs out there in prod for almost everyone are not ami then | 21:15 |
TheJulia | I can't think of the company right now though | 21:15 |
TheJulia | .. (and even then, the major vendors have their own firmware) | 21:15 |
JayF | It's an interesting article even if a little clickbaity though | 21:57 |
JayF | I think anything that raises awareness of how bad, on average, BMC firmwares are is net-good for us | 21:58 |
jfargen | Does bifrost assume pxe and how can it be set to boot images from redfish virtual media? | 22:13 |
JayF | I'm not sure what bifrost's default configuration is | 22:14 |
JayF | but I bet we have the last part documented | 22:14 |
JayF | it should be as easy as, if needed, changing the boot interface | 22:14 |
JayF | set to something like redfish_boot_vmedia (?) | 22:14 |
JayF | https://docs.openstack.org/ironic/latest/admin/drivers/redfish.html#enabling-the-redfish-driver | 22:15 |
JayF | yeah redfish-virtual-media | 22:15 |
JayF | I'm not sure how that looks in bifrost though | 22:15 |
JayF | https://docs.openstack.org/ironic/latest/admin/drivers/redfish.html#virtual-media-boot is a more direct | 22:15 |
JayF | that second link, you could just see if that works | 22:15 |
JayF | and it's referenced in the instructions | 22:16 |
JayF | so the settings are plumbed through bifrost somehow | 22:16 |
JayF | my hunch would be just setting the boot_interface in the inventory files during enrollment, or running the set command from that virtual-media-boot doc | 22:17 |
TheJulia | It should just pass the values through | 23:54 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!