vanou | good morning ironic | 01:08 |
---|---|---|
arne_wiebalck | Good morning vanou and Ironic! | 07:27 |
TheJulia | good morning | 07:28 |
* TheJulia misses sleep | 07:28 | |
* TheJulia is very tired | 07:28 | |
arne_wiebalck | Hey good morning, TheJulia o/ | 07:29 |
* TheJulia waves good morning to arne_wiebalck | 07:31 | |
TheJulia | arne_wiebalck: out of curiosity, at CERN, what is the average OS image size you folks are deploying to machines? | 07:31 |
arne_wiebalck | TheJulia: erm ... | 07:31 |
arne_wiebalck | *needs to check ...* | 07:31 |
* arne_wiebalck needs to check ... and is confused by different chat systems | 07:32 | |
TheJulia | heh | 07:32 |
TheJulia | confusion is a state of being | 07:32 |
TheJulia | and it is a totally valid state at that | 07:32 |
TheJulia | arne_wiebalck: specifically virutal size and actual compressed image file size if you have it | 07:33 |
arne_wiebalck | virtual size is 4G for our CentOS images, 1G for Windows | 07:37 |
arne_wiebalck | we use raw images | 07:37 |
TheJulia | Impressive | 07:37 |
TheJulia | Okay, Thanks. | 07:37 |
arne_wiebalck | np :) | 07:39 |
TheJulia | nobodycam spotted really just a horrible performance with qemu-img convert writing qcows out on nvme devices | 07:39 |
TheJulia | https://storyboard.openstack.org/#!/story/2010397 | 07:39 |
TheJulia | So I was trying to get a sense of what you folks were doing/experiencing as a data point, but with just raw images, you have none of those issues. | 07:40 |
arne_wiebalck | I *think* we had this at some point as well, qemu-img convert driving the controller OOM | 07:59 |
arne_wiebalck | (while it should not convert) | 08:00 |
arne_wiebalck | is directsync what we use in Ironic? | 08:15 |
kubajj | Good morning Ironic! | 08:17 |
arne_wiebalck | hey kubajj o/ | 08:24 |
TheJulia | arne_wiebalck: oh, we fixed that issue in ironic-lib for the most part, then again, in massive concurency I coul dsee it | 08:25 |
TheJulia | so... I believe o_direct is used | 08:25 |
TheJulia | I don't know what directsync is in this context | 08:25 |
TheJulia | but the tl;dr of nobodycam's issue is it is writing out zeros with o_direct which is painfully slow on nvme's since by default you can't work with the buffer in that case | 08:27 |
TheJulia | since o_direct is basically "go direct to the medium, don't cache" | 08:27 |
Nisha_Agarwal | TheJulia, GM | 08:38 |
TheJulia | good morning Nisha_Agarwal | 08:39 |
Nisha_Agarwal | Isnt it late night for u? | 08:39 |
TheJulia | I'm in Brno, CZ this week | 08:39 |
Nisha_Agarwal | :) ok | 08:39 |
TheJulia | Trying to stay awake :) | 08:39 |
Nisha_Agarwal | :) | 08:39 |
Nisha_Agarwal | TheJulia, When you get some time could you review https://review.opendev.org/c/openstack/ironic/+/860055/5...the anaconda patch | 08:40 |
TheJulia | I might not be able to this week, for what it is worth | 08:40 |
TheJulia | meetings all week | 08:40 |
Nisha_Agarwal | :) np | 08:41 |
opendevreview | Merged openstack/virtualbmc master: remove python-dev from bindep https://review.opendev.org/c/openstack/virtualbmc/+/863818 | 08:42 |
rpittau | good morning ironic! o/ | 09:01 |
rpittau | if any core has a moment please review https://review.opendev.org/c/openstack/sushy/+/863828 thanks! | 09:49 |
dtantsur | TheJulia: I was under impression that derekh has fixed the issue with zeroing in qemu-img... | 10:12 |
TheJulia | I think he partially did | 10:16 |
TheJulia | but nobodycam did mention more options/behavior | 10:16 |
derekh | TheJulia dtantsur https://storyboard.openstack.org/#!/story/2009227 | 10:16 |
dtantsur | ajya: hi! were you able to confirm if changing secure boot returns a task or any way to track the execution? | 10:20 |
derekh | The example in https://storyboard.openstack.org/#!/story/2010397 doesn't have the "-S 0" | 10:20 |
derekh | NobodyCam ^ | 10:21 |
arne_wiebalck | TheJulia: at least in fio, there is direct and sync (one is bypassing the cache, the other is sync'ing after every write I think) | 10:31 |
TheJulia | arne_wiebalck: yeah, sync is a fun one because it is device dependent, example some raid controllers treat sync as "ahh, yes, that is committed to my battery backed buffer" | 10:33 |
ajya | Hi dtantsur iDRAC returns task URI in Location header, only thing it is OEM task (same task id, different URL). From that generic task can be extracted, but it's specific code for iDRAC (sushy-oem-idrac has something like that already for one OEM endpoint). I'm asking firmware team to change this to generic task URL for future versions. | 10:33 |
TheJulia | and still needs to write that out. | 10:33 |
* TheJulia has had much fun with this over the years | 10:33 | |
dtantsur | ajya: ouch :( thank you! | 10:33 |
arne_wiebalck | TheJulia: right ... I think a sync ack *should* | 10:34 |
arne_wiebalck | guarantee persistent storage | 10:34 |
TheJulia | yup | 10:34 |
ajya | dtantsur: also I haven't got to reproducing the issue locally yet to confirm that this is the only issue, will try this week | 10:34 |
arne_wiebalck | TheJulia: all this is a constant source of interesting phenomena :) | 10:35 |
TheJulia | indeed | 10:35 |
arne_wiebalck | TheJulia: since many years, at least for me | 10:35 |
TheJulia | And why I had liquor in my office when I had data centers I could walk to | 10:36 |
arne_wiebalck | heh | 10:36 |
dtantsur | ajya: could you check if SecureBootEnable is changed instantly or only after a reboot? If the latter, we can probably try rebooting and checking the updated value. | 10:36 |
ajya | dtantsur: only after reboot, PATCHing only creates a job in iDRAC that is is Scheduled state. Only during reboot it is started. The workflow is the same as for BIOS attribute update because it is a BIOS attribute change (probably, could do the same with BIOS clean step). | 10:41 |
dtantsur | ajya: okay, so if I update the redfish code to check if the value has changed immediately, reboot if not, and check again, it will work? | 10:57 |
ajya | dtantsur: yes, GET /SecureBoot returns old value until job is finished | 11:06 |
dtantsur | ooookay, time for some ugly hacks \o/ | 11:12 |
dtantsur | thanks again ajya | 11:12 |
iurygregory | morning Ironic | 11:32 |
vanou | Hi arne_wiebalck and all | 11:35 |
opendevreview | Mike Raineri proposed openstack/ironic master: Create 'redfish' driver Redfish Interop Profile https://review.opendev.org/c/openstack/ironic/+/754061 | 11:54 |
dtantsur | ajya: how long do you think the job application can take? | 12:25 |
ajya | dtantsur: looking at the logs - around 6 mins | 12:33 |
dtantsur | wow, okay | 12:34 |
ajya | it includes rebooting system that takes time | 12:36 |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: [WIP] Wait for secure boot state change if it's not immediate https://review.opendev.org/c/openstack/ironic/+/863999 | 12:45 |
dtantsur | ajya: true. Could you take a quick look at the direction here ^^? | 12:45 |
ajya | dtantsur: will this reboot somehow conflict with reboot when setting the boot device and launching for pxe or vm? I think for direct deploy there is one reboot that handles both secure boot and boot device changes and then launches IPA. Now it will reboot twice? | 13:07 |
ajya | or users will have to set 0 to avoid that? | 13:08 |
ajya | What will be the flow for ramdisk deploy? | 13:08 |
dtantsur | ajya: so, I checked PXE and redfish-virtual-media, and both do the secure boot business before setting the boot device | 13:09 |
dtantsur | good question re ramdisk | 13:09 |
ajya | dtantsur: ok at least reboot will not clear boot device because it will be set later, it will only slow down direct deploy when changing secure boot as there will be 2 reboots (both roughly those 6 minutes) | 13:10 |
dtantsur | ajya: yeah, but only if you request the secure boot change | 13:12 |
ajya | yes, users maybe will need to also increase their clean, deploy timeouts because of this addition | 13:13 |
dtantsur | good point, I'll update the docs | 13:14 |
dtantsur | ramdisk deploy goes through the same prepare_instance call, which manages secure boot as one of the first things | 13:14 |
ajya | what I don't know because I haven't tried - why secure boot does not work with ramdisk - there is no reboot because system is already running? | 13:14 |
dtantsur | ajya: my only guess is that it's because BMO uses force_persistent_boot_device=Never | 13:15 |
dtantsur | which is still weird, but maybe the temporary boot device is actually reset? | 13:15 |
ajya | dtantsur: so that would happen after applying secure boot in the middle of booting? Maybe, have to check. Otherwise, the proposed solution would work, but drawback is making things slower. However, if the boot setting is cleared in the middle of booting there is no other way around as reboot again | 13:29 |
ajya | maybe could do the reboot&waiting for force_persistent_boot_device=Never only but I have to confirm if that's the cause | 13:30 |
dtantsur | right... | 13:41 |
dtantsur | it's also complicated since set_secure_boot can be called from the API directly | 13:42 |
*** rcastillo|rover_ is now known as rcastillo | 13:49 | |
*** rcastillo is now known as rcastillo|rover_ | 13:49 | |
*** rcastillo|rover_ is now known as rcastillo|rover | 13:51 | |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: [WIP] [PoC] A metal3 CI job https://review.opendev.org/c/openstack/ironic/+/863873 | 13:52 |
ajya | yeah, it should be self-contained unless there is a way to determine if additional reboot needed or not based on the context it is called from and other settings | 13:52 |
arozman | Hi Ironic! | 14:30 |
arozman | Hi, I have a question related to Ironic API, I have this request towards a Ironic server curl -u 'username:passwd' -X PATCH -H "Content-Type: application/json" -d '[{"op":"add","path":"/boot_interface","value":"redfish-virtual-media"}]' -k https://172.22.0.1:6385/v1/nodes/96848965-e6ef-47d6-b02a-ca2f5989e4e0 and it returns a 406 error. Is there some option in the Ironic config that I have to enable to allow me to use this | 14:56 |
arozman | endpoint? | 14:56 |
arozman | this is the error message: {"error_message": "{\"faultcode\": \"Client\", \"faultstring\": \"Request not acceptable.\", \"debuginfo\": null}"} | 14:57 |
ashinclouds[m] | arozman: ummm what is your hardware type. Btw, that should be op update | 15:04 |
ashinclouds[m] | Also, have you considered the bare metal command? | 15:05 |
JayF | I think arozman works on metal3... At least based on my interpretation of that nickname 😀 | 15:06 |
ashinclouds[m] | Should still work if you invoke it properly | 15:07 |
JayF | Yeah I'm just guessing that they're trying to figure out the actual API call... Although you're right that's using the CLI with verbose will probably get you good results | 15:08 |
arozman | @JaF yeah i am from the metal3 tribe true :D but in this case it is a pure Ironic use-case . I am trying to help some folks downstream with some testing. | 15:09 |
arozman | @JayF, ashinclouds[m] Thanks I was looking for this info! I tried op: replace and add because those were present here https://docs.openstack.org/api-ref/baremetal/?expanded=change-node-boot-mode-detail,update-node-detail,create-node-detail | 15:10 |
arozman | but then I will do the testing first with the baremetal cli | 15:10 |
arozman | and also I will try op: update, thanks a lot ! | 15:10 |
dtantsur | arozman: you're missing the correct API version | 15:30 |
dtantsur | arozman: 1.31 in your case: https://docs.openstack.org/ironic/latest/contributor/webapi-version-history.html#ocata-7-0-0 | 15:30 |
dtantsur | it's provided via X-OpenStack-Ironic-Api-Version header | 15:30 |
arozman | @dtantsur Thank you!!! | 15:30 |
dtantsur | but yeah, I'd recommend you rather use the client (we even have the ironic-client container!) | 15:31 |
arozman | yeah I will use the client for debugging that will help me a lot, but downstream folks will also develop Ironic driver for some closed source infra management service so they also need to know how the raw requests fit together, so thanks again! | 15:33 |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: [WIP] [PoC] A metal3 CI job https://review.opendev.org/c/openstack/ironic/+/863873 | 15:43 |
JayF | arozman: we do have the ironic-staging-drivers OSS repo which stuff like that can live if desired | 15:45 |
JayF | arozman: https://opendev.org/x/ironic-staging-drivers if whoever is developing that driver for a closed source system wants to share the code | 15:46 |
arozman | @JayF Thanks I would prefer everything to be OSS but even the service is not public even I am not allowed to know how does it work. It lives inside the belly of the megacorp :D | 15:47 |
JayF | heh, got it :) | 15:47 |
JayF | arozman: you are from metal3, right? | 15:47 |
arozman | yes | 15:47 |
JayF | I remember your name from that slack channel | 15:47 |
JayF | okay, good stuff :) | 15:47 |
dtantsur | JayF: JFYI there are weekly metal3 meetings on zoom, Wed 14:00 UTC | 15:50 |
JayF | if that was not 6am local time, I might would kibitz it. I'm not sure I'd provide enough value to be worth taking that early of a morning :D | 15:51 |
JayF | but I'm always happy to help or attend if there's something specific I can do | 15:51 |
arozman | @JayF well in that case may I offer you https://www.youtube.com/channel/UC_xneeYbo-Dl4g-U78xW15g | 15:52 |
JayF | I see your youtube link, and raise you a youtube live stream, my OSS Office Hours, starting in 7 minutes https://youtube.com/jayofdoom | 15:53 |
arozman | I fold, well then at least I have now something to listen to while I am working | 15:54 |
JayF | arozman: knikolla[m]: I think I fixed it :| | 16:15 |
arozman | okay | 16:16 |
rpittau | bye o/ | 17:16 |
opendevreview | Merged openstack/sushy master: Increase server side retries https://review.opendev.org/c/openstack/sushy/+/863828 | 17:33 |
opendevreview | Jakub Jelinek proposed openstack/ironic master: Implements node inventory: database https://review.opendev.org/c/openstack/ironic/+/862569 | 17:50 |
kubajj | TheJulia: dtantsur: It seems that the mysql_as_long causes issues because it is LONGTEXT for MySQL and TEXT for PostgreSQL | 18:13 |
TheJulia | kubajj: if I remember correctly, text in Postgres is not size constrained | 18:34 |
kubajj | TheJulia: does it make sense to split the asserts for the two fields into the TestMigrationsMySQL and TestMigrationsPostgresSQL? | 18:39 |
TheJulia | kubajj: yes, I think there is another field that needs that already | 18:40 |
opendevreview | Jakub Jelinek proposed openstack/ironic master: Implements node inventory: database https://review.opendev.org/c/openstack/ironic/+/862569 | 18:44 |
opendevreview | Jakub Jelinek proposed openstack/ironic master: Implements node inventory: database https://review.opendev.org/c/openstack/ironic/+/862569 | 18:45 |
opendevreview | Jakub Jelinek proposed openstack/ironic master: WIP: Get inventory from Inspector https://review.opendev.org/c/openstack/ironic/+/864057 | 19:15 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!