*** zzzeek has quit IRC | 00:08 | |
*** zzzeek has joined #openstack-ironic | 00:10 | |
zer0c00l | the prepare_ramdisk() and prepare_instance() methods kind of looks similar. Wonder when each of these methods are called? | 00:25 |
---|---|---|
zer0c00l | is prepare_ramdisk called during "ironic only" deployment? | 00:25 |
zer0c00l | prepare_instance() is called when nova calls ironic to do it's job? | 00:26 |
zer0c00l | or an instance boot is involved? | 00:26 |
*** tosky has quit IRC | 00:36 | |
TheJulia | zer0c00l: prepare_ramdisk is when basically network booting is in the mix and the default, where as prepare_instance is the machine is booting, it may be network by default or it may be booting to local boot | 00:48 |
TheJulia | preferably local | 00:48 |
zer0c00l | TheJulia: ack. Thanks. | 02:02 |
*** tzumainn has quit IRC | 02:18 | |
*** rcernin has quit IRC | 02:19 | |
*** rcernin has joined #openstack-ironic | 02:33 | |
*** rcernin has quit IRC | 02:42 | |
*** rcernin has joined #openstack-ironic | 02:42 | |
*** buhman_ is now known as buhman | 02:49 | |
*** uzumaki has quit IRC | 03:11 | |
*** HardCase has joined #openstack-ironic | 03:41 | |
HardCase | Hi all, question, is it possible to introspect a node that has already been deployed? Lost my introspection data and need to rebuild it so I can use NodeDataLookup | 03:42 |
*** alexmcleod has quit IRC | 03:42 | |
*** zzzeek has quit IRC | 03:57 | |
*** zzzeek has joined #openstack-ironic | 03:58 | |
*** k_mouza has joined #openstack-ironic | 04:06 | |
*** k_mouza has quit IRC | 04:11 | |
*** uzumaki has joined #openstack-ironic | 04:11 | |
*** gyee has quit IRC | 04:30 | |
*** zzzeek has quit IRC | 04:43 | |
*** zzzeek has joined #openstack-ironic | 04:45 | |
*** k_mouza has joined #openstack-ironic | 04:54 | |
*** k_mouza has quit IRC | 04:58 | |
*** zzzeek has quit IRC | 05:15 | |
*** zzzeek has joined #openstack-ironic | 05:16 | |
*** HardCase has quit IRC | 06:01 | |
*** zzzeek has quit IRC | 06:03 | |
*** zzzeek has joined #openstack-ironic | 06:04 | |
*** k_mouza has joined #openstack-ironic | 06:07 | |
*** bburns_ has quit IRC | 06:07 | |
*** k_mouza has quit IRC | 06:11 | |
*** zzzeek has quit IRC | 06:13 | |
*** zzzeek has joined #openstack-ironic | 06:14 | |
*** zzzeek has quit IRC | 06:24 | |
*** zzzeek has joined #openstack-ironic | 06:30 | |
*** bburns has joined #openstack-ironic | 06:31 | |
*** zzzeek has quit IRC | 06:37 | |
*** zzzeek has joined #openstack-ironic | 06:37 | |
*** zzzeek has quit IRC | 07:01 | |
*** zzzeek has joined #openstack-ironic | 07:02 | |
*** anuradha1904 has joined #openstack-ironic | 07:02 | |
*** moshiur has joined #openstack-ironic | 07:04 | |
*** rcernin has quit IRC | 07:37 | |
arne_wiebalck | Good morning, ironic! | 07:42 |
*** zzzeek has quit IRC | 07:44 | |
*** zzzeek has joined #openstack-ironic | 07:46 | |
arne_wiebalck | HardCase: Yes, there is a feature called 'active inspection'. It allows to run the inspector from within the physical instance. | 07:48 |
*** rpittau|afk is now known as rpittau | 08:02 | |
rpittau | good morning ironic! o/ | 08:02 |
rpittau | happy Friday! | 08:02 |
*** zzzeek has quit IRC | 08:05 | |
*** zzzeek has joined #openstack-ironic | 08:07 | |
*** rcernin has joined #openstack-ironic | 08:18 | |
*** monica_pardhi has joined #openstack-ironic | 08:19 | |
*** tosky has joined #openstack-ironic | 08:24 | |
*** zzzeek has quit IRC | 08:43 | |
*** zzzeek has joined #openstack-ironic | 08:46 | |
*** monica_pardhi has quit IRC | 08:46 | |
*** zzzeek has quit IRC | 08:55 | |
*** zzzeek has joined #openstack-ironic | 08:56 | |
*** zzzeek has quit IRC | 09:05 | |
*** zzzeek has joined #openstack-ironic | 09:07 | |
*** lucasagomes has joined #openstack-ironic | 09:13 | |
*** zzzeek has quit IRC | 09:19 | |
*** zzzeek has joined #openstack-ironic | 09:21 | |
*** tosin has joined #openstack-ironic | 09:25 | |
*** ociuhandu has joined #openstack-ironic | 09:26 | |
*** derekh has joined #openstack-ironic | 09:32 | |
janders | good morning rpittau and arne_wiebalck o/ | 09:35 |
rpittau | hey janders :) | 09:35 |
*** uzumaki has quit IRC | 09:44 | |
*** uzumaki has joined #openstack-ironic | 09:44 | |
*** ociuhandu has quit IRC | 09:45 | |
*** ociuhandu has joined #openstack-ironic | 09:50 | |
*** rcernin has quit IRC | 09:55 | |
janders | https://bluejeans.com/772893798 SPUC anyone? :) | 09:57 |
*** ociuhandu has quit IRC | 10:02 | |
*** k_mouza has joined #openstack-ironic | 10:02 | |
*** rcernin has joined #openstack-ironic | 10:09 | |
*** ociuhandu has joined #openstack-ironic | 10:11 | |
*** ociuhandu has quit IRC | 10:11 | |
*** ociuhandu has joined #openstack-ironic | 10:18 | |
*** rcernin has quit IRC | 10:23 | |
*** dougsz has joined #openstack-ironic | 10:28 | |
*** strigazi has joined #openstack-ironic | 10:30 | |
*** k_mouza has quit IRC | 10:56 | |
*** k_mouza has joined #openstack-ironic | 10:56 | |
*** k_mouza has quit IRC | 11:02 | |
iurygregory | good morning Ironic! happy friday | 11:09 |
* iurygregory grabs more coffee | 11:09 | |
*** ociuhandu has quit IRC | 11:39 | |
*** sshnaidm|afk has quit IRC | 11:41 | |
*** dtantsur|afk is now known as dtantsur | 11:41 | |
dtantsur | morning/afternoon ironic | 11:41 |
*** anuradha1904 has quit IRC | 11:41 | |
dtantsur | janders: sorry, had to miss SPUC because of a doctor apt | 11:41 |
*** ociuhandu has joined #openstack-ironic | 11:42 | |
*** zzzeek has quit IRC | 11:42 | |
iurygregory | morning dtantsur (I hope you are well) | 11:44 |
*** zzzeek has joined #openstack-ironic | 11:45 | |
*** rcernin has joined #openstack-ironic | 11:48 | |
dtantsur | routine stuff, nothing to worry about | 11:49 |
*** sshnaidm|afk has joined #openstack-ironic | 11:49 | |
iurygregory | awesome =) | 11:50 |
*** sshnaidm|afk is now known as sshnaidm|off | 11:50 | |
*** k_mouza has joined #openstack-ironic | 11:53 | |
*** ociuhandu has quit IRC | 11:54 | |
janders | hey iurygregory o/ | 11:55 |
iurygregory | hey janders o/ | 11:55 |
janders | dtantsur no worries! first things first! | 11:55 |
janders | I barely made it to the SPUC myself, storm clouds beat me this time, got pretty wet on the bike | 11:56 |
janders | dtantsur Thank you for reviewing my NVMe patch. I have a question regarding https://review.opendev.org/c/openstack/ironic-python-agent/+/771904/27/ironic_python_agent/hardware.py#1669 (using json output). I actually wanted to do this first, but the json format seems really weird: | 11:59 |
janders | http://paste.openstack.org/show/802599/ | 11:59 |
janders | I would need to figure out how to parse this (I suppose it's a decimal representation of bits indicating different capabilities). This can be done but my other concern is if some NVMe manufacturers don't stick with the spec, my interpretation of these numbers can be completely off | 12:00 |
janders | back in the time when I was trying to support NVMe sanitize in addition to format I learned that some vendors use the designated fields in their sanitize-logs, while others use reserved fields (!) so that the output makes no sense | 12:01 |
* iurygregory is trying to understand why the json format is weird | 12:01 | |
janders | long story short: would you like me to try figure out how to drive the json output given all this? I ended up just parsing the plain output cause it did seem easier and I've seen it done with other tools (e.g. hdparm) | 12:03 |
rpittau | janders, dtantsur: about the json output, just please check from which version that's supported, I think I didn't mention that before because it was not compatible with all the OSes we test with, but I may have made a mistake | 12:06 |
* rpittau goes to lunch | 12:07 | |
*** rcernin has quit IRC | 12:08 | |
*** alexmcleod has joined #openstack-ironic | 12:08 | |
dtantsur | janders: I would expect the textual form to be derived from these numbers anyway | 12:13 |
dtantsur | rpittau: at least centos 8 version supports it | 12:13 |
dtantsur | janders: https://gist.github.com/dtantsur/f5200859799d29f0296005580bbe7b1e | 12:14 |
janders | dtantsur nicely done! | 12:19 |
dtantsur | yeah, I wrote it when we first discussed NVMe cleaning.. not sure why I did not share with you, maybe just forgot? | 12:20 |
janders | I'm happy to switch the cleaning code to this, will just re-read the relevant part of NVMe spec to make sure there are no catches | 12:21 |
dtantsur | sure | 12:21 |
janders | the catches I was concerned about were around sanitize-log | 12:21 |
dtantsur | but I think it did it based on the specs | 12:22 |
*** rcernin has joined #openstack-ironic | 12:22 | |
janders | secure format should be less problematic | 12:22 |
janders | as there is no need to check logs prior to / after cleaning | 12:22 |
janders | (I think :) ) | 12:22 |
janders | thank you dtantsur! :) | 12:23 |
dtantsur | np | 12:23 |
openstackgerrit | Aija Jauntēva proposed openstack/sushy master: Refactor TaskMonitor and update Volume methods https://review.opendev.org/c/openstack/sushy/+/774532 | 12:26 |
openstackgerrit | Dmitry Tantsur proposed openstack/ironic-python-agent stable/victoria: Fix error message with UEFI-incompatible images https://review.opendev.org/c/openstack/ironic-python-agent/+/775340 | 12:37 |
dtantsur | okay, where do we stand with the releases? | 12:38 |
*** ociuhandu has joined #openstack-ironic | 12:42 | |
iurygregory | dtantsur, I've requested bifrost yesterday | 12:43 |
dtantsur | great, thank you! I suspect we can request IPA already (will check ironic in a few) | 12:44 |
iurygregory | any other we are ready I can request | 12:44 |
iurygregory | ack will push IPA | 12:44 |
iurygregory | dtantsur, ipa https://review.opendev.org/c/openstack/releases/+/775378 | 12:52 |
dtantsur | ++ | 12:52 |
dtantsur | iurygregory: ironic should be ready as well, I think | 12:53 |
iurygregory | dtantsur, I can request in about 39min (getting ready for my 1:1) =) | 12:53 |
*** rcernin has quit IRC | 12:53 | |
iurygregory | if we are not in a hurry =) | 12:54 |
*** anuradha1904 has joined #openstack-ironic | 12:54 | |
*** ociuhandu has quit IRC | 12:55 | |
dtantsur | not at all | 12:56 |
janders | see you on Monday Ironic | 12:59 |
janders | have a great weekend everyone o/ | 12:59 |
*** ociuhandu has joined #openstack-ironic | 13:00 | |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent master: Remove samples from the hardware test module https://review.opendev.org/c/openstack/ironic-python-agent/+/775163 | 13:02 |
*** k_mouza has quit IRC | 13:04 | |
dtantsur | you too janders | 13:05 |
rpittau | bye janders :) | 13:05 |
*** ociuhandu has quit IRC | 13:07 | |
*** ociuhandu has joined #openstack-ironic | 13:11 | |
*** ociuhandu has quit IRC | 13:15 | |
openstackgerrit | Merged openstack/ironic master: Populate existing policy tests https://review.opendev.org/c/openstack/ironic/+/768136 | 13:20 |
openstackgerrit | Merged openstack/ironic master: Duplicate testing for system scoped ACL testing https://review.opendev.org/c/openstack/ironic/+/770002 | 13:21 |
*** k_mouza has joined #openstack-ironic | 13:21 | |
dtantsur | iurygregory: with ironic, let's probably wait for https://review.opendev.org/c/openstack/ironic/+/768353 | 13:23 |
dtantsur | this is a great addition for this release | 13:23 |
dtantsur | ajya++ | 13:23 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent master: [WIP] Move some raid specific functions to raid_utils https://review.opendev.org/c/openstack/ironic-python-agent/+/774854 | 13:23 |
*** moshiur has quit IRC | 13:27 | |
*** rcernin has joined #openstack-ironic | 13:27 | |
iurygregory | dtantsur, ack! | 13:32 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic master: Replace retrying with tenacity https://review.opendev.org/c/openstack/ironic/+/376574 | 13:40 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic master: Replace retrying with tenacity https://review.opendev.org/c/openstack/ironic/+/376574 | 13:41 |
*** moshiur has joined #openstack-ironic | 13:45 | |
*** zzzeek has quit IRC | 13:48 | |
*** zzzeek has joined #openstack-ironic | 13:49 | |
*** zzzeek has quit IRC | 13:59 | |
dtantsur | rpittau: this ^^ doesn't seem to fix the unit tests issue. or does it? | 13:59 |
*** ociuhandu has joined #openstack-ironic | 13:59 | |
rpittau | dtantsur: mmm no, I ignored the first part of your comment for some reason | 14:00 |
*** rloo has joined #openstack-ironic | 14:03 | |
*** zzzeek has joined #openstack-ironic | 14:04 | |
*** ociuhandu has quit IRC | 14:04 | |
*** lmcgann has joined #openstack-ironic | 14:08 | |
*** ociuhandu has joined #openstack-ironic | 14:08 | |
*** bburns is now known as bburns_afk | 14:08 | |
dtantsur | heh | 14:10 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent master: Remove samples from the hardware test module https://review.opendev.org/c/openstack/ironic-python-agent/+/775163 | 14:11 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent master: [WIP] convert lsblk output to json https://review.opendev.org/c/openstack/ironic-python-agent/+/775391 | 14:11 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent master: [WIP] Use json for lsblk output https://review.opendev.org/c/openstack/ironic-python-agent/+/775391 | 14:11 |
dtantsur | ajya: hi, do you plan on adding deploy_steps support to openstacksdk | 14:12 |
dtantsur | ? | 14:12 |
*** ociuhandu has quit IRC | 14:13 | |
ajya | dtantsur: I can add, will create a task to deploy-steps story | 14:16 |
dtantsur | thanks! | 14:16 |
*** ociuhandu has joined #openstack-ironic | 14:26 | |
dtantsur | ajya: do you know if it works when you have the same step in a template and explicitly? say, with different priority/arguments? | 14:29 |
*** rcernin has quit IRC | 14:30 | |
TheJulia | good morning | 14:30 |
dtantsur | morning TheJulia | 14:30 |
ajya | dtantsur: explicit step overrides template step and has whatever is in explicit step | 14:31 |
*** ociuhandu has quit IRC | 14:31 | |
ajya | morning TheJulia | 14:31 |
dtantsur | ajya: nice! | 14:31 |
iurygregory | good morning TheJulia =) | 14:37 |
*** moshiur has quit IRC | 14:37 | |
*** tzumainn has joined #openstack-ironic | 14:43 | |
*** ociuhandu has joined #openstack-ironic | 14:43 | |
*** ociuhandu has quit IRC | 14:43 | |
*** ociuhandu has joined #openstack-ironic | 14:43 | |
dtantsur | seeking opinions on https://storyboard.openstack.org/#!/story/2008611 wrt location specification. what is better: | 14:44 |
dtantsur | 1) "/dev/disk-by-label/bool:/some/location": "<content>" | 14:44 |
dtantsur | 2) "/some/location": {"on": "/dev/disk/by-label/boot", "data": "<content>"} | 14:45 |
dtantsur | ? | 14:45 |
openstackgerrit | Verification of a change to openstack/ironic failed: Guard conductor from consuming all of the ram https://review.opendev.org/c/openstack/ironic/+/726483 | 14:45 |
openstackgerrit | Verification of a change to openstack/ironic failed: Set default to prevent out of memory conditions https://review.opendev.org/c/openstack/ironic/+/763107 | 14:45 |
dtantsur | The latter is more explicit, the former is shorter (and thus more convenient for CLI) | 14:45 |
dtantsur | or even 3) [ {"path": "/some/location", "on": "/dev/disk/by-label/boot", "data": "<content>" ] | 14:52 |
iurygregory | I would vote for 2 and 3 just because they are more explicit | 14:57 |
dtantsur | and between those? | 14:57 |
dtantsur | well, #3 is probably easier to use for statically typed languages | 14:58 |
iurygregory | 3 | 14:58 |
rpittau | I would say 3 as well | 14:58 |
* dtantsur has Go and Rust in mind | 14:58 | |
dtantsur | thanks guys, I think I'm learning towards #3 now too | 14:58 |
dtantsur | learning.. leaning | 14:58 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Introduce common personas for secure RBAC https://review.opendev.org/c/openstack/ironic/+/763255 | 14:59 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Address some rbac review feedback in merged patches https://review.opendev.org/c/openstack/ironic/+/775399 | 14:59 |
*** MentalSiege has joined #openstack-ironic | 15:01 | |
rpittau | bye everyone, have a great weekend! o/ | 15:03 |
*** rpittau is now known as rpittau|afk | 15:03 | |
TheJulia | dtantsur: file path as key sounds like a good idea, data content makes sense, permissions is missing in that we should be able to assert a numeric gid/uid as well as basic unix perms | 15:03 |
TheJulia | Only even suggesting it because some stuff evaluates permissions as first class protection against the machine having been compromised, if the file is too wide open it goes "all the nopes" | 15:04 |
dtantsur | wdyt about allowing URLs for content? | 15:04 |
TheJulia | sounds good to me | 15:04 |
dtantsur | yeah, that's why I had to add `mode`, but good call re owner. | 15:04 |
TheJulia | yeah, it bit me relatively recently :( | 15:05 |
TheJulia | also, folders | 15:05 |
TheJulia | or maybe folders are out of scope? | 15:05 |
TheJulia | or just get the same treatment | 15:05 |
TheJulia | I dunno | 15:05 |
TheJulia | "hi, uncompress this tarball on / | 15:06 |
TheJulia | " | 15:06 |
TheJulia | kthxbai | 15:06 |
dtantsur | directories are automatically created, dirmode is passed to makedirs | 15:06 |
TheJulia | \o/ | 15:07 |
*** MentalSiege has quit IRC | 15:07 | |
dtantsur | https://storyboard.openstack.org/#!/story/2008611 updated, PTAL | 15:07 |
TheJulia | one minor thing, owner/group should be numeric most likely | 15:07 |
TheJulia | unless the actions are in the chroot of the OS that has been written to disk | 15:08 |
dtantsur | yeah, we probably cannot resolve names | 15:08 |
TheJulia | exactly | 15:08 |
* TheJulia goes back to rbac | 15:08 | |
*** MentalSiege has joined #openstack-ironic | 15:09 | |
dtantsur | iurygregory: please take another look if you have a minute | 15:09 |
iurygregory | dtantsur, sure | 15:11 |
openstackgerrit | Verification of a change to openstack/ironic failed: Guard conductor from consuming all of the ram https://review.opendev.org/c/openstack/ironic/+/726483 | 15:15 |
TheJulia | :( | 15:15 |
dtantsur | for some reason the bot started reporting each failure twice | 15:21 |
TheJulia | Out of curiosity, what are folks seeing on average for unit test runs these days? | 15:29 |
dtantsur | what = ? | 15:29 |
TheJulia | seconds | 15:30 |
TheJulia | runtime seconds | 15:30 |
dtantsur | I nearly never run the whole suite | 15:30 |
TheJulia | I typically see ~80-83 unless I'm streaming the news, which then it is like 110 seconds | 15:30 |
dtantsur | the last time I tried it was something around 120-150 seconds | 15:30 |
TheJulia | my old work laptop was running like 200+ seconds | 15:31 |
TheJulia | I guess that means my desktop is still hanging in there | 15:32 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement "system" scoped RBAC for the node endpoint https://review.opendev.org/c/openstack/ironic/+/763257 | 15:36 |
*** ociuhandu has quit IRC | 15:40 | |
*** lbragstad_ has joined #openstack-ironic | 15:46 | |
*** lbragstad has quit IRC | 15:50 | |
iurygregory | dtantsur, lgtm the RFE | 15:54 |
dtantsur | thx! | 15:54 |
openstackgerrit | Merged openstack/ironic master: Add 'deploy steps' parameter for provisioning API https://review.opendev.org/c/openstack/ironic/+/768353 | 16:01 |
openstackgerrit | Merged openstack/python-ironicclient master: Add 'deploy steps' for provisioning API https://review.opendev.org/c/openstack/python-ironicclient/+/768354 | 16:02 |
openstackgerrit | Merged openstack/ironic-inspector stable/train: Remove grenade jobs from old stable branches https://review.opendev.org/c/openstack/ironic-inspector/+/773332 | 16:02 |
dtantsur | w00t, deploy steps! | 16:02 |
ajya | \o/ | 16:03 |
*** MentalSiege has left #openstack-ironic | 16:04 | |
openstackgerrit | Dmitry Tantsur proposed openstack/ironic master: Trivial: update version for deploy steps https://review.opendev.org/c/openstack/ironic/+/775408 | 16:04 |
dtantsur | could I get a quick approval ^^ iurygregory, TheJulia | 16:04 |
*** ociuhandu has joined #openstack-ironic | 16:04 | |
*** ociuhandu has quit IRC | 16:10 | |
openstackgerrit | Dmitry Tantsur proposed openstack/ironic master: Move the IPv6 job to the experimental pipeline https://review.opendev.org/c/openstack/ironic/+/775410 | 16:10 |
openstackgerrit | Merged openstack/ironic-python-agent stable/victoria: Fix error message with UEFI-incompatible images https://review.opendev.org/c/openstack/ironic-python-agent/+/775340 | 16:12 |
TheJulia | approved | 16:12 |
* TheJulia awaits patches | 16:12 | |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement "system" scoped RBAC for ports https://review.opendev.org/c/openstack/ironic/+/763267 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement system scoped RBAC for port groups https://review.opendev.org/c/openstack/ironic/+/763268 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement system scoped RBAC for chassis https://review.opendev.org/c/openstack/ironic/+/763269 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement system scoped RBAC for baremetal drivers https://review.opendev.org/c/openstack/ironic/+/763270 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement system scoped RBAC for node and driver passthru https://review.opendev.org/c/openstack/ironic/+/763271 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement system scoped RBAC for utility APIs https://review.opendev.org/c/openstack/ironic/+/763272 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement system scoped RBAC for volume APIs https://review.opendev.org/c/openstack/ironic/+/763273 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement system scoped RBAC for conductor APIs https://review.opendev.org/c/openstack/ironic/+/763274 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement system scoped RBAC for the allocation APIs https://review.opendev.org/c/openstack/ironic/+/763275 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement system scoped RBAC for the event APIs https://review.opendev.org/c/openstack/ironic/+/763276 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Implement system scoped RBAC for the deploy templates APIs https://review.opendev.org/c/openstack/ironic/+/763277 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: RBAC System Scope: observer -> reader https://review.opendev.org/c/openstack/ironic/+/772450 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Initial Project scoped tests https://review.opendev.org/c/openstack/ironic/+/772451 | 16:13 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Project Scoping Node endpoint https://review.opendev.org/c/openstack/ironic/+/773924 | 16:13 |
TheJulia | kaboom | 16:13 |
dtantsur | boom! | 16:13 |
dtantsur | :) | 16:13 |
TheJulia | I had to do a minor unit test fix, because an earlier change caused the additional tests to be checked, and the filter view changes slighty if scopes are enabled... so the last patch in that chain now passes unit testing | 16:14 |
*** ociuhandu has joined #openstack-ironic | 16:24 | |
*** lbragstad_ is now known as lbragstad | 16:25 | |
*** k_mouza has quit IRC | 16:26 | |
*** uzumaki has quit IRC | 16:27 | |
*** ociuhandu has quit IRC | 16:31 | |
*** ociuhandu has joined #openstack-ironic | 16:35 | |
*** uzumaki has joined #openstack-ironic | 16:45 | |
openstackgerrit | Dmitry Tantsur proposed openstack/python-ironicclient master: Add missing unit tests for provision state commands https://review.opendev.org/c/openstack/python-ironicclient/+/377607 | 16:46 |
*** ociuhandu has quit IRC | 16:48 | |
*** ociuhandu has joined #openstack-ironic | 16:49 | |
*** ociuhandu has quit IRC | 16:51 | |
*** ociuhandu has joined #openstack-ironic | 16:52 | |
*** ociuhandu has quit IRC | 16:52 | |
*** ociuhandu has joined #openstack-ironic | 16:53 | |
*** tosin has quit IRC | 16:54 | |
*** ociuhandu has quit IRC | 16:58 | |
*** ociuhandu has joined #openstack-ironic | 17:00 | |
*** lucasagomes has quit IRC | 17:00 | |
*** tosin has joined #openstack-ironic | 17:02 | |
TheJulia | spuc? | 17:03 |
iurygregory | not in the mood for spuc today .-. | 17:05 |
*** ociuhandu has quit IRC | 17:05 | |
iurygregory | sorry =( | 17:05 |
*** ociuhandu has joined #openstack-ironic | 17:05 | |
TheJulia | :( | 17:06 |
TheJulia | it happens, hopefully everything is okay | 17:06 |
iurygregory | my mother is not feeling very well, I will probably take her to the doctor (I hope it's not covid .-.) | 17:07 |
TheJulia | *hugs* | 17:08 |
iurygregory | ty =) | 17:08 |
dtantsur | iurygregory: best wishes! | 17:12 |
iurygregory | dtantsur, ty! | 17:12 |
arne_wiebalck | cannot join either: we had some network/db outage and Ironic still not back ... | 17:14 |
TheJulia | :( | 17:34 |
*** k_mouza has joined #openstack-ironic | 17:38 | |
*** gmann is now known as gmann_afk | 17:40 | |
*** k_mouza has quit IRC | 17:41 | |
*** ociuhandu has quit IRC | 17:42 | |
* TheJulia digs deep in to the port node controller | 17:45 | |
*** ociuhandu has joined #openstack-ironic | 17:58 | |
*** derekh has quit IRC | 18:00 | |
TheJulia | hmm, it is monsooning | 18:03 |
*** ociuhandu has quit IRC | 18:04 | |
openstackgerrit | Merged openstack/ironic master: Guard conductor from consuming all of the ram https://review.opendev.org/c/openstack/ironic/+/726483 | 18:12 |
openstackgerrit | Merged openstack/ironic master: Trivial: update version for deploy steps https://review.opendev.org/c/openstack/ironic/+/775408 | 18:12 |
iurygregory | insert happy dance | 18:15 |
iurygregory | \o/\o/ | 18:15 |
iurygregory | I should probably wait for https://review.opendev.org/c/openstack/ironic/+/763107 since https://review.opendev.org/c/openstack/ironic/+/726483 is merged | 18:17 |
iurygregory | to request the bugfix for ironic | 18:18 |
openstackgerrit | Merged openstack/ironic master: Introduce common personas for secure RBAC https://review.opendev.org/c/openstack/ironic/+/763255 | 18:20 |
openstackgerrit | Verification of a change to openstack/ironic failed: Set default to prevent out of memory conditions https://review.opendev.org/c/openstack/ironic/+/763107 | 18:32 |
dtantsur | iurygregory: I think we're ready for an ironic release. Can be done today or on Monday (it's unlikely that anybody processes it today) | 18:34 |
openstackgerrit | Dmitry Tantsur proposed openstack/ironic-python-agent master: [WIP] New deploy step for injecting arbitrary files https://review.opendev.org/c/openstack/ironic-python-agent/+/775428 | 18:35 |
dtantsur | this ^^ is a complete implementation, although some tests are missing | 18:35 |
dtantsur | it's larger than I hoped, but it also does much more than initially planned | 18:35 |
*** bburns_afk is now known as bburns | 18:36 | |
iurygregory | dtantsur, ack I will wait to push so we can have https://review.opendev.org/c/openstack/ironic/+/763107 on it | 18:38 |
dtantsur | I'm not sure if it was planned to have it in this release or the next one? | 18:38 |
dtantsur | otherwise why splitting the patches? | 18:38 |
* dtantsur will leave it up to TheJulia | 18:39 | |
*** dtantsur is now known as dtantsur|afk | 18:39 | |
dtantsur|afk | have a great weekend, folks! | 18:39 |
TheJulia | hmm | 18:41 |
TheJulia | makes sense to go ahead and release with it merged | 18:41 |
TheJulia | hopefully it will make it through the gate today | 18:41 |
iurygregory | we merged the 1st in chain... | 18:47 |
TheJulia | second is in check queue after failure on merge | 18:47 |
TheJulia | 3rd is just docs | 18:47 |
iurygregory | yeah | 18:47 |
*** gmann_afk is now known as gmann | 18:52 | |
*** k_mouza has joined #openstack-ironic | 19:13 | |
*** anuradha1904 has quit IRC | 19:13 | |
*** k_mouza has quit IRC | 19:17 | |
*** ociuhandu has joined #openstack-ironic | 19:40 | |
*** ociuhandu has quit IRC | 19:46 | |
*** tosin has quit IRC | 20:04 | |
*** rcernin has joined #openstack-ironic | 20:35 | |
*** k_mouza has joined #openstack-ironic | 20:40 | |
*** k_mouza has quit IRC | 20:41 | |
*** k_mouza has joined #openstack-ironic | 20:43 | |
*** uzumaki has quit IRC | 20:47 | |
*** k_mouza has quit IRC | 20:48 | |
arne_wiebalck | TheJulia: we may hit the nova/ironic power sync as a scaling limit now: the time to complete a "node list" times the number of nova nodes requesting it is larger than the power sync interval of 600 secs, so the processes seem to run into each other and melt the database (I am not clear if this is only the case with cold caches as it worked before we had a network outage today) | 21:18 |
*** rcernin has quit IRC | 21:22 | |
TheJulia | arne_wiebalck: are you trying to bring both services online at the same time? | 21:38 |
TheJulia | arne_wiebalck: or is the conductor being given time to get back online and do the first sweep and then the api/nova-compute services getting fired up? | 21:38 |
arne_wiebalck | TheJulia: both at the same time | 21:38 |
arne_wiebalck | TheJulia: you think it is a startup-only issue? | 21:39 |
TheJulia | i suspect | 21:39 |
TheJulia | when you say melt the database, is it just request latency? | 21:39 |
arne_wiebalck | TheJulia: From what I see, it works ok some time after the startup, but when the power sync queries come in, all hell breaks lose ... | 21:40 |
arne_wiebalck | TheJulia: yes | 21:40 |
arne_wiebalck | TheJulia: requests time out at the loadbalancer | 21:40 |
TheJulia | which requests time out? | 21:40 |
TheJulia | nova-compute? | 21:41 |
arne_wiebalck | the requests from nc to the ironic APIs | 21:41 |
arne_wiebalck | yes | 21:41 |
TheJulia | when you say outage? was a backup restored? | 21:42 |
arne_wiebalck | no | 21:42 |
arne_wiebalck | there was a network interruption | 21:42 |
TheJulia | okay | 21:42 |
TheJulia | that was my next question | 21:42 |
arne_wiebalck | the database was unreachable for a while | 21:42 |
arne_wiebalck | once it was back, problems started ... | 21:43 |
arne_wiebalck | I am still a little puzzled why it worked before | 21:43 |
arne_wiebalck | I was thinking all processes were aligned or sth | 21:44 |
TheJulia | without nova-computes running, what is do you see as the response time just requesting a list of nodes? | 21:44 |
TheJulia | /is do/is it that/ | 21:44 |
arne_wiebalck | I have not made this test, but the query takes about 30-40 secs for 8600 nodes. | 21:44 |
arne_wiebalck | this is a node list | 21:45 |
TheJulia | I guess load balancers are killing the query at 60 seconds? | 21:45 |
arne_wiebalck | which I use as a probe | 21:45 |
arne_wiebalck | the timeout is 270 secs | 21:45 |
arne_wiebalck | I increased it at some point | 21:45 |
TheJulia | okay | 21:45 |
TheJulia | so everything seems to be find, then power sync comes along | 21:46 |
TheJulia | pulls a giant list of nodes, locks one at a time and iterate through | 21:46 |
arne_wiebalck | yes, this is my current thinking | 21:46 |
TheJulia | that shouldn't impact db reads | 21:46 |
arne_wiebalck | pulling the list takes 30-40 secs | 21:46 |
arne_wiebalck | then there are around 20 doing this in parallel | 21:47 |
arne_wiebalck | every 600 secs | 21:47 |
arne_wiebalck | so every 30 secs on avg | 21:47 |
arne_wiebalck | mind you, we just moved from 17 to 20 earlier this week | 21:47 |
TheJulia | sounds like we have a thundering herd problem with the database | 21:48 |
arne_wiebalck | I think we just tipped over the threshold | 21:48 |
arne_wiebalck | yes | 21:48 |
arne_wiebalck | I have increased the interval from 600 to 3600 secs now | 21:48 |
TheJulia | any chance your able to startup delay the conductors to bring them up over a couple minutes and not all at once? | 21:48 |
arne_wiebalck | things seems stable now | 21:48 |
arne_wiebalck | yes | 21:49 |
TheJulia | I guess we actually have two thundering herd problems in this | 21:49 |
arne_wiebalck | but after increasing the interval, I started them all at once :) | 21:49 |
arne_wiebalck | yes, totally! | 21:49 |
TheJulia | i guess your nova config is still asking ironic for the power state instead of getting a callback? | 21:49 |
arne_wiebalck | I saw processes piling up on the db | 21:49 |
arne_wiebalck | yes :( | 21:50 |
TheJulia | :( | 21:50 |
arne_wiebalck | nova is still on stein | 21:50 |
TheJulia | oh | 21:50 |
arne_wiebalck | ironic on train | 21:50 |
arne_wiebalck | quite ironic that we developed this feature :) | 21:50 |
TheJulia | I wasn't going to say that... but this is ironic | 21:50 |
arne_wiebalck | and have not yet deployed it | 21:50 |
arne_wiebalck | yeah :-D | 21:50 |
TheJulia | The simplest thing I think is we need a randomizer on the power sync thundering heard initiator | 21:51 |
TheJulia | actually, all periodics | 21:51 |
arne_wiebalck | I think this is there already, no? | 21:52 |
TheJulia | I don't think so | 21:52 |
arne_wiebalck | hmm, ok | 21:52 |
arne_wiebalck | I thought there was some random start up delay | 21:52 |
arne_wiebalck | if not, then yes! | 21:52 |
TheJulia | ipa heartbeat | 21:52 |
arne_wiebalck | but we also need to make sure that the #processes x runtime < sync_interval I think | 21:53 |
arne_wiebalck | this is for the nova syny | 21:53 |
arne_wiebalck | sync | 21:53 |
arne_wiebalck | you cannot ask more questions than there are possible answers ;) | 21:53 |
arne_wiebalck | the db becomes the bottleneck | 21:54 |
TheJulia | that would slow that down | 21:54 |
arne_wiebalck | yes | 21:54 |
TheJulia | and all the read activity | 21:54 |
TheJulia | although is there any sign that the row locking for the lock updates is also impacting it? | 21:54 |
arne_wiebalck | I wonder if this is also only a startup problem as we have caches in nova, no? | 21:54 |
TheJulia | yeah, nova builds a cache, so if everything is starting up it is trying to populate the resource tracker cache as well | 21:55 |
TheJulia | and it also they each have to build a hash ring | 21:55 |
arne_wiebalck | all I saw were big queries for the nodes in a conductor group | 21:55 |
arne_wiebalck | there is a cache for the power state, no? | 21:55 |
TheJulia | I don't think so | 21:56 |
arne_wiebalck | ah, no? | 21:56 |
TheJulia | but the objects are created via the db | 21:56 |
TheJulia | so there is the db hit there | 21:56 |
TheJulia | were you able to see what the effectve db query was/is? | 21:56 |
arne_wiebalck | I was thinking if it worked before since I only added a few nodes at a time while the others had warm caches | 21:56 |
TheJulia | I'm wondering if we've got room to improve a query or threee | 21:56 |
TheJulia | well, resource tracker itself would have been warm | 21:57 |
arne_wiebalck | :) | 21:57 |
TheJulia | and the other node's hashrings | 21:57 |
TheJulia | add just a little bit more then mix in a thundering herd, and I could see everything colliding | 21:57 |
arne_wiebalck | hmm, yeah resource tracker ... | 21:57 |
arne_wiebalck | colliding ... that somehow rings a bell :-D | 21:58 |
TheJulia | I'm sure there is a soundtrack that would match | 21:59 |
arne_wiebalck | things are up now since an hour with power_sync=3600 | 21:59 |
arne_wiebalck | so, this seems to do the trick | 21:59 |
TheJulia | Yeah, you were hitting the initial crunch | 22:00 |
arne_wiebalck | yes | 22:00 |
arne_wiebalck | bringing up everything at the same time is probably not ideal and should be improved | 22:00 |
arne_wiebalck | for the power sync interval we have to see what the impact is in real life | 22:01 |
TheJulia | if you ever reproduce or capture queries to the db, if we can figure out what is doing the most "work" then we should look into changing those | 22:01 |
arne_wiebalck | I have the query, I think | 22:01 |
TheJulia | yeah, I'm wondering if we need to do a "autosize deployment and try to prevent a thundering heard based on that" | 22:01 |
TheJulia | ohhhhh | 22:01 |
TheJulia | we may have an easy fix to improve performance there as well | 22:01 |
TheJulia | add an index for conductor | 22:02 |
arne_wiebalck | well, not the full query | 22:02 |
arne_wiebalck | "SELECT nodes.created_at AS nodes_created_at, nodes.updated_at AS nodes_updated_at, nodes.version AS" | 22:03 |
arne_wiebalck | only the beginning | 22:03 |
arne_wiebalck | I guess I can get it from the DB colleagues looking at this with me during the past hours | 22:03 |
arne_wiebalck | but the whole problem smells very much like the inspector issue some months ago | 22:03 |
TheJulia | yeah, if they can extract the full query the DB is getting, that would be helpful | 22:03 |
arne_wiebalck | which was solved with the leader election | 22:04 |
arne_wiebalck | ok, will ask them | 22:04 |
arne_wiebalck | thanks for your thoughts, TheJulia, I will have another look at this next week | 22:04 |
arne_wiebalck | have a good week-end o/ | 22:04 |
TheJulia | arne_wiebalck: I'm putting some notes into an item in storyboard | 22:04 |
arne_wiebalck | TheJulia: thanks, I will have a look and update, I think we have a scaling item there somewhere ... | 22:07 |
TheJulia | very much so | 22:08 |
TheJulia | arne_wiebalck: https://storyboard.openstack.org/#!/story/2008626 | 22:11 |
*** lmcgann has quit IRC | 22:12 | |
arne_wiebalck | TheJulia: thanks! | 22:12 |
*** uzumaki has joined #openstack-ironic | 22:35 | |
openstackgerrit | Merged openstack/ironic master: Set default to prevent out of memory conditions https://review.opendev.org/c/openstack/ironic/+/763107 | 22:46 |
iurygregory | merged \o/ | 22:47 |
TheJulia | woot! | 23:07 |
TheJulia | ship it! | 23:07 |
iurygregory | \o/ | 23:25 |
iurygregory | shipped in https://review.opendev.org/c/openstack/releases/+/775456 =) | 23:31 |
*** rloo has quit IRC | 23:36 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!