*** rcernin has joined #openstack-ironic | 00:08 | |
*** whoami-rajat has joined #openstack-ironic | 00:09 | |
*** igordc has joined #openstack-ironic | 00:09 | |
*** trident has quit IRC | 00:38 | |
*** gyee has quit IRC | 00:48 | |
*** rcernin has quit IRC | 00:58 | |
*** rcernin has joined #openstack-ironic | 01:14 | |
*** hamzy has joined #openstack-ironic | 01:17 | |
*** rh-jelabarre has joined #openstack-ironic | 01:58 | |
*** rloo has quit IRC | 01:58 | |
*** ash2307 has joined #openstack-ironic | 02:13 | |
*** ash2307 has left #openstack-ironic | 02:16 | |
*** whoami-rajat has quit IRC | 02:28 | |
openstackgerrit | Merged openstack/networking-generic-switch master: Add Mellanox MLNX-OS Switch support https://review.opendev.org/642565 | 02:32 |
---|---|---|
*** strigazi has quit IRC | 03:09 | |
*** strigazi has joined #openstack-ironic | 03:10 | |
*** whoami-rajat has joined #openstack-ironic | 03:20 | |
*** ash2307 has joined #openstack-ironic | 03:34 | |
openstackgerrit | Kaifeng Wang proposed openstack/ironic-inspector master: [TEST] Update non-standalone job to use uwsgi https://review.opendev.org/675724 | 03:39 |
*** gkadam has joined #openstack-ironic | 03:41 | |
*** gkadam has quit IRC | 03:41 | |
*** gkadam has joined #openstack-ironic | 03:43 | |
*** gkadam has quit IRC | 04:00 | |
*** absubram has quit IRC | 04:11 | |
*** mkrai has joined #openstack-ironic | 04:17 | |
*** stendulker has joined #openstack-ironic | 04:18 | |
*** dsneddon has quit IRC | 04:24 | |
*** absubram has joined #openstack-ironic | 04:30 | |
*** rh-jelabarre has quit IRC | 04:42 | |
*** igordc has quit IRC | 05:02 | |
*** dsneddon has joined #openstack-ironic | 05:11 | |
openstackgerrit | Digambar proposed openstack/ironic stable/rocky: DRAC: Fix OOB introspection to use pxe_enabled flag in idrac driver https://review.opendev.org/648360 | 05:15 |
*** dsneddon has quit IRC | 05:17 | |
*** dsneddon has joined #openstack-ironic | 05:21 | |
*** dsneddon has quit IRC | 05:26 | |
*** ash2307 has left #openstack-ironic | 05:31 | |
*** mkrai has quit IRC | 05:33 | |
*** mkrai has joined #openstack-ironic | 05:36 | |
*** adrianc has quit IRC | 05:37 | |
openstackgerrit | Kaifeng Wang proposed openstack/ironic-inspector master: [TEST] Update non-standalone job to use uwsgi https://review.opendev.org/675724 | 05:46 |
*** absubram has quit IRC | 05:48 | |
*** dsneddon has joined #openstack-ironic | 05:58 | |
*** dsneddon has quit IRC | 06:03 | |
*** rcernin has quit IRC | 06:16 | |
*** jtomasek has joined #openstack-ironic | 06:20 | |
*** kaifeng has joined #openstack-ironic | 06:21 | |
*** rcernin has joined #openstack-ironic | 06:31 | |
openstackgerrit | Merged openstack/sushy-tools master: Follow-up fixes https://review.opendev.org/674627 | 06:41 |
arne_wiebalck | Good morning, ironic! | 06:44 |
kaifeng | morning arne o/ | 06:44 |
*** dsneddon has joined #openstack-ironic | 06:46 | |
arne_wiebalck | Hey kaifeng o/ | 06:47 |
arne_wiebalck | In case people are interested, we have summarized some of our findings when scaling ironic with nova here: https://techblog.web.cern.ch/techblog/post/nova-ironic-at-scale/ | 06:49 |
*** dsneddon has quit IRC | 06:51 | |
*** dsneddon has joined #openstack-ironic | 07:01 | |
*** trident has joined #openstack-ironic | 07:01 | |
*** tesseract has joined #openstack-ironic | 07:10 | |
*** e0ne has joined #openstack-ironic | 07:22 | |
kaifeng | arne_wiebalck: thanks for the article, it says "if an Ironic instance is already provisioned it can???t be moved to a different compute-node and in case of a nova-compute failure the user can't perform any API operation for his instance.", I am wondering why | 07:24 |
kaifeng | if a compute node is offline, the instance won't be shifted to another compute node? | 07:25 |
*** mkrai has quit IRC | 07:29 | |
*** mkrai has joined #openstack-ironic | 07:31 | |
*** mkrai has quit IRC | 07:38 | |
*** rcernin has quit IRC | 07:56 | |
arne_wiebalck | kaifeng: no, the code does not do this (there is now a proposal to change this, though) | 08:00 |
*** dougsz has joined #openstack-ironic | 08:00 | |
*** lucasagomes has joined #openstack-ironic | 08:02 | |
kaifeng | arne_wiebalck: thanks, do you have the link to the proposal? | 08:11 |
*** rpittau|afk is now known as rpittau | 08:18 | |
rpittau | good morning ironic! o/ | 08:18 |
*** tssurya has joined #openstack-ironic | 08:22 | |
*** rcernin has joined #openstack-ironic | 08:23 | |
*** dougsz has quit IRC | 08:27 | |
*** derekh has joined #openstack-ironic | 08:29 | |
*** rcernin has quit IRC | 08:29 | |
*** dsneddon has quit IRC | 08:32 | |
kaifeng | morning rpittau | 08:32 |
rpittau | hey kaifeng :) | 08:33 |
*** rcernin has joined #openstack-ironic | 08:38 | |
*** dougsz has joined #openstack-ironic | 08:40 | |
*** stendulker has quit IRC | 08:41 | |
openstackgerrit | Merged openstack/ironic master: Add deploy steps for Redfish BIOS interface https://review.opendev.org/642060 | 08:42 |
arne_wiebalck | kaifeng: This one looks like it should help: https://review.opendev.org/#/c/671534 | 09:00 |
patchbot | patch 671534 - nova - ironic: take over instances from down compute serv... - 1 patch set | 09:00 |
* arne_wiebalck is upgrading our ironic deployment atm | 09:00 | |
*** dsneddon has joined #openstack-ironic | 09:02 | |
*** rcernin has quit IRC | 09:04 | |
rpittau | arne_wiebalck: good luck :) | 09:08 |
arne_wiebalck | rpittau: thx :) | 09:11 |
kaifeng | arne_wiebalck: thanks! | 09:12 |
openstackgerrit | Merged openstack/ironic-python-agent master: Fixes get_holder disks with nvme drives https://review.opendev.org/675620 | 09:14 |
*** dsneddon has quit IRC | 09:17 | |
*** dsneddon has joined #openstack-ironic | 09:28 | |
*** dsneddon has quit IRC | 09:33 | |
*** kaifeng has quit IRC | 09:43 | |
*** ociuhandu has joined #openstack-ironic | 09:49 | |
openstackgerrit | raphael.glon proposed openstack/ironic-python-agent master: Softraid: partitioning fixes https://review.opendev.org/674819 | 09:50 |
openstackgerrit | raphael.glon proposed openstack/ironic-python-agent master: image extension, install_bootloader improvements https://review.opendev.org/674879 | 09:53 |
openstackgerrit | Shivanand Tendulker proposed openstack/ironic master: Add new method 'apply_configuration' to RAIDInterface https://review.opendev.org/674269 | 10:07 |
*** adrianc has joined #openstack-ironic | 10:07 | |
openstackgerrit | Shivanand Tendulker proposed openstack/ironic master: Add iLO RAID deploy steps https://review.opendev.org/674271 | 10:07 |
openstackgerrit | Shivanand Tendulker proposed openstack/ironic master: WIP: Add iDRAC RAID deploy steps https://review.opendev.org/641731 | 10:08 |
*** ociuhandu has quit IRC | 10:14 | |
*** ociuhandu has joined #openstack-ironic | 10:15 | |
*** alexmcleod has joined #openstack-ironic | 10:16 | |
* arne_wiebalck finished the upgrade to Stein (+ s/w RAID) \o/ | 10:22 | |
*** verma-varsha has joined #openstack-ironic | 10:22 | |
*** verma-varsha1 has joined #openstack-ironic | 10:27 | |
*** dougsz has quit IRC | 10:28 | |
*** verma-varsha has quit IRC | 10:29 | |
*** verma-varsha1 has quit IRC | 10:29 | |
*** bnemec has quit IRC | 10:34 | |
*** bnemec has joined #openstack-ironic | 10:37 | |
*** verma-varsha has joined #openstack-ironic | 10:39 | |
*** dougsz has joined #openstack-ironic | 10:44 | |
*** bnemec has quit IRC | 10:45 | |
*** bnemec has joined #openstack-ironic | 10:49 | |
*** priteau has joined #openstack-ironic | 10:50 | |
*** adrianc has quit IRC | 11:00 | |
*** adrianc has joined #openstack-ironic | 11:03 | |
*** bnemec has quit IRC | 11:04 | |
openstackgerrit | Pradip Kadam proposed openstack/ironic master: DRAC : clear_job_queue clean step to fix pending bios config jobs https://review.opendev.org/674021 | 11:08 |
*** bnemec has joined #openstack-ironic | 11:09 | |
*** bnemec has quit IRC | 11:13 | |
TheJulia | arne_wiebalck: \o/ | 11:14 |
TheJulia | Good morning everyone! | 11:14 |
TheJulia | rpioso: spec merged correct? if so needs-spec is not really needed. I don't see an issue applying the tag though | 11:19 |
TheJulia | arne_wiebalck: very interesting read yesterday. I was very shocked about how long it takes for a single nova-compute to update the list of resources in nova. Like... alarmingly shocked. | 11:21 |
arne_wiebalck | TheJulia: User visible shocking, actually :) | 11:23 |
arne_wiebalck | TheJulia: FWIU, this is only/mostly needed to clean up inconsistencies, though. | 11:24 |
TheJulia | arne_wiebalck: Yeah, I'm suspecting all of those checks are soft of like death by a thousand cuts outside of the deepcopy | 11:25 |
arne_wiebalck | TheJulia: exactly | 11:25 |
arne_wiebalck | TheJulia: perfectly fine for hosts with virtual instances | 11:26 |
*** bnemec has joined #openstack-ironic | 11:26 | |
TheJulia | Sounds like nova/ironic ought to try and discuss this and see if any of us can somehow reduce the thousand cuts and the deepcopy | 11:27 |
TheJulia | efried: ^^^ | 11:27 |
arne_wiebalck | Belmiro (who has done most of the analysis in our team) is in touch with efried. | 11:27 |
*** dsneddon has joined #openstack-ironic | 11:29 | |
TheJulia | okay, awesome | 11:29 |
arne_wiebalck | TheJulia: We have also looked into sharding, but will not do this before nova is on Stein as well. | 11:29 |
arne_wiebalck | TheJulia: Should happen in the coming weeks. | 11:30 |
openstackgerrit | Ilya Etingof proposed openstack/ironic master: Add Redfish Virtual Media Boot support https://review.opendev.org/638453 | 11:30 |
arne_wiebalck | TheJulia: This may be a candidate for backporting: https://review.opendev.org/#/c/650942/ | 11:30 |
patchbot | patch 650942 - ironic - Do not tear down node upon cleaning failure - 6 patch sets | 11:30 |
*** bnemec has quit IRC | 11:31 | |
openstackgerrit | Ilya Etingof proposed openstack/ironic master: Add set_boot_device hook in `redfish` boot interface https://review.opendev.org/672123 | 11:31 |
openstackgerrit | Ilya Etingof proposed openstack/ironic master: Add `filename` parameter to Redfish virtual media boot URL https://review.opendev.org/671054 | 11:31 |
TheJulia | arne_wiebalck: Okay, I know yahoo folks have made mention of insanely long update loops to reconcile, but their fleet is insanely large, comparable times for you guys with a much smaller enrolled fleet kind of caused me to drop my jaw | 11:32 |
arne_wiebalck | TheJulia: Not sure the resource tracker already exists in the release they run. | 11:34 |
arne_wiebalck | TheJulia: They had issues with power synchronization I think. | 11:34 |
*** dsneddon has quit IRC | 11:34 | |
TheJulia | not anywhere near current day form | 11:34 |
TheJulia | yeah | 11:34 |
openstackgerrit | Ilya Etingof proposed openstack/ironic master: [WIP] Add iDRAC boot interface https://review.opendev.org/672498 | 11:35 |
TheJulia | I wonder if nova tried to perform updates in smaller sweeps with ironic. i.e. if we made node searchable by time since last update... if that would at least reduce the set of machines downward... | 11:36 |
*** bnemec has joined #openstack-ironic | 11:38 | |
* arne_wiebalck is somewhat out of his depth on this one | 11:39 | |
arne_wiebalck | I think there are some fundamental (algorithmic) things to improve first. | 11:39 |
TheJulia | Indeed, but if we could change the fundimental access pattern to be more helpful as to what to process through resource tracker... The downside is things eventually disappearing :\ | 11:40 |
arne_wiebalck | Agreed. How about we check with the nova folks to see if/how ironic can help with improving things in this area? Eric and Belmiro have been working on this during the past weeks, so I guess they would have an answer quite quickly. | 11:43 |
TheJulia | ++ | 11:44 |
jroll | morning everyone | 11:44 |
TheJulia | Good morning jroll | 11:44 |
rpittau | hey jroll :) | 11:44 |
* jroll loves this conversation | 11:44 | |
*** bnemec has quit IRC | 11:45 | |
arne_wiebalck | jroll: o/ | 11:45 |
* arne_wiebalck will check with Belmiro | 11:45 | |
TheJulia | arne_wiebalck: thanks | 11:46 |
arne_wiebalck | jroll: do you still work on this one: https://review.opendev.org/#/c/671534? | 11:46 |
patchbot | patch 671534 - nova - ironic: take over instances from down compute serv... - 1 patch set | 11:46 |
arne_wiebalck | jroll: sorry, I guess so (somehow thought the year was 2018) | 11:47 |
jroll | arne_wiebalck: I don't have time to work on it - I put up a similar POC downstream for people to take over and thought I'd push it upstream in case someone found it useful :) | 11:47 |
arne_wiebalck | jroll: ok, I see | 11:48 |
jroll | I'm not sure if our downstream folks even took it yet | 11:48 |
*** bnemec has joined #openstack-ironic | 11:48 | |
arne_wiebalck | jroll: we will need sth like this once we shard | 11:48 |
jroll | they just finished up our ocata upgrade (finally!) and this is a bug in the backlog | 11:48 |
TheJulia | jroll: mind if I comment on that change that your okay with someone taking it over? | 11:49 |
arne_wiebalck | jroll: you have sharded nova-compute? | 11:49 |
jroll | arne_wiebalck: can you define shard? :) | 11:49 |
jroll | TheJulia: I'll do it | 11:49 |
TheJulia | jroll: awesome, thanks~! | 11:49 |
arne_wiebalck | jroll: multiple nova-computes | 11:49 |
arne_wiebalck | jroll: we currently have one with 3k nodes | 11:49 |
jroll | arne_wiebalck: ah ok, we're still playing with it, but are running multiple nova-computes | 11:50 |
jroll | so a couple things here | 11:50 |
jroll | 1) agree you need something to take over instances, but the existing code without that isn't any worse than what you have today (if a nova-compute service goes down, those instances can't be managed), so you could still go ahead and try it | 11:51 |
arne_wiebalck | jroll: yes | 11:51 |
TheJulia | hmm... failure to find disks has killed the standalone job twice in the last 24 hours | 11:52 |
jroll | 2) we did a hack where one nova-compute process spawns multiple nova-compute services, I highly recommend not doing this, it's been painful :) start multiple instances of the services instead | 11:52 |
arne_wiebalck | jroll: ok, good to know! | 11:52 |
TheJulia | 2 sounds very ouchy | 11:53 |
jroll | 3) restart nova-compute services one at a time, or else the hash ring churn gets insane. any computes starting at the same time see the others as down, and so they take all the nodes | 11:53 |
jroll | I think that's it | 11:53 |
jroll | 2 was brutal, but was a good experiment | 11:53 |
arne_wiebalck | jroll: ai! 3) sounds nasty | 11:53 |
arne_wiebalck | jroll: what's the main reason you run with multiple nova-computes? | 11:53 |
jroll | arne_wiebalck: HA and reducing resource tracker runtimes | 11:54 |
arne_wiebalck | jroll: w/o take over there is no HA, is there? ;) | 11:55 |
arne_wiebalck | jroll: how many nodes do you have per nova-compute? | 11:55 |
*** bnemec has quit IRC | 11:55 | |
jroll | ooo, that deepcopy is rough, I don't think that's in ocata | 11:55 |
arne_wiebalck | jroll: I think that came later, yes | 11:55 |
jroll | arne_wiebalck: heh, there's HA for new instances at least :P | 11:56 |
arne_wiebalck | jroll: true! | 11:56 |
jroll | I'm not sure how many offhand, I'd have to check | 11:56 |
arne_wiebalck | still 1000s I guess | 11:56 |
jroll | for sure | 11:56 |
arne_wiebalck | jroll: thanks a lot! | 11:57 |
*** belmoreira has joined #openstack-ironic | 11:58 | |
*** dsneddon has joined #openstack-ironic | 11:59 | |
jroll | pagination always confuses me, I knew there had to be more than 1k servers in this deployment :P | 11:59 |
jroll | arne_wiebalck: you're welcome :) | 11:59 |
*** bnemec has joined #openstack-ironic | 11:59 | |
TheJulia | jroll: eh... I think there is hard coded only return 1k nodes unless told otherwise. | 12:00 |
TheJulia | which... likely needs to change | 12:00 |
jroll | yeah | 12:00 |
jroll | well, it's configurable | 12:00 |
* TheJulia doesn't remember it being configurable but then \o/ | 12:01 | |
jroll | max_limit I think | 12:01 |
jroll | --limit 0 will make the client grab all, and also make me wait >:( | 12:01 |
jroll | arne_wiebalck: the first cluster we started having problems with was around 3400 nodes per nova-compute instance | 12:02 |
arne_wiebalck | jroll: ai ... we're at 3k now, but next deliveries arrive in a couple of weeks I believe | 12:02 |
TheJulia | jroll: yeah, maybe a flag to say "forget pagination, just stream the results back to me" ? | 12:02 |
jroll | TheJulia: interesting, could be fun :) | 12:03 |
TheJulia | how much time is that wait presently? | 12:03 |
TheJulia | with pagination trying to do it's thing | 12:03 |
TheJulia | roughly | 12:03 |
jroll | heh | 12:03 |
*** rh-jelabarre has joined #openstack-ironic | 12:03 | |
jroll | getting about 10k nodes per minute | 12:04 |
jroll | a little more | 12:04 |
TheJulia | given how the object model works, that is not bad | 12:04 |
jroll | so... 6s per api call | 12:04 |
jroll | yeah | 12:04 |
*** dsneddon has quit IRC | 12:04 | |
*** bnemec has quit IRC | 12:04 | |
jroll | not sure how much of that is on each end, of course, the client takes a bit to format too | 12:04 |
TheJulia | That is true, but I seem to remember the bulk of the enforcement is in the api because of the line return limit for raw clients | 12:05 |
jroll | yeah, pagination happens in the API, I just mean the client takes time to format results | 12:05 |
TheJulia | Anyway, allowing a processor to begin working on a list while being returned might be useful, although I suspect the client handling that may not actually return until the transfer completes. | 12:06 |
TheJulia | yeah | 12:06 |
*** bnemec has joined #openstack-ironic | 12:10 | |
*** dsneddon has joined #openstack-ironic | 12:10 | |
*** belmoreira has quit IRC | 12:13 | |
*** ociuhandu has quit IRC | 12:14 | |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Follow-up to power sync reno https://review.opendev.org/676400 | 12:14 |
*** dsneddon has quit IRC | 12:16 | |
TheJulia | rpittau: When you have a couple minutes free, please take a look at https://review.opendev.org/#/c/674269 | 12:17 |
patchbot | patch 674269 - ironic - Add new method 'apply_configuration' to RAIDInterface - 6 patch sets | 12:17 |
rpittau | TheJulia: ack | 12:17 |
openstackgerrit | Ilya Etingof proposed openstack/ironic master: Add Redfish Virtual Media Boot support https://review.opendev.org/638453 | 12:19 |
openstackgerrit | Ilya Etingof proposed openstack/ironic master: Add set_boot_device hook in `redfish` boot interface https://review.opendev.org/672123 | 12:20 |
openstackgerrit | Ilya Etingof proposed openstack/ironic master: Add `filename` parameter to Redfish virtual media boot URL https://review.opendev.org/671054 | 12:20 |
openstackgerrit | Ilya Etingof proposed openstack/ironic master: [WIP] Add iDRAC boot interface https://review.opendev.org/672498 | 12:21 |
TheJulia | eek... rebased them all | 12:22 |
TheJulia | oh, whew... gerrit did the right thing | 12:23 |
* rpittau getting lost in ephemeral disks | 12:23 | |
TheJulia | rutro! | 12:27 |
TheJulia | hjensas: Any any ideas on the networking-baremetal traffic passing failure? | 12:27 |
openstackgerrit | Varsha Verma proposed openstack/sushy-tools master: Add Storage and Storage Controllers resource support https://review.opendev.org/674339 | 12:27 |
*** bnemec has quit IRC | 12:29 | |
hjensas | TheJulia: I'm still looking at it, no clues yet. I'm setting up a local devstack to reproduce the job atm. | 12:30 |
TheJulia | hjensas: ack, I was suspecting that would have to be the next step :( | 12:30 |
TheJulia | let me know if there is anything I can do to help, otherwise I'm going to try and stay out of the way and work on reviewing patches. | 12:30 |
hjensas | TheJulia: ok. | 12:31 |
*** bnemec has joined #openstack-ironic | 12:33 | |
*** rcernin has joined #openstack-ironic | 12:34 | |
openstackgerrit | Julia Kreger proposed openstack/ironic stable/stein: Check for deploy.deploy deploy step in heartbeat https://review.opendev.org/676151 | 12:38 |
*** rcernin has quit IRC | 12:40 | |
*** bnemec has quit IRC | 12:41 | |
*** belmoreira has joined #openstack-ironic | 12:42 | |
*** belmoreira has quit IRC | 12:43 | |
*** bnemec has joined #openstack-ironic | 12:44 | |
*** priteau has quit IRC | 12:48 | |
*** bnemec has quit IRC | 12:49 | |
*** dsneddon has joined #openstack-ironic | 12:49 | |
*** belmoreira has joined #openstack-ironic | 12:51 | |
openstackgerrit | Merged openstack/sushy master: Implements adapter checking https://review.opendev.org/669963 | 12:53 |
*** bnemec has joined #openstack-ironic | 12:54 | |
*** rloo has joined #openstack-ironic | 12:55 | |
*** dsneddon has quit IRC | 12:56 | |
*** dsneddon has joined #openstack-ironic | 13:02 | |
*** jcoufal has joined #openstack-ironic | 13:04 | |
*** bnemec has quit IRC | 13:07 | |
*** dsneddon has quit IRC | 13:08 | |
*** bnemec has joined #openstack-ironic | 13:10 | |
*** bnemec has quit IRC | 13:16 | |
*** beekneemech has joined #openstack-ironic | 13:16 | |
*** belmoreira has quit IRC | 13:28 | |
*** beekneemech has quit IRC | 13:33 | |
*** belmoreira has joined #openstack-ironic | 13:34 | |
*** dsneddon has joined #openstack-ironic | 13:36 | |
TheJulia | arne_wiebalck: would be super appreciative of any IPA reviews if you have time, mainly because there are a couple software raid related changes/fixes. | 13:37 |
*** bnemec has joined #openstack-ironic | 13:39 | |
openstackgerrit | Merged openstack/ironic master: Ansible: fix partition_configdrive for logical root_devices https://review.opendev.org/674643 | 13:44 |
efried | o/ arne_wiebalck TheJulia -- looks like I missed a discussion about perf? | 13:44 |
*** cdent has joined #openstack-ironic | 13:45 | |
*** belmoreira has quit IRC | 13:46 | |
*** bnemec has quit IRC | 13:47 | |
efried | It occurs to me that belmoreira might benefit from this (credit cdent) to narrow down places where we can get the most bang https://docs.openstack.org/nova/latest/contributor/testing/eventlet-profiling.html | 13:47 |
cdent | (that needs a caveat/warning about how it fails to work well when rpc is involved, and workarounds for dealing with that, but for the most part is goodness) | 13:48 |
*** bnemec has joined #openstack-ironic | 13:51 | |
*** dsneddon has quit IRC | 13:52 | |
*** bnemec has quit IRC | 13:55 | |
TheJulia | Neat, it does sound like they went fairly down the rabbit hole and found various issues/pain points. At least that was my perception from reading what they published. | 13:57 |
*** belmoreira has joined #openstack-ironic | 13:58 | |
*** bnemec has joined #openstack-ironic | 13:58 | |
arne_wiebalck | The main question from the ironic side is probably if ironic can do sth to help or if the improvements are mostly needed in the general nova code (i.e. even outside the ironic driver, let alone how ironic provides data to nova). | 14:03 |
*** sthussey has joined #openstack-ironic | 14:04 | |
jroll | arne_wiebalck: oh, I forgot to mention one other way to shard is with the conductor locality feature | 14:04 |
arne_wiebalck | My understanding from a discussion with belmoreira earlier was that the main improvements are expected to come from the shared code in nova (shared code meaning used for virtual and physical instances). | 14:04 |
jroll | in ironic, it shards conductors, but you can also tie nova-compute instances to the group | 14:05 |
arne_wiebalck | jroll: conductor locality? | 14:05 |
cdent | arne_wiebalck: yes, that's my understanding too | 14:05 |
jroll | the main benefit in nova being that it pulls a subset of nodes from ironic to iterate over in the resource tracker, instead of all | 14:05 |
* jroll finds docs | 14:05 | |
jroll | arne_wiebalck: https://docs.openstack.org/ironic/latest/admin/conductor-groups.html | 14:05 |
cdent | nova likes to lock over loops that it was designed to think are small but with ironic are big | 14:05 |
arne_wiebalck | cdent: right | 14:06 |
cdent | (they are also big in clustered hypervisors) | 14:06 |
jroll | cdent: ++ | 14:06 |
*** belmoreira has quit IRC | 14:06 | |
arne_wiebalck | cdent: we are considering to have multiple nova-computes to mitigate, but from jroll's (and our) experience this has limits | 14:07 |
* cdent nods | 14:07 | |
jroll | welllll, it has caveats. if you spun an n-cpu per ironic node, you'd be golden :P | 14:07 |
cdent | thousands of tiny n-cpu containers? | 14:08 |
jroll | "tiny" is relative, but yeah | 14:08 |
cdent | (is not actually the worst idea every) | 14:08 |
cdent | ever | 14:08 |
jroll | that would be hard to manage, but not impossible | 14:08 |
cdent | just tells k8s to do it, doesn't it fix everything ;) | 14:08 |
*** dsneddon has joined #openstack-ironic | 14:09 | |
jroll | sure, run my k8s for me and that sounds great :D | 14:09 |
jroll | re: what can ironic do: the one big change that ironic could do is to take over management of placement for ironic nodes. if we were putting the right data in placement as it changes, we could just noop the resource tracker in nova. but this comes with all sorts of problems in the developer world | 14:09 |
cdent | I was gonna say that too, but it feels churlish for me to mention it first, so I'm glad you did | 14:10 |
arne_wiebalck | hmm, isn't the rt a little less important than its name implies? | 14:11 |
* arne_wiebalck maybe does not fully understand its role | 14:11 | |
jroll | in short, the resource tracker is what takes the ironic node data and puts it in placement for the scheduler to use | 14:11 |
arne_wiebalck | but only for new resources, no? | 14:11 |
jroll | existing as well | 14:12 |
arne_wiebalck | I don't think so. | 14:12 |
jroll | e.g. if you put an available node in maintenance mode, it will notice, and drop the available resources | 14:12 |
arne_wiebalck | I thought this was based on explicit events. | 14:12 |
jroll | if you update the resource class on a node, it updates placement accordingly, etc | 14:12 |
jroll | it is not :( | 14:12 |
jroll | it should be, but is not | 14:12 |
cdent | Is it already a given in these discussion that nova needs to be involved at all (presumably for the sake of the api?)? If that's fungible, the options open up broadly. | 14:12 |
jroll | (this model would also work for new resources) | 14:13 |
jroll | cdent: it is not always a given, but a single API for all compute resources is a major benefit that some orgs would like to continue to take advantage of | 14:13 |
*** dsneddon has quit IRC | 14:13 | |
*** absubram has joined #openstack-ironic | 14:15 | |
rpittau | TheJulia: I noticed that in https://review.opendev.org/666591 we're mixing legacy neutron libs, I don't think that is advisable, and I'm actually struggling understanding where that come from at the moment :/ | 14:17 |
patchbot | patch 666591 - ironic-python-agent-builder - [WIP] Update tinycore from 8.x to 10.x - 12 patch sets | 14:17 |
*** absubram has quit IRC | 14:19 | |
*** ociuhandu has joined #openstack-ironic | 14:21 | |
*** absubram has joined #openstack-ironic | 14:25 | |
*** ociuhandu has quit IRC | 14:26 | |
arne_wiebalck | cdent: jroll: Any suggestions how to best follow up on this discussion and come up with a plan on how to move forward? | 14:26 |
cdent | arne_wiebalck: I missed the start of the conversation so I'm not clear on the full context | 14:26 |
*** belmoreira has joined #openstack-ironic | 14:27 | |
arne_wiebalck | cdent: I think the trigger was the performance issues in the resource tracker for larger ironic deployments. efried and belmoreira are working on this. | 14:27 |
arne_wiebalck | cdent: TheJulia offered help from the ironic side (if needed). | 14:28 |
*** jcoufal_ has joined #openstack-ironic | 14:29 | |
cdent | One thing I think is probably worth exploring is using a deque as the source of things do during the locked loop and just do N things per go. Each time you process something, put it back on the end of the deque | 14:29 |
cdent | however | 14:29 |
cdent | in some cases that's going to upset the view of reality | 14:29 |
cdent | and managing that becomes the tricky bit | 14:29 |
cdent | I would think that if belmoreira and efried are on the case in some fashion, that's pretty good odds that things will be improved | 14:30 |
arne_wiebalck | agreed | 14:30 |
cdent | as belmoreira pointed out in his blog post there are some definite problem in the ProviderTree data structure that are probably amenable to improvement | 14:30 |
arne_wiebalck | right | 14:31 |
*** jcoufal has quit IRC | 14:31 | |
rpioso | TheJulia: I tidied up https://storyboard.openstack.org/#!/story/2004592 yesterday by adding a couple of the tags we discussed, changing the project of the task associated with the spec to openstack/ironic-specs, and adding a new task for the implementation. I left the description there, even though it's been superseded by the spec. | 14:32 |
arne_wiebalck | I was just thinking if we take a step back, there may be even more efficient options/ideas (like "does nova need to be involved?"). | 14:32 |
arne_wiebalck | Totally fine with letting efried and belmoreira finish their analysis :) | 14:32 |
TheJulia | rpittau: got a link to a log? | 14:33 |
*** ijw has joined #openstack-ironic | 14:33 | |
arne_wiebalck | TheJulia: I'll try to have a look at the IPA patches. | 14:33 |
*** dsneddon has joined #openstack-ironic | 14:36 | |
jroll | arne_wiebalck: I do think we should have a bigger discussion about "can we eliminate the RT for the ironic driver", which might should start in the ML or a spec | 14:36 |
rpioso | TheJulia: On a different topic, I would like us to discuss prospective homes for sushy OEM extension Git repos during our next weekly meeting. Spoiler alert! I'm thinking of the opendev.org/x namespace. May I add it to the Discussion section of the agenda? | 14:38 |
arne_wiebalck | jroll: That should better happen before too much work goes into optimizing it. | 14:39 |
jroll | arne_wiebalck: good point, I somewhat agree, though it's going to be a long road and doesn't hurt to improve it in the meantime | 14:41 |
rpioso | jroll: You may be interested in ^^^ | 14:41 |
arne_wiebalck | jroll: also true | 14:41 |
jroll | rpioso: the 'x' namespace is for unofficial projects, so I would just go ahead and put them there now. if ironic decides to take them in we can move them :) | 14:42 |
arne_wiebalck | jroll: as major changes will much longer, I'd think that the PTG might be a good place to start the discussion, but it seems many will skip the Shanghai one | 14:42 |
jroll | arne_wiebalck: agree and agree :( | 14:42 |
*** dsneddon has quit IRC | 14:47 | |
jroll | arne_wiebalck: now that I think about it, I don't think cdent will be there either? so best to start on the ML | 14:47 |
cdent | i will not, no | 14:48 |
cdent | ML helps to filter out a lot of the shared language building prior to in person chat, too | 14:49 |
jroll | ++ | 14:50 |
*** priteau has joined #openstack-ironic | 14:50 | |
rpittau | TheJulia: sorry, got mixed up with other things and inverted the logs, that is actually not using legacy, while other patches are | 14:50 |
arne_wiebalck | cdent: jroll: ok, I'll check with belmoreira (to also check with efried), I think it'd be best if they drive this as they probably have the best overview atm | 14:51 |
efried | arne_wiebalck: I'm not at the PTG either | 14:51 |
efried | but yeah, we can do things in IRC/ML | 14:52 |
jroll | ++ | 14:52 |
arne_wiebalck | efried: ack, thx! | 14:52 |
*** dsneddon has joined #openstack-ironic | 14:55 | |
*** ociuhandu has joined #openstack-ironic | 14:57 | |
*** dsneddon has quit IRC | 15:01 | |
*** absubram has quit IRC | 15:09 | |
*** alexmcleod is now known as alexmcleod|bbl | 15:11 | |
*** ociuhandu has quit IRC | 15:11 | |
*** ociuhandu has joined #openstack-ironic | 15:12 | |
*** jcoufal_ has quit IRC | 15:12 | |
*** ijw_ has joined #openstack-ironic | 15:12 | |
*** jcoufal has joined #openstack-ironic | 15:13 | |
*** ijw has quit IRC | 15:14 | |
TheJulia | rpioso: Is there really a need. It is not like the ironic community has to approve x/ namespace usage and there has been a past desire to not pull specific vendor items into the project's scope. Perhaps one day there might be the case for a separate namespace though.... | 15:16 |
TheJulia | arne_wiebalck: I really think the ML is the best place because the PTG is likely to be a specific slice of contributorship and I suspect most PTG time will actually be more of the "bridge building" and listening to issues/needs sort of time. Anything discussed there would need to go back to the ML anyway. | 15:18 |
TheJulia | rpittau: I'm afraid you have me confused, are you referring to networking-baremetal? | 15:19 |
rpittau | TheJulia: sorry, I'm referring to the migration to tinycore 10.x in ironic-python-agent-builder | 15:20 |
*** ianychoi has joined #openstack-ironic | 15:20 | |
TheJulia | so, we did merge a change the default template to not use legacy this week, I guess we might need to do it there depending on job config... | 15:21 |
rpittau | TheJulia: no, my bad, I confused two different logs, that is actually using the updated neutron, not legacy | 15:22 |
TheJulia | it happens, that was earlier in the week too so older job logs may be slightly confusing | 15:23 |
arne_wiebalck | TheJulia: Checking with belmoreira, the plan was to upgrade to Stein first, then re-assess (as it brings some patches in this area). | 15:25 |
arne_wiebalck | TheJulia: The (nova) upgrade should happen during the next weeks. | 15:26 |
arne_wiebalck | TheJulia: OK with you if we wait and pick this up then? | 15:27 |
TheJulia | arne_wiebalck: absolutely. One thing though, I'm going to be absent for a good chunk of September. | 15:28 |
* TheJulia needs to send that email | 15:28 | |
rpioso | TheJulia: I thought we could share thoughts like those with the ironic community. Either way, I'm good with proceeding. | 15:28 |
rpioso | jroll, TheJulia: Thank you! | 15:29 |
arne_wiebalck | TheJulia: ok | 15:29 |
TheJulia | rpioso: I think if the project as a whole wants to reconsider, then that might be a good topic. I'm just not sure the outcome would really go anywhere unless we had quorum of others who may be impacted/affected or who may directly benefit. | 15:30 |
TheJulia | rpioso: in otherwords, it might also be a good mailing list topic :) | 15:30 |
*** dsneddon has joined #openstack-ironic | 15:33 | |
rpioso | TheJulia: Since that one day may be quite some time in the future, we'll proceed to create the repo in 'x'. No need to hold it up while discussing prospective futures. After it exists, an ML topic may be worthwhile to inform the community of its existence and possible paths forward should critical mass be achieved. | 15:38 |
*** ijw has joined #openstack-ironic | 15:39 | |
TheJulia | rpioso: I was thinking the exact same thing! | 15:39 |
*** dsneddon has quit IRC | 15:39 | |
rpioso | TheJulia: \o/ | 15:40 |
rpioso | TheJulia: Is #openstack-infra the place to go to ask about creating it or is TC approval needed first? | 15:42 |
*** ijw_ has quit IRC | 15:42 | |
jroll | there's a guide for this :) | 15:43 |
jroll | rpioso: https://docs.openstack.org/infra/manual/creators.html | 15:43 |
*** ijw_ has joined #openstack-ironic | 15:46 | |
*** absubram has joined #openstack-ironic | 15:48 | |
*** jcoufal has quit IRC | 15:48 | |
*** ijw has quit IRC | 15:49 | |
rpioso | jroll: Now that's a guide! Thanks, again. | 15:50 |
mbuil | hey guys, when using virtualbmc version 1.5, if I don't specify a log file, at some point the vbmc server crashes. Have you ever noticed this? This was not happening in 1.3, so I suspect this was introduced by the new architecture but no idea why :( | 15:50 |
jroll | rpioso: no problem :) | 15:52 |
rpittau | bye all, see you on monday, long weekend here :) | 15:55 |
*** rpittau is now known as rpittau|afk | 15:55 | |
*** tesseract has quit IRC | 15:56 | |
*** tssurya has quit IRC | 16:05 | |
*** tssurya has joined #openstack-ironic | 16:05 | |
*** belmoreira has quit IRC | 16:07 | |
*** cdent has quit IRC | 16:12 | |
*** lucasagomes has quit IRC | 16:15 | |
etingof | mbuil, o/ I've not encountered that, however if you do, your best bet would be to open up an issue on storyboard enclosing your config and traceback | 16:18 |
TheJulia | mbuil: Interesting... I wonder upon detach, since stdout is not a valid option, if python blows upon trying to write to it for logging.... | 16:21 |
TheJulia | etingof: ^^^ just a crazy idea | 16:22 |
*** dsneddon has joined #openstack-ironic | 16:27 | |
mbuil | etingof: ok, I'll do that :). Another thing that I found is that when doing vbmc delete server_A, the config gets deleted but the process that was created when adding it stays there but now as "[vbmc] <defunct>". Looks like a zombie process | 16:28 |
mbuil | 5 0 131170 131097 20 0 0 0 exit Z ? 0:02 [vbmc] <defunct> | 16:28 |
mbuil | I'll add that to the storyboard too | 16:29 |
*** ricolin has quit IRC | 16:29 | |
*** dsneddon has quit IRC | 16:33 | |
*** ijw_ has quit IRC | 16:34 | |
*** alexmcleod|bbl has quit IRC | 16:37 | |
*** tssurya has quit IRC | 16:39 | |
*** belmoreira has joined #openstack-ironic | 16:39 | |
*** ociuhandu_ has joined #openstack-ironic | 16:41 | |
*** belmoreira has quit IRC | 16:41 | |
*** fungi has quit IRC | 16:42 | |
*** fungi has joined #openstack-ironic | 16:43 | |
*** ociuhandu has quit IRC | 16:44 | |
*** ociuhandu_ has quit IRC | 16:45 | |
*** dsneddon has joined #openstack-ironic | 16:47 | |
*** adrianc has quit IRC | 16:50 | |
*** e0ne has quit IRC | 16:51 | |
*** adrianc has joined #openstack-ironic | 16:52 | |
*** dsneddon has quit IRC | 16:53 | |
etingof | mbuil, this all sounds very weird... can it be related to python version you are on, I wonder... | 16:55 |
*** ociuhandu has joined #openstack-ironic | 16:56 | |
*** derekh has quit IRC | 17:00 | |
*** ociuhandu has quit IRC | 17:00 | |
openstackgerrit | Merged openstack/ironic stable/stein: Fixes power-on failure for 'ilo' hardware type https://review.opendev.org/674458 | 17:00 |
etingof | mbuil, also, if you could set logging level to debug and see what happens in the log, that can possibly reveal something... or not | 17:02 |
etingof | TheJulia's crazy idea is too reasonable | 17:04 |
*** ijw has joined #openstack-ironic | 17:05 | |
*** jcoufal has joined #openstack-ironic | 17:12 | |
openstackgerrit | Merged openstack/ironic master: Enable testing software RAID in the standalone job https://review.opendev.org/675102 | 17:14 |
*** dsneddon has joined #openstack-ironic | 17:25 | |
TheJulia | etingof: sorry :( | 17:29 |
TheJulia | I'll strive harder for crazy ideas! | 17:29 |
TheJulia | What if it was from an exploding fuel tank flying out of a nearby computer running kerbal space program?!? ;) | 17:30 |
*** dsneddon has quit IRC | 17:30 | |
etingof | do you mean - PSU, not rocket booster perhaps? | 17:31 |
TheJulia | Well, I recently started packing the science packages you could place on far off places... and they do require solar panel packages....... | 17:32 |
etingof | ironic goes beyond the clouds, it seems! | 17:34 |
*** priteau has quit IRC | 17:37 | |
etingof | it seems there is a design problem in sushy OEM class hierarchy, but fixing it clearly might introduce a backward hiccup | 17:38 |
*** dougsz has quit IRC | 17:41 | |
TheJulia | etingof: well, was it not already moderately not working until we just merged a patch recently? | 17:46 |
TheJulia | I guess regardless, it would be a major version bump | 17:46 |
* etingof is taking it as a suggestion not to bother with backward compatibility for OEM! \o/ | 17:47 | |
etingof | otherwise it's quite messy | 17:47 |
*** verma-varsha has joined #openstack-ironic | 18:05 | |
*** gyee has joined #openstack-ironic | 18:06 | |
*** e0ne has joined #openstack-ironic | 18:08 | |
*** dsneddon has joined #openstack-ironic | 18:09 | |
*** e0ne has quit IRC | 18:10 | |
*** ociuhandu has joined #openstack-ironic | 18:11 | |
*** verma-varsha has quit IRC | 18:12 | |
*** dsneddon has quit IRC | 18:14 | |
*** dsneddon has joined #openstack-ironic | 18:28 | |
*** verma-varsha has joined #openstack-ironic | 18:29 | |
*** ociuhandu has quit IRC | 18:34 | |
*** verma-varsha has quit IRC | 18:46 | |
TheJulia | I might be awful for thinking that, and others may disagree, but no real reason to maintain it if it hasn't been used and we just fixed it. Just need to do the major version bump | 19:43 |
etingof | that's my thinking as well | 19:45 |
rpioso | etingof: Are you preparing to cut a new sushy release? | 19:46 |
etingof | rpioso, not yet, we seem to require OEM resource model change in sushy | 19:47 |
rpioso | etingof: Has that change already been proposed or merged? | 19:48 |
etingof | no, I am still experimenting with that | 19:48 |
rpioso | etingof: Gotcha. I'll be on the lookout for it. Thanks! | 19:50 |
*** e0ne has joined #openstack-ironic | 20:14 | |
*** ijw has quit IRC | 20:17 | |
*** ijw has joined #openstack-ironic | 20:24 | |
*** ijw has quit IRC | 20:27 | |
openstackgerrit | Julia Kreger proposed openstack/python-ironicclient master: WIP: Remove the ironic command https://review.opendev.org/676515 | 20:28 |
*** ijw has joined #openstack-ironic | 20:28 | |
*** ijw has quit IRC | 20:30 | |
*** ijw has joined #openstack-ironic | 20:30 | |
openstackgerrit | Matt Riedemann proposed openstack/ironic stable/rocky: CI: remove quotation marks from TEMPEST_PLUGINS variable https://review.opendev.org/676517 | 20:37 |
*** mriedem has joined #openstack-ironic | 20:38 | |
*** ijw has quit IRC | 20:38 | |
*** ijw has joined #openstack-ironic | 20:38 | |
*** efried has quit IRC | 20:45 | |
*** efried has joined #openstack-ironic | 20:46 | |
*** mriedem has left #openstack-ironic | 20:56 | |
*** jtomasek has quit IRC | 21:01 | |
*** jcoufal has quit IRC | 21:10 | |
*** e0ne has quit IRC | 21:11 | |
*** e0ne has joined #openstack-ironic | 21:13 | |
*** e0ne has quit IRC | 21:13 | |
*** dsneddon has quit IRC | 21:56 | |
openstackgerrit | Steve Baker proposed openstack/metalsmith master: Allow reserve_node to backfill from existing node https://review.opendev.org/676525 | 22:32 |
openstackgerrit | Steve Baker proposed openstack/metalsmith master: Allow reserve_node to backfill from existing node https://review.opendev.org/676525 | 22:36 |
*** absubram has quit IRC | 23:29 | |
*** ijw has quit IRC | 23:29 | |
*** sthussey has quit IRC | 23:53 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!