TheJulia | That itself, shouldn't cause this. it is super weird though | 00:53 |
---|---|---|
TheJulia | dtantsur: any resolution regarding https://review.opendev.org/c/openstack/ironic/+/906113 ? | 00:58 |
TheJulia | omg I see it | 01:14 |
TheJulia | take a look at notify_conductor_resume_operation | 01:14 |
TheJulia | we can't remove the methods, for some reason we have a conductor heartbeat to trigger another rpc action code path | 01:15 |
* TheJulia pours whiskey into a glass | 01:15 | |
TheJulia | JayF: cid: ^^^ ironic/conductor/utils.py notify_conductor_resume_operation (so conductor call itself.... gah! | 01:16 |
TheJulia | oh, like 947, at least on my local branch right now | 01:16 |
JayF | I'm very confused as to why that only broke in one job though. That must only be used by the ansible driver | 01:40 |
JayF | I'm curious if there's a forward looking fix we could make, but we would still have to delay the removal of the methods | 01:41 |
JayF | Something to look at tomorrow morning at a computer | 01:41 |
TheJulia | we could likely unwire it from using the call | 01:48 |
TheJulia | I'm not sure why it does it to begin with since the call should already be on the conductor | 01:48 |
TheJulia | ... I didn't realize we had conductor to conductor call logic anywhere | 01:48 |
TheJulia | but... eh | 01:48 |
TheJulia | brraaaains | 01:48 |
JayF | The only thing I can think of is if there was some reason you wanted to allow for a hash ring change | 02:18 |
JayF | Because I bet there are cases where you can RPC to a different conductor to continue the next step | 02:18 |
ashinclouds[m] | That would be a weird case to still be working and pick the work elsewhere | 02:30 |
ashinclouds[m] | But, I could see cases where that might make sense | 02:31 |
opendevreview | Jacob Anders proposed openstack/sushy-tools master: Replace hardcoded BiosVersion with an updatable field https://review.opendev.org/c/openstack/sushy-tools/+/909487 | 06:38 |
opendevreview | Jacob Anders proposed openstack/sushy-tools master: [WIP] Add support for BIOS update emulation https://review.opendev.org/c/openstack/sushy-tools/+/909500 | 06:43 |
opendevreview | Jacob Anders proposed openstack/sushy-tools master: [WIP] Add support for BIOS update emulation https://review.opendev.org/c/openstack/sushy-tools/+/909500 | 06:55 |
dtantsur | TheJulia: re https://review.opendev.org/c/openstack/ironic/+/906113: I did not have time to properly work on it. I did file https://bugs.launchpad.net/ironic/+bug/2049913 as a large piece of work that can help. | 07:46 |
rpittau | good morning ironic! o/ | 08:21 |
*** nfedorov is now known as jingvar | 09:22 | |
dtantsur | JayF: is it expected that I don't have +2 on unmaintained/yoga? | 10:48 |
opendevreview | cid proposed openstack/ironic master: Fix multiple assignment of redfish_system_id during node creation https://review.opendev.org/c/openstack/ironic/+/909851 | 10:57 |
opendevreview | cid proposed openstack/ironic master: Fix multiple assignment of redfish_system_id during node creation https://review.opendev.org/c/openstack/ironic/+/909851 | 11:33 |
jingvar | Hi folks | 12:58 |
jingvar | I'm having an issue with ironic/ironic-inspector | 12:59 |
jingvar | Ironic was deploeyd with kayobe/kolla ansible without openstack services on 3 controll nodes | 13:00 |
jingvar | Only rabbit+ironic+maria+(finaly keystone + glance) | 13:02 |
JayF | dtantsur: no | 13:02 |
jingvar | When I star inspect, Inspector sends a message over rabbitmq to someone | 13:03 |
jingvar | oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID | 13:04 |
jingvar | It breaks inspection process | 13:04 |
jingvar | But all consumers are present in Rabbit | 13:05 |
jingvar | There is a strange thing - inspection is Failed. but Conductor start power on node etc | 13:08 |
jingvar | I've tried Zed and Bobcat | 13:11 |
jingvar | File "/var/lib/kolla/venv/lib64/python3.9/site-packages/ironic_inspector/main.py", line 379, in api_introspection | 13:13 |
jingvar | client.call({}, 'do_introspection', node_id=node_id | 13:13 |
jingvar | This step | 13:13 |
jingvar | I looks like missconfigured exchanges | 13:15 |
SvenKieske | do you happen to run rabbitmq with quorum queues enabled? | 13:29 |
dtantsur | jingvar: I seriously wonder whether (and if yes - WHY) kolla configured inspector in an HA setup.. | 13:33 |
SvenKieske | what do you mean by HA? only because something is deployed on three nodes doesn't make it HA :) | 13:37 |
dtantsur | SvenKieske: inspector does not need rabbitmq in a non-HA scenario | 13:37 |
dtantsur | (also, inspector's HA support is an experimental feature) | 13:37 |
dtantsur | quorum queues at least might be enabled https://github.com/openstack/kolla-ansible/blob/master/ansible/roles/ironic/templates/ironic-inspector.conf.j2#L24-L26 | 13:38 |
TheJulia | good morning | 13:43 |
SvenKieske | it seems it was at least partly enabled without any conditionals on HA usage, but I have to dig up the details: https://review.opendev.org/c/openstack/kolla-ansible/+/632369 | 13:44 |
jingvar | SvenKieske: yes quorum by default | 13:44 |
dtantsur | jingvar: will it work if you revert https://review.opendev.org/c/openstack/kolla-ansible/+/632369 locally? | 13:44 |
SvenKieske | that are not all commits which enable the rabbitmq stuff for ironic-inspector, I _guess_.. | 13:46 |
jingvar | dtantsur: how do inspector connect to conductor? | 13:46 |
dtantsur | jingvar: public API (HTTP+JSON, the normal way) | 13:47 |
jingvar | I have several condunctors and one inspector | 13:47 |
dtantsur | jingvar: a correction: try setting transport_url=fake:// | 13:47 |
jingvar | wiil try | 13:48 |
SvenKieske | dtantsur: I guess this is the culprit: https://review.opendev.org/c/openstack/kolla-ansible/+/868305/5/ansible/roles/ironic/templates/ironic-inspector.conf.j2 | 13:53 |
SvenKieske | I mean this was even not really correct before, because there is no check to enable the rabbitmq transport only for HA deployments, it was conditional on TLS instead.. | 13:55 |
SvenKieske | jingvar: would you be so kind to file a bug against kolla-ansible for this? I happen to be a maintainer and would like to fix this. | 13:55 |
dtantsur | Probably, but in the end, you should use fake:// if it's a standalone (no API/worker split) inspector without HA | 13:56 |
jingvar | I want conductor HA | 13:57 |
SvenKieske | dtantsur: do you have any docs around this on your side? if I fix this on our side. I want to do it correctly :) | 13:57 |
TheJulia | good morning | 13:58 |
jingvar | HA for deploement purposes | 13:58 |
jingvar | inspector can be non HA | 13:58 |
dtantsur | yeah, we're only talking about inspector now | 13:58 |
SvenKieske | jingvar: alright, you might need this patch as well: https://review.opendev.org/c/openstack/networking-baremetal/+/903995 | 13:58 |
dtantsur | ironic can and should be deployed with rabbit and stuff | 13:59 |
dtantsur | SvenKieske: https://specs.openstack.org/openstack/ironic-inspector-specs/specs/splitting-service-on-API-and-worker.html might be the best we have | 14:00 |
dtantsur | SvenKieske: but do kee https://specs.openstack.org/openstack/ironic-specs/specs/approved/merge-inspector.html in mind for the future | 14:00 |
SvenKieske | thx, still, a bugreport would be nice. I don't know if I get around to it myself today, as I'm swamped with meetings. | 14:02 |
jingvar | I don't use ssl | 14:03 |
SvenKieske | jingvar: are you referring to the commit dtantsur asked you to revert? I guess this is just a refactoring artifact. I guess this is really a bug on how we deploy ironic-conductor in kolla. we can also move to #openstack-kolla if you like | 14:04 |
jingvar | sorry which one? | 14:06 |
jingvar | tarnsport url to fake& | 14:07 |
jingvar | ? | 14:07 |
SvenKieske | no, sorry, this is probably a misunderstanding. I'm in the middle of a meeting, will respond later.. | 14:07 |
jingvar | thx | 14:08 |
SvenKieske | jingvar: I created a bug report regarding the unnecessary transport via rabbitmq for non HA deployments of ironic-inspector, if you are interested: https://bugs.launchpad.net/kolla-ansible/+bug/2054705 | 14:55 |
*** nfedorov is now known as jingvar | 15:04 | |
jingvar | dtantsur: tarnsport_url fake:// in inspector.conf help me, thanks a lot | 15:12 |
dtantsur | jingvar: okay, so FYI SvenKieske ^^^ | 15:13 |
jingvar | what about inspector HA? | 15:14 |
dtantsur | Do you really *need* it? | 15:14 |
jingvar | on zed I faced with failed deploy with broken inspector, but it works on Bobcat | 15:15 |
jingvar | probably it is IPA behavior | 15:15 |
jingvar | on Zed it stops deploy when got timeout with update introspection | 15:16 |
jingvar | Bobcat ignore the error and do work | 15:17 |
SvenKieske | jingvar: could you post your actual error message (trace) and tested openstack release to the above bug report? That would help me a lot. if you don't have an account you can also use paste.openstack.org and I copy it | 15:17 |
jingvar | ofcourse take a minute | 15:18 |
SvenKieske | thank you | 15:19 |
jingvar | first env https://paste.opendev.org/show/bscQiIEnqqfwzW2Vv3yx/ | 15:21 |
jingvar | nent one https://paste.opendev.org/show/bTfJ1XRpmkqDw2JisoZr/ | 15:23 |
dtantsur | It looks like an unpatched queue. I wonder if it's the eventlet error that JayF was dealing with recently. | 15:23 |
jingvar | take a look last lines - inspection started | 15:23 |
dtantsur | So, just to clarify: hasn't using fake:// made the problem go away? | 15:24 |
jingvar | dtantsur: As I wrote above, tarnsport_url fake:// in inspector.conf help me, thanks a lot | 15:26 |
dtantsur | SvenKieske: my recommendation for Kolla would be: use fake:// transport and start only one copy of inspector (as you probably already do). | 15:27 |
dtantsur | Fair for the inspector merge work to happen for HA | 15:27 |
dtantsur | s/Fair/Wait/ (brain y u so) | 15:27 |
SvenKieske | :D alright, thanks! | 15:28 |
jingvar | SvenKieske: stable/2023.2 | 15:28 |
SvenKieske | thank you, I take it, this pastes are not set to expire? :) | 15:29 |
jingvar | Kayobe uses one inspector mgoddard: comment it | 15:29 |
SvenKieske | yeah, I noticed that | 15:29 |
SvenKieske | in the code, that is | 15:30 |
jingvar | SvenKieske: I'm not sure, just a random pastebin | 15:31 |
SvenKieske | hum | 15:44 |
SvenKieske | dtantsur: may I kindly ask, why "rabbit://" is the default transport_url then? I guess this is how it ended up in our config, because we try to use upstream defaults where possible. | 15:45 |
SvenKieske | https://docs.openstack.org/ironic-inspector/2023.2/configuration/ironic-inspector.html#DEFAULT.transport_url | 15:45 |
*** nfedorov_ is now known as jingvar | 15:45 | |
jingvar | There is a hole and there is a rabbit | 15:48 |
dtantsur | this ^^^ :) | 15:49 |
dtantsur | SvenKieske: we had different plans back in the days. They never come to reality, instead we have the plan to freeze inspector as it is and migrate its functionality gradually into Ironic. | 15:50 |
SvenKieske | yeah, but maybe that would be a case where the default transport_url should be switched, given it doesn't really seem to work? or am I missing something else here? or at least document that most users don't want the default but instead "fake://"? | 15:52 |
dtantsur | Yeah, the documentation is lacking for sure | 15:55 |
dtantsur | It's designed to work with the API/worker split (which was thought to be a future mode of operation similar to the Ironic's API/conductor split) | 15:56 |
SvenKieske | alright. I don't want to promise a docs patch myself as my backlog is already growing, maybe I get around to it some day.. | 15:57 |
jingvar | :) ^^^ | 16:08 |
* TheJulia needs even more coffee today | 16:23 | |
JayF | dtantsur: I'm not certain it's an eventlet issue at a cursory glance. I won't have time to look in depth this week | 16:24 |
rpittau | good night! o/ | 16:25 |
cid | little troubleshooting, but it's not very clear, did I make a breaking change: https://review.opendev.org/c/openstack/ironic/+/909851 | 17:12 |
TheJulia | cid, an opportunity to learn :) | 17:13 |
cid | honestly, been doing a ton of learning recently. | 17:14 |
JayF | cid: so since you have node options defined inside the loop, it's still getting the same problem: that same node_options is gonna be used for each time around | 17:15 |
JayF | one potential solution would be to keep deploy options outta node_options, and add deploy_options (probably nicer to name node_deploy_options?) to the call on Line 2620-ish | 17:17 |
TheJulia | maybe being mindful of the desire to mix ipmi+redfish might also change the overall approach to fix it, but that is also a bit more of a lift structure/formatting/pattern wise | 17:21 |
JayF | oooh | 17:25 |
JayF | that's a good point | 17:25 |
TheJulia | Just looking at it, we're pre-preparing, maybe we just need to prepare each time | 17:25 |
JayF | make the while loop longer? | 17:26 |
JayF | that makes sense | 17:26 |
TheJulia | yeah | 17:26 |
cid | hmm, makes sense (based on my understanding). | 17:30 |
cid | starts working on patch set 3 | 17:30 |
JayF | I'll note | 17:32 |
JayF | well, nevermind | 17:33 |
JayF | was about to make a bad suggestion :D | 17:33 |
cid | "bad" | 17:34 |
JayF | was going to suggest that you can likely test quickly outside of devstack | 17:35 |
JayF | but realized you'd likely need to set 9000 environment variables | 17:35 |
cid | set 9,000 environment variables once? | 17:36 |
TheJulia | eh, 9000 is not awful, but 9001 is concerning ;) | 17:37 |
JayF | The reason the idea was bad is I thought it'd simplify things, but it wouldn't have been simpler :D | 17:40 |
cid | yea | 17:41 |
cid | because I was about to ask/enquire, that I would love to provide updates to my code based on the test results, just that the resource intensity. | 17:42 |
cid | it's the machines that will feel my wrath :) | 17:42 |
cid | hence the nudge needing questions. | 17:42 |
JayF | one of the tricks if you're worried about eating up too many CI resources is you can edit zuul.d/[projectname].yaml and comment out jobs from check/gate; but I wouldn't worry about it much | 17:45 |
cid | owkay, will probably try that. | 17:49 |
JayF | Don't spend too much time on it. CI time is important and limited, but so is your time. | 17:55 |
opendevreview | Julia Kreger proposed openstack/ironic master: WIP/DNM: See what explodes when cleaning is enabled for functional tests https://review.opendev.org/c/openstack/ironic/+/909918 | 17:59 |
TheJulia | Wheeeeee https://bugs.launchpad.net/ironic/+bug/2054722 | 18:00 |
TheJulia | (and by Wheee, I mean a sarcastic sound of joy which is actually pain) | 18:00 |
JayF | if fake: pass | 18:01 |
JayF | did I fix it? /s | 18:01 |
JayF | honestly fake agent in sushy seems more appealing lol | 18:01 |
JayF | *sushy-tools | 18:01 |
TheJulia | might be useful, but sort of also disjointed from the fact the tests expects cleaning disabled on a prod cloud | 18:14 |
opendevreview | cid proposed openstack/ironic master: Fix multiple assignment of redfish_system_id during node creation https://review.opendev.org/c/openstack/ironic/+/909851 | 19:22 |
opendevreview | Julia Kreger proposed openstack/ironic master: neutron: do not error if no cleaning/provisioning on launch https://review.opendev.org/c/openstack/ironic/+/909937 | 21:14 |
opendevreview | Julia Kreger proposed openstack/ironic master: fix errors messaging around network mappings https://review.opendev.org/c/openstack/ironic/+/909938 | 21:14 |
TheJulia | so the neutron one is one I care about and want to backport. I went to add new testing and realized the underlying callers/methods were also fairly well tested so only added a note. | 21:15 |
opendevreview | Julia Kreger proposed openstack/ironic stable/2023.1: WIP/DNM: See what explodes when cleaning is enabled for functional tests https://review.opendev.org/c/openstack/ironic/+/909828 | 21:22 |
TheJulia | In today's "did I already fix this?!" | 21:23 |
JayF | those are fun changes to "show blame" on | 21:32 |
JayF | and you can explicitly see we added that *before* validate existed | 21:32 |
TheJulia | yup | 21:34 |
JayF | Sometime around 2pm pacific, I'm going to be talking to adamcarthur5 about finding some tasks to extract eventlet dependencies from IPA | 21:38 |
JayF | if you want in on that conversation, lmk | 21:38 |
* JayF suspects "find a new WSGI server" is likely the winner there | 21:38 | |
JayF | any patches that do substantive changes we'll likely hold until after C is branched | 21:39 |
opendevreview | Julia Kreger proposed openstack/ironic-tempest-plugin master: Invoke allocation tests with 'fake' deploy interface https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/909939 | 21:45 |
TheJulia | JayF: I happen to be available until 3 | 22:01 |
JayF | TheJulia: https://us06web.zoom.us/j/89271237336?pwd=IqQRbboSzDyqhRWsgXUy12Kp9vhqde.1 | 22:03 |
TheJulia | I need to do one thing, but I can do it in the background | 22:03 |
opendevreview | Adam McArthur proposed openstack/ironic-python-agent master: Adding support for viewing individual cpu process info https://review.opendev.org/c/openstack/ironic-python-agent/+/909346 | 22:06 |
opendevreview | Julia Kreger proposed openstack/ironic master: WIP/DNM: Don't setup fake first!!! https://review.opendev.org/c/openstack/ironic/+/909918 | 22:11 |
JayF | https://etherpad.opendev.org/p/ipa-eventlet-wsgi is our notes coming out of that chat; I think the plan will be for adamcarthur5 to do some exploration of possibilities and come to PTG with ideas and maybe (time permitting) prototypes. If you have strong opinions and want to share them, etherpad is a good place | 23:10 |
JayF | I wonder if another low-hanging-fruit for migration could be removing the eventlet monkey_patch and explicitly importing the greened versions of socket for use in image streaming (I suspect this'd have to be stacked after 'get rid of eventlet wsgi') | 23:11 |
TheJulia | so yeah, I think I want to burn down the existence of the pure functional test jobs | 23:42 |
TheJulia | since... they default to run fake which breaks if a non-fake driver gets loaded for... say... deploy_interface | 23:42 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!