*** hwoarang has quit IRC | 00:06 | |
*** hwoarang has joined #openstack-ironic | 00:08 | |
*** sdake has joined #openstack-ironic | 00:09 | |
*** sdake has quit IRC | 00:12 | |
*** sdake has joined #openstack-ironic | 00:12 | |
*** jcoufal has joined #openstack-ironic | 00:22 | |
*** jcoufal has quit IRC | 00:39 | |
*** sdake has quit IRC | 00:41 | |
*** hwoarang has quit IRC | 00:59 | |
*** hwoarang has joined #openstack-ironic | 01:00 | |
*** rloo has quit IRC | 01:00 | |
*** sdake has joined #openstack-ironic | 01:07 | |
*** openstackgerrit has joined #openstack-ironic | 01:16 | |
openstackgerrit | Lars Kellogg-Stedman proposed openstack/virtualbmc master: [WIP] Add Serial-over-LAN (SOL) support https://review.openstack.org/640888 | 01:16 |
---|---|---|
*** sthussey has quit IRC | 01:22 | |
openstackgerrit | Kaifeng Wang proposed openstack/ironic-tempest-plugin master: inspector py3 gate fix https://review.openstack.org/640910 | 01:26 |
openstackgerrit | Kaifeng Wang proposed openstack/ironic-inspector master: Exclude unrelevant files from tempest job https://review.openstack.org/640912 | 01:36 |
*** dmellado has quit IRC | 01:50 | |
*** stevebaker has quit IRC | 01:50 | |
*** hwoarang has quit IRC | 01:55 | |
*** hwoarang has joined #openstack-ironic | 01:57 | |
*** whoami-rajat has joined #openstack-ironic | 02:02 | |
*** sdake has quit IRC | 02:05 | |
*** sdake has joined #openstack-ironic | 02:26 | |
*** hwoarang has quit IRC | 02:33 | |
*** hwoarang has joined #openstack-ironic | 02:35 | |
*** cdearborn has quit IRC | 02:43 | |
*** dmellado has joined #openstack-ironic | 02:55 | |
*** andrein has quit IRC | 02:56 | |
openstackgerrit | Lars Kellogg-Stedman proposed openstack/ironic master: [WIP] honor ipmi_port in serial console drivers https://review.openstack.org/640928 | 02:56 |
openstackgerrit | Lars Kellogg-Stedman proposed openstack/ironic master: [WIP] honor ipmi_port in serial console drivers https://review.openstack.org/640928 | 02:58 |
openstackgerrit | Lars Kellogg-Stedman proposed openstack/ironic master: [WIP] honor ipmi_port in serial console drivers https://review.openstack.org/640930 | 02:59 |
*** dsneddon has quit IRC | 03:19 | |
*** stevebaker has joined #openstack-ironic | 03:25 | |
*** dsneddon has joined #openstack-ironic | 03:35 | |
*** dsneddon has quit IRC | 03:40 | |
*** stendulker has joined #openstack-ironic | 03:47 | |
*** sdake has quit IRC | 03:51 | |
*** andrein has joined #openstack-ironic | 03:51 | |
*** gyee has quit IRC | 03:54 | |
*** andrein has quit IRC | 04:13 | |
*** dsneddon has joined #openstack-ironic | 04:15 | |
*** dsneddon has quit IRC | 04:20 | |
*** dsneddon has joined #openstack-ironic | 04:50 | |
*** dsneddon has quit IRC | 04:57 | |
openstackgerrit | Kaifeng Wang proposed openstack/ironic-inspector master: Exclude unrelevant files from tempest job https://review.openstack.org/640912 | 05:08 |
openstackgerrit | QianBiao Ng proposed openstack/ironic master: Add Huawei iBMC driver support https://review.openstack.org/639288 | 05:24 |
*** dsneddon has joined #openstack-ironic | 05:30 | |
openstackgerrit | Merged openstack/ironic master: Update the log message for ilo drivers https://review.openstack.org/639989 | 05:31 |
*** dsneddon has quit IRC | 05:40 | |
*** sdake has joined #openstack-ironic | 05:47 | |
*** jhesketh has quit IRC | 05:47 | |
*** jhesketh has joined #openstack-ironic | 05:48 | |
*** sdake has quit IRC | 05:50 | |
*** pcaruana has joined #openstack-ironic | 05:52 | |
*** sdake has joined #openstack-ironic | 05:56 | |
*** pcaruana has quit IRC | 06:07 | |
*** dsneddon has joined #openstack-ironic | 06:10 | |
*** rh-jelabarre has quit IRC | 06:13 | |
*** jtomasek has joined #openstack-ironic | 06:17 | |
*** dims has quit IRC | 06:24 | |
*** e0ne has joined #openstack-ironic | 06:24 | |
*** dims has joined #openstack-ironic | 06:26 | |
*** e0ne has quit IRC | 06:33 | |
*** dims has quit IRC | 06:36 | |
*** dims has joined #openstack-ironic | 06:37 | |
*** e0ne has joined #openstack-ironic | 06:46 | |
*** Chaserjim has quit IRC | 06:47 | |
*** Qianbiao has joined #openstack-ironic | 06:59 | |
Qianbiao | Hello. | 07:04 |
Qianbiao | I am working on https://review.openstack.org/#/c/639288/ | 07:04 |
patchbot | patch 639288 - ironic - Add Huawei iBMC driver support - 6 patch sets | 07:04 |
Qianbiao | the openstack-tox-lower-constraints CI results in an error now | 07:05 |
Qianbiao | The reason is "rfc3986==0.3.1" does not match my code. | 07:05 |
Qianbiao | May i upgrade the lower-constraints? | 07:06 |
Qianbiao | Or i need to use old-version-style of rfc3986. | 07:07 |
*** e0ne has quit IRC | 07:07 | |
openstackgerrit | Nikolay Fedotov proposed openstack/ironic-inspector master: Use getaddrinfo instead of gethostbyname while resolving BMC address https://review.openstack.org/626552 | 07:10 |
*** lekhikadugtal has joined #openstack-ironic | 07:36 | |
arne_wiebalck | good morning, ironic | 07:46 |
*** tssurya has joined #openstack-ironic | 08:16 | |
rpittau|afk | good morning ironic! o/ | 08:16 |
*** lekhikadugtal has quit IRC | 08:16 | |
*** rpittau|afk is now known as rpittau | 08:16 | |
*** rh-jelabarre has joined #openstack-ironic | 08:17 | |
*** e0ne has joined #openstack-ironic | 08:17 | |
*** pcaruana has joined #openstack-ironic | 08:18 | |
*** jtomasek has quit IRC | 08:20 | |
*** yolanda has joined #openstack-ironic | 08:21 | |
*** yolanda has quit IRC | 08:23 | |
*** pcaruana has quit IRC | 08:25 | |
*** yolanda has joined #openstack-ironic | 08:25 | |
*** sdake has quit IRC | 08:36 | |
*** pcaruana has joined #openstack-ironic | 08:37 | |
*** priteau has joined #openstack-ironic | 08:42 | |
*** pcaruana has quit IRC | 08:44 | |
iurygregory | good morning o/ | 08:46 |
iurygregory | morning rpittau =) | 08:46 |
rpittau | hi iurygregory :) | 08:46 |
mgoddard | morning all | 08:50 |
rpittau | hi mgoddard :) | 08:50 |
iurygregory | morning mgoddard | 08:51 |
mgoddard | I'll be at a client's office today and tomorrow, with no internet access. Should be able to push a patch with deploy template nits before my train arrives at the station... | 08:51 |
mgoddard | morning rpittau iurygregory | 08:51 |
*** lekhikadugtal has joined #openstack-ironic | 08:51 | |
*** amoralej|off is now known as amoralej | 08:53 | |
*** e0ne has quit IRC | 08:58 | |
*** iurygregory has quit IRC | 08:58 | |
*** e0ne has joined #openstack-ironic | 09:00 | |
*** iurygregory has joined #openstack-ironic | 09:01 | |
*** pcaruana has joined #openstack-ironic | 09:01 | |
*** mbeierl has quit IRC | 09:02 | |
*** mbeierl has joined #openstack-ironic | 09:04 | |
*** lekhikadugtal has quit IRC | 09:07 | |
openstackgerrit | Mark Goddard proposed openstack/ironic-tempest-plugin master: Deploy templates: add API tests https://review.openstack.org/637187 | 09:08 |
*** andrein has joined #openstack-ironic | 09:09 | |
*** moshele has joined #openstack-ironic | 09:09 | |
openstackgerrit | Mark Goddard proposed openstack/ironic master: Deploy templates: conductor and API nits https://review.openstack.org/640446 | 09:13 |
*** S4ren has joined #openstack-ironic | 09:13 | |
*** stendulker has quit IRC | 09:19 | |
*** dtantsur|afk is now known as dtantsur | 09:22 | |
dtantsur | morning ironic | 09:22 |
*** andrein has quit IRC | 09:24 | |
iurygregory | morning dtantsur o/ | 09:26 |
*** andrein has joined #openstack-ironic | 09:27 | |
*** mariojv has quit IRC | 09:30 | |
Qianbiao | hello | 09:31 |
Qianbiao | May someone look at the story https://storyboard.openstack.org/#!/story/2005140 | 09:32 |
dtantsur | Qianbiao: please join #openstack-requirements and work with them on updating constraints. we have no power over it. | 09:34 |
Qianbiao | ok thanks dtantsur | 09:34 |
dtantsur | oh wait | 09:35 |
dtantsur | Qianbiao: my bad, I thought about upper-constraints. lower-constraints can be updated with your patch. | 09:35 |
Qianbiao | I could update it directly in my patch? | 09:35 |
dtantsur | Qianbiao: just make sure you update requirements.txt to >= x.y.z AND lower-constrants.txt to === x.y.z | 09:35 |
dtantsur | Qianbiao: you should, actually, yes. | 09:36 |
Qianbiao | ok thanks | 09:36 |
*** derekh has joined #openstack-ironic | 09:36 | |
* dtantsur needs coffee | 09:36 | |
openstackgerrit | Arkady Shtempler proposed openstack/ironic-tempest-plugin master: Test BM with VM on the same network https://review.openstack.org/636598 | 09:45 |
openstackgerrit | Rachit Kapadia proposed openstack/ironic master: Set boot_mode in node properties during OOB Introspection https://review.openstack.org/639698 | 09:46 |
*** lekhikadugtal has joined #openstack-ironic | 09:47 | |
*** lekhikadugtal has quit IRC | 09:48 | |
S4ren | Good morning ironic, I have a question, if I have an instance deployed on a node in ironic, is it possible to wipe that node and it become available, but for the instance itself to remain active in nova? | 09:50 |
*** moshele has quit IRC | 09:52 | |
openstackgerrit | Iury Gregory Melo Ferreira proposed openstack/python-ironicclient master: Move to zuulv3 https://review.openstack.org/633010 | 09:54 |
*** sdake has joined #openstack-ironic | 09:55 | |
*** lekhikadugtal has joined #openstack-ironic | 09:55 | |
openstackgerrit | QianBiao Ng proposed openstack/ironic master: Add Huawei iBMC driver support https://review.openstack.org/639288 | 09:55 |
arne_wiebalck | S4ren: I guess you’re describing a situation you see in your deployment? (And not something you’d like to see.) | 09:56 |
S4ren | arne_wiebalck, yes it is a situation I saw and definetely not something I'd like to see | 09:58 |
arne_wiebalck | S4ren: :) | 09:58 |
arne_wiebalck | S4ren: I’d say nova and ironic got out of sync. | 09:58 |
arne_wiebalck | S4ren: Do yo know how the instance got wiped? | 09:58 |
dtantsur | S4ren: you can do it via ironic API | 09:59 |
dtantsur | i.e. you deploy a node via nova but undeploy via ironic | 09:59 |
S4ren | Well thats the thing I do not, Im trying to figure this out | 09:59 |
S4ren | dtantsur, Is that possible? I thought that once an instance is deployed on a node in ironic ironic wont let you do stuff on the node | 09:59 |
dtantsur | S4ren: "undeploy" is a legitimate operation for a deployed node | 10:00 |
dtantsur | one of few that are allowed | 10:00 |
dtantsur | actually, nova uses precisely this operation when you issue 'openstack server delete' | 10:00 |
* arne_wiebalck thinks of dd’ing to a block device which has a fs on top | 10:00 | |
S4ren | So I can undeploy a node in ironic, but that will leave the nova instance untouched is that right? | 10:01 |
dtantsur | S4ren: yep. we have no way of syncing it back. | 10:02 |
S4ren | dtantsur, Is it possible to undeploy the node via the horizon GUI? I am looking at mine now and there is no option to undeploy in the dropdown | 10:03 |
dtantsur | S4ren: I don't have a lot of experience with horizon. are you using ironic-ui? | 10:03 |
openstackgerrit | Digambar proposed openstack/ironic stable/ocata: Fix OOB introspection to use pxe_enabled flag in idrac driver https://review.openstack.org/640969 | 10:03 |
openstackgerrit | Digambar proposed openstack/ironic stable/ocata: Fix OOB introspection to use pxe_enabled flag in idrac driver https://review.openstack.org/640969 | 10:05 |
S4ren | dtantsur, I am not sure, let me doublecheck | 10:07 |
S4ren | dtantsur, yeah I am using ironic-ui | 10:12 |
arne_wiebalck | S4ren: nova should also notices this inconsistency, it probably won’t do anything about it (it doesn’t for VMs), but it should log this inconsistency | 10:12 |
*** mbuil has joined #openstack-ironic | 10:13 | |
S4ren | It did, there are logs saying "Instance is unexpectedly not found. Ignore." | 10:13 |
arne_wiebalck | S4ren: that’s what I meant :) | 10:14 |
dtantsur | iurygregory: images updated \o/ but we apparently forgot the branch suffix >_< https://tarballs.openstack.org/ironic-python-agent/tinyipa/files/ | 10:15 |
iurygregory | dtantsur, omg | 10:15 |
iurygregory | well we almost there \o/ | 10:15 |
dtantsur | I wonder at which point it disappeared.. wanna take a look? | 10:15 |
dtantsur | yeah, at least it builds | 10:15 |
iurygregory | sure | 10:15 |
iurygregory | looking now | 10:15 |
iurygregory | =) | 10:15 |
dtantsur | iurygregory: the logs from the last build, if you need it: http://logs.openstack.org/dd/dd300fe49e0799936111832a869631b9ea6775f6/post/ironic-python-agent-buildimage-tinyipa/c73025b/ | 10:16 |
iurygregory | its call zuulv3 magic dtantsur XD | 10:16 |
dtantsur | lol | 10:16 |
mbuil | hey guys, what is the login for the Fedora IPA image? I am having some problems and I would like to see the logs of the IPA process | 10:16 |
dtantsur | mbuil: I don't think there is a password set, unless you set it explicitly when building an image | 10:16 |
dtantsur | mbuil: check https://docs.openstack.org/ironic-python-agent/latest/admin/troubleshooting.html#gaining-access-to-ipa-on-a-node | 10:17 |
openstackgerrit | Nisha Agarwal proposed openstack/ironic master: [WIP] Adds graphical console implementation for ilo drivers https://review.openstack.org/640973 | 10:18 |
S4ren | Hmm would setting an active node to available state cause it to self clean? | 10:18 |
mbuil | dtantsur: ok, thanks! any idea why nodes could be stuck for a very long time in "wait call-back"? | 10:21 |
dtantsur | mbuil: usually it's DHCP, PXE or otherwise networking problems | 10:21 |
arne_wiebalck | S4ren: how would you do that? | 10:24 |
mbuil | dtantsur: In the node, I can see a successful heartbeat and I see "ens1f0: state change: disconnected -> prepare". Then, "ens1f0: state change: prepare -> config". Then, "ens1f0: state change: config -> ip-config". Then, "ens1f0: activation: beginning transaction (timeout in 45s)". And finally "ens1f0: dhclient started with pid 3014" | 10:28 |
dtantsur | mbuil: just to double-check: make sure your nodes are not in maintenance | 10:30 |
mbuil | dtantsur: in the server I see "ironic_inspector.pxe_filter.base [-] The PXE filter driver NoopFilter, state=initialized left the fsm_reset_on_error context fsm_reset_on_error /usr/lib/python2.7/site-packages/ironic_inspector/pxe_filter/base.py:153" | 10:30 |
mbuil | dtantsur: maintenance is False for both nodes | 10:31 |
mbuil | dtantsur: and I also see in the server "dnsmasq-dhcp[27659]: DHCPDISCOVER(eth0) 5c:b9:01:8b:a6:30 ignored" :( | 10:32 |
dtantsur | mbuil: it may be ironic-inspector DHCP. anyway, if you have IPA running, you're past the DHCP stage. | 10:32 |
mbuil | dtantsur: ok, that's what I was wondering | 10:33 |
dtantsur | mbuil: also check that you're not affected by https://docs.openstack.org/ironic/latest/admin/troubleshooting.html#dhcp-during-pxe-or-ipxe-is-inconsistent-or-unreliable | 10:34 |
dtantsur | (you shouldn't since you're in IPA, but just in case) | 10:34 |
mbuil | dtantsur: I can also see from time to time in the node "cancel DHCP transaction", is that relevant? TBH, I am a bit lost because I am not sure what should be happening when node is in "wait call-back" state. It should contact the server with info about the node through HTTP? | 10:36 |
dtantsur | mbuil: what is happening is that IPA heartbeats into ironic, and as a reaction of these heartbeats ironic tells IPA to do something | 10:36 |
dtantsur | (partition a disk, flash an image, etc) | 10:36 |
dtantsur | so if you see heartbeats, ironic is supposed to react to them | 10:36 |
dtantsur | * successful heartbeats | 10:37 |
S4ren | arne_wiebalck, When an instance is active the user has the option of moving it into available state. This will undeploy it/wipe it and this is available in ironic-ui as well. Is this expected behavior? | 10:37 |
dtantsur | yes | 10:38 |
mbuil | dtantsur: I think they are successful: "Mar 05 10:37:13 linux ironic-api[26625]: 2019-03-05 10:37:13.760 26914 INFO eventlet.wsgi.server [req-48bd1c85-bf40-4275-8193-dd5eb5928868 - - - - -] 192.168.122.3 "POST /v1/heartbeat/e1369efa-5391-5035-8533-3a065c44a584 HTTP/1.1" status: 202 len: 298 time: 0.0552812" | 10:38 |
dtantsur | yep | 10:39 |
dtantsur | mbuil: anything in ironic-conductor logs? I wonder what it is doing. | 10:39 |
mbuil | dtantsur: nope, in journalctl I only see logs for ironic-inspector | 10:43 |
dtantsur | mbuil: maybe they're in /var/log/ironic? | 10:44 |
mbuil | dtantsur: nope, nothing. Is ironic-conductor the one supposed to tell IPA to do something? | 10:46 |
S4ren | Follow on point, is there a way to disable this behavior so that the only way to delete the instance is through nova, at least in ironic-ui? | 10:46 |
dtantsur | mbuil: yeah. you may need to turn DEBUG logging on if you haven't already. | 10:47 |
*** v12aml has quit IRC | 10:48 | |
arne_wiebalck | S4ren: “has the option“ you mean the ui offers this? | 10:48 |
dtantsur | S4ren: I don't think so | 10:48 |
S4ren | Yes it does | 10:48 |
S4ren | dtantsur, :( | 10:48 |
arne_wiebalck | dtantsur: what would be the cli equivalent? | 10:48 |
dtantsur | arne_wiebalck: 'openstack baremetal node undeploy'? | 10:49 |
arne_wiebalck | dtantsur: sure, but what I meant is to change the provisioning states | 10:49 |
dtantsur | arne_wiebalck: this is the command for "deleted" action. we no longer have a universal command for them. | 10:50 |
dtantsur | it's $ openstack baremetal node {undeploy,deploy,rebuild,clean,inspect,...} | 10:50 |
*** sdake has quit IRC | 10:51 | |
arne_wiebalck | dtantsur: right, hence my question to S4ren where the user has the option to move a node from active to available | 10:51 |
arne_wiebalck | “When an instance is active the user has the option of moving it into available state.” | 10:51 |
S4ren | That option is definetely there | 10:51 |
arne_wiebalck | S4ren: So, the ui offers this option and under the hood it calls ‘undeploy’? | 10:52 |
S4ren | It seems that way yes | 10:52 |
arne_wiebalck | S4ren: dtantsur: That sounds like a dangerous option … to say the least :) | 10:52 |
S4ren | It is somewhat confusing becouse normally moving a node to available is done from managable, and that doesnt have a specific affect apart from triggering cleaning if it is enabled | 10:53 |
S4ren | arne_wiebalck, Agreed | 10:53 |
*** v12aml has joined #openstack-ironic | 10:53 | |
dtantsur | sounds like a UX issue | 10:53 |
arne_wiebalck | dtantsur: ++ | 10:53 |
arne_wiebalck | S4ren: correct | 10:54 |
*** sdake has joined #openstack-ironic | 10:54 | |
S4ren | I might raise this as a bug/request, would the appropriate way be as a storyboard? | 10:56 |
arne_wiebalck | S4ren: yes | 10:57 |
mbuil | dtantsur: after turning on DEBUG logging, these are the logs. I can't see anything strange there but hopefully you can: http://paste.openstack.org/show/747279/ | 10:57 |
*** dougsz has joined #openstack-ironic | 10:58 | |
dtantsur | mbuil: it looks like IPA is still downloading the image. weird. is the image large? | 11:00 |
mbuil | dtantsur: can you tell me where you see that in the logs? The image is 870M | 11:04 |
openstackgerrit | Dmitry Tantsur proposed openstack/ironic-tempest-plugin master: Deploy templates: add API tests https://review.openstack.org/637187 | 11:05 |
dtantsur | mbuil: Status of agent commands for node e1369efa-5391-5035-8533-3a065c44a584: prepare_image: result "None", error "None" | 11:08 |
dtantsur | mbuil: note that if the image is now RAW, you need enough RAM on the target machine to fit it for conversion. | 11:08 |
mbuil | dtantsur: ok. It has just finished: "Image successfully written to node c1a56cb3-fcef-59d5-8105-7e6815154f70" but it took a long time :( | 11:08 |
mbuil | dtantsur: not sure why, this was much faster a few months ago | 11:09 |
dtantsur | mbuil: it can be some network condition OR it can be that ironic started converting images to RAW to stream them (without fitting in RAM) | 11:09 |
dtantsur | mbuil: for the latter, you can play with https://github.com/openstack/ironic/blob/f4576717ba8e37fdd5868370b0cdd5ac84e2b668/ironic/conf/agent.py#L37 | 11:10 |
mbuil | dtantsur: so let me understand this, if stream_raw_images=True, less memory is needed but it takes longer? RAM is 128GB so, more than enough | 11:14 |
dtantsur | mbuil: you're right. you may want to disable this option if so. | 11:22 |
arne_wiebalck | etingof: ping | 11:23 |
etingof | arne_wiebalck, o/ | 11:24 |
arne_wiebalck | etingof: hey o/ | 11:24 |
arne_wiebalck | etingof: we discussed some weeks ago to test the parallel power sync | 11:25 |
etingof | right | 11:25 |
arne_wiebalck | etingof: I could give that a try | 11:25 |
etingof | would be interesting! | 11:25 |
arne_wiebalck | etingof: could you point me to patches I’d need? | 11:25 |
arne_wiebalck | etingof: please ;) | 11:26 |
etingof | arne_wiebalck, I think they are all merged already | 11:26 |
etingof | so just current master should have everything in place | 11:26 |
etingof | arne_wiebalck, or do you want to cherry pick? | 11:27 |
arne_wiebalck | etingof: yeah … but I won’t be able to run master that easily | 11:27 |
arne_wiebalck | etingof: yes | 11:27 |
* arne_wiebalck checking master | 11:28 | |
etingof | arne_wiebalck, https://review.openstack.org/#/c/631872/ | 11:28 |
patchbot | patch 631872 - ironic - Parallelize periodic power sync calls (MERGED) - 5 patch sets | 11:28 |
etingof | arne_wiebalck, https://review.openstack.org/#/c/610007/ | 11:29 |
patchbot | patch 610007 - ironic - Kill misbehaving `ipmitool` process (MERGED) - 9 patch sets | 11:29 |
arne_wiebalck | etingof: awesome, thx, I’ll keep you posted | 11:29 |
etingof | arne_wiebalck, the first patch runs up to 8 ipmitool processes concurrently, the second patch kills off the hung ones | 11:30 |
etingof | there are a couple of follow ups, but these are insignificant I think | 11:30 |
arne_wiebalck | etingof: ok. I’ll start with the first one to see how that affects the power sync loop | 11:31 |
*** e0ne has quit IRC | 11:32 | |
*** andrein has quit IRC | 11:32 | |
dtantsur | thanks arne_wiebalck, this will be extremely useful | 11:38 |
arne_wiebalck | dtantsur: :) | 11:39 |
*** andrein has joined #openstack-ironic | 11:40 | |
etingof | arne_wiebalck, watch for long-running ipmitool processes then | 11:41 |
arne_wiebalck | etingof: ok! | 11:41 |
*** e0ne has joined #openstack-ironic | 11:42 | |
openstackgerrit | Merged openstack/python-ironicclient master: Deploy templates: client support https://review.openstack.org/636931 | 11:49 |
dtantsur | \o/ | 11:50 |
*** andrein has quit IRC | 11:53 | |
*** andrein has joined #openstack-ironic | 11:53 | |
*** lekhikadugtal has quit IRC | 11:57 | |
*** amoralej is now known as amoralej|lunch | 12:00 | |
*** jtomasek has joined #openstack-ironic | 12:06 | |
*** mmethot has quit IRC | 12:15 | |
*** bfournie has quit IRC | 12:20 | |
*** early` has quit IRC | 12:26 | |
*** andrein has joined #openstack-ironic | 12:28 | |
*** early` has joined #openstack-ironic | 12:29 | |
jroll | mornings | 12:30 |
Qianbiao | hello | 12:31 |
Qianbiao | if a node deploy failed, how we could re-deploy it. | 12:31 |
Qianbiao | Provisioning State is "deploy failed", Maintenance is False. | 12:32 |
iurygregory | dtantsur, do we need to change this https://github.com/openstack/ironic-python-agent/blob/80be07ae791980a1c444b3b0d685775c1688ca34/imagebuild/tinyipa/common.sh ? | 12:34 |
iurygregory | morning jroll o/ | 12:34 |
iurygregory | BRANCH_PATH will get master i think since we dont have stable/stein https://github.com/openstack/ironic-python-agent/blob/5b6bf0b6c86aad352db271cf530a6321ad4248eb/playbooks/ironic-python-agent-buildimage/run.yaml#L9 | 12:35 |
dtantsur | iurygregory: well, it does not get master, that's the problem. it's just empty. | 12:36 |
dtantsur | probably ZUUL_REFNAME is not a thing | 12:37 |
iurygregory | yeah | 12:38 |
iurygregory | going to check with infra to see | 12:38 |
*** sdake has quit IRC | 12:39 | |
dtantsur | iurygregory: https://github.com/openstack/openstack-manuals/commit/21edbc931fe09eedf8fa4e219fde1b222c9bce93 | 12:47 |
dtantsur | we probably need the ZUUL_BRANCH thingy | 12:47 |
iurygregory | http://git.openstack.org/cgit/openstack-infra/project-config/tree/playbooks/proposal/propose_update.sh#n82 | 12:47 |
iurygregory | i found this too | 12:47 |
iurygregory | should i go and try Ajaeger approach? | 12:49 |
dtantsur | iurygregory: I think so (we only need a branch though, not all of these) | 12:52 |
iurygregory | dtantsur, going to push a patch =) | 12:53 |
openstackgerrit | Harald Jensås proposed openstack/ironic master: Initial processing of network port events https://review.openstack.org/633729 | 12:59 |
openstackgerrit | Iury Gregory Melo Ferreira proposed openstack/ironic-python-agent master: Replace ZUUL_REFNAME for zuul.branch https://review.openstack.org/641007 | 13:05 |
openstackgerrit | Merged openstack/networking-generic-switch master: Adding python 3.6 unit test https://review.openstack.org/640796 | 13:12 |
openstackgerrit | Iury Gregory Melo Ferreira proposed openstack/ironic-python-agent master: Replace ZUUL_REFNAME for zuul.branch https://review.openstack.org/641007 | 13:13 |
*** sdake has joined #openstack-ironic | 13:20 | |
openstackgerrit | Riccardo Pittau proposed openstack/networking-baremetal master: Adding py36 environment to tox https://review.openstack.org/640794 | 13:26 |
*** PabloIranzoGmez[ has joined #openstack-ironic | 13:28 | |
*** amoralej|lunch is now known as amoralej | 13:28 | |
PabloIranzoGmez[ | hi | 13:30 |
PabloIranzoGmez[ | I'm trying to add a baremetal host to an environment with regular instances | 13:30 |
openstackgerrit | Riccardo Pittau proposed openstack/networking-baremetal master: Supporting all py3 environments with tox https://review.openstack.org/640794 | 13:30 |
*** dtantsur is now known as dtantsur|brb | 13:31 | |
PabloIranzoGmez[ | thing is that aggregateinstancespecfilter removes all 4 hosts (the ones in the virtual-host aggregate), but says nothing about the baremetal-hosts aggregate | 13:32 |
PabloIranzoGmez[ | where I've added the controllers | 13:32 |
PabloIranzoGmez[ | but I cannot add the baremetal host | 13:32 |
PabloIranzoGmez[ | however, baremetal host shows in details that is member of that aggregate | 13:32 |
PabloIranzoGmez[ | scheduling instance for baremetal fails because there aggregate for baremetal is not reported and it filters as expected all the virtual-hosts | 13:32 |
PabloIranzoGmez[ | osp release is queens | 13:33 |
PabloIranzoGmez[ | any hint? | 13:33 |
*** oanson has joined #openstack-ironic | 13:36 | |
openstackgerrit | Ilya Etingof proposed openstack/sushy-tools master: Add memoization to expensive emulator calls https://review.openstack.org/612758 | 13:40 |
*** priteau has quit IRC | 13:41 | |
*** Qianbiao has quit IRC | 13:42 | |
openstackgerrit | Lars Kellogg-Stedman proposed openstack/virtualbmc master: [WIP] Add Serial-over-LAN (SOL) support https://review.openstack.org/640888 | 13:43 |
*** priteau has joined #openstack-ironic | 13:52 | |
*** jcoufal has joined #openstack-ironic | 13:53 | |
*** sthussey has joined #openstack-ironic | 14:05 | |
*** mmethot has joined #openstack-ironic | 14:05 | |
*** bfournie has joined #openstack-ironic | 14:09 | |
*** sdake has quit IRC | 14:11 | |
*** mjturek has joined #openstack-ironic | 14:17 | |
openstackgerrit | Merged openstack/ironic master: Add option to protect available nodes from accidental deletion https://review.openstack.org/639264 | 14:21 |
*** sdake has joined #openstack-ironic | 14:21 | |
*** sdake has quit IRC | 14:23 | |
arne_wiebalck | Qianbiao: How about moving the node via manage & provide back to available and then retrigger the creation of an instance (if this is via nova)? | 14:29 |
arne_wiebalck | etingof: dtantsur: I tried the 1st power sync patch now on one of our controllers (which has ~700 nodes) | 14:30 |
arne_wiebalck | etingof: power sync loop took 105 seconds before | 14:30 |
arne_wiebalck | etingof: is now down to 42 secs | 14:31 |
arne_wiebalck | etingof: (default number of wrokers) | 14:31 |
*** rloo has joined #openstack-ironic | 14:32 | |
arne_wiebalck | etingof: I haven’t checked for ipmi outliers yet | 14:33 |
*** sdake has joined #openstack-ironic | 14:33 | |
arne_wiebalck | etingof: but with that patch we could go back to default values for the power sync interval | 14:34 |
TheJulia_sick | o/ | 14:37 |
TheJulia_sick | Hey everyone | 14:37 |
arne_wiebalck | Hey TheJulia_sick o/ | 14:37 |
rpittau | hi TheJulia_sick :) | 14:38 |
TheJulia_sick | rpittau: Hey, regarding python-hardware, do you think we could submit an update to constraints to get the upper-constraint updated since they just released? | 14:38 |
rloo | morning ironicers arne_wiebalck, TheJulia_sick, rpittau. | 14:38 |
arne_wiebalck | rloo o/ | 14:39 |
rloo | TheJulia_sick: you shouldn't be here... | 14:39 |
arne_wiebalck | rloo ++ | 14:39 |
*** dtantsur|brb is now known as dtantsur | 14:40 | |
rpittau | TheJulia_sick, yes definitely | 14:40 |
dtantsur | morning TheJulia_sick, how are you? | 14:40 |
rpittau | hi rloo :) | 14:41 |
dtantsur | also morning rloo | 14:41 |
TheJulia_sick | woot, and the wifey now has it | 14:41 |
rloo | hi dtantsur! | 14:41 |
TheJulia_sick | joy | 14:41 |
rloo | TheJulia_sick: :-( | 14:41 |
dtantsur | :( | 14:41 |
rloo | TheJulia_sick: nice that you are sharing... again :-( | 14:42 |
TheJulia_sick | rloo: she said the same thing... | 14:42 |
TheJulia_sick | I must have picked up the new flu varient that has been hitting the midwest united states | 14:42 |
TheJulia_sick | luckilly, if one has had the vaccine, its still awful, but fairly quick for the flu | 14:42 |
dtantsur | oh, I see | 14:42 |
rpittau | oh gosh :/ | 14:43 |
TheJulia_sick | s/farily quick/fairly rough and quick/ | 14:43 |
TheJulia_sick | I slept most of saturday/sunday | 14:43 |
dtantsur | TheJulia_sick: FYI I've been fast-approving attempts to make our IPA post job back into operation | 14:43 |
TheJulia_sick | dtantsur: oh.. joy | 14:43 |
dtantsur | yeah, it's been down since mid-Jan | 14:44 |
TheJulia_sick | dtantsur: ack ack, anything else I need to be aware of | 14:44 |
TheJulia_sick | sweet! | 14:44 |
* iurygregory zuulv3 magic | 14:44 | |
dtantsur | and I really want it back before FF | 14:44 |
dtantsur | TheJulia_sick: nothing else, go back to bed :) | 14:44 |
TheJulia_sick | dtantsur: eh, FF is only in our minds ;) | 14:44 |
iurygregory | get better TheJulia_sick o/ | 14:44 |
* TheJulia_sick <3s you all | 14:44 | |
rpittau | TheJulia_sick, take care :) | 14:45 |
* TheJulia_sick is still reading email :) | 14:46 | |
TheJulia_sick | dtantsur: btw, I don't know if you saw, the fast track change set actually worked in CI. Just the scenario test I built needs a little more work :) | 14:46 |
TheJulia_sick | s/built/cobbled together/ | 14:46 |
dtantsur | neat! I'll try to get to it after I finish going through TC candidate statements/discussions and finally vote.. | 14:47 |
rpittau | TheJulia_sick, one thing though, the old hardware package is not in the current upper-constraints, I guess it's because it's in plugin-requirements and not in requirements | 14:47 |
etingof | arne_wiebalck, \o/ | 14:47 |
TheJulia_sick | dtantsur: ack, I figure we can rip the depends-on off and take out the test job from running, and add it back in a later patch | 14:48 |
TheJulia_sick | just so we can get it merged and iterate from there | 14:48 |
dtantsur | I think it's a good idea | 14:48 |
*** sdake has quit IRC | 14:48 | |
dtantsur | rloo: if you have a minute today, https://review.openstack.org/#/c/639050/ would make my life somewhat easier | 14:49 |
patchbot | patch 639050 - ironic - Allow building configdrive from JSON in the API - 8 patch sets | 14:49 |
rloo | dtantsur: ok | 14:49 |
*** sdake has joined #openstack-ironic | 14:50 | |
* etingof wishes s/TheJulia_sick/TheJulia/ quickly | 14:54 | |
rloo | hjensas: hi, i'm looking at the neutron event stuff. wondering where you have gotten to. Do you have it working, tested, etc? There are two WIP PRs and I don't know if you will be adding more PRs, so wanted to get an idea. | 14:55 |
dtantsur | hjensas: .. and what's the situation around the networking-baremetal part? | 14:57 |
*** pcaruana has quit IRC | 14:57 | |
hjensas | rloo: I have only the add/remove cleaning network implemented. It's working, but a PoC. It does'nt cover the port update setting dhcp_opts either, so it's very early. Won't be done this week. | 14:58 |
dtantsur | hjensas: should we press to get it done in Stein then? how likely is it even? | 14:58 |
*** munimeha1 has joined #openstack-ironic | 14:59 | |
etingof | arne_wiebalck, I wonder why the time period reduction is far from being 8-fold though | 15:00 |
hjensas | dtantsur: not likely to finish in weeks yet. We need to have notifications in CI as well, the ironic jobs does not use the baremetal mech driver. So we would need that, or implement the notifier in neutron. | 15:01 |
arne_wiebalck | etingof: may be it’s dominated by sth uncompressible | 15:01 |
rloo | hjensas: thx for the status. I think that it won't get done in Stein. | 15:01 |
*** e0ne has quit IRC | 15:02 | |
dtantsur | right. then we should plan on early Train. | 15:02 |
dtantsur | This looks scary to rush in.. | 15:02 |
hjensas | rloo: dtantsur: yes, and we may want to have another discussion at PTG regarding mgoddard's idea of integrating it in steps? | 15:03 |
rloo | hjensas, dtantsur: yeah, i don't want to rush this in. I was thinking about what mark said yesterday about using steps (deploy/clean) and i think it makes sense. | 15:03 |
dtantsur | hjensas: totally. have you added it to the etherpad? | 15:03 |
rloo | hjensas: ++ :) | 15:03 |
*** mjturek has quit IRC | 15:03 | |
rloo | hjensas, dtantsur: are you OK if I remove the neutron events stuff from our weekly priorities? | 15:03 |
dtantsur | fine with me | 15:04 |
hjensas | rloo: fine with me as well. | 15:04 |
*** sdake has quit IRC | 15:04 | |
rloo | thx hjensas. | 15:04 |
arne_wiebalck | TheJulia_sick: jroll: dtantsur: a first version of our downstream software RAID patches is now available from: https://review.openstack.org/#/q/topic:software_raid | 15:06 |
dtantsur | arne_wiebalck: nice! what's left to get rid of [WIP]? | 15:06 |
arne_wiebalck | dtantsur: there are a few open points I’d like to discuss (I made corresponding task in the story as well) | 15:07 |
*** sdake has joined #openstack-ironic | 15:07 | |
dtantsur | arne_wiebalck: is it PTG level of discussion or just on the patches? | 15:07 |
etingof | arne_wiebalck, how about trying more power sync workers? ;) | 15:08 |
arne_wiebalck | etingof: doing that already … ;) | 15:09 |
arne_wiebalck | dtantsur: half and half I’d say | 15:09 |
dtantsur | I see. are you coming to the PTG, arne_wiebalck? | 15:10 |
arne_wiebalck | dtantsur: no, sorry | 15:10 |
arne_wiebalck | dtantsur: on the list for next one :) | 15:10 |
arne_wiebalck | dtantsur: the code is woking, btw, I verified it in our QA env with real machines | 15:10 |
arne_wiebalck | it’s not only woking it’s even working | 15:11 |
arne_wiebalck | dtantsur: I can add an item to the necxt weekly meeting | 15:11 |
dtantsur | arne_wiebalck: yes please, ideally with specific questions | 15:11 |
arne_wiebalck | dtantsur: just wanted to signal that the code is up in a first version | 15:12 |
arne_wiebalck | dtantsur: I can remove the [WIP] if that attracts more eyes | 15:12 |
*** e0ne has joined #openstack-ironic | 15:13 | |
dtantsur | arne_wiebalck: generally, if you think the code is worth merging, remove [WIP] | 15:13 |
arne_wiebalck | dtantsur: in that case I leave the [WIP] until we discussed some of the points :) | 15:14 |
arne_wiebalck | etingof: going to 32 workers does not change the timing | 15:15 |
arne_wiebalck | etingof: (assuming I sucessfully changed the no of workers) | 15:15 |
hjensas | dtantsur: added to the PTG etherpad to discuss neutron events again. | 15:17 |
dtantsur | thanks! | 15:18 |
rloo | dtantsur: was this approved? https://storyboard.openstack.org/#!/story/2005083 | 15:18 |
rloo | dtantsur: jroll seems good with it. | 15:19 |
dtantsur | rloo: apparently not officially. jroll said okay in the comment, TheJulia_sick agreed on the meeting (IIRC) | 15:19 |
dtantsur | if you agree, we can call it approved | 15:19 |
openstackgerrit | Iury Gregory Melo Ferreira proposed openstack/python-ironicclient master: Move to zuulv3 https://review.openstack.org/633010 | 15:22 |
* iurygregory starts to pray for the CI | 15:23 | |
rloo | dtantsur: trying to parse the description of that story. The configdrive now can be string or gzip-base-64-encoded. You want to accept configdrive as a json with 3? (not 2) keys? | 15:24 |
dtantsur | rloo: up to three keys, yes (did I leave two somewhere still? it's from an older version) | 15:25 |
rloo | dtantsur: ok, got it. with user_data value maybe being json or not, but the other two have jsons as their values. | 15:25 |
etingof | arne_wiebalck, is it plausible that you have one slow BMC on which ipmitool blocks for 42 sec? | 15:26 |
dtantsur | rloo: exactly | 15:26 |
rloo | dtantsur: and the only defaults will be for meta_data. if the others aren't specified we don't do anything. | 15:26 |
dtantsur | rloo: correct | 15:26 |
rloo | ok, got it. thx. will approve. | 15:26 |
dtantsur | thnx! | 15:26 |
* etingof wonders if arne_wiebalck applied the killer patch as well...? | 15:27 | |
*** Chaserjim has joined #openstack-ironic | 15:27 | |
arne_wiebalck | etingof: I verified that I indeed increased the number of threads. | 15:27 |
arne_wiebalck | etingof: no killer patch yet | 15:27 |
arne_wiebalck | etingof: I was thinking switching to debug mode and see if there is a slow “power status” call … not sure if I will find it, though | 15:28 |
*** openstackgerrit has quit IRC | 15:28 | |
* arne_wiebalck sharpens the grep knife | 15:28 | |
arne_wiebalck | etingof: you’d think I should go with the killer patch first? | 15:29 |
*** bfournie has quit IRC | 15:29 | |
etingof | this killer patch is the most reliable part of the whole system, arne_wiebalck | 15:33 |
etingof | so I'd try it sooner or later | 15:33 |
*** openstackgerrit has joined #openstack-ironic | 15:33 | |
openstackgerrit | Merged openstack/ironic-python-agent master: Replace ZUUL_REFNAME for zuul.branch https://review.openstack.org/641007 | 15:33 |
etingof | but debugging what's happening first makes sense as well | 15:33 |
rloo | dtantsur: sorry, another question. are you going to update the client to support that new configdrive format? | 15:37 |
dtantsur | rloo: so, this is something I've been postponing because ironicclient already has decent support for building configdrives. | 15:38 |
rloo | dtantsur: well, just seems inconsistent. would someone using the client, want to pass in json for the configdrive? | 15:39 |
dtantsur | rloo: okay, I can (and I will) fix https://github.com/openstack/python-ironicclient/blob/9cd584548a77492bfa0f41e7ea72546baed4a58d/ironicclient/v1/node.py#L537-L540 to now blow up on a dict | 15:39 |
dtantsur | but CLI already supports passing a directory, which should be enough | 15:39 |
rloo | dtantsur: i'm fine if you do it in train. i was just wondering if it was something that we wanted to land this week | 15:39 |
dtantsur | rloo: the ironicclient fix will be small, I'll prepare it right now. | 15:40 |
rloo | dtantsur: ok. now i'll look at your ironic PR :) | 15:40 |
*** pcaruana has joined #openstack-ironic | 15:42 | |
openstackgerrit | Merged openstack/ironic-tempest-plugin master: Deploy templates: add API tests https://review.openstack.org/637187 | 15:44 |
*** sdake has quit IRC | 15:49 | |
*** mjturek has joined #openstack-ironic | 15:49 | |
dtantsur | huh, I seem to have found a bug in our old configdrive code.. | 15:52 |
rpittau | just realized gate is broken for ironic-introspection :/ | 15:53 |
iurygregory | D: | 15:53 |
dtantsur | SIGH | 15:53 |
dtantsur | there was a patch this morning | 15:54 |
dtantsur | this https://review.openstack.org/#/c/640910/ ? | 15:54 |
patchbot | patch 640910 - ironic-tempest-plugin - inspector py3 gate fix - 1 patch set | 15:54 |
rpittau | yes | 15:55 |
rpittau | but I was just rechecking like a blind lemming | 15:55 |
rpittau | I guess I'll add a depends on until that is merged | 15:56 |
openstackgerrit | Arkady Shtempler proposed openstack/ironic-tempest-plugin master: Test BM with VM on the same network https://review.openstack.org/636598 | 15:56 |
mbuil | dtantsur: do you know what network protocol is it used to transmit the image to the nodes? | 15:56 |
dtantsur | mbuil: depends on deploy_interface used. I assume you use "direct", thus HTTP(s) | 15:58 |
iurygregory | rpittau, just remember that depends-on wont trigger merge if your patch have +2 +W XD | 15:58 |
mbuil | dtantsur: that's what I thought but I can't see any HTTP traffic going on :/, need to investigate | 15:59 |
rpittau | iurygregory, yeah, it should be ok | 15:59 |
openstackgerrit | Dmitry Tantsur proposed openstack/python-ironicclient master: Support passing a dictionary for configdrive https://review.openstack.org/641061 | 15:59 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-inspector master: [trivial] removing pagination in Get Introspection Data api-ref https://review.openstack.org/640705 | 16:00 |
*** sdake has joined #openstack-ironic | 16:01 | |
*** ajya[m] has quit IRC | 16:01 | |
*** ajya[m] has joined #openstack-ironic | 16:02 | |
arne_wiebalck | etingof: there are some nodes where the lock for the power sync state is held 6+ seconds | 16:02 |
etingof | arne_wiebalck, should we power sync locked nodes later in time...? | 16:05 |
arne_wiebalck | etingof: the lock is shared, so is the time the lock is taken relevant? | 16:06 |
arne_wiebalck | etingof: it is shared, right? | 16:06 |
arne_wiebalck | etingof: oh, wait … I missed that the call time seems to be logged! | 16:09 |
* arne_wiebalck collects more data | 16:09 | |
openstackgerrit | Dmitry Tantsur proposed openstack/python-ironicclient master: Support passing a dictionary for configdrive https://review.openstack.org/641061 | 16:11 |
dtantsur | rloo: ^^ (even with CLI, it proved straightforward) | 16:13 |
* dtantsur -> exercising | 16:13 | |
rloo | dtantsur: you're faster than me :) | 16:13 |
etingof | arne_wiebalck, yes, lock is shared | 16:14 |
arne_wiebalck | etingof: I see 4 BMCs replying only in 4+ seconds plus 10 or so in 1+ secs | 16:14 |
dtantsur | heh | 16:14 |
openstackgerrit | Iury Gregory Melo Ferreira proposed openstack/python-ironicclient master: Move to zuulv3 https://review.openstack.org/633010 | 16:14 |
arne_wiebalck | etingof: if ~40 secs is b/c this is simply the sum of the ipmi calls of the “slowest” worker I don’t understand why more workers don’t help | 16:15 |
arne_wiebalck | etingof: but the spectrum of ipmi runtimes varies from 0.03s to 5.1s | 16:17 |
etingof | arne_wiebalck, there are just 8 workers, right? if you have 8 slow BMCs, everything beyond 8 would accumulate time perhaps | 16:17 |
*** bfournie has joined #openstack-ironic | 16:17 | |
arne_wiebalck | etingof: 32 workers atm | 16:17 |
arne_wiebalck | etingof: let me go to 4 :) | 16:18 |
etingof | arne_wiebalck, I think you should sum up all runtimes and divide by 32 | 16:18 |
etingof | arne_wiebalck, you should get the Answer to the Ultimate Question of Life, the Universe, and Everything | 16:19 |
arne_wiebalck | etingof: ha! | 16:19 |
rpittau | lol | 16:20 |
* arne_wiebalck calculates | 16:20 | |
arne_wiebalck | etingof: total time divided by workers is ~6 | 16:22 |
arne_wiebalck | etingof: 201 secs and 32 workers | 16:22 |
arne_wiebalck | etingof: this is the max locking time and close to the max ipmitool runtime | 16:23 |
etingof | arne_wiebalck, can it mean that the workers are idling, but you still have a chunk of slow BMCs that line-up to 42 secs? | 16:24 |
arne_wiebalck | etingof: if the slow BMCs go all the same worker you mean? | 16:24 |
etingof | arne_wiebalck, yes, that would be the *slowest* worker meaning there could be more than one | 16:25 |
*** kandi has joined #openstack-ironic | 16:25 | |
etingof | arne_wiebalck, but you see that some workers are waiting on the shared lock? | 16:26 |
etingof | arne_wiebalck, if you have an exclusive lock on a node, that would cause power sync to block, right? | 16:27 |
arne_wiebalck | etingof: the max ipmitool time is 5sces, the max lock time is 6secs, and I was assuming these were the same workers | 16:27 |
etingof | arne_wiebalck, is it 'do_sync_power_state' time that you count? | 16:28 |
arne_wiebalck | etingof: it’s the ipmitool call time which is printed in debug mode | 16:29 |
arne_wiebalck | etingof: the total time is from instrumenting _sync_power_states | 16:30 |
etingof | arne_wiebalck, _sync_power_states is misleading - it's actually instruments (1) spawning N-1 threads and (2) running N-th thread from start to finish | 16:32 |
arne_wiebalck | etingof: correct | 16:32 |
etingof | arne_wiebalck, 'do_sync_power_state' seems to measure a single ipmitool runtime | 16:33 |
arne_wiebalck | etingof: do you know what the METRICS.imer decorator does? | 16:38 |
arne_wiebalck | etingof: METRICS.timer | 16:38 |
etingof | arne_wiebalck, my understanding is that it measures the runtime of the decorated callable, no? | 16:40 |
arne_wiebalck | etingof: right … do you know how to access the generated data? | 16:41 |
arne_wiebalck | etingof: it doesn’t seem to go to the debug output | 16:41 |
arne_wiebalck | etingof: 4 threads is roughly the same as 8 threads | 16:42 |
iurygregory | dtantsur, we will only know if the patch worked tomorrow? | 16:43 |
*** tssurya has quit IRC | 16:45 | |
etingof | arne_wiebalck, statsd? | 16:46 |
etingof | arne_wiebalck, https://docs.openstack.org/ironic/ocata/deploy/metrics.html | 16:46 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-ui master: Supporting all py3 environments with tox https://review.openstack.org/641080 | 16:46 |
arne_wiebalck | etingof: yes … not configured | 16:47 |
etingof | arne_wiebalck, so the runtimes from debug - can we trust them? | 16:49 |
etingof | arne_wiebalck, where in the code is it calculated...? | 16:50 |
etingof | arne_wiebalck, we have also tried 'sa' against ipmitool if memory serves | 16:51 |
arne_wiebalck | etingof: yes we did! | 16:52 |
arne_wiebalck | etingof: let me see if I can find out where the 40s come from | 16:53 |
etingof | arne_wiebalck that would be extremely useful - may be we could squeeze some more time out of the power sync loop | 16:54 |
arne_wiebalck | etingof: my current guess is that the slow BMCs end up with the same worker | 16:54 |
arne_wiebalck | etingof: the nodes are ordered, no? | 16:55 |
etingof | arne_wiebalck, the seems unlikely | 16:55 |
arne_wiebalck | etingof: fully agree | 16:55 |
etingof | arne_wiebalck, the nodes might be ordered in the shared queue, then the workers pick up one job at a time | 16:56 |
mbuil | dtantsur: after one hour it finished the dump of the image, however, I only saw 8080 traffic two minutes before it worked. I think IPA is stuck somewhere and after an hour, it tries to download the image but no idea where. Any thing that comes to your mind? | 16:56 |
arne_wiebalck | etingof: yes, that was my understanding | 16:56 |
etingof | arne_wiebalck, so how slow nodes can end up in a single worker then? | 16:57 |
openstackgerrit | Harald Jensås proposed openstack/ironic master: Initial processing of network port events https://review.openstack.org/633729 | 16:57 |
openstackgerrit | Harald Jensås proposed openstack/ironic master: WiP - Implement Event Handler in driver interfaces https://review.openstack.org/637840 | 16:57 |
openstackgerrit | Harald Jensås proposed openstack/ironic master: WIP - Cleaning network - events https://review.openstack.org/637841 | 16:57 |
arne_wiebalck | etingof: I don’t know | 16:57 |
etingof | arne_wiebalck, but the answer is still 42 | 16:57 |
arne_wiebalck | etingof: it’s just my theory … no proof | 16:57 |
arne_wiebalck | etingof: yes … more or less independent from the number of workers | 16:58 |
etingof | arne_wiebalck, at least 1/4 of nodes are slow enough to occupy all 4 workers for 42 secs? | 16:59 |
etingof | arne_wiebalck, or can it be locking...? | 16:59 |
*** andrein has quit IRC | 17:00 | |
etingof | (I am saying 4 workers as you say that's the breaking point) | 17:00 |
arne_wiebalck | etingof: I guess I should try 2 and 1 | 17:01 |
etingof | arne_wiebalck, well, I think we started with one... | 17:02 |
arne_wiebalck | etingof: 1 was the old code | 17:02 |
etingof | arne_wiebalck, right, but for 1 worker the new code behaves the same | 17:02 |
dtantsur | iurygregory: periodic tasks can be watched like http://zuul.openstack.org/builds?job_name=ironic-python-agent-buildimage-tinyipa | 17:02 |
dtantsur | iurygregory: I see a new version at https://tarballs.openstack.org/ironic-python-agent/tinyipa/files/ \o/ | 17:03 |
dtantsur | FYI folks watch out for new IPA-related failures, since the version we use in gate was just updated (the previous one was from mid-January) | 17:08 |
-openstackstatus- NOTICE: Gerrit is being restarted for a configuration change, it will be briefly offline. | 17:10 | |
*** jistr|sick is now known as jistr | 17:14 | |
*** e0ne has quit IRC | 17:18 | |
arne_wiebalck | etingof: 2 threads ~60 secs | 17:19 |
arne_wiebalck | : etingof: 2 threads ~120 secs | 17:19 |
arne_wiebalck | etingof: sorry, 1 threads ~120 secs | 17:20 |
arne_wiebalck | etingof: scales nicely at the beginning, then hits 42 secs | 17:21 |
etingof | arne_wiebalck, so current suspects are 1) a subset of slow nodes and 2) exclusively locked nodes | 17:22 |
arne_wiebalck | etingof: the ipmitool runtime output confirms we have some nodes which need ~5 secs to get their power state | 17:23 |
arne_wiebalck | etingof: I can try that by hand to confirm it’s real | 17:23 |
dtantsur | I can easily believe that. I've seen rare cases when it took more than a minute.. | 17:23 |
etingof | arne_wiebalck, will these notes take up 42 secs combined across 3-4 threads? | 17:23 |
arne_wiebalck | etingof: well, the total of the ipmi calls is ~200 secs | 17:24 |
arne_wiebalck | etingof: so, the right combination will, yes | 17:24 |
etingof | arne_wiebalck, that does not explain why 32 workers do not beat 42 secs | 17:25 |
arne_wiebalck | etingof: agreed, this should mix up the distribution | 17:25 |
dtantsur | arne_wiebalck: how much CPU is ironic-conductor using during power sync? | 17:26 |
dtantsur | maybe we're shelving there? | 17:26 |
arne_wiebalck | dtantsur: looks pretty busy indeed | 17:28 |
dtantsur | we may be just hitting the limitation of a single process.. | 17:28 |
arne_wiebalck | dtantsur: so with 1 thread we should a significantly lower load … | 17:29 |
dtantsur | not sure, it's one OS thread anyway | 17:34 |
arne_wiebalck | hmm, so picking one “slow” node and running ipmitool against it to check the power status, it returns in <<1 sec | 17:34 |
dtantsur | with more green threads we spend less time just spinning and waiting, but the CPU load should be comparable? | 17:35 |
*** sdake has quit IRC | 17:35 | |
arne_wiebalck | running ipmitool multiple times, the runtime has basically two values: ~0secs and ~5secs, seems reproducible | 17:40 |
arne_wiebalck | there are two levels | 17:40 |
arne_wiebalck | the 5 secs comes from the ’-N 5’ the code uses | 17:41 |
etingof | arne_wiebalck, may be try enabling debugging in ipmitool? looks like a timeout and retry? | 17:41 |
etingof | or change -N to see if it reflects runtime | 17:42 |
arne_wiebalck | etingof: it does | 17:43 |
arne_wiebalck | etingof: with “-N 10”, it sometimes takes now 10secs | 17:43 |
arne_wiebalck | etingof: still comes back ok | 17:43 |
etingof | arne_wiebalck, how about hacking down min_command_interval in ironic? | 17:45 |
etingof | that's in ipmi config | 17:46 |
* arne_wiebalck checking … | 17:46 | |
arne_wiebalck | etingof: min_command_interval is now 1 sec, workers=8 … let’s see | 17:48 |
* etingof is thrilled | 17:48 | |
arne_wiebalck | lol | 17:48 |
arne_wiebalck | power sync is now 34 secs | 17:50 |
* arne_wiebalck is disappointed | 17:50 | |
etingof | at least arne_wiebalck did not kill the BMCs | 17:51 |
* etingof is trying to cheer arne_wiebalck up | 17:51 | |
arne_wiebalck | etingof: :) | 17:51 |
dtantsur | rloo: thanks for review! I'll think about a tempest test (we don't have API tests for configdrive yet) | 17:51 |
rpittau | good night o/ | 17:52 |
etingof | arne_wiebalck, did you try ipmitool -N 1 by hand? | 17:52 |
arne_wiebalck | etingof: yes | 17:52 |
rloo | dtantsur: ok. i am fine with follow up to that PR, just wanted your comments first. so let me know. | 17:52 |
*** rpittau is now known as rpittau|afk | 17:52 | |
arne_wiebalck | etingof: wait, no | 17:52 |
arne_wiebalck | etingof: I tried -N 10 | 17:52 |
* TheJulia_sick needs lots of coffeee | 17:52 | |
etingof | arne_wiebalck, try -N 1 | 17:52 |
etingof | ipmitool is a smart beast | 17:53 |
arne_wiebalck | etingof: gives a plateau at 1 sec | 17:53 |
arne_wiebalck | etingof: -N clearly influences the total runtime of ipmitool | 17:53 |
*** gyee has joined #openstack-ironic | 17:54 | |
arne_wiebalck | etingof: but not so much the total time to power sync | 17:54 |
etingof | arne_wiebalck, how about trying -N 1 with 1 and 2 workers again? | 17:54 |
arne_wiebalck | etingof: trying … | 17:55 |
etingof | the desire for coffee must be a sign of recovery, TheJulia_sick | 17:56 |
*** S4ren has quit IRC | 17:56 | |
TheJulia_sick | I am feeling decent enough that I did put on clothing that makes me look like I stepped out of the 1950s again | 17:57 |
TheJulia_sick | also... everything is in the laundry | 17:57 |
openstackgerrit | Iury Gregory Melo Ferreira proposed openstack/python-ironicclient master: Move to zuulv3 https://review.openstack.org/633010 | 17:57 |
arne_wiebalck | etingof: min_command_interval=1 and workers=2 gives 53 secs | 17:58 |
arne_wiebalck | etingof: compared to 60 before | 17:58 |
* arne_wiebalck needs to stop for today | 17:58 | |
etingof | arne_wiebalck, does it mean that the bottleneck is somewhere else? | 17:58 |
arne_wiebalck | etingof: I think so | 17:59 |
arne_wiebalck | etingof: the main contributor is sth else | 17:59 |
etingof | lets sleep on it | 17:59 |
arne_wiebalck | etingof: +1 | 17:59 |
etingof | I will be at the university tomorrow in the morning, back by noon | 17:59 |
arne_wiebalck | etingof: ok, thx for your help today! | 18:00 |
etingof | thank you, arne_wiebalck! | 18:00 |
* etingof is still trying to imagine the fashion of fifties | 18:01 | |
openstackgerrit | Dmitry Tantsur proposed openstack/ironic master: Allow building configdrive from JSON in the API https://review.openstack.org/639050 | 18:08 |
dtantsur | rloo: updated, thanks ^^^ | 18:08 |
rloo | dtantsur: thx | 18:09 |
dtantsur | ugh, I think the huawei CI went insane: https://review.openstack.org/#/c/639050/ | 18:10 |
patchbot | patch 639050 - ironic - Allow building configdrive from JSON in the API - 9 patch sets | 18:10 |
TheJulia_sick | dtantsur: that looks like there are cogs and some smoke coming out of that CI | 18:12 |
dtantsur | I wonder if we can make it stop somehow without waiting for their morning.. | 18:13 |
TheJulia_sick | etingof: I have to take a new bio picture in the next day or two, I'll share when I do | 18:13 |
TheJulia_sick | There morning is in like 8 hours... | 18:13 |
openstackgerrit | Dmitry Tantsur proposed openstack/ironic master: Allow building configdrive from JSON in the API https://review.openstack.org/639050 | 18:15 |
dtantsur | rloo: mostly dummy update to try stop the CI madness ^^ | 18:15 |
rloo | dtantsur: ok | 18:15 |
TheJulia_sick | 45 minutes until my next meeting \o/ | 18:15 |
dtantsur | oh my, it became worse now.. | 18:16 |
dtantsur | TheJulia_sick: I'm asking an intervention from infra | 18:16 |
TheJulia_sick | dtantsur: ++ | 18:16 |
* TheJulia_sick is going to have to lay down in a little bit | 18:19 | |
*** e0ne has joined #openstack-ironic | 18:19 | |
dtantsur | ++ | 18:19 |
* etingof would consider trading a meeting for a day-off | 18:20 | |
TheJulia_sick | Yeah, after the meeting I'm likely going to take a nap | 18:23 |
*** amoralej is now known as amoralej|off | 18:28 | |
*** Chaserjim has quit IRC | 18:30 | |
etingof | ++ | 18:31 |
*** baha has joined #openstack-ironic | 18:33 | |
openstackgerrit | Julia Kreger proposed openstack/ironic master: fast tracked deployment support https://review.openstack.org/635996 | 18:35 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Add fast-track testing https://review.openstack.org/641104 | 18:35 |
TheJulia_sick | dtantsur: ^^ split the scenario test apart so the code can be merged as the test still needs some work | 18:35 |
TheJulia_sick | and... it doesn't make sense to require the test to merge before the feature | 18:35 |
dtantsur | yeah :) | 18:36 |
*** andrein has joined #openstack-ironic | 18:56 | |
*** kanikagupta has joined #openstack-ironic | 19:15 | |
*** e0ne has quit IRC | 19:17 | |
*** dtantsur is now known as dtantsur|afk | 19:31 | |
dtantsur|afk | g'night | 19:31 |
TheJulia_sick | sleep sounds good | 19:31 |
*** sdake has joined #openstack-ironic | 19:33 | |
*** e0ne has joined #openstack-ironic | 19:37 | |
larsks | What is responsible for populating iscsi targets? After resting the controller, a host configured for boot-from-volume is ACTIVE, but there are no iscsi targets defined. I was hoping that 'openstack server stop' followed by 'openstack server start' would re-create the targets, but no luck. | 19:53 |
*** sdake has quit IRC | 20:17 | |
*** sdake has joined #openstack-ironic | 20:18 | |
*** whoami-rajat has quit IRC | 20:22 | |
*** kanikagupta has quit IRC | 20:25 | |
TheJulia_sick | larsks: no iscsi targets defined where? | 20:35 |
TheJulia_sick | larsks: power off/power on should cause ironic to attempt to update what it has on file from cinder. | 20:36 |
*** andrein has quit IRC | 20:52 | |
*** andrein has joined #openstack-ironic | 20:53 | |
*** anupn has joined #openstack-ironic | 20:56 | |
*** pcaruana has quit IRC | 21:10 | |
*** anupn has quit IRC | 21:13 | |
*** e0ne has quit IRC | 21:13 | |
*** dsneddon has quit IRC | 21:18 | |
*** jcoufal has quit IRC | 21:25 | |
*** e0ne has joined #openstack-ironic | 21:26 | |
*** e0ne has quit IRC | 21:30 | |
*** mjturek has quit IRC | 21:43 | |
*** baha has quit IRC | 21:43 | |
larsks | TheJulia_sick: in answer to the first question: in the kernel. The controller is not offering any iscsi targets. Starting and stopping the server in Nova does not restore the targets. | 21:57 |
larsks | Destroying and re creating the server seems to work. | 21:57 |
*** anupn has joined #openstack-ironic | 22:01 | |
*** priteau has quit IRC | 22:01 | |
*** dougsz has quit IRC | 22:09 | |
*** MattMan_1 has quit IRC | 22:11 | |
*** MattMan_1 has joined #openstack-ironic | 22:11 | |
openstackgerrit | Lin Yang proposed openstack/sushy master: Fix wrong default JsonDataReader() argument https://review.openstack.org/641146 | 22:26 |
*** sdake has quit IRC | 22:28 | |
*** mmethot has quit IRC | 22:34 | |
*** sdake has joined #openstack-ironic | 22:37 | |
*** andrein has quit IRC | 22:40 | |
*** sdake has quit IRC | 22:45 | |
*** sdake has joined #openstack-ironic | 22:50 | |
*** dsneddon has joined #openstack-ironic | 22:51 | |
*** munimeha1 has quit IRC | 22:52 | |
*** dsneddon has quit IRC | 22:56 | |
*** anupn has quit IRC | 22:58 | |
*** sdake has quit IRC | 23:02 | |
*** dsneddon has joined #openstack-ironic | 23:10 | |
*** hwoarang has quit IRC | 23:14 | |
*** dsneddon has quit IRC | 23:15 | |
*** hwoarang has joined #openstack-ironic | 23:18 | |
*** dsneddon has joined #openstack-ironic | 23:25 | |
*** sdake has joined #openstack-ironic | 23:33 | |
*** derekh has quit IRC | 23:36 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!