*** tosky has quit IRC | 00:17 | |
*** k_mouza has joined #openstack-ironic | 00:21 | |
*** k_mouza has quit IRC | 00:25 | |
*** jamesdenton has quit IRC | 00:26 | |
*** jamesden_ has joined #openstack-ironic | 00:26 | |
mnaser | TheJulia, arne_wiebalck: I believe I figured out the culprit, we don't erase metadata on RAID member drives (nor do we wipe them), so it ends up ignoring those devices and then running shred on the RAID array | 00:55 |
---|---|---|
mnaser | `smartctl -d ata /dev/md127 -g security` | 00:56 |
mnaser | Running cmd (subprocess): shred --force --zero --verbose --iterations 1 /dev/md127 execute /opt/ironic-python-agent/lib64/python3.6/site-packages/oslo_concurrency/processutils.py:384 | 00:56 |
mnaser | CMD "shred --force --zero --verbose --iterations 1 /dev/md127" returned: 0 in 1848.313s execute /opt/ironic-python-agent/lib64/python3.6/site-packages/oslo_concurrency/processutils.py:423 | 00:56 |
mnaser | 30 minutes to run shred against the RAID array | 00:57 |
mnaser | IMHO, the better approach is to just run delete_configuration, erase_device_metadata, erase_device, create_configuration on every cleaning instead? | 00:57 |
mnaser | does automated cleaning not do that by default? | 00:57 |
mnaser | aha! | 01:02 |
mnaser | ynchronous command get_clean_steps completed: {'clean_steps': {'GenericHardwareManager': [{'step': 'erase_devices', 'priority': 10, 'interface': 'deploy', 'reboot_requested': False, 'abortable': True}, {'step': 'erase_devices_metadata', 'priority': 99, 'interface': 'deploy', 'reboot_requested': False, 'abortable': True}, {'step': 'delete_configuration', 'priority': 0, 'interface': 'raid', 'reboot_requested': | 01:02 |
mnaser | False, 'abortable': True}, {'step': 'create_configuration', 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}]}, 'hardware_manager_version': {'generic_hardware_manager': '1.1'}} | 01:02 |
*** paras333 has joined #openstack-ironic | 01:06 | |
*** iurygregory has quit IRC | 01:07 | |
*** paras333 has quit IRC | 01:10 | |
mnaser | i guess there's no way of changing the prio of delete_configuration and create_configuration ? | 01:11 |
eandersson | Anyone seeing this error? We believe it started after rebasing stable/victoria | 01:13 |
eandersson | > Unexpected response from the agent for node <uuid>: the running command list does not include prepare_image or its result is malformed | 01:13 |
mnaser | eandersson: did you update your ipa too? | 01:20 |
eandersson | Yea - should be, but can double check. | 01:21 |
mnaser | eandersson: /var/log/ironic/deploy/*.tar.gz might help too? | 01:22 |
eandersson | ah yea good call | 01:22 |
mnaser | that'll also help identify the ipa version too | 01:22 |
eandersson | Could be related to IPA version. I see "deploy_results", but not "command_results" in the logs. | 01:26 |
eandersson | It didn't create a log for some reason under /var/log/ironic/deploy :'( | 01:28 |
*** iurygregory has joined #openstack-ironic | 01:30 | |
eandersson | mnaser does the image building pipeline work for you? :D | 01:35 |
eandersson | > https://bootstrap.pypa.io/3.5/3.5/get-pip.py | 01:35 |
eandersson | It's trying to get this for us and throwing a 404 | 01:35 |
eandersson | oh lol nvm | 01:35 |
mnaser | eandersson: which image building one exactly? | 01:35 |
mnaser | for ipa? | 01:35 |
eandersson | 3.5/3.5 not sure why this is part of it | 01:35 |
eandersson | Yea | 01:35 |
mnaser | i can run a build now | 01:35 |
eandersson | I think I figured it out haha we manually patched a bug in the ipa build process that has been fixed now | 01:36 |
mnaser | eandersson: you using ironic-python-agent-builder? | 01:36 |
mnaser | that what i use :) | 01:36 |
eandersson | I think we started with that, but switched to just using the disk image builder at some point | 01:40 |
mnaser | eandersson: i mean it pretty much just runs DIB :) | 01:42 |
eandersson | Yea - was gonna say it's probably the same, since we use the elements from the ipa-builder | 01:48 |
*** zzzeek has quit IRC | 01:49 | |
*** zzzeek has joined #openstack-ironic | 01:51 | |
*** paras333 has joined #openstack-ironic | 01:57 | |
*** paras333 has quit IRC | 02:01 | |
openstackgerrit | Merged openstack/ironic-python-agent stable/ussuri: Pin version of ipa-builder when publishing image https://review.opendev.org/c/openstack/ironic-python-agent/+/778021 | 02:18 |
*** rcernin has quit IRC | 02:34 | |
*** rcernin has joined #openstack-ironic | 02:47 | |
*** rloo has quit IRC | 02:50 | |
*** mkrai has joined #openstack-ironic | 03:16 | |
*** jamesden_ is now known as jamesdenton | 03:29 | |
*** zzzeek has quit IRC | 04:32 | |
*** zzzeek has joined #openstack-ironic | 04:33 | |
*** rh-jlabarre has quit IRC | 04:35 | |
*** tzumainn has quit IRC | 05:11 | |
openstackgerrit | Jacob Anders proposed openstack/ironic master: Add support for using NVMe specific cleaning https://review.opendev.org/c/openstack/ironic/+/778134 | 05:27 |
*** anuradha1904 has joined #openstack-ironic | 05:29 | |
openstackgerrit | Jacob Anders proposed openstack/ironic-python-agent master: Remove nvme-cli warning and delay on nvme-format https://review.opendev.org/c/openstack/ironic-python-agent/+/778136 | 05:41 |
openstackgerrit | Yogesh proposed openstack/ironic master: Add idrac HW type IPMI interface support https://review.opendev.org/c/openstack/ironic/+/771862 | 05:53 |
*** lbragstad_ has joined #openstack-ironic | 06:03 | |
*** lbragstad has quit IRC | 06:06 | |
*** zzzeek has quit IRC | 06:10 | |
*** zzzeek has joined #openstack-ironic | 06:11 | |
*** k_mouza has joined #openstack-ironic | 06:21 | |
*** k_mouza has quit IRC | 06:25 | |
openstackgerrit | ankit proposed openstack/ironic master: Adds config parameter kernel_append_param for iLO https://review.opendev.org/c/openstack/ironic/+/755189 | 06:27 |
*** gyee has quit IRC | 06:47 | |
*** rcernin has quit IRC | 06:57 | |
*** paras333 has joined #openstack-ironic | 07:24 | |
*** jawad_axd has joined #openstack-ironic | 07:25 | |
*** moshiur has joined #openstack-ironic | 07:28 | |
arne_wiebalck | mnaser: this sequence is exactly what we do, in our downstream h/w manager so it is part of automated cleaning | 07:37 |
mnaser | arne_wiebalck: i'm working on a patch that allows overriding the prio for clean and create config | 07:38 |
arne_wiebalck | mnaser: I think we always assume cleaning is happening between deploys | 07:40 |
mnaser | arne_wiebalck: right, but the cleaning by default has prio of 0 for clean and create config | 07:41 |
mnaser | and what happens is cleaning ignores raid array drives | 07:41 |
arne_wiebalck | mnaser: the fact that RAID cleaning is not done automatically is done to be the same as for h/w RAID | 07:41 |
mnaser | ah | 07:41 |
mnaser | i guess it makes sense to have a tunable so that it can be run as part of normal cleaning for software raid | 07:41 |
arne_wiebalck | mnaser: we discussed several times already to have this done automatically | 07:41 |
arne_wiebalck | mnaser: I think there is a patch which does what you suggest for deploy steps | 07:42 |
openstackgerrit | Mohammed Naser proposed openstack/ironic master: Allow users to configure priority for {create,delete}_configuration https://review.opendev.org/c/openstack/ironic/+/778145 | 07:43 |
arne_wiebalck | mnaser: I think RAID devices are skipped for erase as a) we assumed they were not there anymore as delete_configuration was run before and b) they would not be able to do fast erase | 07:43 |
mnaser | arne_wiebalck: the RAID devices which are part of the raid array are skipped.. the raid device itself (/dev/md127) ended up running through shred | 07:44 |
arne_wiebalck | mnaser: yeah ... not sure this makes sense as the underlying disks may also run shred | 07:45 |
arne_wiebalck | mnaser: or some sort of erase | 07:45 |
mnaser | yeah ideally what id like to do with my patch is run this order: delete_configuration, erase_devices_metadata, erase_devices, create_configuration -- that way, when it reaches erase_devices, there's no raid array, and it will run a quick secure erase | 07:46 |
arne_wiebalck | mnaser: yes, this is how I have the order in our h/w manager | 07:47 |
mnaser | so for me to avoid writing a hardware manager, i'd have those options that i can tweak :P | 07:47 |
arne_wiebalck | mnaser: yes, that makes sense | 07:47 |
*** Qianbiao has joined #openstack-ironic | 07:50 | |
openstackgerrit | vinay50muddu proposed openstack/ironic master: Add clean/deploy steps to manage certificates https://review.opendev.org/c/openstack/ironic/+/763791 | 07:50 |
openstackgerrit | Arun S A G proposed openstack/ironic master: Add agent_state and agent_status params to heartbeat https://review.opendev.org/c/openstack/ironic/+/778058 | 08:01 |
zer0c00l | The anaconda deploy driver is ready for review - only thing missing is config drive related stuff. https://review.opendev.org/q/topic:%22anaconda-deploy-driver%22+(status:open%20OR%20status:merged) | 08:02 |
zer0c00l | i will be at the review jam tomorrow! | 08:02 |
openstackgerrit | Arne Wiebalck proposed openstack/ironic master: Lazy-load node details from the DB https://review.opendev.org/c/openstack/ironic/+/776930 | 08:12 |
*** mkrai has quit IRC | 08:18 | |
*** mkrai has joined #openstack-ironic | 08:18 | |
*** rpittau|afk is now known as rpittau | 08:22 | |
rpittau | good morning ironic! o/ | 08:22 |
janders | good morning rpittau o/ | 08:25 |
rpittau | hey janders :) | 08:25 |
rpittau | iurygregory: unfortunately https://review.opendev.org/c/openstack/ironic-python-agent/+/778021 doesn't work so I'm going to revert it | 08:26 |
rpittau | iurygregory: https://zuul.opendev.org/t/openstack/build/ab1e98708986424d9b31ab4db4abe6b7 | 08:26 |
*** tosky has joined #openstack-ironic | 08:35 | |
*** ociuhandu has joined #openstack-ironic | 08:44 | |
*** dougsz has joined #openstack-ironic | 08:50 | |
*** jamesdenton has quit IRC | 08:56 | |
*** jamesdenton has joined #openstack-ironic | 08:57 | |
*** lucasagomes has joined #openstack-ironic | 09:01 | |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder https://review.opendev.org/c/openstack/ironic-python-agent/+/778153 | 09:15 |
rpittau | ok maybe we don't need to revert with this ^ | 09:15 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic master: [WIP] Prepare to use tinycore 12 for tinyipa https://review.opendev.org/c/openstack/ironic/+/777342 | 09:18 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic master: Prepare to use tinycore 12 for tinyipa https://review.opendev.org/c/openstack/ironic/+/777342 | 09:20 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic master: Prepare to use tinycore 12 for tinyipa https://review.opendev.org/c/openstack/ironic/+/777342 | 09:29 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent-builder master: Use tinycore 12 to build tinyipa https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/776587 | 09:30 |
moshiur | Hi rpittau: I am able to build the IPA image with opensuse base image. | 09:40 |
*** ociuhandu has quit IRC | 09:50 | |
rpittau | moshiur: great :) | 09:59 |
rpittau | moshiur: patches welcome :) | 10:02 |
*** derekh has joined #openstack-ironic | 10:05 | |
*** paras333 has quit IRC | 10:06 | |
*** ociuhandu has joined #openstack-ironic | 10:06 | |
moshiur | Thanks rpittau: I will try to add two patches each in https://github.com/openstack/diskimage-builder and https://github.com/openstack/ironic-python-agent-builder. | 10:25 |
rpittau | moshiur: not sure how familiar you are with gerrit, but you should use the opendev repositories, not github, they're just mirrors | 10:26 |
moshiur | rpittau: oh, I am not familiar with gerrit, but will give a try to do this. | 10:31 |
rpittau | moshiur: you can find a lot of info on the internet, you can start from https://www.gerritcodereview.com/ and https://wiki.openstack.org/wiki/How_To_Contribute | 10:33 |
openstackgerrit | Derek Higgins proposed openstack/ironic-python-agent master: Increase the memory limit for qemu-img https://review.opendev.org/c/openstack/ironic-python-agent/+/778035 | 10:33 |
*** mkrai has quit IRC | 10:46 | |
*** mkrai_ has joined #openstack-ironic | 10:46 | |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder https://review.opendev.org/c/openstack/ironic-python-agent/+/778153 | 10:53 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder https://review.opendev.org/c/openstack/ironic-python-agent/+/778153 | 10:54 |
*** k_mouza has joined #openstack-ironic | 10:56 | |
janders | TheJulia when you're online and have the time please let me know if this doco addition for NVMe cleaning would be sufficient to address the doco gap you pointed out: https://review.opendev.org/c/openstack/ironic/+/778134 thanks! :) | 11:01 |
janders | see you tomorow Ironic o/ | 11:01 |
rpittau | bye janders :) | 11:01 |
janders | see you rpittau | 11:06 |
iurygregory | good morning ironic | 11:12 |
iurygregory | rpittau, hey shouldn't we run on ubuntu-bionic? | 11:12 |
rpittau | iurygregory: yeah, probably better, I'll add it to the patch | 11:13 |
iurygregory | rpittau++ =) | 11:13 |
iurygregory | and using UPPER_CONSTRAINTS_FILE makes a lot of sense! | 11:14 |
iurygregory | going to grab coffee brb | 11:14 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder https://review.opendev.org/c/openstack/ironic-python-agent/+/778153 | 11:17 |
openstackgerrit | Derek Higgins proposed openstack/ironic-python-agent master: Install doc/requirements.txt in testenv:venv https://review.opendev.org/c/openstack/ironic-python-agent/+/778173 | 11:35 |
*** uzumaki has joined #openstack-ironic | 11:38 | |
openstackgerrit | Merged openstack/ironic master: secure-rbac - minor follow-up for project scoped tests https://review.opendev.org/c/openstack/ironic/+/778033 | 11:38 |
openstackgerrit | Merged openstack/ironic-python-agent master: Added comment about IPA logs being uploaded to Ironic https://review.opendev.org/c/openstack/ironic-python-agent/+/778031 | 11:38 |
openstackgerrit | Derek Higgins proposed openstack/ironic-python-agent master: Increase the memory limit for qemu-img https://review.opendev.org/c/openstack/ironic-python-agent/+/778035 | 11:39 |
*** ociuhandu has quit IRC | 11:48 | |
*** hoonetorg has quit IRC | 11:49 | |
*** ociuhandu has joined #openstack-ironic | 11:50 | |
*** ociuhandu has quit IRC | 11:50 | |
*** ociuhandu has joined #openstack-ironic | 11:52 | |
*** ociuhandu has quit IRC | 11:57 | |
*** hoonetorg has joined #openstack-ironic | 12:03 | |
anuradha1904 | Hi everyone, My name is Anuradha and I was an Outreachy intern for December 2020 round for OpenStack. My mentors were iurygregory, and TheJulia, Today is my last day of internship and I want to thank each one of you for this amazing community, I had the most amazing times learning and growing here. I will continue with my contributions and give back as much as possible. I want to thank my amazing mentor | 12:06 |
anuradha1904 | Iurygregory for being the best mentor I could ever ask for. He helped me with the smallest of doubts by helping me with examples, pseudo-codes, and explanations without complaining and with extraordinary patience. It never felt like his first experience as a mentor. Thank you my amazing mentor TheJulia for constantly motivating me and helping me solve my doubts by guiding me with steps such that I self | 12:06 |
anuradha1904 | learn and correct myself, Thank you tosin: for being a great friend, I am ready to grow and learn some more with you and Finally, all the amazing members of the community who reviewed my code, you all were a part of an amazing experience for a beginner who will try to learn to learn and grow. :) | 12:06 |
iurygregory | anuradha1904, thank you for your hard work! you did a great job =) congratulations! | 12:08 |
anuradha1904 | iurygregory, Thank you so much, could not have been possible at all without your help :) | 12:09 |
*** ociuhandu has joined #openstack-ironic | 12:12 | |
*** ociuhandu has quit IRC | 12:17 | |
*** ociuhandu has joined #openstack-ironic | 12:18 | |
*** zzzeek has quit IRC | 12:20 | |
*** ociuhandu has quit IRC | 12:22 | |
*** zzzeek has joined #openstack-ironic | 12:23 | |
*** paras333_ has joined #openstack-ironic | 12:35 | |
*** ociuhandu has joined #openstack-ironic | 12:35 | |
*** mkrai_ has quit IRC | 12:38 | |
*** ociuhandu has quit IRC | 12:44 | |
*** rh-jlabarre has joined #openstack-ironic | 13:04 | |
*** ociuhandu has joined #openstack-ironic | 13:27 | |
*** uzumaki has quit IRC | 13:40 | |
*** lbragstad_ is now known as lbragstad | 13:41 | |
TheJulia | good morning | 13:49 |
rpittau | good morning TheJulia :) | 13:49 |
iurygregory | good morning TheJulia =) | 13:51 |
*** rloo has joined #openstack-ironic | 13:59 | |
*** rloo has quit IRC | 13:59 | |
*** rloo has joined #openstack-ironic | 13:59 | |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder https://review.opendev.org/c/openstack/ironic-python-agent/+/778153 | 14:02 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder https://review.opendev.org/c/openstack/ironic-python-agent/+/778153 | 14:07 |
rpittau | TheJulia: I think we're good to go for the releases, I'm going to review the release notes one more time | 14:09 |
rpittau | oh they were actually already done :D | 14:09 |
rpittau | except metalsmith | 14:10 |
rpittau | I'l request that | 14:10 |
TheJulia | rpittau: thanks | 14:12 |
*** lmcgann has joined #openstack-ironic | 14:14 | |
*** rloo has quit IRC | 14:17 | |
*** rloo has joined #openstack-ironic | 14:18 | |
*** fdegir has joined #openstack-ironic | 14:19 | |
openstackgerrit | Riccardo Pittau proposed openstack/metalsmith master: Fix release versions https://review.opendev.org/c/openstack/metalsmith/+/778194 | 14:20 |
rpittau | maybe we can just quickly merge this before ? ^ | 14:20 |
TheJulia | Approved | 14:24 |
rpittau | thanks | 14:25 |
*** jawad_axd has quit IRC | 14:31 | |
*** jawad_axd has joined #openstack-ironic | 14:32 | |
openstackgerrit | Merged openstack/ironic-python-agent master: Remove nvme-cli warning and delay on nvme-format https://review.opendev.org/c/openstack/ironic-python-agent/+/778136 | 14:36 |
openstackgerrit | Merged openstack/metalsmith master: Fix release versions https://review.opendev.org/c/openstack/metalsmith/+/778194 | 14:39 |
*** tzumainn has joined #openstack-ironic | 14:42 | |
TheJulia | So... March 367th is today? | 14:42 |
* iurygregory - no reference found =( | 14:45 | |
TheJulia | mnaser: So, I feel like maybe the enumeration of priority should look for software raid and reset the available steps as such | 14:45 |
tzumainn | dtantsur, the change required to allow instance_info to override *_interface values turned out to be suspiciously simple | 14:54 |
*** uzumaki has joined #openstack-ironic | 14:58 | |
iurygregory | tzumainn, Dmitry is on PTO this week =) | 15:01 |
iurygregory | but he will be happy to hear this when he comes back :D | 15:01 |
iurygregory | s/hear/read | 15:02 |
tzumainn | haha, okay! | 15:02 |
iurygregory | if you have the change up feel free to add ironic-week-prio in the hashtag field =) | 15:03 |
tzumainn | iurygregory, done, thanks for the heads up! | 15:04 |
iurygregory | ty! | 15:05 |
TheJulia | can we hold off on permission changes until after the new project scoped rbac work merges?? | 15:05 |
iurygregory | I really don't want to look at the possible merge conflicts in https://review.opendev.org/c/openstack/ironic/+/776540 :D | 15:08 |
iurygregory | TheJulia, I think it makes sense | 15:08 |
TheJulia | I ask mainly because I don't want to inadvertently squash something brand new and I'd prefer to limit the delta of changes. Plus a lot of the old style of permissions rules need to be ripped out in the grand scheme of the universe | 15:10 |
*** mkrai has joined #openstack-ironic | 15:11 | |
iurygregory | I'm ok with this approach | 15:12 |
iurygregory | I know it's also a pain to solve merge conflicts etc | 15:13 |
*** jawad_axd has quit IRC | 15:15 | |
*** jawad_axd has joined #openstack-ironic | 15:16 | |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent master: Remove default parameter from execute https://review.opendev.org/c/openstack/ironic-python-agent/+/778201 | 15:20 |
*** jawad_axd has quit IRC | 15:24 | |
*** Qianbiao has quit IRC | 15:33 | |
*** openstackgerrit has quit IRC | 15:35 | |
mnaser | TheJulia: that is far more resilient. I couldn’t find where the steps get enumerated or generated though :( | 15:53 |
*** mkrai has quit IRC | 15:57 | |
*** moshiur has quit IRC | 15:59 | |
*** openstackgerrit has joined #openstack-ironic | 16:03 | |
openstackgerrit | Merged openstack/ironic-python-agent master: Increase the memory limit for qemu-img https://review.opendev.org/c/openstack/ironic-python-agent/+/778035 | 16:03 |
TheJulia | mnaser: I'll try to take a look between meetings today | 16:05 |
TheJulia | but today is a meeting day | 16:05 |
*** uzumaki has quit IRC | 16:08 | |
*** uzumaki has joined #openstack-ironic | 16:21 | |
openstackgerrit | Riccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder https://review.opendev.org/c/openstack/ironic-python-agent/+/778153 | 16:29 |
rpittau | sometimes my typos really amaze me | 16:29 |
openstackgerrit | Riccardo Pittau proposed openstack/ironic master: Prepare to use tinycore 12 for tinyipa https://review.opendev.org/c/openstack/ironic/+/777342 | 16:42 |
TheJulia | brain processing is always a fun topic | 16:46 |
openstackgerrit | Derek Higgins proposed openstack/ironic-python-agent master: Install doc/requirements.txt in testenv:venv https://review.opendev.org/c/openstack/ironic-python-agent/+/778173 | 16:49 |
*** anuradha1904 has quit IRC | 16:58 | |
*** sshnaidm is now known as sshnaidm|afk | 17:00 | |
openstackgerrit | Jay Faulkner proposed openstack/ironic-specs master: No Conductor to IPA Communication spec https://review.opendev.org/c/openstack/ironic-specs/+/777172 | 17:01 |
*** lucasagomes has quit IRC | 17:06 | |
*** ociuhandu_ has joined #openstack-ironic | 17:09 | |
openstackgerrit | Derek Higgins proposed openstack/ironic-python-agent master: Install doc/requirements.txt in testenv:venv https://review.opendev.org/c/openstack/ironic-python-agent/+/778173 | 17:10 |
*** ociuhandu has quit IRC | 17:12 | |
*** ociuhandu_ has quit IRC | 17:13 | |
TheJulia | So my last meeting ran over and I need to run an errand. With regards to the review jam I may be late or we can skip it today | 17:17 |
TheJulia | up for either option, as anyone can start/run it | 17:17 |
eandersson | There is a check that warns if the ipa is too old. Does that check work with custom built versions of the ipa that does not necessarily have a tag (e.g. X.Y.Z.dev5) | 17:24 |
TheJulia | I don't remember | 17:29 |
TheJulia | I take it your outside of the supported matrix? | 17:29 |
* TheJulia take care to mechanic | 17:29 | |
TheJulia | definitly going to be very late for review jam now | 17:30 |
TheJulia | :( | 17:30 |
eandersson | Trying to figure out a weird bug that started hitting us and noticed that we get a warning about not running a Victoria or newer IPA, but we are on the latest stable/victoria. | 17:31 |
*** dougsz has quit IRC | 17:32 | |
TheJulia | hmm | 17:33 |
TheJulia | could be a bug or the settings playing out to the message being logged | 17:33 |
eandersson | This is the version we are running | 17:34 |
eandersson | > ironic-python-agent==6.4.4.dev24 | 17:34 |
eandersson | Guessing it is just a red herring | 17:35 |
JayF | TheJulia: I think zer0c00l was planning on showing up to talk about anaconda, as mentioned yesterday in the upstream meeting | 17:36 |
JayF | I will attend and hope others do in order to get his stuff moving | 17:36 |
rpittau | I can't attend :/ | 17:37 |
rpittau | good night! o/ | 17:42 |
*** rpittau is now known as rpittau|afk | 17:42 | |
arne_wiebalck | bye everyone o/ | 17:43 |
*** bnemec has quit IRC | 17:49 | |
eandersson | > Agent is busy: executing command execute_deploy_step | 17:54 |
eandersson | What ever issue we are having it is causing this on all steps.. which also prevents those the sweet logs from getting collected ;'( | 17:55 |
eandersson | > Agent command standby.get_partition_uuids for node <uuid> failed. Expected 2xx HTTP status code, got 409. | 17:56 |
*** derekh has quit IRC | 18:01 | |
JayF | Review jam: https://meetpad.opendev.org/ironic | 18:03 |
JayF | you all are the peanut butter, join us :D | 18:04 |
eandersson | Are we really handling 409's correct here? They seem to map to "AgentIsBusy" | 18:07 |
eandersson | Shouldt Ironic just retry if the agent is still "busy"? | 18:07 |
JayF | No, the bug is that it's trying to run two commands at the same time. | 18:08 |
JayF | I don't know why/how, you could call GET /v1/commands on the agent if you can to see what is still running | 18:08 |
JayF | but generally speaking, there should never be a case when a conductor is trying to issue a command to a running agent that already has a command in progress | 18:08 |
JayF | I don't know anything about why it's happening in your case; but it usually indicates some metadata was never written to a node, or perhaps (guessing here) something going upside-down with the agent fast track support | 18:09 |
eandersson | Thanks helps a lot. | 18:13 |
eandersson | My current best guess is that this is a new bug with deploy and reboot_requested. | 18:16 |
*** bnemec has joined #openstack-ironic | 18:16 | |
JayF | I'm not generally familiar with deploy steps, but if I recall I absolutely saw that (years ago) with clean steps + reboot requested | 18:17 |
JayF | so you may be on the right track | 18:17 |
*** k_mouza has quit IRC | 18:36 | |
*** k_mouza_ has joined #openstack-ironic | 18:36 | |
eandersson | Interesting. It looks like it fails when it does not reboot as requested. | 18:36 |
*** paras333_ has quit IRC | 18:39 | |
TheJulia | eandersson: uhh, we should be. This seems really familiar | 18:53 |
TheJulia | well heartbeats still occur which can trigger the next step | 18:53 |
eandersson | The only thing we have found so far is that it fails with the above message when the reboot for some unknown reason isn't triggered. | 18:59 |
eandersson | And our script is dead simple. It's literally a bash script with exit 0. | 19:00 |
eandersson | That gets executed by the IPA | 19:00 |
TheJulia | would you be up for talking through it? | 19:01 |
* TheJulia thinks more coffee is needed | 19:01 | |
lbragstad | TheJulia o/ it looks like y'all are making some good progress on the secure rbac patches - i only see a few left? | 19:01 |
JayF | TheJulia: at least for cleaning, heartbeat won't trigger the next step unless the previous one has completed | 19:01 |
TheJulia | lbragstad: yup, we need to do cleanup and likely look at db stuffs later, but yeah | 19:02 |
TheJulia | the problem I think is it is getting the 409 on getting the command status | 19:02 |
TheJulia | but I'm trying to grok eandersson's exact case because I thought we fixed the bug | 19:02 |
JayF | Agent command standby.get_partition_uuids for node <uuid> failed. Expected 2xx HTTP status code, | 19:02 |
JayF | | got 409. | 19:02 |
JayF | Sorry, didn't mean to paste that before cleaning it up, but you see ^^ it's not calling for command status | 19:03 |
JayF | and calling /v1/commands, while a command is running, succeeds (at least in any agent I've tried it on, up to ussuri) | 19:03 |
eandersson | http://paste.openstack.org/show/OzMCHhFjjCmN0mD3EcV4/ | 19:04 |
*** k_mouza_ has quit IRC | 19:04 | |
*** k_mouza has joined #openstack-ironic | 19:05 | |
eandersson | This is the log from two runs, one successul and one failure. | 19:05 |
eandersson | You can see that it properly reboots the node when it is successful, but for some reason does not the second time we deploy it. | 19:05 |
TheJulia | This is sounding super deja-vu'ey | 19:05 |
lbragstad | TheJulia ok - https://review.opendev.org/q/topic:secure-rbac+project:openstack/ironic+status:open is still an accurate list of what needs to land for system-admin, system-reader, project-member, project-reader? | 19:06 |
*** tzumainn has quit IRC | 19:07 | |
TheJulia | lbragstad: it is, system-[admin, member, reader] are done, it is all project stuffs right now and we've got one more to go which is still in development | 19:07 |
TheJulia | lbragstad: keep in mind, we don't have to have everything merged by m3 | 19:07 |
lbragstad | oh - sweet | 19:07 |
TheJulia | at least, in ironic we don't have to | 19:08 |
lbragstad | yeah - and ironic is pretty much an admin-only API | 19:08 |
lbragstad | right? | 19:08 |
TheJulia | well, becoming less and less admin only, espescialy with this work | 19:08 |
TheJulia | plus ironic operates by different release rules | 19:08 |
lbragstad | ok | 19:08 |
*** tzumainn has joined #openstack-ironic | 19:09 | |
TheJulia | blarg | 19:16 |
TheJulia | eandersson: I see what is going on :( | 19:19 |
eandersson | Something easy to fix? :D | 19:22 |
TheJulia | maybe | 19:23 |
TheJulia | are you getting a "Conductor attempted to process deploy step" error? | 19:24 |
TheJulia | https://github.com/openstack/ironic/blob/6e0682377ce433e1f9e6acf863e2bf73728a75ae/ironic/conductor/deployments.py#L273 | 19:24 |
eandersson | I don't but this is Victoria | 19:25 |
eandersson | and I don't think that exists in Victoria | 19:25 |
eandersson | Maybe > Expected 2xx HTTP status code, got 409. is the same message in Victoria | 19:26 |
TheJulia | https://github.com/openstack/ironic/blob/6e0682377ce433e1f9e6acf863e2bf73728a75ae/ironic/drivers/modules/agent_base.py#L380 | 19:27 |
TheJulia | are you getting that error that could be raised? | 19:28 |
eandersson | I don't see it in the logs at least | 19:28 |
*** k_mouza has quit IRC | 19:29 | |
eandersson | This causes logs to not be shipped to the ironic server as well | 19:29 |
eandersson | and having a difficult time catching it while it is happening | 19:29 |
TheJulia | I think our hard failure of things is more all on the ironic side of the universe | 19:29 |
TheJulia | we should handle the 409 | 19:29 |
JayF | Is there ever a valid case where we should get a 409 from the agent? | 19:30 |
TheJulia | yes, when the agent is still working but is heartbeating | 19:30 |
JayF | Can you lay out that case explicitly? I can 100% hit /v1/commands while a command is in progress on the agent. | 19:31 |
TheJulia | somewhere we're failing things fairly hardcore and I think I know where | 19:31 |
JayF | Which is the only agent endpoint that should be hit while a command is running | 19:31 |
TheJulia | well, you can hit it, but you can't ask it to execute the command status command | 19:31 |
TheJulia | since that is a separate command | 19:31 |
TheJulia | and only one command can run at a time | 19:31 |
JayF | Uh. Let me look at the code | 19:31 |
TheJulia | ++ | 19:31 |
JayF | IIRC we hit /v1/commands (the list endpoint) and take the first value | 19:31 |
eandersson | The error message was confusing to me. Is busy to me just sounds like hey I am still working on this. | 19:32 |
TheJulia | eandersson: it likely is | 19:32 |
JayF | https://opendev.org/openstack/ironic/src/branch/master/ironic/drivers/modules/agent_client.py#L253 I see no evidence we ever hit /v1/commands/{command_uuid} | 19:33 |
JayF | and /v1/commands works when commands are in progress | 19:33 |
JayF | I'm fairly certain this has to be running multiple actual commands, even if the commands are merely informational (like get_clean_steps, for example) | 19:33 |
JayF | I'm out this afternoon; but I would love to know where this all leads -- I'll read scrollback but if there are any patches/stories filed, please feel free to ping them to me directly | 19:34 |
TheJulia | erbarr: just to confirm, the failure is https://github.com/openstack/ironic/blob/8604f84fd7bda4e30d3f07005c4901f3662303a7/ironic/common/exception.py#L628 | 19:35 |
openstackgerrit | Julia Kreger proposed openstack/ironic stable/victoria: Handle agent still doing the prior command https://review.opendev.org/c/openstack/ironic/+/778237 | 19:37 |
* TheJulia whistles | 19:37 | |
TheJulia | that is why it is deja vu | 19:37 |
JayF | [-] Tried to execute standby.get_partition_uuids, agent is still executing Command name: | 19:37 |
JayF | execute_deploy_step, params: {'step': {'interface': 'deploy', 'step': 'write_image', | 19:37 |
JayF | that is 100% two commands at the same time | 19:37 |
JayF | not just getting a command status | 19:37 |
JayF | (from the original patch to master) | 19:37 |
TheJulia | yup | 19:39 |
TheJulia | eandersson: if you can try out 778237 and see if that clears up your issue, that would be good | 19:40 |
JayF | thanks for that, I'll review that older master patch and if the victoria one isn't landed after that, I'll vote on ie | 19:41 |
eandersson | We will try it out today | 19:42 |
TheJulia | okay | 19:42 |
eandersson | Building the container now :D | 19:46 |
TheJulia | Wow, the mobile command center's AC finally turned on | 19:50 |
TheJulia | (it is hooked up outside the home office window) | 19:51 |
eandersson | Rebuilding 4 nodes now. Fingerscrossed. | 20:06 |
*** juanoterocas has joined #openstack-ironic | 20:10 | |
eandersson | Still failing unfortunately | 20:12 |
TheJulia | *sigh* | 20:12 |
eandersson | but it looks different now | 20:13 |
stevebaker | morning | 20:13 |
*** mcarden has joined #openstack-ironic | 20:14 | |
TheJulia | yay 6 rbac tests failing | 20:15 |
TheJulia | eandersson: oh?!? | 20:15 |
TheJulia | as much detail as possible would be greatly appreciated | 20:15 |
eandersson | It failed and then just stopped this time. Before that patch it kept going. | 20:15 |
TheJulia | *sigh* | 20:15 |
eandersson | I think it's just completely stuck now. | 20:22 |
eandersson | It just failed and gave up | 20:23 |
eandersson | Yea - the state engine got messed up and just booted back into the original OS | 20:24 |
eandersson | (since we don't do the clean step at the moment the old OS was still there) | 20:25 |
eandersson | I'll try to get you some logs | 20:26 |
TheJulia | much appreciated | 20:26 |
TheJulia | I've got calls the next 1.5 hours | 20:26 |
TheJulia | fwiw | 20:26 |
TheJulia | in a moment of somethign surprising, we actually don't call the conductor on patching an allocation | 20:26 |
iurygregory | <surprise face> O.o | 20:28 |
eandersson | http://paste.openstack.org/show/hHCvcChRvxTL1xkaZo8W/ | 20:30 |
eandersson | I noticed this showing up twice. Not sure if it has any significance. | 20:31 |
eandersson | > Agent on node <node-uuid> returned deploy command success, moving to next step | 20:31 |
eandersson | It almost feels like it is meant to wait for the reboot, but does not wait long enough and just moves on to the next state. | 20:58 |
openstackgerrit | Verification of a change to openstack/ironic failed: Project Scoping Node endpoint https://review.opendev.org/c/openstack/ironic/+/773924 | 20:59 |
eandersson | 90% sure that is what is happening here. It goes to DEPLOYWAIT and then the agent instantly moves to deploying due to some race condition. | 21:04 |
eandersson | I almost feel like we are missing something in the state machine to protect it here. | 21:21 |
*** hoonetorg has quit IRC | 21:21 | |
openstackgerrit | Jacob Anders proposed openstack/ironic master: Add support for using NVMe specific cleaning https://review.opendev.org/c/openstack/ironic/+/778134 | 21:25 |
janders | ^ NVMe cleaning doco fixes | 21:25 |
janders | good morning Ironic o/ | 21:25 |
*** k_mouza has joined #openstack-ironic | 21:29 | |
iurygregory | good morning janders o/ | 21:30 |
eandersson | Maybe we need to disable heartbeats before it goes into the waiting state. | 21:31 |
janders | good morning iurygregory o/ | 21:33 |
*** k_mouza has quit IRC | 21:34 | |
*** hoonetorg has joined #openstack-ironic | 21:42 | |
*** gyee has joined #openstack-ironic | 21:43 | |
TheJulia | brraaains | 22:03 |
TheJulia | eandersson: Uhh... hmm | 22:05 |
TheJulia | eandersson: I think I understand what is going on and I think it is variation | 22:06 |
TheJulia | for my context, update_firmware is running before the deployment as it is priority 70 correct? | 22:06 |
TheJulia | and that is *still* in progress | 22:06 |
eandersson | Yea | 22:12 |
eandersson | If you look at my logs you can see that there are two different reqs and they happen at almost the same moment (unfortunately I removed timestamp) | 22:12 |
*** frigo has joined #openstack-ironic | 22:13 | |
eandersson | One puts it into DEPLOY-WAITING and the other almost instantly moves it back into ACTIVE | 22:13 |
eandersson | So it isn't in progress (at least not the firmware call), but the transiton from ACTIVE -> DEPLOY-WAITING -> REBOOT -> ACTIVE is still in progress | 22:14 |
eandersson | but because it moves DEPLOY-WAITING to ACTIVE it never has time to trigger the REBOOT | 22:14 |
*** lmcgann has quit IRC | 22:15 | |
TheJulia | err | 22:17 |
TheJulia | That seems like a distinctly different issue from what I'm thinking | 22:17 |
TheJulia | but I think we've got two issues playing together in not fun ways | 22:18 |
eandersson | I could be wrong as I base this on reading the logs | 22:18 |
TheJulia | I can toss up a patch a little later for what I think I see in the code, but I need to get the current context out of my head first | 22:18 |
eandersson | Gonna dig into it today as well. We didn't see this in the beginning so not sure what changed. We are thinking maybe rebasing the victoria branch a few weeks ago caused it, but that is just the only known change. | 22:21 |
* TheJulia heats up lunch | 22:21 | |
TheJulia | what was the beginning? | 22:21 |
TheJulia | meaning ipa/ironic versions | 22:21 |
eandersson | Pretty sure the beginning was like Victoria rc1 or 2 of Ironic | 22:22 |
eandersson | Before Victoria was actually released | 22:23 |
TheJulia | hm, we don't do RC's | 22:23 |
TheJulia | ahh | 22:23 |
TheJulia | so maybe a point release before | 22:23 |
TheJulia | hmmm | 22:23 |
eandersson | We add our own features to Ironic so we just build based on the stable branches | 22:26 |
eandersson | The version I deployed now does not have any custom features thou | 22:27 |
TheJulia | hmm, could there have been overlapping changes maybe? | 22:29 |
TheJulia | I'm thinking there weres some last minute things to victoria, but the stable branch hasn't really changed much | 22:29 |
TheJulia | and stables are always based on our final cycle release unless "something else" has to happen | 22:30 |
eandersson | A possibility is that we just got really lucky as well. As this isn't happening to 100% of the rebuilds. | 22:30 |
TheJulia | hmm, could be | 22:31 |
TheJulia | okay, let me get my current thing out of my head and then I'll put the other patch up | 22:31 |
TheJulia | that I think is needed | 22:31 |
*** rcernin has joined #openstack-ironic | 22:33 | |
*** juanoterocas has quit IRC | 22:50 | |
*** frigo has quit IRC | 23:00 | |
*** pmannidi has joined #openstack-ironic | 23:07 | |
*** pmannidi_ has quit IRC | 23:08 | |
*** zzzeek has quit IRC | 23:13 | |
*** zzzeek has joined #openstack-ironic | 23:17 | |
*** k_mouza has joined #openstack-ironic | 23:30 | |
*** k_mouza has quit IRC | 23:34 | |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Project Scoping Node endpoint https://review.opendev.org/c/openstack/ironic/+/773924 | 23:45 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Port/Portgroup project scoped access https://review.opendev.org/c/openstack/ironic/+/775465 | 23:45 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Volume targets/connectors Project Scoped RBAC https://review.opendev.org/c/openstack/ironic/+/776314 | 23:45 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Project scope driver vendor pass-through https://review.opendev.org/c/openstack/ironic/+/776767 | 23:45 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: Follow-up on project scoped trait tests https://review.opendev.org/c/openstack/ironic/+/776768 | 23:45 |
openstackgerrit | Julia Kreger proposed openstack/ironic master: WIP: Allocation support for project scoped RBAC https://review.opendev.org/c/openstack/ironic/+/778340 | 23:45 |
TheJulia | stevebaker: please take a look at the allocation patch above, still a wip, but it is drastically different endpoint so more eyes the better | 23:45 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!