14:06:24 #startmeeting PowerVM Driver Meeting 14:06:24 Meeting started Tue Jul 17 14:06:24 2018 UTC and is due to finish in 60 minutes. The chair is edmondsw. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:06:24 edmondsw: while testing the iscsi changes on devstack, there were some issue could 14:06:25 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:06:28 The meeting name has been set to 'powervm_driver_meeting' 14:06:47 #link https://etherpad.openstack.org/p/powervm_driver_meeting_agenda 14:07:04 ping efried gman-tx mdrabe mujahidali chhagarw 14:07:24 o/ 14:07:26 getting started a bit late... I got too caught up in reviewing the device passthrough spec 14:07:27 ō/ 14:07:33 heh, good news 14:07:35 I'm sure efried will mind that :) 14:07:48 #topic In-Tree Driver 14:08:04 #link https://etherpad.openstack.org/p/powervm-in-tree-todos 14:08:18 I don't know of anything to discuss here... anyone else? 14:09:03 I don't believe we've made any more progress on the TODOs there 14:09:12 everything on hold as we focus on other priorities 14:09:21 #topic Out-of-Tree Driver 14:09:35 #link https://etherpad.openstack.org/p/powervm-oot-todos 14:10:20 chhagarw I saw that you posted a new patch set for your iSCSI work... is that based on something you found with devstack? 14:10:33 I think you were starting to say that right as I started this meeting 14:10:57 in which case... good example of why we wanted devstack testing, and thank you for doing that 14:11:15 yeah, while testing with devstack there are couple of issues found 14:11:33 i am re-verifying on pvc now, will keep u posted 14:11:49 as an aside on that... I am trying to spend spare cycles here and there improving our example devstack local.conf files in nova-powervm based on things I've been learning from chhagarw's environment and the CI 14:12:12 chhagarw I think the last patch set I saw still had pep8 errors, so make sure you iron those out 14:12:43 yeah i am updating 14:12:59 I had a conversation with mdrabe about the MSP work. I hope he's getting to that here soon 14:13:18 mdrabe any comments there? 14:13:56 edmondsw: I'll be syncing the conf options, but the migration object in pvc will remain the same 14:14:13 we can talk about pvc in other forums 14:15:08 I added a section in our TODO etherpad about docs 14:15:27 basically, readthedocs builds are failing since stephenfin's changes 14:16:23 efried I also figured out how to register in readthedocs to be notified when a docs build fails... thought you might be interested to do the same 14:16:50 edmondsw: I would rather get us into docs.o.o. Is that possible? 14:17:06 I believe so, and it's on the TODO list 14:17:25 in fact, I think that may be the only way to solve the current build issues, short of reverting some of what stephenfin did 14:17:30 which I'd rather not do 14:18:00 that will probably be the next thing I try 14:18:16 that == moving to docs.o.o 14:18:56 while we're talking about docs builds... I also noticed that one of our stable docs builds is broken, and all of our EOL tagged docs builds are broken 14:19:18 lower priority, but also need to be fixed 14:19:19 I hope we can also move them to docs.o.o but I'm not sure on that 14:19:30 edmondsw: I want the updated code to reviewed once for the LPM change perspective 14:19:32 docs.o.o is latest only, I thought. 14:19:43 efried no, it has older stuff too 14:20:27 chhagarw I'll try to look later today 14:21:02 anything else to discuss OOT? 14:21:44 #topic Device Passthrough 14:21:55 Eric has a couple things up for review: 14:22:14 #link https://review.openstack.org/#/c/579359 14:22:26 #link https://review.openstack.org/#/c/579289 14:22:42 efried I've started commenting on both in parallel 14:22:54 efried what do you want to add here? 14:23:12 (i.e. I'm done talking, take it away) 14:23:50 Reshaper work is proceeding apace. Once that winds down, I'll probably be looking in nova (resource tracker and report client) to make sure nrp support is really there; as well as working through more of that series ^ 14:24:23 mdrabe: We're counting on you to be our second core reviewer for this series, in case you didn't have enough to do. 14:25:10 mdrabe I know you have other things to focus on atm... probably let me get my comments up today first, and then look at it 14:25:42 sounds good 14:26:49 efried anything else? 14:26:53 no 14:26:57 mdrabe: if u can have a look https://review.openstack.org/#/c/576034/ want you to check if this change does not impact NPIV lpm 14:27:10 #topic PowerVM CI 14:27:34 #link https://etherpad.openstack.org/p/powervm_ci_todos 14:27:58 we've been having some CI stability issues that mujahidali is working 14:28:03 chhagarw: will do 14:28:08 I've helped some there, as has esberglu 14:28:24 Here's what I think is going on with CI. The underlying systems are a mess. Filesystems are full, vios issues, etc. 14:28:30 yes 14:28:31 Everything else is just a symptom of that 14:28:36 agreed 14:28:49 question is how best to fix it 14:28:53 I looked into the the neo-21 and found that pvmctl was not working so restarted the neo followed by 14:28:53 pvm-core 14:28:54 pvm-res 14:28:54 and after that pvmctl worked for neo-21 14:29:21 I have cleared the other neo sytems as suggested by esberglu but still no luck, so, I decided to manually clear the ports. But it seems after cleaning them manually they are not coming back to active state 14:29:44 The ports are just going to keep failing to delete until the underlying issues are resolved 14:30:03 Are the /boot/ directories still full on some of the systems? 14:31:10 managemnt nodes and most of the neo have only 30% filled /bbot/ directory 14:31:22 */boot/ 14:32:04 esberglu: do we need to again re deploy the CI after the cleanup and neo restart ? 14:32:44 Have you been restarting neos? If you restart them you need to redeploy them 14:33:00 And yes if they are broken because of full filesystems they need to be redeployed 14:33:35 esberglu is it possible to redeploy a single neo, or do they all have to be redeployed as a group? 14:33:43 so, redploying the cloud_nodes or only management_nodes will do the work ? 14:34:31 You should deploy the compute_nodes and the management_nodes 14:34:40 okay 14:35:05 You can redeploy single nodes using the --limit command, I've given mujahidali instructions on that before, but let me know if you need help with that 14:35:25 At this point it's probably better to redeploy all of them though 14:35:26 sure 14:36:05 mujahidali have we fixed the VIOS issues? 14:36:27 and you said "most" of the neos have only 30% filled in /boot... what about the others? 14:36:40 for neo-26 and neo-30 ?? 14:36:45 yes 14:36:57 Eric Fried proposed openstack/networking-powervm master: Match neutron's version of hacking, flake8 ignores https://review.openstack.org/582686 14:36:57 Eric Fried proposed openstack/networking-powervm master: Use tox 3.1.1 and basepython fix https://review.openstack.org/582404 14:36:59 I want to address as much as we can before you redeploy to increase our chances of that fixing things 14:37:32 I am not getting what exactly gone wrong with neo-26 and 30 14:38:13 ok, let's try to look at that together after this meeting, before you redeploy 14:38:19 they(neo-26 and 30) are having sufficient /boot/ space as well 14:38:43 edmondsw: sure 14:38:52 anything else to discuss here? 14:39:12 Yeah I have some stuff 14:39:27 mujahidali: Have you created all of the zuul merger nodes? 14:39:37 So that I can stop maintaining mine soon? 14:40:23 I want to try them with today's deployment for prod 14:41:16 so let me deploy the prod with the new zuul mergers and if all went right then you can free yours 14:42:30 mujahidali: Please propose a patch with the changes 14:42:47 sure 14:43:11 mujahidali: edmondsw: What's the status on vSCSI CI for stable branches? I think last I heard ocata was still broken there. I gave some suggestions 14:43:21 Is it still broken with those? 14:43:43 Is it worth moving forward with vSCSI stable CI for pike and queens only and skipping ocata for now? 14:43:55 I thought we were going to split that commit into 1) pike and newer 2) ocata so that we could go ahead and merge #1 14:44:01 but I haven't seen that done yet 14:44:19 I am able to stack it now with changes esberglu suggested 14:44:24 yay! 14:44:38 but there are 3 tempest failure 14:45:03 mujahidali ping me the details after the meeting 14:45:09 and we can work through that 14:45:22 after we work through the other thing 14:45:57 okay 14:46:44 edmondsw: mujahidali: There were a bunch of additional neo systems that we had slated for the CI pool. Did those ever get set up? 14:46:57 no 14:47:31 because we've been focused on other things, or is there another reason? 14:47:54 we were hitting CI breaking very frequently so, didn't get a chance to a look at it. 14:48:47 I think that's understandable... keeping the CI running takes priority 14:49:02 Last thing on my list was multinode CI. Any questions for me there mujahidali? I'm guessing not much work has happened there either with the CI stability issues 14:49:57 I redeployed the staging CI using the changes suggested by esberglu for multinode 14:51:03 and? 14:52:01 the jenkins job failed. can I paste the log link here 14:52:24 no, that's another thing we can talk about in slack 14:52:30 sure 14:52:52 I think that's it for CI? 14:53:14 All for me 14:53:38 #topic Open Discussion 14:54:02 I will be OOO next monday. 14:54:07 I meant to bring this up when we were talking about OOT, but efried has fixed our specs so they build now. Thanks efried 14:54:17 mujahidali got it, tx 14:55:18 #endmeeting