14:01:27 #startmeeting cinder 14:01:27 Meeting started Wed Aug 11 14:01:27 2021 UTC and is due to finish in 60 minutes. The chair is rosmaita. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:01:27 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:01:27 The meeting name has been set to 'cinder' 14:01:32 #topic roll call 14:01:36 hi 14:01:38 hi 14:01:41 hi 14:01:43 hi 14:01:43 hi 14:01:47 hi 14:01:49 Hi 14:02:25 hi 14:02:42 hi 14:03:00 good turnout 14:03:09 #link https://etherpad.opendev.org/p/cinder-xena-meetings 14:03:52 ok, let's get started 14:03:56 #topic announcements 14:04:05 os-brick Xena release next week 14:04:21 2 things happening in os-brick, both around the nvmeof 14:04:30 agent 14:04:39 #link https://review.opendev.org/c/openstack/os-brick/+/802691 14:04:48 nvmeof multiple volume support 14:04:58 #link https://review.opendev.org/c/openstack/os-brick/+/800014 14:05:17 those are the highest priority reviews right now 14:05:26 so, please review accordingly 14:05:51 also, would be good if each of the teams working on the two patches mentioned reviewed each other's work 14:06:47 and, of course, it would be good if cinder people in general review the two mentioned patches 14:07:11 i think that's it for announcements 14:07:12 800014 shows merge conflict, fyi 14:07:24 yeah, i noticed that 14:07:45 pro tip for driver developers: merge conflicts are a real turn-off 14:07:54 :) 14:08:00 important to get those resolved quickly or people won't review 14:08:45 ok, that's all for announcements, i think 14:09:02 #topic How to solve retype error with nas_secure_file_operations 14:09:06 shoffmann: that's you 14:09:15 Hi, thanks. 14:09:47 We saw the issue, that retype of available volumes fail when using nfs share, if nas_secure options are used. 14:10:03 means, root is not allowed to write to that share. 14:10:17 #link https://bugs.launchpad.net/cinder/+bug/1938196 14:10:34 My question is, how we can solve this in a nice way. 14:11:11 my -2 was against adding a new config option, so i think that no longer applies 14:11:12 use the existing nas_secure_file_operations option but also in volume/manage.py (where it is currently not defined for all backends) or introduce a new option. 14:11:25 Maybe there are also other ideas. 14:11:41 #link https://review.opendev.org/c/openstack/cinder/+/802882 14:12:35 Right now, i check, if the option is defined, but maybe this is a bit ugly, because the nas_secure option is related to some backend types only. 14:12:42 i'm not sure that the current patch will work in all scenarios 14:13:07 IIRC root access for _qemu_img_info is needed when volumes are attached, regardless of the nas_secure options 14:13:19 (because they may be owned by the qemu user via nova) 14:14:06 In our setup, we configured nova to not change the owner of the file. Not sure, if it is possible, to enable nas_secure on NFS share and let nova change the owner. 14:14:19 Because at NetApp only one user and one group can be allowed. 14:14:28 So we use cinder:nova 14:14:47 this code is in the generic nfs driver 14:15:24 years ago i think we generally felt that the nas_secure options weren't fully implemented correctly, do we know if there are other fixes needed like this to successfully use nas_secure? 14:16:33 Till now we didn't saw other issues. But maybe others know more. 14:17:32 Is it possible for generic/other nfs drivers to define nas_secure with many allowed users? 14:17:39 I haven't seen any other bug related to nas_secure, but I'll check 14:17:46 enriquetaso: there have been many 14:18:02 yes, i think generic nfs configurations commonly do that 14:18:58 these changes are also risky because there's a matrix of configuration options (two nas_secure* options), the generic nfs driver, netapp nfs drivers, whether you configure root_squash, etc.... it's hard to know that a change like this won't break something (which i think it does) 14:19:31 what was the new option (that eric doesn't like)? 14:20:04 Something like "secure_file_operation" or so. 14:20:34 Adding a NetApp specific option also sounds bad. 14:20:44 what was it going to do that you couldn't do with the current options? 14:20:47 the purpose of the new option overlapped with the existing option 14:21:04 and yea, we want to avoid vendor-specific options 14:21:51 On the reported bug side: i think i can start tagging all this bugs with 'Tags: nas_secure' or something like that to have them all together 14:22:02 enriquetaso: ++ 14:22:07 i wonder if the easier fix for the qemu_img_info case wouldn't be to just attempt running it both as root and non-root, or maybe look at the files being inspected to decide which to do 14:22:12 rather than trying to tie it to config options 14:23:07 Are there use cases, using the nas_secure_* but allow root to write to the nfs share? 14:23:43 eharney I also can implement this 14:23:59 same for _connect_device in volume/manager.py? 14:25:25 we may need a list of use cases for all 4 combinations of the existing nas_secure options, i'm not sure if we even know that they are all valid combinations 14:27:09 and what's the deal with 'auto'? seems like we would always be using True for both on fresh installs 14:28:00 whether auto ends up as true or false depends mostly on how the permissions on the share are configured 14:28:19 oh, ok 14:28:46 we basically let deployers decide if they want to have NFS volume files owned as root or as a different user 14:29:11 so that's what you were saying about inspect the files to decide what to do 14:29:18 which is the decision that the nas_secure_file_permissions option is about 14:29:34 the nas_secure_file_operations option is about whether to run commands as root via sudo or just as the cinder user 14:29:58 yes 14:31:18 slightly off topic, we have a devstack-plugin-nfs-tempest-full zuul job ... i wonder how hard it would be to reconfigure it to test this change 14:31:40 we could do that, but we'd also need to examine what coverage we get from tempest tests 14:31:47 should not be hard to do, though 14:32:28 good point about what exactly tempest is checking 14:33:03 but it looks like we will really need a "live" test of this change to see what happens 14:33:14 yeah, there are a lot of details about file permissions with volumes that are attached, migrating, etc 14:33:42 than I will try to implement checking file permissions. If I should add/change tests to tempest, please tell me (e.g. at the change or bug) 14:34:19 Or should I wait for what we will do with all the nfs share issues? 14:35:11 i think the file checking permissions is a good start 14:35:37 but it would be helpful if you think about the nfs share issues in general, especially because you work a lot with nfs 14:36:27 anybody want to volunteer to see what tempest and/or cinder-tempest-plugin test that will be relevant for this change? 14:37:12 shoffmann: could be helpful if you could list some scenarios where this bug surfaces as a start 14:37:35 sure, I can help 14:37:50 What is a good place for this? mailing list or next meeting? 14:38:17 shoffmann: i think put some scenarios into an etherpad and then send a note to the mailing list 14:38:25 i need to get familiar with this, i'd like to group all the related bugs first and list the scenarios we are currently having, but I can help with the tempest coverage as well :) 14:38:33 rosmaita++ 14:38:40 sounds better 14:38:56 but shoffmann we should continue to follow this at the cinder meetings too 14:39:06 (it's not an either/or is what i mean) 14:39:23 ok, thanks, let's move on 14:39:23 enriquetaso yes me too, but I can check the nas_secure option with the netapp nfs driver 14:39:24 fyi all of this interacts w/ cinder-backup too 14:39:52 eharney: good point! we don't want to break backups 14:39:59 shoffmann: ^^ 14:40:25 #topic Cinder matrix - Clarification on Snapshot Attachment 14:40:29 walshh_: that's you 14:40:34 Apologies for bringing this one up again. Differing opinions 14:40:37 #link https://review.opendev.org/c/openstack/cinder/+/800015 14:40:55 This functionality is exposed via rpcapi (as a cinder backup enhancement) and not cinder API so maybe it should not be in the matrix at all. 14:41:14 Only a couple of drivers can attach snapshots, most cannot. 14:41:44 Just wondering is removing it might be the best option as it is certainly causing confusion with customers 14:41:49 there was some confusion about this on the mailing list, too 14:42:04 and confusing us also :-) 14:42:05 i think you may be right about removing it from the matrix 14:42:11 i think that makes sense, since it isn't actually a feature that is exposed to users 14:42:18 If this is only for cinder-backup use then I will remove my +1 and agree that it be removed completely rom the matrix 14:42:34 And customers do not like to see x's against features 14:42:39 maybe move it over somewhere in the driver development docs? so people at least know to think about it 14:42:45 even if it is not possible to implement in a driver 14:42:49 where by 'people' i mean driver developers 14:43:19 sure I can do that if I am pointed in the correct direction 14:44:04 ok, why don't you go ahead and revise your patch to just remove it, and if someone finds a good place to add a statement, they can tell you on the review 14:44:13 so if there is no other strong opinions for it to remain, I will remove it 14:44:30 sounds good ... if anyone has 2nd thoughts, leave comments on the review 14:44:34 thanks helen 14:44:38 thanks all 14:44:49 #topic RBD fixes that we really need 14:44:53 enriquetaso: you're up 14:45:11 I'd like to point out these two patches here that I think are ready for review. (1) Currently the Ceph backend fails when restoring non ceph volumes. 14:45:24 #link https://review.opendev.org/c/openstack/cinder/+/750782 14:45:30 and.. 14:45:36 (2) When creating an encrypted volume from an encrypted image the volume gets truncated and it's inaccessible after the operation. 14:45:44 that is a bad one 14:45:47 #link https://review.opendev.org/c/openstack/cinder/+/801522 14:46:03 yes, i think rbd really needs both asap 14:46:14 ok, reminder that RBD is a community driver, so it deserves extra love 14:46:22 :) 14:46:27 thanks! 14:46:27 thanks sofia 14:46:41 #topic Fix for PowerVC 14:46:47 simondodsley: you have the floor 14:46:52 Thanks - This is an XS patch that we need to merge so we can create an urgent backport to Wallaby. PowerVC will use Wallaby for its next release. PowerVC uses `provider_id` heavily and so we need it to be correctly associated with all volumes in consistency groups. 14:47:14 https://review.opendev.org/c/openstack/cinder/+/803046 14:47:51 ok, reminder to all driver maintainers -- it is helpful to review other driver patches 14:48:02 I try to :) 14:48:09 (hint to anyone asking in #openstack-cinder for reviews) 14:48:35 simondodsley: that wasn't aimed at you, i have noticed your name showing up on all sorts of revieww 14:48:57 thanks simone 14:49:03 sorry about the type 14:49:06 typo 14:49:09 lol 14:49:23 #topic mypy inline comments 14:49:37 i just saw these on https://review.opendev.org/c/openstack/cinder/+/802882/7/cinder/volume/manager.py 14:49:54 iirc, pylint did this too for a while but we turned it off somehow 14:50:23 just want to see what people's opinions are about what we as a team want here 14:50:49 inline comments are certainly in your face and motivate fixing things 14:51:11 so maybe leave them? 14:51:52 ok, not hearing an outburst of opinions here, let's leave them for now 14:52:03 fwiw, the comments on https://review.opendev.org/c/openstack/cinder/+/802882/7/cinder/volume/manager.py look kind of odd 14:52:13 although, let's review and merge this patch that fixes the particular problem 14:52:16 are they on the right lines? 14:52:27 #link https://review.opendev.org/c/openstack/cinder/+/784453 14:52:33 i don't know, looking now 14:52:47 personal opinion from somebody whose colleagues can sometimes get tired of "please also run this check before pushing to Gerrit": I like advice about improvements, and I like to keep my code as clean as possible, but then I suppose that's common among people gathered here :) 14:53:57 the comments also seem to be reported twice 14:54:03 eharney: line 980 doesn't seem to make sense, and the duplication is not nice 14:54:29 i guess they're appeared twice because there was a recheck and the job ran twice 14:55:00 but yeah, in this particular case the line numbers in https://zuul.opendev.org/t/openstack/build/8b0af55849674691ac9b5c032ac2ec81 and the quoted "return True" statements do not seem to correspond to what is shown in Gerrit 14:55:09 just as a side note, there's a patch listed in https://tiny.cc/CinderPriorities where mypy would have helped avert a bug 14:55:22 so i encourage everyone to continue to review eharney's patches 14:55:23 is there any chance mypy was running on the wrong files?! 14:55:34 Roamer`: this could be an artifact of zuul doing a merge before the mypy run, need to check the logs for that 14:56:11 eharney, oh, of course 14:56:27 eharney, yeah, don't know why I didn't realize that when I fight with Zuul merges in our CI weekly :) 14:56:42 hm, so this means that the line numbers will *almost certainly* be wrong 14:57:16 that would seem to indicate that inline comments won't be helpful 14:57:21 right 14:57:28 i can't remember if zuul does that in the check queue or just the gate queue 14:58:40 eharney, I think it has to do it in the check queue, too... I don't see how things would work without that (it is quite often that the Gerrit change is not made against the tip of the branch) 14:58:47 (at least it does in our CI) 14:58:57 i think it does: https://zuul.opendev.org/t/openstack/build/8b0af55849674691ac9b5c032ac2ec81/log/job-output.txt#334 14:59:27 because it's running on a merge commit of 802882 and not the commit itself... so yeah i'm not sure how the line number thing would actually work 15:00:10 yeah, if the merge is kinda-sorta-trivial, then one could think of writing some tool that parses the Python AST and figures out the line number deltas, but even that could go wrong 15:00:32 ok, looks like we should turn off the inline comments 15:00:58 i will follow up on that 15:01:06 all right, we are out of time 15:01:14 everybody: os-brick reviews!!! 15:01:31 BTW something that I didn't add to the etherpad (I thought it was too trivial, maybe I should have)..... what are the chances for a StorPool trivial performance improvement - implement revert-to-snapshot natively? https://review.opendev.org/c/openstack/cinder/+/680889 15:01:48 (and yes, I realize that I should do more reviews myself, sorry about that) 15:01:56 :) 15:02:33 seems ok ... file a launchpad BP because we are coming up on M-3 feature freeze soon 15:02:42 that will help us prioritize 15:02:45 thanks everyone! 15:02:46 I did, it's at https://blueprints.launchpad.net/openstack/?searchtext=storpool-revert-to-snapshot 15:02:52 but I only did it the day before :) 15:02:54 and thanks 15:03:03 ok, thanks i will make sure it's in the xena mix 15:03:08 #endmeeting