*** Liang__ has joined #openstack-glance | 01:18 | |
*** gyee has quit IRC | 01:23 | |
openstackgerrit | Ghanshyam Mann proposed openstack/glance stable/stein: Make greande jobs n-v for EM and oldest stable https://review.opendev.org/737417 | 02:06 |
---|---|---|
*** rcernin has quit IRC | 02:41 | |
*** rcernin has joined #openstack-glance | 02:42 | |
*** rchurch has quit IRC | 04:23 | |
*** rchurch has joined #openstack-glance | 04:24 | |
*** evrardjp has quit IRC | 04:33 | |
*** evrardjp has joined #openstack-glance | 04:33 | |
*** udesale has joined #openstack-glance | 04:41 | |
*** ratailor has joined #openstack-glance | 05:08 | |
*** Liang__ has quit IRC | 05:17 | |
*** Liang__ has joined #openstack-glance | 05:18 | |
*** m75abrams has joined #openstack-glance | 05:31 | |
abhishekk | dansmith, ack, thank you | 05:31 |
*** belmoreira has joined #openstack-glance | 06:49 | |
*** rcernin has quit IRC | 08:18 | |
*** priteau has joined #openstack-glance | 08:57 | |
abhishekk | dansmith, ran two scenarios, added comment on the bug | 09:38 |
abhishekk | https://bugs.launchpad.net/glance/+bug/1884596/comments/1 | 09:38 |
openstack | Launchpad bug 1884596 in Glance "image import copy-to-store will start multiple importing threads due to race condition" [Undecided,New] | 09:38 |
abhishekk | jokke, ^^ | 09:38 |
abhishekk | dansmith, I will be back around 1430 UST | 09:39 |
*** udesale_ has joined #openstack-glance | 09:46 | |
*** udesale has quit IRC | 09:49 | |
*** ratailor_ has joined #openstack-glance | 09:51 | |
*** ratailor has quit IRC | 09:54 | |
*** tkajinam has quit IRC | 10:17 | |
*** Liang__ has quit IRC | 10:23 | |
*** jawad_axd has joined #openstack-glance | 11:43 | |
*** lpetrut has joined #openstack-glance | 11:51 | |
openstackgerrit | Dirk Mueller proposed openstack/glance master: Switch from unittest2 compat methods to Python 3.x methods https://review.opendev.org/737512 | 12:17 |
*** ratailor_ has quit IRC | 12:44 | |
dansmith | abhishekk: ack | 13:34 |
dansmith | abhishekk: my admin context hack *appears* to not have worked: https://zuul.opendev.org/t/openstack/build/72f025316f5344dba473f4f63c24a6ec/log/controller/logs/screen-g-api.txt#7103 | 13:35 |
dansmith | which surprises me from reading the code, so I must be missing something | 13:35 |
abhishekk | oh | 13:36 |
abhishekk | jokke, this is another issue we found yesterday | 13:36 |
abhishekk | https://bugs.launchpad.net/glance/+bug/1884587 | 13:38 |
openstack | Launchpad bug 1884587 in Glance "image import copy-to-store API should reflect proper authorization" [Undecided,New] | 13:38 |
dansmith | abhishekk: can you confirm that a user should be able to do copy-to-store if they can read the image? | 13:40 |
*** m75abrams has quit IRC | 13:40 | |
dansmith | if not, I need to add admin credentials to nova for glance (which would be unfortunate), else I can help try to figure out getting users access to it in glance | 13:41 |
abhishekk | dansmith, they should | 13:41 |
dansmith | ack sweet | 13:41 |
jokke | dansmith: only if they own the image | 13:44 |
dansmith | jokke: wait, really? why? | 13:44 |
jokke | user x cannot copy public or shared image owned by user y | 13:45 |
dansmith | but I can download it and re-upload it and then copy it wherever I want | 13:45 |
jokke | dansmith: as it consumes resources on user y's name | 13:45 |
dansmith | resources for the second location? | 13:45 |
jokke | yep | 13:46 |
dansmith | so, then if nova uses admin credentials for copying these images, they'll be charging the original user | 13:46 |
jokke | Specially now when the limits are coming in this is getting important. If there is storage quota, we cannot let just anyone consume quota'd resource of some other user | 13:46 |
jokke | dansmith: correct | 13:46 |
dansmith | this completely breaks the whole "nova copies the image to the correct rbd for you before boot" thing we discussed | 13:47 |
dansmith | except for user-uploaded private images | 13:47 |
dansmith | like, if a cloud provides a public image anyone can use, that has to be pre-pushed to all the locations by the admin before a regular user can boot something that requires one of those locations | 13:48 |
jokke | dansmith: the visibility is not factor, the owner is. | 13:48 |
jokke | dansmith: correct, you would expect the user that is maintaining public images to do that. | 13:48 |
dansmith | but that means every image has to be pushed to every location ahead of time | 13:49 |
jokke | as long as you're the owner the visibility doesn't matter, you cn copy and what not your images. But you cannot as normal user just copy someone elses image all over the place | 13:49 |
dansmith | I understand what you're saying, I understand the visibility has nothing to do with the mechanics | 13:49 |
dansmith | what I'm saying is, a user that can see a public image and expect to boot it at edge site 2 can't do that, unless the image has been pre-pushed there | 13:50 |
dansmith | or they can download/re-upload the image so they own the new one, and then it will work | 13:50 |
jokke | dansmith: correct | 13:50 |
dansmith | hopefully that confirmation also comes with a realization of that not being a very good experience :) | 13:51 |
jokke | dansmith: and it should make sense that if you're maintaining public images for the cloud, you will make sure they are in the sites where they are supposed to be booted | 13:51 |
dansmith | so in an edge cloud with hundreds of functions and thousands of sites, that pretty much defeats the whole purpose here | 13:52 |
dansmith | especially if it leads to users duplicating images in your cloud so they can be sure a function is available at a given site | 13:53 |
dansmith | anyway, I realize there's a missing piece here to make this work | 13:53 |
dansmith | so just to be clear, | 13:53 |
dansmith | if I upload an image of Xgb, I'm charged for Xgb of image space and if I copy that to another store, I start getting charged for 2Xgb right? | 13:54 |
jokke | that's what I would expect if I'm charged for storage | 13:54 |
jokke | It's resource consumed | 13:55 |
dansmith | I understand | 13:55 |
jokke | Same if I spin 100 vm of same image I get charged 100 vms not one | 13:55 |
dansmith | it's a little different for VMs of course, because people don't start VMs and then make them publicly usable like an image | 13:56 |
dansmith | like, if I'm not the owner, but the owner has copied the image to edge-2 and is being charged for it, I can boot an instance there "for free", | 13:57 |
dansmith | and if they later remove it from that store, I will still consume the storage they were charged for because I have an image backed by that base image, | 13:57 |
dansmith | which can't be deleted until I leave the site | 13:57 |
jokke | yeah ... This might be worthy of trying to get hold of the telcos and ask if it's a problem to them or if their automation will just deal with it. | 14:00 |
dansmith | the probably more expensive problem here is that nova will not be able to determine that an image isn't usable at a site until it schedules a VM there, starts a build and gets a 403 when it tries to copy the image | 14:00 |
dansmith | I definitely know it won't work for several scenarios | 14:01 |
jokke | If it's a problem for big enough usecase it might make sense to have that as configurable. But we can't just let that be the default behavior | 14:01 |
dansmith | so, here are some options I can think of: | 14:01 |
dansmith | 1. Add an owner to a location so we can charge the right person. This will be just as leaky as the current thing, where the first person to boot it at a site gets unlucky and charged for the image | 14:01 |
dansmith | 2. Add a property to the image that says "this can always be copied" and have nova use admin creds to do the copy if that property is set (or glance allow it by non-owner if set) | 14:02 |
dansmith | at the very least we need to check proper auth in the API so nova gets a 403 instead of a timeout waiting for something that will never happen, and glance starting a task that will just fail | 14:03 |
dansmith | I'll see if I can cook that up | 14:03 |
dansmith | without either 1 or 2, we basically can't run tempest against a setup like this because all tempest tests run under their own generated tenant, which will never own the image | 14:09 |
dansmith | so we'll need a new test that does the download/upload and tries it | 14:10 |
abhishekk | yes :( | 14:10 |
dansmith | that's super unfortunate because it's a lot more work and also means we don't just magically get more coverage by running the existing tests, which is what gave us the hint about the race condition :/ | 14:11 |
abhishekk | yes, really | 14:13 |
dansmith | abhishekk: any opinions on which optional solution above is most palatable? | 14:13 |
openstackgerrit | Dan Smith proposed openstack/glance master: WIP: Check authorization before import for image https://review.opendev.org/737548 | 14:14 |
abhishekk | For me 2nd one sounds reasonable | 14:15 |
dansmith | okay I like that less muchly but it's something | 14:16 |
dansmith | does glance have a reporting mechanism by which a user's resource usage is summarized? | 14:17 |
abhishekk | nope | 14:19 |
dansmith | oh, so.. how does a user get charged for multiple locations in an image? | 14:20 |
dansmith | does something external just have to look at locations and know to charge extra? | 14:20 |
dansmith | or was jokke talking about quota? | 14:20 |
abhishekk | sorry got distracted, we have some config options around quota's but nothing about reporting mechanism | 14:44 |
dansmith | abhishekk: okay does quota include the space consumed by an image in multiple locations? | 14:45 |
abhishekk | yes | 14:46 |
dansmith | okay | 14:46 |
dansmith | abhishekk: if remove an image from an rbd store, and that image is in use by instances booted from it, what happens? does glance refuse to delete the location? does it get deleted from glance but persist in rbd? | 14:47 |
abhishekk | dansmith, I never checked this in rbd but I guess image gets deleted from glance | 14:49 |
abhishekk | glance_store rbd driver has one exception ImageInUse but I never seen its raised | 14:49 |
dansmith | abhishekk: when would the base image be deleted then? if glance stops tracking it does it go away automatically if the last vm is deleted? or do we leak that image? | 14:50 |
*** jawad_axd has quit IRC | 14:51 | |
abhishekk | dansmith, https://github.com/openstack/glance_store/blob/master/glance_store/_drivers/rbd.py#L444 | 14:51 |
*** priteau has quit IRC | 14:52 | |
dansmith | abhishekk: okay so the image won't get deleted if it's in use? | 14:52 |
abhishekk | that should be the case, but I never used/seen this exception raised | 14:53 |
dansmith | it's raised on that line you just linked.. or do you mean you've never seen it happen in real life? | 14:54 |
abhishekk | I mean, I never used to boot instance from rbd store and deleted it :( | 14:55 |
dansmith | okay, so, let's assume that the code works, and the image won't be deleted: | 14:55 |
dansmith | if I own an image, I copy it to the edge-2 store, start being charged quota (and maybe money) | 14:55 |
dansmith | another user boots an instance from that image in edge-2, for free | 14:56 |
dansmith | now until that other user deletes their instance, I can't stop being charged for the image in that location | 14:56 |
dansmith | that seems bad right? | 14:56 |
abhishekk | yes | 14:57 |
dansmith | another question... is it true that public images can only be owned by admins? | 14:57 |
abhishekk | I guess that's not the case | 15:03 |
dansmith | okay | 15:04 |
dansmith | I couldn't really find anything in the docs | 15:05 |
dansmith | anyway, I'm concerned about the leakiness of this model, I guess specifically when multiple stores are used | 15:05 |
dansmith | but it sounds like going forward with option #2 above will at least alleviate some of the problem | 15:06 |
abhishekk | give me some time, my laptop crashed in afternoon :/, I am still setting it up | 15:07 |
dansmith | no problem, sorry for all the questions, just trying to get up to speed | 15:07 |
*** m75abrams has joined #openstack-glance | 15:08 | |
abhishekk | its not documented but looking at policies public images can only be owned by admins seems true | 15:09 |
*** belmoreira has quit IRC | 15:09 | |
dansmith | is that policy changeable though? | 15:10 |
dansmith | anyway, if only admin images can be public, it surely seems like allowing images to be copied if they're public would be reaonable, since they're....public :) | 15:10 |
abhishekk | yes, that policy is changeable | 15:11 |
dansmith | ack | 15:12 |
dansmith | did we lose jokke due to timezone or something else? | 15:12 |
abhishekk | no idea | 15:12 |
abhishekk | he might be in meeting, his calendar shows that | 15:14 |
abhishekk | dansmith, still around? | 15:19 |
dansmith | yeah | 15:19 |
abhishekk | I boot vm from image which is in rbd store | 15:20 |
abhishekk | deleted it | 15:20 |
dansmith | s/it/image/ ? | 15:20 |
abhishekk | yes | 15:20 |
abhishekk | image got deleted from glance and not listed in rbd as well (sudo rbd ls images) | 15:20 |
abhishekk | nova shows below line in output of "nova show" | 15:21 |
abhishekk | | image | Image not found (f8940ac5-5488-46c3-b6a4-185beab7711f) | 15:21 |
dansmith | that from nova is what is expected to happen if you delete the image, yeah | 15:21 |
dansmith | so the image is deleted and the storage is still consumed by the base, but presumably when you delete the last VM using the base, the refcount will go to zero and the storage will come back? | 15:22 |
dansmith | so this is a hidden storage consumption that we can't expose anywhere | 15:22 |
dansmith | that the user never paid for, if they didn't own the image | 15:22 |
dansmith | but certainly better that the original user doesn't keep getting charged | 15:23 |
abhishekk | yes | 15:23 |
abhishekk | so that brings another doubt for me when will be that InUse exception will be raised? | 15:24 |
abhishekk | dansmith, going for dinner, will be back in 20 mins | 15:26 |
dansmith | yeah | 15:26 |
dansmith | abhishekk: ack | 15:26 |
dansmith | abhishekk: jokke: I wonder if we should be granting copy ability to the user so that you can say things in policy like "anything public can be copied" or "if the image is shared to you as a member, you can copy", etc ? | 15:33 |
*** lpetrut has quit IRC | 15:44 | |
*** gyee has joined #openstack-glance | 15:59 | |
*** m75abrams has quit IRC | 16:33 | |
*** udesale_ has quit IRC | 17:05 | |
abhishekk | dansmith, sorry was out due to power outage | 17:06 |
dansmith | abhishekk: np | 17:18 |
dansmith | the gate has made no progress testing my auth patch since you left anyway :) | 17:18 |
abhishekk | ohh, release traffic :) | 17:25 |
dansmith | abhishekk: did you see my question about policy? | 17:32 |
abhishekk | missed, looking now | 17:33 |
abhishekk | need to explore, but second one like if the image is shared as a member then can copy should be reasonable | 17:34 |
dansmith | presumably if it's defined by policy then an admin could also grant he first right? | 17:36 |
dansmith | *the first | 17:36 |
dansmith | and if we were to gain this ability, then we could configure a devstack to allow copying public images and get the desired test coverage, as well as enable the case where a user and operator want this to "just work" | 17:37 |
dansmith | without breaking the default of it not working | 17:37 |
dansmith | I would think this would be more flexible and desirable than just a flag on an image, even if a little bit more complicated | 17:38 |
abhishekk | in glance policy check is done at different different layers, so going to be much complicated | 17:38 |
dansmith | policy check is done at *multiple* different layers, or at a different layer than we'd need for this to work? | 17:39 |
abhishekk | policy check is done at *multiple* different layers | 17:42 |
dansmith | okay, so it would be okay to do a policy check at the top/API layer, but might be defeated by another check somewhere down in the stack that doesn't know what we're doing.. is that your concern? | 17:43 |
abhishekk | yes | 17:45 |
dansmith | okay | 17:46 |
dansmith | is that something that is unchangeable in glance, or is there a desire to consolidate policy checks up to the higher layer (as we have done in nova)? | 17:47 |
abhishekk | it has been in planning since 3-4 cycles, but we didn't have much contributors to do it | 17:50 |
dansmith | meaning, is the distributed checking in lots of places the desired approach by design, or just a consequence of things being added by different authors? | 17:50 |
dansmith | what has been, moving policy checks up to the api? | 17:50 |
abhishekk | also there are some known issue with location API | 17:51 |
abhishekk | https://review.opendev.org/#/c/528021/2/specs/rocky/approved/glance/policy-refactor.rst | 17:55 |
dansmith | oye okay | 17:55 |
dansmith | my takeaway from that is that refactoring this path to check policy at the API is probably reasonable in _approach_, the feasibility of it notwithstanding | 17:56 |
dansmith | I would rather do the refactoring of the bits needed to enable this to be a policy knob, than punt on "it's too hard" and add another image flag, unless it's *really* hard | 17:57 |
dansmith | but we should probably hear from jokke for an opinion right? | 17:57 |
abhishekk | yes | 17:57 |
dansmith | what is the best way to do that? glance meeting? | 17:58 |
abhishekk | yes | 17:58 |
abhishekk | I will add this topic in discussion agenda | 17:58 |
dansmith | okay, I put it on my calendar | 17:59 |
abhishekk | ack | 18:00 |
openstackgerrit | Abhishek Kekane proposed openstack/glance master: WIP: Fix race condition in copy image operation https://review.opendev.org/737596 | 18:28 |
*** jawad_axd has joined #openstack-glance | 18:50 | |
abhishekk | signing out for the day | 19:25 |
openstackgerrit | Cyril Roelandt proposed openstack/python-glanceclient master: Do not use the six library. https://review.opendev.org/735670 | 19:57 |
*** rchurch has quit IRC | 20:28 | |
*** rchurch has joined #openstack-glance | 20:30 | |
*** tkajinam has joined #openstack-glance | 22:53 | |
*** rcernin has joined #openstack-glance | 23:02 | |
*** rcernin has quit IRC | 23:08 | |
*** rcernin has joined #openstack-glance | 23:08 | |
*** kgz has quit IRC | 23:29 | |
*** freerunner has quit IRC | 23:30 | |
*** freerunner has joined #openstack-glance | 23:31 | |
*** kgz has joined #openstack-glance | 23:32 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!