*** spotz has joined #openstack-glance | 00:25 | |
*** rcernin_ has joined #openstack-glance | 02:58 | |
*** rcernin has quit IRC | 02:59 | |
*** rcernin_ has quit IRC | 03:16 | |
*** rcernin_ has joined #openstack-glance | 03:32 | |
*** Liang__ has joined #openstack-glance | 03:34 | |
*** rcernin_ has quit IRC | 03:45 | |
*** rcernin has joined #openstack-glance | 03:45 | |
*** ratailor has joined #openstack-glance | 04:16 | |
*** evrardjp has quit IRC | 04:33 | |
*** evrardjp has joined #openstack-glance | 04:33 | |
*** Liang__ has quit IRC | 04:49 | |
*** Liang__ has joined #openstack-glance | 04:51 | |
*** rcernin has quit IRC | 05:32 | |
*** udesale has joined #openstack-glance | 05:33 | |
*** rcernin has joined #openstack-glance | 05:40 | |
*** m75abrams has joined #openstack-glance | 06:16 | |
openstackgerrit | Abhishek Kekane proposed openstack/glance_store master: Release notes for Victoria Milestone 1 https://review.opendev.org/737218 | 06:41 |
---|---|---|
openstackgerrit | Rajat Dhasmana proposed openstack/glance_store master: Don't allow image creation with encrypted nfs volumes https://review.opendev.org/732506 | 07:05 |
openstackgerrit | Rajat Dhasmana proposed openstack/glance_store master: Don't allow image creation with encrypted nfs volumes https://review.opendev.org/732506 | 07:08 |
*** bhagyashris is now known as bhagyashris|lunc | 07:27 | |
*** lpetrut has joined #openstack-glance | 07:40 | |
*** rcernin_ has joined #openstack-glance | 07:47 | |
*** rcernin has quit IRC | 07:47 | |
*** rcernin_ has quit IRC | 07:54 | |
*** bhagyashris|lunc is now known as bhagyashris | 08:34 | |
*** rajivmucheli has joined #openstack-glance | 08:52 | |
*** rajivmucheli has quit IRC | 09:13 | |
*** tkajinam has quit IRC | 09:21 | |
*** Liang__ has quit IRC | 09:59 | |
*** ratailor has quit IRC | 12:21 | |
*** udesale_ has joined #openstack-glance | 12:25 | |
*** udesale has quit IRC | 12:27 | |
*** rosmaita has joined #openstack-glance | 12:51 | |
*** jv_ has joined #openstack-glance | 13:52 | |
openstackgerrit | Markus Hentsch proposed openstack/glance-specs master: Spec for the Glance part of Image Encryption https://review.opendev.org/609667 | 14:48 |
*** m75abrams has quit IRC | 15:44 | |
*** gyee has joined #openstack-glance | 15:55 | |
*** udesale_ has quit IRC | 16:21 | |
*** lpetrut has quit IRC | 16:50 | |
dansmith | abhishekk: jokke: Need some help figuring out why the copy-to-store isn't working in my test | 17:40 |
dansmith | I see the task starting, but eventually it looks like they fail like this: | 17:40 |
dansmith | https://zuul.opendev.org/t/openstack/build/18e4701c1a374bf09269778479160f25/log/controller/logs/screen-g-api.txt#7466 | 17:40 |
dansmith | specifically: glance.common.exception.Forbidden: You are not permitted to modify 'os_glance_importing_to_stores' on this image. | 17:41 |
dansmith | nova just waits for the full timeout for the copy to finish but never sees that happen, I'm guessing because glance isn't updating those meta properties? | 17:41 |
abhishekk | dansmith, never faced this issue before | 17:46 |
abhishekk | Can I see the test? | 17:47 |
dansmith | it's just a test trying to create an instance, | 17:47 |
dansmith | with my nova patch that requests a copy-to-store from file to rbd before progressing | 17:48 |
dansmith | it works locally | 17:48 |
dansmith | abhishekk: what would cause glance to not be able to update that status property on the image? | 17:49 |
dansmith | shouldn't that be using an admin context or something such that it can always do that? | 17:49 |
abhishekk | yes it should | 17:49 |
dansmith | abhishekk: but it looks like it's failing even in just the plugin loading process right? | 17:54 |
dansmith | is that loading the copy-to-store plugin? | 17:54 |
abhishekk | No plugin is loaded at the run-time i.e. when service starts | 17:55 |
abhishekk | this is where we build the actual flow of image creation | 17:56 |
dansmith | okay, I don't even see any context usage in that path really | 17:56 |
dansmith | abhishekk: you have the curse of being helpful so I'm pinging you, but if someone else is the better point of contact, let me know | 17:58 |
abhishekk | dansmith, jokke is the one but I guess at the moment he is facing some connectivity issues | 17:59 |
dansmith | okay | 17:59 |
abhishekk | does multiple import test worked> | 18:00 |
abhishekk | ? | 18:00 |
abhishekk | or you just trying copying scenario first? | 18:01 |
dansmith | the import conversion from qcow to raw worked fine, | 18:04 |
dansmith | this is now trying to do a copy-to-store | 18:04 |
dansmith | seems to be failing here: | 18:04 |
dansmith | image.extra_properties[ | 18:04 |
dansmith | 'os_glance_importing_to_stores'] = ','.join((store for store in | 18:04 |
dansmith | stores if | 18:04 |
dansmith | store is not None)) | 18:04 |
abhishekk | yes, but this is the same flow which ran successfully for conversion thing | 18:05 |
dansmith | but with different owners | 18:05 |
abhishekk | But in that case you are creating image with single store I guess | 18:05 |
dansmith | is that why? it's letting me do a copy-to-store, but then the task fails later async? | 18:05 |
abhishekk | I guess so | 18:06 |
dansmith | that seems broken to me | 18:07 |
dansmith | (a) if I can read the image, why can't I copy it to a store I have access to? | 18:07 |
dansmith | (b) the import should fail with 403, | 18:07 |
dansmith | and not just fail later | 18:07 |
dansmith | and (c) if it fails later, it needs to add to failed_stores or whatever so I know to stop looking | 18:07 |
abhishekk | the C part is not executing as it is failing before actual taskflow execution starts | 18:08 |
abhishekk | what two different owners are you talking about? | 18:09 |
abhishekk | Jun 22 15:57:24.812816 ubuntu-bionic-rax-iad-0017311577 glance-api[14039]: WARNING keystonemiddleware.auth_token [-] Authorization failed for token: keystonemiddleware.auth_token._exceptions.InvalidToken: Token authorization failed | 18:09 |
dansmith | devstack uploads the image as one tenant, but tempest tests all run under a different one | 18:10 |
abhishekk | I can see this warning in between as well, but I guess it doesn't related | 18:10 |
dansmith | re: (c) above, if you returned 20x to me when you started the task, but the task fails later, there needs to be some way for my polling loop to determine that it has failed and I should stop polling | 18:11 |
dansmith | ideally, you would start the task running before returning from the initial POST, so I know that 20x means it's actually loaded and runnable, but what matters more is that I have some way of knowing | 18:13 |
abhishekk | yeah | 18:14 |
dansmith | because I'm also seeing what looks like a race to start the conversion.. if two clients ask for copy-to-store simultaneously, it kinda looks like two tasks will start with the same goals | 18:14 |
abhishekk | yes | 18:16 |
dansmith | that's a pretty big problem, if so | 18:17 |
abhishekk | IMO second operation will fail in that case | 18:18 |
dansmith | presumably tasks are tracked in the database such that we could determine if more than one is started with the same goal before running right? | 18:18 |
dansmith | will currently or should be made to fail? | 18:18 |
abhishekk | looking for confirmation | 18:20 |
dansmith | okay, my next question will be... if I make nova do this with admin credentials, will that fix the ability to update the image? | 18:21 |
dansmith | and/or do you think it's a bug that this is failing in the first place with another tenant asking for the copy? | 18:22 |
abhishekk | For first question, if two clients ask for copy-to-store same image simultaneously then there will problem | 18:23 |
abhishekk | if I make nova do this with admin credentials, will that fix the ability to update the image? >>> this should work ok | 18:24 |
dansmith | okay, so admin should fix me quickly, but.. what *should* be the case? I would expect using the user's token is better so that any auditing takes place as the user, etc. | 18:26 |
openstackgerrit | Merged openstack/glance_store master: Release notes for Victoria Milestone 1 https://review.opendev.org/737218 | 18:26 |
abhishekk | not sure at the moment, need to go through it again :( | 18:29 |
dansmith | okay | 18:31 |
dansmith | I just want to know whether it's expected to use admin for this, which means the bug is "I should have gotten 403 instead of 20x", or if it should work which means the bug is "admin shouldn't be required but is" | 18:32 |
dansmith | abhishekk: either way, it sounds like there's also an important bug around preventing two nodes from copying the image at the same time right? | 18:32 |
abhishekk | two nodes, means two controller nodes? | 18:33 |
dansmith | no, compute nodes | 18:33 |
dansmith | or of course, two separate users who do this manually | 18:35 |
abhishekk | possibility is more if that is not HA setup (i.e. only single glance is running then its 100% sure) | 18:35 |
dansmith | okay I'm not sure what you mean | 18:36 |
abhishekk | I am saying staging is local to api node | 18:36 |
dansmith | even with only one controller node running glance standalone, there will be multiple worker processes right? | 18:36 |
abhishekk | we are using file store for staging which will be only one for all the processes | 18:37 |
dansmith | okay, so are you saying that if we weren't hitting this permission issue, that one of the threads would "win" and the rest would never start, but only if they're all threads on one machine? | 18:37 |
dansmith | and that if there are multiple API worker nodes, that "bad things would happen" in that case? | 18:37 |
abhishekk | what I am saying is | 18:38 |
abhishekk | 1. there are two requests for copying image A to store 2 and 3 | 18:38 |
abhishekk | 2. first request will copy image in staging area /whatever/os_glance_staging_dir | 18:39 |
abhishekk | 3. Second request will also try to copy iamge in staging area but it will see that file is present there already so it will exit that task and proceed to import | 18:39 |
abhishekk | 4. In this case there might be possibility that when Import task starts for request 2 staging of image from request 1 is not completed | 18:40 |
dansmith | so it will start copying the partial image? | 18:41 |
abhishekk | which leads to corrupt or partial image upload | 18:41 |
dansmith | gotcha, so that will happen even in the single-glance-api case | 18:41 |
abhishekk | yes, can be | 18:42 |
dansmith | okay, so that's a bug to file for sure right? | 18:42 |
abhishekk | yes | 18:43 |
dansmith | can you do that and get me the bug number, since you understand that path? | 18:43 |
dansmith | and I think another bug is the "report 20x but fail with permissions later, should have returned 403" issue right? I can file that one | 18:43 |
abhishekk | Is it ok if I do it tomorrow morning my time? | 18:44 |
dansmith | NO, DO IT NOW! | 18:44 |
dansmith | (yes of course :) | 18:44 |
abhishekk | :D | 18:44 |
dansmith | but confirm, the above 403 is a bug and I will file, yeah? | 18:44 |
dansmith | i.e. if it is going to fail because admin is required, it should fail as 403 immediately and not 20x and fail later | 18:45 |
abhishekk | I am still confused about this case, but I think it as a bug | 18:46 |
dansmith | okay I will write it up as a bug and you can look tomorrow and maybe get a better idea with complete prose instead of my IRC ramblings | 18:46 |
abhishekk | We need couple of more error handling around these new plugins | 18:47 |
abhishekk | like if import operation is in progress and someone sends copy-image request for the same image | 18:47 |
dansmith | supposedly there is a known error for that, according to the API | 18:48 |
dansmith | and I'm checking for it | 18:48 |
dansmith | however, I think it doesn't ever happen because the credential thing never allows the other one to start | 18:48 |
abhishekk | it will happen | 18:49 |
abhishekk | We have some flags around import operation to set image to active while importing to multiple stores | 18:49 |
dansmith | okay, I'm checking for it, but I don't log anything, so maybe it is | 18:49 |
dansmith | abhishekk: right, but because of the permissions thing, the image never gets the "importing_to_stores=$newstore" set on it | 18:50 |
dansmith | so if you're checking that, you won't see it in progress | 18:50 |
dansmith | but maybe there is other accounting? | 18:50 |
abhishekk | I will test this scenario tomorrow which is in my mind | 18:51 |
dansmith | ack | 18:51 |
abhishekk | But sounds like "importing_to_stores=$newstore this will prevent second one | 18:51 |
dansmith | meaning prevent the race between two competing imports? | 18:52 |
smcginnis | abhishekk: I updated https://review.opendev.org/#/c/735735/ to pick up that release note. If you could take a quick look when you have a moment, that would be great. | 18:52 |
abhishekk | smcginnis, ack, will look in 5-10 minutes | 18:52 |
smcginnis | Thanks! | 18:53 |
abhishekk | dansmith, I will try to reproduce this issue in my local environment first before filing this | 18:53 |
dansmith | okay | 18:53 |
abhishekk | Just to be on same page, the scenario I will test will be | 18:54 |
abhishekk | 1. create image in single store | 18:54 |
abhishekk | 2. send one request to copy it in other stores | 18:54 |
abhishekk | 3. Send 2nd request to do the same | 18:54 |
abhishekk | almost immediately | 18:54 |
dansmith | yes, but, | 18:55 |
dansmith | unless the task checks that no other task is also doing the import *after* it has succeeded in updating the image state, you can get a double insert race | 18:55 |
dansmith | meaning you get this pattern: | 18:55 |
dansmith | 1. task1 checks to see if already importing, false | 18:55 |
dansmith | 2. task2 checks to see if already importing, false | 18:56 |
dansmith | 3. task1 updates to importing | 18:56 |
dansmith | 4. task2 updates to importing | 18:56 |
dansmith | 5. chaos | 18:56 |
dansmith | it's hard to prove that can't happen by testing externally | 18:56 |
abhishekk | In case of copy operation iamge status does not change it remains active only | 18:56 |
dansmith | I know, by "checks" I mean "checks to see if the store is in os_glance_importing_to_stores" | 18:57 |
abhishekk | aha | 18:57 |
dansmith | to prevent the above, | 18:57 |
dansmith | each task needs a second check, after it succeeds in updating, to make sure that it and only it is importing | 18:57 |
abhishekk | hmm | 18:58 |
abhishekk | smcginnis, what is the significance of 'No' an empty file? | 19:00 |
smcginnis | Huh, no idea where that came from. Let me fix that quick. | 19:01 |
abhishekk | smcginnis, ack | 19:01 |
abhishekk | dansmith, I am going offline now, its quiet late here (please let me know if you found something else) | 19:02 |
smcginnis | abhishekk: Cleaned that up. | 19:02 |
dansmith | abhishekk: just filed: https://bugs.launchpad.net/glance/+bug/1884587 | 19:02 |
openstack | Launchpad bug 1884587 in Glance "image import copy-to-store API should reflect proper authorization" [Undecided,New] | 19:02 |
smcginnis | abhishekk: Have a good night. | 19:02 |
dansmith | abhishekk: no problem we can circle back tomorrow | 19:02 |
abhishekk | smcginnis, looking | 19:02 |
abhishekk | dansmith, looking | 19:02 |
dansmith | abhishekk: seriously, it's fine to go and look tomorrow | 19:03 |
abhishekk | yes, thank you and have a good day ahead | 19:03 |
abhishekk | smcginnis, for you too ^^ | 19:03 |
dansmith | o/ | 19:03 |
smcginnis | Thanks abhishekk | 19:04 |
abhishekk | o/~ | 19:04 |
dansmith | abhishekk: for your morrow, I filed this bug in anticipation of additional looking you might do tomorrow, just to try to clearly lay out the process by which multiple threads may get started: https://bugs.launchpad.net/glance/+bug/1884596 | 19:53 |
openstack | Launchpad bug 1884596 in Glance "image import copy-to-store will start multiple importing threads due to race condition" [Undecided,New] | 19:53 |
dansmith | if that is not possible for some reason, then we can close it as invalid, but ... I think it's valid :) | 19:53 |
openstackgerrit | Dan Smith proposed openstack/glance master: WIP: Create image import factory with admin context https://review.opendev.org/737382 | 21:29 |
*** rcernin_ has joined #openstack-glance | 22:32 | |
*** rcernin_ has quit IRC | 22:47 | |
*** tkajinam has joined #openstack-glance | 22:51 | |
*** rcernin_ has joined #openstack-glance | 23:02 | |
*** rcernin_ has quit IRC | 23:16 | |
*** rcernin has joined #openstack-glance | 23:18 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!