*** yangyapeng has quit IRC | 00:03 | |
*** mtanino has quit IRC | 00:11 | |
*** pgbridge has quit IRC | 00:20 | |
*** akerr_ is now known as akerr_away | 00:33 | |
*** akerr_away is now known as akerr_ | 00:33 | |
*** kaisers_ has joined #openstack-manila | 00:34 | |
*** kaisers_ has quit IRC | 00:38 | |
*** Suyi_ has quit IRC | 00:40 | |
*** liyifeng has quit IRC | 00:57 | |
*** yangyapeng has joined #openstack-manila | 01:16 | |
*** akshai has joined #openstack-manila | 01:23 | |
*** akshai has quit IRC | 01:35 | |
*** vbellur has joined #openstack-manila | 01:50 | |
*** liyifeng has joined #openstack-manila | 01:50 | |
*** yangyape_ has joined #openstack-manila | 02:03 | |
*** yangyapeng has quit IRC | 02:04 | |
*** akerr_ has quit IRC | 02:16 | |
*** kaisers_ has joined #openstack-manila | 02:23 | |
*** gouthamr has joined #openstack-manila | 02:24 | |
*** gouthamr_ has joined #openstack-manila | 02:27 | |
*** kaisers_ has quit IRC | 02:27 | |
*** gouthamr has quit IRC | 02:30 | |
*** yangyapeng has joined #openstack-manila | 02:34 | |
*** yangyape_ has quit IRC | 02:37 | |
*** cknight has joined #openstack-manila | 03:26 | |
*** gouthamr_ has quit IRC | 03:29 | |
*** nkrinner_afk is now known as nkrinner | 03:47 | |
*** liyifeng has quit IRC | 04:02 | |
*** liyifeng has joined #openstack-manila | 04:17 | |
*** kaisers_ has joined #openstack-manila | 04:23 | |
*** kaisers_ has quit IRC | 04:23 | |
*** furlongm has quit IRC | 04:43 | |
*** furlongm_ has joined #openstack-manila | 04:43 | |
*** vbellur has quit IRC | 05:02 | |
*** furlongm_ has quit IRC | 05:05 | |
*** furlongm_ has joined #openstack-manila | 05:06 | |
*** cknight has quit IRC | 05:06 | |
*** yangyape_ has joined #openstack-manila | 05:42 | |
*** yangyapeng has quit IRC | 05:42 | |
*** vbellur has joined #openstack-manila | 05:55 | |
*** lpetrut has joined #openstack-manila | 06:19 | |
openstackgerrit | Daniel Gonzalez Nothnagel proposed openstack/manila: Add binding_profile option for backends https://review.openstack.org/284034 | 06:35 |
---|---|---|
openstackgerrit | Daniel Gonzalez Nothnagel proposed openstack/manila: Add multi-segment support https://review.openstack.org/277731 | 06:35 |
openstackgerrit | Daniel Gonzalez Nothnagel proposed openstack/manila: Add neutron driver for binding https://review.openstack.org/283494 | 06:35 |
*** tovchinnikova has joined #openstack-manila | 06:41 | |
*** liyifeng has quit IRC | 07:01 | |
*** pcaruana has joined #openstack-manila | 07:21 | |
*** zengyingzhe has quit IRC | 07:22 | |
*** zengyingzhe has joined #openstack-manila | 07:22 | |
*** ekarlso has quit IRC | 07:29 | |
*** lpetrut has quit IRC | 07:36 | |
*** ekarlso has joined #openstack-manila | 07:37 | |
*** furlongm_ has quit IRC | 07:38 | |
*** furlongm_ has joined #openstack-manila | 07:42 | |
openstackgerrit | Alexey Ovchinnikov proposed openstack/manila: Container driver https://review.openstack.org/308930 | 07:44 |
*** lpetrut has joined #openstack-manila | 07:46 | |
*** yangyapeng has joined #openstack-manila | 07:59 | |
*** liyifeng has joined #openstack-manila | 07:59 | |
*** yangyap__ has joined #openstack-manila | 08:01 | |
*** dsariel has joined #openstack-manila | 08:02 | |
*** yangyape_ has quit IRC | 08:03 | |
*** yangyapeng has quit IRC | 08:04 | |
*** lpetrut has quit IRC | 08:04 | |
*** yangyap__ has quit IRC | 08:05 | |
openstackgerrit | Béla Vancsics proposed openstack/manila: Use more specific asserts in tests https://review.openstack.org/349926 | 08:35 |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [Tempest] Run tempest with the only plugin - manila's https://review.openstack.org/338371 | 08:39 |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [Tempest] Run tempest with the only plugin - manila's https://review.openstack.org/338371 | 08:42 |
*** yangyapeng has joined #openstack-manila | 08:42 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [debug] do not merge https://review.openstack.org/350168 | 08:44 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/manila: Updated from global requirements https://review.openstack.org/348635 | 09:02 |
*** amitkqed has quit IRC | 09:04 | |
*** amitkqed has joined #openstack-manila | 09:04 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [Tempest] Run tempest with the only plugin - manila's https://review.openstack.org/338371 | 09:24 |
*** lpetrut has joined #openstack-manila | 09:26 | |
openstackgerrit | Marc Koderer proposed openstack/manila: Fix issue with snapshot creation in NetApp driver https://review.openstack.org/350473 | 09:30 |
openstackgerrit | Béla Vancsics proposed openstack/manila: Use more specific asserts in tests https://review.openstack.org/349926 | 09:37 |
*** furlongm_ has quit IRC | 09:37 | |
*** furlongm_ has joined #openstack-manila | 09:37 | |
*** gaurangt has joined #openstack-manila | 09:40 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [Tempest] Run tempest with the only plugin - manila's https://review.openstack.org/338371 | 09:41 |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [debug] do not merge https://review.openstack.org/350168 | 09:42 |
*** kambiz has joined #openstack-manila | 09:44 | |
openstackgerrit | zhongjun proposed openstack/python-manilaclient: Add snapshot instances admin APIs https://review.openstack.org/304449 | 10:07 |
*** amit213 has quit IRC | 10:10 | |
*** amit213 has joined #openstack-manila | 10:12 | |
openstackgerrit | zhongjun proposed openstack/python-manilaclient: Add snapshot instances admin APIs https://review.openstack.org/304449 | 10:15 |
*** yangyapeng has quit IRC | 10:26 | |
*** yangyapeng has joined #openstack-manila | 10:27 | |
*** yangyapeng has quit IRC | 10:46 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [DEBUG] bug 1566815 https://review.openstack.org/350174 | 11:10 |
openstack | bug 1566815 in Manila "share manager fails to remove access rules on replicated share" [Critical,In progress] https://launchpad.net/bugs/1566815 - Assigned to Rodrigo Barbieri (rodrigo-barbieri2010) | 11:10 |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [DEBUG2] bug 1566815 https://review.openstack.org/350175 | 11:11 |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [DEBUG3] bug 1566815 https://review.openstack.org/350176 | 11:11 |
*** jcsp has joined #openstack-manila | 11:28 | |
*** alyson_ has joined #openstack-manila | 11:28 | |
*** yangyapeng has joined #openstack-manila | 11:42 | |
*** tpsilva has joined #openstack-manila | 11:48 | |
*** ganso has joined #openstack-manila | 11:53 | |
ganso | vponomaryov: ping | 11:54 |
vponomaryov | ganso: pong | 11:54 |
ganso | vponomaryov: Hi Valeriy | 11:54 |
vponomaryov | ganso: Hello Rodrigo | 11:54 |
ganso | vponomaryov: answering your yesterday's messages: I also suspect the bug is fixed | 11:55 |
vponomaryov | ganso: I am not anymore | 11:55 |
ganso | vponomaryov: looking at your debug patches, they are incomplete | 11:55 |
ganso | vponomaryov: so the failures you reported are expected | 11:55 |
vponomaryov | ganso: what is lost you think? | 11:55 |
ganso | vponomaryov: let me link you just a sec | 11:55 |
*** gaurangt has left #openstack-manila | 11:56 | |
ganso | vponomaryov: this: https://review.openstack.org/#/c/332267/37/manila/db/sqlalchemy/api.py | 11:56 |
ganso | vponomaryov: I did not look at ALL the logs you posted, but looking at some of them I immediately noticed that the code above ^ is missing and is necessary to complete the fix | 11:57 |
ganso | vponomaryov: part of the errors were related to "ResourceNotFound" when querying an instance access rule | 11:57 |
ganso | vponomaryov: the code above addresses that | 11:57 |
ganso | vponomaryov: either way | 11:58 |
ganso | vponomaryov: I am going to separate the patch | 11:58 |
ganso | vponomaryov: I need part of that fix in order for migration to work, which is not completely related to the other errors observed | 11:59 |
ganso | vponomaryov: so I will rebase on top | 11:59 |
ganso | vponomaryov: If the fix, once separated, does not address the bug you filed entirely, I will file another bug related only to migration which I am addressing. Unfortunately I cannot invest time right now finding the root cause of all the other errors | 12:02 |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [DEBUG3] bug 1566815 https://review.openstack.org/350176 | 12:07 |
openstack | bug 1566815 in Manila "share manager fails to remove access rules on replicated share" [Critical,In progress] https://launchpad.net/bugs/1566815 - Assigned to Rodrigo Barbieri (rodrigo-barbieri2010) | 12:07 |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [DEBUG2] bug 1566815 https://review.openstack.org/350175 | 12:07 |
*** jcsp has quit IRC | 12:07 | |
*** jcsp has joined #openstack-manila | 12:08 | |
*** timcl has joined #openstack-manila | 12:08 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: [DEBUG] bug 1566815 https://review.openstack.org/350174 | 12:08 |
openstack | bug 1566815 in Manila "share manager fails to remove access rules on replicated share" [Critical,In progress] https://launchpad.net/bugs/1566815 - Assigned to Rodrigo Barbieri (rodrigo-barbieri2010) | 12:08 |
*** nidhimittalhada has joined #openstack-manila | 12:10 | |
ganso | vponomaryov: also, I did not fully understand your changes on access.py, why must the lock be in the update_access method instead of only in the remove_access_rules as in my patch? | 12:10 |
vponomaryov | ganso: "update access" is the only interface per share instance, it consists of several logical parts | 12:12 |
vponomaryov | ganso: so, the most uber fix would be to make lock on that single interface and reload rules inside of it | 12:12 |
vponomaryov | reload means reread from db | 12:12 |
ganso | vponomaryov: it will have a bigger impact on performance | 12:14 |
vponomaryov | ganso: lock is made on per-instance basis | 12:14 |
vponomaryov | ganso: what performance are you talking about? | 12:14 |
ganso | vponomaryov: my lock is per share_id | 12:15 |
ganso | vponomaryov: because rules are read from share_access_mapping | 12:15 |
ganso | vponomaryov: it acquires what rules each instance should have based on what is registered in the share_access_mapping | 12:15 |
vponomaryov | ganso: "update_access" handles one specific share instance | 12:16 |
ganso | vponomaryov: so the race is when one thread is removing the last access rule from one instance | 12:16 |
ganso | vponomaryov: and then the other instance thinks it should apply that access rule, while in fact it should not, because it is being removed | 12:16 |
*** xyang1 has joined #openstack-manila | 12:16 | |
vponomaryov | I don't mind, let it be share_id | 12:17 |
ganso | vponomaryov: it does, but refresh procedure at the end messes it up | 12:17 |
vponomaryov | ganso: just propose your change | 12:17 |
ganso | vponomaryov: in fact, the lock should include remove and refresh refresh as well | 12:17 |
vponomaryov | ganso: gates are broken because of it | 12:18 |
ganso | vponomaryov: are they broken? wasn't this bug random? | 12:18 |
vponomaryov | ganso: just look here - https://review.openstack.org/#/c/338371/ | 12:18 |
vponomaryov | ganso: migration tests in tempest fail too | 12:18 |
vponomaryov | ganso: for generic and lvm drivers | 12:19 |
vponomaryov | ganso: which are voting | 12:19 |
ganso | vponomaryov: I am confused, what change broke it? it was voting +1 yesterday | 12:19 |
vponomaryov | ganso: this bug has started to appear very often | 12:20 |
*** narayrak has joined #openstack-manila | 12:21 | |
*** gouthamr has joined #openstack-manila | 12:30 | |
*** rraja has joined #openstack-manila | 12:35 | |
*** dustins has joined #openstack-manila | 12:36 | |
*** narayrak has quit IRC | 12:41 | |
*** akerr has joined #openstack-manila | 12:41 | |
*** liyifeng has quit IRC | 12:44 | |
*** narayrak has joined #openstack-manila | 12:52 | |
*** narayrak has quit IRC | 12:54 | |
*** narayrak has joined #openstack-manila | 12:54 | |
openstackgerrit | Alexey Ovchinnikov proposed openstack/manila: [DNM] Tests for scenario tests https://review.openstack.org/340236 | 13:01 |
*** nidhimittalhada has quit IRC | 13:14 | |
*** porrua has joined #openstack-manila | 13:17 | |
*** zengyingzhe_ has joined #openstack-manila | 13:22 | |
*** rraja has quit IRC | 13:24 | |
*** zengyingzhe has quit IRC | 13:25 | |
*** nkrinner is now known as nkrinner_afk | 13:29 | |
*** faiz89 has joined #openstack-manila | 13:33 | |
openstackgerrit | Rodrigo Barbieri proposed openstack/manila: Fix update_access race condition and stale data https://review.openstack.org/350588 | 13:38 |
ganso | vponomaryov: ^ | 13:39 |
tpsilva | ganso, vponomaryov: we just stumbled on an issue testing one of our drivers, while adding several rules, some of them are just skipped and never come to the driver | 13:39 |
tpsilva | ganso, vponomaryov: we'll cherry pick this and test again as soon as jenkins approves :) | 13:40 |
ganso | tpsilva: I remember you attempted to fix this in your latest update_access patch back in Mitaka, using the generic driver... our test case was that missing rules when adding several rules at once | 13:41 |
ganso | tpsilva: this is possibly a related issue | 13:41 |
tpsilva | ganso: yep, I fixed it with the refresh if I recall correctly, but no locks were used | 13:42 |
*** vbellur has quit IRC | 13:42 | |
ganso | tpsilva: yep, now we cannot avoid locks anymore :( | 13:42 |
tpsilva | ganso: why don't you lock the entire update_access though? | 13:43 |
ganso | tpsilva: that's one alternative. My first attempt was https://review.openstack.org/#/c/332267/37/manila/share/access.py | 13:44 |
ganso | tpsilva: ^ I did not see the error anymore after that | 13:44 |
*** eharney has joined #openstack-manila | 13:54 | |
*** akshai has joined #openstack-manila | 13:54 | |
*** faiz89 has quit IRC | 13:58 | |
bswartz | vponomaryov: https://review.openstack.org/350597 | 13:58 |
*** merooney has joined #openstack-manila | 13:58 | |
*** timcl1 has joined #openstack-manila | 13:59 | |
*** faiz89 has joined #openstack-manila | 14:00 | |
*** krot_sickleave is now known as krotscheck | 14:00 | |
*** akshai has quit IRC | 14:00 | |
*** esker has joined #openstack-manila | 14:01 | |
*** akshai has joined #openstack-manila | 14:02 | |
*** timcl has quit IRC | 14:02 | |
*** eharney_ has joined #openstack-manila | 14:04 | |
*** vbellur has joined #openstack-manila | 14:04 | |
*** eharney has quit IRC | 14:04 | |
vponomaryov | tpsilva: I proposed Rodrigo to lock whole module, you can read above in history | 14:09 |
vponomaryov | tpsilva: so, you are not alone | 14:10 |
openstackgerrit | zhongjun proposed openstack/python-manilaclient: Add snapshot instances admin CLIs https://review.openstack.org/304449 | 14:13 |
tpsilva | vponomaryov: locking the whole module would also remove that weird method | 14:17 |
vponomaryov | tpsilva: yes | 14:18 |
vponomaryov | tpsilva: were debugging it with following: https://review.openstack.org/#/c/350174/3/manila/share/access.py | 14:19 |
ganso | vponomaryov, tpsilva : Just testing, I will submit another patch anyway to address the share_instance model comment. If it passes the first time I will submit several debug patches, one of them with the whole method locked | 14:23 |
vponomaryov | ganso: submit in parallel | 14:24 |
vponomaryov | ganso: it is gates stability related | 14:24 |
ganso | vponomaryov: I'm waiting because I have not tested the one I submitted at all | 14:25 |
vponomaryov | ganso; got working debug - updated commit message and that's it | 14:25 |
vponomaryov | ganso: nice | 14:25 |
tpsilva | vponomaryov: I did not understand that last part on your patch | 14:25 |
vponomaryov | tpsilva: update of "remove_rules"? | 14:25 |
tpsilva | vponomaryov: you're ensuring that remove_rules contains only rules that exist on DB? | 14:25 |
tpsilva | yes | 14:25 |
vponomaryov | tpsilva: exactly | 14:25 |
tpsilva | vponomaryov: looks nice | 14:26 |
vponomaryov | tpsilva: because previous locked operation could update it | 14:26 |
tpsilva | vponomaryov: makes sense | 14:27 |
tpsilva | I'll keep an eye on that | 14:27 |
vponomaryov | tpsilva: it already passed tempest jobs | 14:27 |
vponomaryov | tpsilva: just not updated unit tests failed | 14:28 |
*** mtanino has joined #openstack-manila | 14:40 | |
*** alkhodos has joined #openstack-manila | 14:42 | |
*** toabctl has quit IRC | 14:46 | |
*** toabctl has joined #openstack-manila | 14:47 | |
*** akapil has joined #openstack-manila | 14:49 | |
*** eharney_ is now known as eharney | 14:51 | |
*** esker has quit IRC | 15:00 | |
*** pgbridge has joined #openstack-manila | 15:02 | |
*** nkrinner_afk has quit IRC | 15:05 | |
*** akapil has quit IRC | 15:07 | |
*** akshai has quit IRC | 15:09 | |
*** vbellur has quit IRC | 15:10 | |
tpsilva | vponomaryov: I'm fine with that implementation | 15:16 |
vponomaryov | tpsilva: it requires update, there is dead lock | 15:16 |
vponomaryov | tpsilva: on recursive call part | 15:16 |
tpsilva | vponomaryov: I see that some tests failed, but I thought it was not related to that change | 15:16 |
*** nkrinner_afk has joined #openstack-manila | 15:16 | |
vponomaryov | tpsilva: unit test timeout talks about it | 15:17 |
*** akshai has joined #openstack-manila | 15:18 | |
*** akapil has joined #openstack-manila | 15:19 | |
tpsilva | vponomaryov: changing to share_id fixes it? | 15:20 |
vponomaryov | tpsilva: it is not related to name of lock | 15:21 |
vponomaryov | tpsilva: there is recursive call of locked func | 15:21 |
tpsilva | vponomaryov: ooooh right | 15:21 |
vponomaryov | tpsilva: with such fix | 15:21 |
tpsilva | vponomaryov: makes sens | 15:21 |
*** akapil has quit IRC | 15:22 | |
*** akapil has joined #openstack-manila | 15:22 | |
tpsilva | vponomaryov: quick fix is make update_access private and create a public method just for the initial call and put the lock there | 15:24 |
tpsilva | vponomaryov: but it's ugly | 15:24 |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: Fix concurrency issue updating access having 2+ share instances https://review.openstack.org/350647 | 15:25 |
vponomaryov | tpsilva ^ , ganso: you and goutham set as co-authors | 15:25 |
vponomaryov | ganso: we really should fix it asap | 15:25 |
tpsilva | oh, you're locking only the recursive call | 15:26 |
tpsilva | nice | 15:26 |
tpsilva | no you're not, nvm hehe | 15:27 |
*** vbellur has joined #openstack-manila | 15:27 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: Fix concurrency issue updating access having 2+ share instances https://review.openstack.org/350647 | 15:30 |
*** timcl1 has quit IRC | 15:32 | |
*** timcl has joined #openstack-manila | 15:37 | |
tpsilva | vponomaryov: https://review.openstack.org/#/c/350647/2/manila/tests/share/test_access.py | 15:39 |
tpsilva | vponomaryov: this is weird | 15:39 |
vponomaryov | tpsilva: it is not | 15:39 |
tpsilva | vponomaryov: doesn't your patch break fallback mode? | 15:39 |
vponomaryov | tpsilva: it is related to that filtering of updated DB | 15:40 |
vponomaryov | tpsilva: look at test | 15:40 |
vponomaryov | tpsilva: its delete_rules do not cross oribinal rules | 15:40 |
vponomaryov | s/oribinal/original/ | 15:42 |
tpsilva | vponomaryov: yes, looks like the test is not testing properly, but I'm ok with that... fallback mode should be removed soon | 15:43 |
tpsilva | vponomaryov: original_rules should have delete_rules, since they are deleted after update_access is executed | 15:44 |
tpsilva | vponomaryov: but I'm ok with that, not the scope of your commit | 15:44 |
*** akapil has quit IRC | 15:49 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: Fix concurrency issue updating access having 2+ share instances https://review.openstack.org/350647 | 15:51 |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: Fix concurrent usage of update_access method for share intances https://review.openstack.org/350647 | 15:56 |
*** JoseMello has joined #openstack-manila | 15:57 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack/manila: Fix concurrent usage of update_access method for share intances https://review.openstack.org/350647 | 15:57 |
vponomaryov | ganso: 5th patch should be final -> https://review.openstack.org/#/c/350647/5 | 15:58 |
*** akshai has quit IRC | 15:59 | |
vponomaryov | ganso: your commit message + lock by share_id as you wanted | 15:59 |
*** tovchinnikova has quit IRC | 16:26 | |
*** faiz89 has quit IRC | 16:27 | |
*** vbellur has quit IRC | 16:27 | |
*** akshai has joined #openstack-manila | 16:30 | |
*** akshai_ has joined #openstack-manila | 16:32 | |
*** narayrak has quit IRC | 16:33 | |
*** akshai has quit IRC | 16:35 | |
*** dsariel has quit IRC | 16:39 | |
*** vbellur has joined #openstack-manila | 16:43 | |
*** Suyi_ has joined #openstack-manila | 16:45 | |
tpsilva | vponomaryov: new implementation is cleaner | 16:46 |
tpsilva | vponomaryov: but apparently it's still breaking on our driver | 16:46 |
tpsilva | :( | 16:46 |
tpsilva | vponomaryov: probably different bug | 16:46 |
*** furlongm_ has quit IRC | 16:49 | |
*** furlongm_ has joined #openstack-manila | 16:50 | |
*** vbellur has quit IRC | 16:51 | |
*** vbellur has joined #openstack-manila | 16:51 | |
*** krotscheck is now known as kro_focused | 16:53 | |
*** zhongjun_ has quit IRC | 16:53 | |
*** zhongjun_ has joined #openstack-manila | 16:55 | |
tpsilva | vponomaryov: yep, definitely still broken here | 16:55 |
tpsilva | vponomaryov: it stops adding the rules | 16:55 |
*** zengyingzhe_ has quit IRC | 16:58 | |
*** pcaruana has quit IRC | 17:00 | |
*** zengyingzhe has joined #openstack-manila | 17:00 | |
*** openstackgerrit_ has joined #openstack-manila | 17:06 | |
*** openstackgerrit_ has quit IRC | 17:08 | |
tpsilva | vponomaryov: still there? | 17:10 |
tpsilva | vponomaryov: problem is in refresh | 17:10 |
*** akshai_ has quit IRC | 17:11 | |
*** lpetrut has quit IRC | 17:11 | |
*** eharney has quit IRC | 17:20 | |
*** lpetrut has joined #openstack-manila | 17:37 | |
tpsilva | gouthamr: ping | 17:39 |
gouthamr | tpsilva: Hey Tiago | 17:39 |
tpsilva | gouthamr: hey Goutham | 17:39 |
tpsilva | gouthamr: you're checking the update access bug as well, right? | 17:40 |
tpsilva | gouthamr: I found another issue, not sure if it's the same bug or other | 17:40 |
tpsilva | gouthamr: probably other one, mine is API related | 17:40 |
gouthamr | tpsilva: oh.. | 17:40 |
tpsilva | gouthamr: but we may need to use the same lock proposed by vponomaryov on API | 17:41 |
tpsilva | gouthamr: let me explain | 17:41 |
gouthamr | tpsilva: sure.. | 17:41 |
tpsilva | gouthamr: it's pretty simple and I guess someone already raised that possible issue when the refresh was proposed... API is changing status to updating multiple right after update_access checks it for refresh | 17:43 |
tpsilva | gouthamr: after that the whole access allow/deny are stalled | 17:43 |
gouthamr | tpsilva: when the driver is applying rules? | 17:44 |
tpsilva | gouthamr: yes | 17:44 |
tpsilva | gouthamr: we may need to add that same lock on allow_access_to_instance | 17:44 |
tpsilva | gouthamr: on API | 17:44 |
tpsilva | gouthamr: not sure about the performance impact though | 17:45 |
gouthamr | tpsilva: so, two separate API requests might try to set stuff to 'status_updating_multiple' | 17:45 |
tpsilva | gouthamr: no no | 17:45 |
tpsilva | gouthamr: only one request | 17:45 |
tpsilva | gouthamr: I made a simple for, adding 100 rules to a share | 17:45 |
tpsilva | gouthamr: while manager is still processing a rule, another one comes to the API | 17:46 |
tpsilva | gouthamr: manager has just checked if it needs refresh, status says no | 17:46 |
tpsilva | gouthamr: then API changed status to "updating multiple" | 17:46 |
tpsilva | gouthamr: but manager has already checked, so it does not refresh | 17:46 |
*** faiz89 has joined #openstack-manila | 17:46 | |
tpsilva | gouthamr: not sure if this would happen on any driver but it certainly happens on HNAS driver | 17:47 |
tpsilva | gouthamr: a simple for can reproduce this issue | 17:47 |
gouthamr | tpsilva: ah.. i see the problem now. | 17:50 |
tpsilva | gouthamr: adding the same lock on allow_access_to_instance would affect the performance? | 17:50 |
gouthamr | tpsilva: we can't certainly apply the same lock on the API method.. | 17:50 |
tpsilva | gouthamr: why? these locks don't work across two processes? | 17:51 |
gouthamr | tpsilva: that lock will make the API wait. | 17:52 |
*** lpetrut has quit IRC | 17:52 | |
tpsilva | gouthamr: right, API must be available | 17:53 |
tpsilva | gouthamr: I tried updating the share_instance model, but that's not enough | 17:53 |
tpsilva | gouthamr: and that's not the correct way of solving concurrency issues :) | 17:53 |
gouthamr | tpsilva: yes.. bswartz and I were discussing the locks we currently have.. we have gigantic critical sections when driver methods are being invoked.. | 17:54 |
gouthamr | tpsilva: don't we check if the access rules changed when the manager returns and checks for refresh? | 17:56 |
tpsilva | gouthamr: the refresh method does that | 17:56 |
*** harlowja has quit IRC | 17:58 | |
*** harlowja has joined #openstack-manila | 18:00 | |
tpsilva | gouthamr: lock on API solved it, but there's the availability issue now | 18:01 |
tpsilva | gouthamr: we may need to check other ways of solving that | 18:01 |
gouthamr | tpsilva: what's the use of _check_needs_refresh? | 18:03 |
tpsilva | gouthamr: it solves and older bug | 18:03 |
tpsilva | gouthamr: let me find it | 18:03 |
*** lpetrut has joined #openstack-manila | 18:03 | |
tpsilva | gouthamr: point is, it solved one bug, but added this concurrency one | 18:03 |
tpsilva | https://review.openstack.org/#/c/287758/ | 18:04 |
gouthamr | tpsilva: IIRC, we had an API regression with the first update_access patch.. we weren't allowing more than one non-active rule at a time | 18:04 |
tpsilva | check cknight comments on PS 15 | 18:04 |
tpsilva | gouthamr: ^ | 18:06 |
gouthamr | tpsilva: not to restart that discussion all over again, but we were probably okay with per access rule status so that we could correctly reflect what was applied, and what was not.. and while all of them were not applied correctly, the share instance's access_rule_status could be something other than 'available' | 18:06 |
tpsilva | gouthamr: yes | 18:06 |
tpsilva | gouthamr: one of the issued that we would solve on Newton was to add back the per access rule status | 18:07 |
tpsilva | gouthamr: it was discussed on summit IIRC, but no one voluteered | 18:07 |
gouthamr | tpsilva: we never got to that :( | 18:07 |
tpsilva | gouthamr: I want to fix that, but cannot do that on Newton, just ocata :( | 18:07 |
gouthamr | tpsilva: i think if we had that, we would drip the check in the API, allow access rule changes to come in, perform the necessary reconciliation in the manager. | 18:08 |
gouthamr | s/drip/drop | 18:08 |
* tpsilva gouthamr: yes... | 18:08 | |
tpsilva | gouthamr: don't know why it send it that way... weird irc client lol | 18:09 |
tpsilva | gouthamr: we're trying to fix this update_access since late mitaka and we never got it working | 18:10 |
ganso | gouthamr, tpsilva: I am following the discussion here | 18:10 |
ganso | it seems the problem is mostly related to states | 18:10 |
ganso | since we have a patch that adds a huge lock to update_access | 18:10 |
tpsilva | gouthamr: we need to add back per rule status, but I'm afraid it's kinda late to get this done in Newton | 18:10 |
ganso | I believe we could use that to set the state, decide whether the rule is present in DB or not, and always query the correct state | 18:11 |
gouthamr | bswartz: ? | 18:11 |
tpsilva | gouthamr: I think I can do that, but that would mean no mountable snapshots | 18:11 |
tpsilva | since it classifies as a bugfix, its deadline is past FF | 18:12 |
ganso | tpsilva: yes | 18:12 |
bswartz | gouthamr: ? | 18:13 |
*** MikeG451 has quit IRC | 18:13 | |
ganso | bswartz, gouthamr, tpsilva: ? | 18:13 |
gouthamr | bswartz: per access rule status not being there is issues in the API.. we've lived with it through Newton; we knew we had to fix it... | 18:14 |
tpsilva | if ganso continues what I started on mountable snapshots, I can start adding per rule status tomorrow :P | 18:15 |
gouthamr | ganso: tpsilva: in a meeting, sorry for the lag (bswartz is here as well) | 18:15 |
tpsilva | gouthamr: no problem | 18:15 |
* gouthamr looks at all the patches ganso has and tells tpsilva, 'fat chance' | 18:15 | |
tpsilva | gouthamr: is he following the issue? | 18:15 |
ganso | tpsilva: definitely no way | 18:15 |
tpsilva | help me out here guys :P | 18:16 |
gouthamr | tpsilva: related question for you, i think mountable snapshots has come a long way, but do you think we can have first party driver support? | 18:17 |
tpsilva | gouthamr: I was discussing that with ganso these days | 18:17 |
tpsilva | gouthamr: it's probably really quick to implement on LVM driver | 18:18 |
gouthamr | tpsilva: nice, thanks.. | 18:20 |
*** timcl1 has joined #openstack-manila | 18:24 | |
tpsilva | well that's probably topic for the weekly meeting | 18:26 |
*** timcl has quit IRC | 18:27 | |
*** MikeG451 has joined #openstack-manila | 18:29 | |
*** vbellur has quit IRC | 18:30 | |
*** dustins has quit IRC | 18:41 | |
*** dustins has joined #openstack-manila | 18:45 | |
*** vbellur has joined #openstack-manila | 18:49 | |
*** lpetrut has quit IRC | 18:55 | |
*** lpetrut has joined #openstack-manila | 18:55 | |
*** eharney has joined #openstack-manila | 18:58 | |
*** porrua has quit IRC | 19:06 | |
*** timcl has joined #openstack-manila | 19:10 | |
bswartz | guys what's up? | 19:12 |
bswartz | I'm dividing my attention between 3 things so not reading all the chat here | 19:12 |
*** timcl1 has quit IRC | 19:13 | |
tpsilva | bswartz: caught a bug on update access, pretty easy to reproduce | 19:13 |
tpsilva | bswartz: just by adding several rules... eventually they stop being added because of our refresh mechanism | 19:14 |
bswartz | yes it's a race condition | 19:15 |
bswartz | if you add them too fast | 19:15 |
tpsilva | point is, it's not really that fast | 19:16 |
tpsilva | i'm just adding a bunch of rules using a shell script | 19:16 |
tpsilva | this is enought to get it to fail | 19:16 |
tpsilva | bswartz: ^ | 19:16 |
bswartz | yeah but scripts are fast | 19:19 |
bswartz | if you did it interactively it probably wouldn't trigger | 19:19 |
bswartz | if you see the bug even with long delays between adds then I'd be really worried | 19:20 |
bswartz | nevertheless it's a race we shodl fix | 19:20 |
ganso | bswartz: a script to add access rules is a valid use case for a user while there is no share groups feature | 19:20 |
bswartz | ganso: I'm not disagreeing with that | 19:20 |
ganso | bswartz: access groups | 19:20 |
tpsilva | bswartz: the script is a simple for | 19:20 |
tpsilva | for i in `seq 1 100`; do manila access-allow share ip 5.5.5.$i; done | 19:21 |
tpsilva | bswartz: and after the issue, the whole allow/deny api becomes stuck | 19:21 |
ganso | bswartz: ^ it becomes a blocking issue | 19:21 |
tpsilva | bswartz: that's what worries me the most | 19:21 |
bswartz | my only point was that the issue is known and shouldn't affect interactive users | 19:22 |
tpsilva | bswartz: access_rules_status are stuck in "updating_multiple" | 19:22 |
bswartz | obvious the fact that it affects automation scripts means we need to fix it anyways | 19:22 |
ganso | bswartz: before, we assumed that the next rule would fix the problem of a rule not added if refresh wouldn't be able to do so | 19:22 |
tpsilva | bswartz: and we cannot get it back to anything else without removing the share | 19:22 |
tpsilva | even restarting the services won't fix it | 19:23 |
bswartz | honestly I'd hoped that process on tooz locking would have gone faster | 19:23 |
bswartz | this is fixable with any kind of multiprocess lock though | 19:23 |
tpsilva | bswartz: yes, I fixed with extending the lock that vponomaryov proposed to the API | 19:23 |
tpsilva | bswartz: but the API is locked then | 19:24 |
tpsilva | the lock proposed here: https://review.openstack.org/#/c/350647/ | 19:24 |
bswartz | tpsilva: forgive me I don't see any locks | 19:32 |
bswartz | is that the right patch? | 19:32 |
bswartz | oh wait I found it | 19:32 |
* bswartz has poor vision | 19:32 | |
tpsilva | bswartz: https://review.openstack.org/#/c/350647/5/manila/share/access.py@49 | 19:32 |
tpsilva | bswartz: I extended it to this method | 19:33 |
tpsilva | bswartz: https://github.com/openstack/manila/blob/master/manila/share/api.py#L1225 | 19:33 |
tpsilva | bswartz: and refreshed the share_instance model inside the method | 19:33 |
tpsilva | bswartz: worked for me | 19:33 |
*** gouthamr has quit IRC | 19:35 | |
*** vbellur has quit IRC | 19:35 | |
bswartz | tpsilva: okay so does this fix coimpletely address the issue or are there still problems with it | 19:38 |
bswartz | it looks like it might be the correct fix but I don't have time to properly review it | 19:38 |
tpsilva | bswartz: it worked for me, I didn't have any issues with it | 19:39 |
tpsilva | bswartz: I will discuss that with vponomaryov tomorrow | 19:39 |
*** timcl has quit IRC | 19:39 | |
tpsilva | bswartz: since he is fixing the original bug | 19:39 |
bswartz | tpsilva: so then the only thing you're asking me is to review that patch, right? | 19:40 |
tpsilva | bswartz: not exactly... I'm afraid this would cover some other possible issue with that, also, that locks the API, which is not quite nice | 19:41 |
tpsilva | but I don't know what I'm asking either :P | 19:41 |
bswartz | okay | 19:41 |
bswartz | so my stance is that the api service absolutely needs to have locks | 19:41 |
bswartz | however we need to ensure nothing blocking happens while holding a lock | 19:41 |
bswartz | at least, no network blocking (DB calls and file I/O would be fine) | 19:42 |
bswartz | I'm not sure why we've avoided locks in the API service so far | 19:42 |
tpsilva | bswartz: okay... well since locks on API are OK, I'll discuss that solution with vponomaryov tomorrow | 19:43 |
tpsilva | hopefully we can get this fixed tomorrow | 19:43 |
bswartz | yes | 19:44 |
tpsilva | alright | 19:44 |
tpsilva | thanks Ben | 19:44 |
bswartz | btw if anyone take the viewpoint that locks in the API are not okay, then I'd *really* like to know what the recommended alternative for fixing these kinds of issues is | 19:44 |
tpsilva | bswartz: goutham is against locks on API :P | 19:44 |
bswartz | still I want to read the code and validate that the lock is only held for a minimum time | 19:45 |
tpsilva | think he already left though | 19:45 |
bswartz | I talked with gouthamr yesterday | 19:45 |
bswartz | hopefully cleared up any misunderstandings | 19:45 |
*** gouthamr has joined #openstack-manila | 19:45 | |
bswartz | I think he was basing his opinion on tradition | 19:45 |
ganso | bswartz: I have an idea but I am not sure if it fixes the problem. tpsilva already tested his idea and it works | 19:48 |
*** vbellur has joined #openstack-manila | 19:48 | |
bswartz | ganso: can you outline it? | 19:48 |
bswartz | I have to run in about 2 minutes | 19:48 |
ganso | bswartz: my idea is that now that we cannot avoid to use a huge lock on update_access, we would remove those additional statuses, and allow every RPC request to go to the manager | 19:48 |
ganso | bswartz: and then the whole manager code is inside the lock (which almost already is) | 19:49 |
bswartz | aha | 19:49 |
bswartz | that's exactly what we shouldn't do | 19:49 |
bswartz | because it creates deadlock situations | 19:49 |
ganso | bswartz: we are kinda doing that | 19:49 |
ganso | bswartz: with the proposed fix | 19:49 |
bswartz | okay that's why I want to look more closely at the proposed fix | 19:49 |
tpsilva | ganso: no we're not | 19:49 |
bswartz | the general goal should be: prevent conflicting access to resources using transitional states, and protect state changes with locks | 19:50 |
bswartz | that approach allows us to ensure locks aren't held for more than a few microseconds, and transional states should exist for maybe hundreds of milliseconds | 19:51 |
ganso | tpsilva: when I said "we are kinda doing that" I referred to the huge lock. I know we are not forwarding everything to the manager right now | 19:51 |
bswartz | okay I'm AFK for a bit | 19:52 |
ganso | bswartz: my suggestion would serialize everything | 19:52 |
ganso | bswartz: because it seems we cannot do it parallel without locks in the API | 19:52 |
bswartz | ganso, tpsilva: I added a topic for tomorrow: https://wiki.openstack.org/wiki/Manila/Meetings#Next_meeting | 19:54 |
tpsilva | bswartz: awesome, thanks | 19:54 |
openstackgerrit | Merged openstack/manila: Updated from global requirements https://review.openstack.org/348635 | 19:55 |
openstackgerrit | Merged openstack/manila: Check for usage of same Cephx ID as manila service https://review.openstack.org/349718 | 19:56 |
*** permalac has quit IRC | 20:02 | |
*** permalac has joined #openstack-manila | 20:03 | |
alkhodos | xyang1, vponomaryov, gouthamr, zhongjun_ : Hi, please take a look at PS 31 https://review.openstack.org/#/c/309286/31 | 20:08 |
*** porrua has joined #openstack-manila | 20:09 | |
*** lpetrut has quit IRC | 20:10 | |
gouthamr | alkhodos: responded | 20:14 |
*** merooney has quit IRC | 20:16 | |
gouthamr | alkhodos: so, you can't remove access rules? | 20:16 |
gouthamr | alkhodos: s/you/your driver | 20:17 |
*** akapil has joined #openstack-manila | 20:28 | |
*** dustins has quit IRC | 20:30 | |
*** porrua has quit IRC | 20:30 | |
alkhodos | gouthamr: I can | 20:33 |
alkhodos | gouthamr: forgot to remove that commented test... | 20:34 |
gouthamr | alkhodos: i see that you're not adding any logic to read deleted rules | 20:34 |
alkhodos | gouthamr: :param access_rules: All access rules for given share. This list is enough to update the access rules for given share. | 20:36 |
alkhodos | gouthamr: I just use this param for both add and delete | 20:36 |
*** akapil has quit IRC | 20:36 | |
alkhodos | gouthamr: basicly pass all access rules the share should have every time | 20:37 |
*** timcl has joined #openstack-manila | 20:40 | |
gouthamr | alkhodos: access_rules will not contain "deny" rules. | 20:40 |
openstackgerrit | Tiago Pasqualini da Silva proposed openstack/manila: Add Hitachi HSP driver https://review.openstack.org/329134 | 20:41 |
gouthamr | alkhodos: so, the thing is.. when a user uses ``manila access_allow``, update_access() is called on the driver with this access rule set in "add_rules" and in "access_rules" | 20:42 |
gouthamr | alkhodos: when a user uses ``manila access_deny`` , update_access() is called on the driver with this access rule set in "delete_rules" | 20:43 |
gouthamr | alkhodos: access_rules will only contain those rules that need to be on the share.. | 20:43 |
alkhodos | gouthamr: hmm, let me retest this | 20:43 |
*** akapil has joined #openstack-manila | 20:44 | |
*** dustins has joined #openstack-manila | 20:44 | |
gouthamr | alkhodos: the use of that is really in the 'recovery' case, when, for instance, you bounce the share manager service and the driver needs to update any un-updated rules, it will receive the ``update_access`` call on driver startup with all the rules that *need* to be on the share.. and in this case add_rules and delete_rules will be None | 20:45 |
gouthamr | ganso: ^ anything wrong with that interpretation? | 20:45 |
openstackgerrit | Tiago Pasqualini da Silva proposed openstack/manila: Add Hitachi HSP driver https://review.openstack.org/329134 | 20:46 |
ganso | gouthamr: reading | 20:46 |
ganso | gouthamr, alkhodos: if I am reading it right, his driver works like my driver, which has a list of rules that is basically a string. In order words, a huge list of rules that can be updated atomically. His driver does not need to care about adding or removing individual rules. "access_rules" list always has the list of rules the driver should have, so he | 20:50 |
ganso | just atomically applies the whole list to his backend. | 20:50 |
ganso | gouthamr: I don't see any problem wit hit | 20:51 |
gouthamr | ganso alkhodos: nice, that was my question in teh review.. if that's the case, it'd be nice to see this explanation in the docstring | 20:52 |
*** akerr has quit IRC | 20:52 | |
alkhodos | gouthamr: yes, that what i was trying top say) | 20:52 |
alkhodos | gouthamr: will update the docstring | 20:54 |
gouthamr | alkhodos: good stuff. thank you! | 20:54 |
*** vbellur has quit IRC | 20:56 | |
*** timcl has quit IRC | 20:57 | |
*** furlongm has joined #openstack-manila | 20:57 | |
*** furlongm_ has quit IRC | 20:57 | |
alkhodos | gouthamr: probably a silly question, but can't find a way to make it work: when I pass required=True to an opt with no default, how do I pass the unit tests? It throws an error on setUp | 21:05 |
alkhodos | gouthamr: here's the error http://paste.openstack.org/show/547855/ | 21:06 |
gouthamr | alkhodos: how are you setting and passing this config | 21:06 |
gouthamr | alkhodos: your driver's init method will expect it if required=True.. | 21:07 |
*** dustins has quit IRC | 21:07 | |
gouthamr | alkhodos: an example: https://github.com/openstack/manila/blob/master/manila/tests/share/drivers/zfsonlinux/test_driver.py#L29 | 21:07 |
gouthamr | alkhodos: and https://github.com/openstack/manila/blob/master/manila/tests/share/drivers/zfsonlinux/test_driver.py#L97 | 21:08 |
alkhodos | gouthamr: I wonder how it does not fail here https://github.com/openstack/manila/blob/master/manila/tests/share/drivers/zfsonlinux/test_driver.py#L94 ? | 21:09 |
alkhodos | gouthamr: I wonder how it does not fail here https://github.com/openstack/manila/blob/master/manila/tests/share/drivers/zfsonlinux/test_driver.py#L94 ? | 21:09 |
gouthamr | alkhodos: self.mock_object(zfs_driver.CONF, '_check_required_opts') :) | 21:09 |
alkhodos | gouthamr: ah nvm | 21:09 |
alkhodos | gouthamr: got it ) | 21:09 |
gouthamr | alkhodos: smart man vponomaryov | 21:09 |
alkhodos | gouthamr: ye, I was trying to figure that out, but failed lol | 21:10 |
*** JoseMello has quit IRC | 21:10 | |
openstackgerrit | Rodrigo Barbieri proposed openstack/manila: Fix Share Migration improper behavior for drivers https://review.openstack.org/332267 | 21:14 |
*** akapil has quit IRC | 21:14 | |
*** akapil has joined #openstack-manila | 21:15 | |
*** faiz89 has quit IRC | 21:23 | |
*** gouthamr has quit IRC | 21:24 | |
*** alkhodos has quit IRC | 21:25 | |
*** jcsp has quit IRC | 21:29 | |
*** akapil has quit IRC | 21:36 | |
openstackgerrit | Alexey Khodos proposed openstack/manila: Nexenta: adding share drivers for NexentaStor https://review.openstack.org/309286 | 21:40 |
*** jcsp has joined #openstack-manila | 21:41 | |
*** eharney has quit IRC | 22:02 | |
*** eharney has joined #openstack-manila | 22:03 | |
*** alyson_ has quit IRC | 22:03 | |
*** vbellur has joined #openstack-manila | 22:28 | |
*** jcsp has quit IRC | 22:41 | |
*** xyang1 has quit IRC | 22:56 | |
*** tpsilva has quit IRC | 23:08 | |
*** sjjfowler has quit IRC | 23:15 | |
*** adrianofr has quit IRC | 23:21 | |
*** hoonetorg has quit IRC | 23:39 | |
*** hoonetorg has joined #openstack-manila | 23:44 | |
*** zhongjun_ has quit IRC | 23:47 | |
*** zhongjun_ has joined #openstack-manila | 23:48 | |
*** liyifeng has joined #openstack-manila | 23:57 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!