Thursday, 2019-09-05

openstackgerritMerged openstack/cinder stable/pike: NetApp: Return all iSCSI targets-portals  https://review.opendev.org/66275700:01
openstackgerritMerged openstack/cinder master: Refactor use of encryption/image volume utils  https://review.opendev.org/66994500:01
openstackgerritMerged openstack/cinderlib master: Run functional tests with memory persistence  https://review.opendev.org/64821800:01
openstackgerritMerged openstack/cinder stable/stein: NetApp ONTAP: Fix JSON serialization error on EMS logs  https://review.opendev.org/67810500:01
openstackgerritMerged openstack/cinder stable/rocky: Fix ceph: only close rbd image after snapshot iteration is finished  https://review.opendev.org/67621100:01
openstackgerritMerged openstack/cinder stable/stein: Fix LVM IPv6 target portals  https://review.opendev.org/67860100:01
openstackgerritMerged openstack/cinder master: [api-ref]Fix values of service-status in list-hosts  https://review.opendev.org/67785900:01
openstackgerritMerged openstack/cinder stable/pike: Remove Sheepdog tests from zuul config  https://review.opendev.org/66933400:02
openstackgerritMerged openstack/cinder master: [api-ref]Fix response example file of update_type  https://review.opendev.org/67766400:02
*** trident has joined #openstack-cinder00:03
openstackgerritMerged openstack/cinder master: Add context to cloning snapshots in remotefs driver  https://review.opendev.org/57088500:08
openstackgerritAlan Bishop proposed openstack/cinder stable/pike: Fix NFS volume retype with migrate  https://review.opendev.org/68023700:10
openstackgerritMerged openstack/cinder master: Synology: Fix driver to be compatible with python3  https://review.opendev.org/67970500:14
openstackgerritMerged openstack/cinderlib master: Add pdf build support  https://review.opendev.org/67699700:14
*** markvoelker has joined #openstack-cinder01:00
*** markvoelker has quit IRC01:05
*** senrique_ has quit IRC01:14
openstackgerritzhufl proposed openstack/cinder master: Fix potential NameError of rc_id  https://review.opendev.org/67994401:43
*** markvoelker has joined #openstack-cinder02:01
*** markvoelker has quit IRC02:05
*** markvoelker has joined #openstack-cinder03:31
*** markvoelker has quit IRC03:41
*** mvkr has joined #openstack-cinder03:51
*** gkadam has joined #openstack-cinder03:54
*** gkadam has quit IRC03:54
*** whfnst has joined #openstack-cinder04:07
*** markvoelker has joined #openstack-cinder04:30
*** markvoelker has quit IRC04:35
*** Luzi has joined #openstack-cinder04:55
*** hoonetorg has quit IRC04:56
*** udesale has joined #openstack-cinder05:02
*** hoonetorg has joined #openstack-cinder05:13
*** markvoelker has joined #openstack-cinder05:30
*** markvoelker has quit IRC05:35
*** psachin has joined #openstack-cinder05:36
*** snecker has joined #openstack-cinder06:04
*** snecker has quit IRC06:29
*** psachin has quit IRC06:31
openstackgerritBhaa Shakur proposed openstack/cinder master: Zadara VPSA: Move to API access key authentication  https://review.opendev.org/67071506:36
openstackgerritBhaa Shakur proposed openstack/cinder master: Zadara VPSA: Add Multi-Attach to driver capabilities.  https://review.opendev.org/67956506:36
*** mmethot_ has joined #openstack-cinder06:43
*** mmethot has quit IRC06:45
openstackgerritAbhishek Kekane proposed openstack/cinder master: Support multiple stores of Glance  https://review.opendev.org/66167606:46
*** openstackgerrit has quit IRC06:51
*** udesale has quit IRC06:56
*** raghavendrat has joined #openstack-cinder07:01
*** snecker has joined #openstack-cinder07:03
*** markvoelker has joined #openstack-cinder07:06
*** markvoelker has quit IRC07:10
*** sapd1_x has quit IRC07:10
*** tesseract has joined #openstack-cinder07:15
*** sahid has joined #openstack-cinder07:18
*** trident has quit IRC07:21
*** tosky has joined #openstack-cinder07:23
*** trident has joined #openstack-cinder07:29
*** rcernin has quit IRC07:33
*** trident has quit IRC07:34
*** trident has joined #openstack-cinder07:43
*** openstackgerrit has joined #openstack-cinder07:43
openstackgerritSimon O'Donovan proposed openstack/cinder master: PowerMax Driver - Revert to Snapshot Fix  https://review.opendev.org/67997007:43
*** snecker has quit IRC07:46
*** tkajinam has quit IRC08:05
openstackgerritpengyuesheng proposed openstack/cinder master: Blacklist eventlet 0.25.0  https://review.opendev.org/68031808:18
*** davidsha has joined #openstack-cinder08:26
openstackgerritOpenStack Release Bot proposed openstack/os-brick stable/train: Update .gitreview for stable/train  https://review.opendev.org/68032508:32
openstackgerritOpenStack Release Bot proposed openstack/os-brick stable/train: Update TOX/UPPER_CONSTRAINTS_FILE for stable/train  https://review.opendev.org/68032608:32
openstackgerritOpenStack Release Bot proposed openstack/os-brick master: Update master for stable/train  https://review.opendev.org/68032708:32
*** whfnst has quit IRC08:41
*** e0ne has joined #openstack-cinder08:41
openstackgerritpengyuesheng proposed openstack/os-brick master: Blacklist eventlet 0.25.0  https://review.opendev.org/68033608:42
*** markvoelker has joined #openstack-cinder08:45
*** whfnst has joined #openstack-cinder08:46
*** markvoelker has quit IRC08:50
*** trident has quit IRC09:01
*** trident has joined #openstack-cinder09:09
*** ociuhandu has joined #openstack-cinder09:23
openstackgerritPawel Kaminski proposed openstack/cinder master: target/spdknvmf: Add max_queue_depth configuration parameter  https://review.opendev.org/67206409:35
*** udesale has joined #openstack-cinder10:20
*** udesale has quit IRC10:28
*** udesale has joined #openstack-cinder10:29
openstackgerritNaoki Saito proposed openstack/cinder master: NEC Driver: Support multi-attach  https://review.opendev.org/67527910:41
*** sapd1_x has joined #openstack-cinder10:46
*** markvoelker has joined #openstack-cinder10:46
*** markvoelker has quit IRC10:52
*** carloss has joined #openstack-cinder11:02
*** sapd1_x has quit IRC11:07
*** ociuhandu has quit IRC11:23
hemnadoink11:26
*** lpetrut has joined #openstack-cinder11:27
openstackgerritRajat Dhasmana proposed openstack/cinder master: Untyped to Default Volume Type  https://review.opendev.org/63918011:30
openstackgerritRajat Dhasmana proposed openstack/cinder master: Untyped to Default Volume Type  https://review.opendev.org/63918011:30
*** rosmaita has left #openstack-cinder11:35
*** ociuhandu has joined #openstack-cinder11:39
*** ociuhandu has quit IRC11:40
*** ociuhandu has joined #openstack-cinder11:41
*** ociuhandu has quit IRC11:41
*** ociuhandu has joined #openstack-cinder11:42
*** ociuhandu has quit IRC11:43
*** ociuhandu has joined #openstack-cinder11:44
*** ociuhandu has quit IRC11:50
*** dr_gogeta86 has joined #openstack-cinder11:53
dr_gogeta86hi guys11:53
dr_gogeta86who can help me to increase timeout between volume allocation and wwn scan on cinder volume11:53
dr_gogeta86cinder wan't that particular volume11:54
dr_gogeta86but doesn't find ... and heat stack delete its11:54
dr_gogeta86*it11:54
hemna?11:57
*** markvoelker has joined #openstack-cinder12:02
dr_gogeta86got this latency problem12:03
dr_gogeta86I'm using hitachi cinder driver on os queens12:03
dr_gogeta86it creates volumes like a charm12:03
dr_gogeta86but when a boot from volume is needed12:03
dr_gogeta86there is a bit of delay to see the volume inside cinder-volume machine12:04
dr_gogeta86copy fail12:04
dr_gogeta86and delete the volume due failure12:04
*** virendra-sharma has joined #openstack-cinder12:05
hemnahave to look at the logs12:05
hemnato see why it's failing12:05
*** dviroel has joined #openstack-cinder12:05
dr_gogeta86time to gather is12:07
dr_gogeta86some in particular to check?12:07
raghavendrathi hemna: I had a query regarding your review comment on https://review.opendev.org/#/c/677945/12:09
dr_gogeta86http://paste.openstack.org/show/771290/12:09
dr_gogeta86hemna, ^12:11
*** virendra-sharma has quit IRC12:13
*** ociuhandu has joined #openstack-cinder12:27
*** n-saito has quit IRC12:28
hemnaraghavendrat: sup12:28
raghavendratI had written _initialize_connection_common function because similar code was used for primary array & secondary array12:31
*** ociuhandu has quit IRC12:32
raghavendratIf i understand correctly, I have to remove this function & revert to original code flow. Also write a separate function _initialize_connection_replication for secondary array12:34
*** jmlowe has quit IRC12:36
* hemna looks again12:39
hemnaok I see that now12:41
*** eharney has joined #openstack-cinder12:41
hemnathe logging is still way too noisy for info12:41
hemnathat stuff should be debug12:41
raghavendrati will change logging to debug12:41
hemnathe cl name too12:42
raghavendratyes. i will change name12:42
raghavendratmy query was regarding _initialize_connection_common function.12:42
raghavendratthis function was written because similar code was used for primary array & secondary array12:43
raghavendratIf i understand correctly, I have to remove this function & revert to original code flow. Also write a separate function _initialize_connection_replication for secondary array12:43
raghavendratright ?12:43
hemnanah, it's fine as it is I suppose12:44
hemnait's just overly complicated12:44
hemnabut there in lies the problem with cinder supporting replication itself12:45
hemnaat this point, lets focus on the CI12:45
hemnaand get that running12:45
hemnabecause we can't even approve this patch without it.12:46
raghavendratok. my team-mates are working on CI front. will keep everyone posted12:47
raghavendratas regards, my patch; i will do two things: [1] change log to debug [2] rename "cl" to "remote_client"12:47
raghavendratand submit another patchset12:47
hemnaok good12:48
raghavendratthanks12:49
openstackgerritRajat Dhasmana proposed openstack/python-cinderclient master: Optional filters parameters should be passed only once  https://review.opendev.org/67852312:57
*** jmlowe has joined #openstack-cinder12:59
dr_gogeta86hemna, did u seen the logs ?13:03
*** mriedem has joined #openstack-cinder13:07
*** pcaruana has quit IRC13:10
raghavendratI am leaving for the day.13:13
*** ociuhandu has joined #openstack-cinder13:18
openstackgerritSimon O'Donovan proposed openstack/cinder master: PowerMax Driver - Metro Volume Metadata change  https://review.opendev.org/68040613:19
smcginnisdr_gogeta86: That paste doesn't have a lot of details. But the error there is on running multipath. I guess make sure you have multipath installed and configured would be my first suggestion.13:21
dr_gogeta86smcginnis, of cours it is13:21
dr_gogeta86but i got so much delay from unmask13:21
*** spatel has joined #openstack-cinder13:22
*** senrique_ has joined #openstack-cinder13:27
*** rosmaita has joined #openstack-cinder13:32
*** lseki has joined #openstack-cinder13:39
*** pcaruana has joined #openstack-cinder13:39
*** spatel has quit IRC13:41
*** Luzi has quit IRC13:46
*** senrique_ has quit IRC13:49
*** enriquetaso has joined #openstack-cinder13:57
openstackgerritEric Harney proposed openstack/cinder master: Rename volume/utils.py to volume/volume_utils.py  https://review.opendev.org/67749214:00
*** raghavendrat has quit IRC14:03
*** ociuhandu has quit IRC14:04
*** lpetrut has quit IRC14:05
*** senrique_ has joined #openstack-cinder14:12
*** pcaruana has quit IRC14:12
openstackgerritMerged openstack/python-cinderclient master: Add custom CA support for get_server_version  https://review.opendev.org/67589114:14
*** enriquetaso has quit IRC14:14
*** ociuhandu has joined #openstack-cinder14:20
*** senrique_ has quit IRC14:20
*** ociuhandu has quit IRC14:25
*** enriquetaso has joined #openstack-cinder14:27
*** enriquetaso has quit IRC14:28
*** enriquetaso has joined #openstack-cinder14:28
*** lpetrut has joined #openstack-cinder14:29
openstackgerritSimon O'Donovan proposed openstack/cinder master: PowerMax Driver - Metro Volume Metadata change  https://review.opendev.org/68040614:32
*** enriquetaso has quit IRC14:42
*** tpsilva has quit IRC14:43
*** udesale has quit IRC14:46
*** lpetrut has quit IRC14:47
*** udesale has joined #openstack-cinder14:47
hemnadr_gogeta86: have to find out why multipath -l call is failing14:57
hemnathat is quite unusual14:57
hemnamy guess is there more to the log than that14:58
*** e0ne has quit IRC15:05
*** sfernand has joined #openstack-cinder15:05
openstackgerritBrian Rosmaita proposed openstack/cinderlib master: Don't run functional gates on doc-only changes  https://review.opendev.org/68007015:06
Roamer`could anyone do a quick review on two pretty much trivial StorPool driver changes? https://review.opendev.org/679785 (advertise thin provisioning and a couple of other capabilities) and https://review.opendev.org/679676 (mark it as supported again) - thanks in advance!15:14
*** tosky has quit IRC15:15
smcginnisRoamer`: I don't see CINDER_BRANCH being set in the local.conf. Where are you telling devstack to load the current patch to test?15:17
*** enriquetaso has joined #openstack-cinder15:26
dr_gogeta86hemna, my suspect is15:27
dr_gogeta86the storage manager returns the WWNN15:27
dr_gogeta86but that lun is visible after many minutes15:27
*** ociuhandu has joined #openstack-cinder15:31
*** tpsilva has joined #openstack-cinder15:32
*** ociuhandu has quit IRC15:36
*** thgcorrea has joined #openstack-cinder15:37
hemnaif that volume doesn't show up for minutes on the host, then that seems like a problem between the host and the storage backend15:43
hemnaadding a minutes long timout looking for a device will cause cinder api calls and rabbitmq messages to bail15:44
hemnaso that's not going to help you15:44
hemnagotta debug why that volume isn't showing up quickly15:44
*** jmlowe has quit IRC15:47
hemnaso it doesn't look like we have API rate limiting?15:50
hemnahttps://bugs.launchpad.net/cinder/+bug/166263715:50
openstackLaunchpad bug 1662637 in Cinder "Rate limit settings not enforced" [Undecided,New]15:50
*** jmlowe has joined #openstack-cinder16:00
*** altlogbot_0 has quit IRC16:01
Roamer`smcginnis, you mean for the CI? um, is this not what zuul-merger does? from what I can see, when devstack starts, the Ansible jobs have already copied the Cinder repository to /opt/stack/cinder after merging the patch to be tested16:01
*** altlogbot_1 has joined #openstack-cinder16:02
Roamer`smcginnis, right now I have an SSH session to the worker node and cd /opt/stack/cinder and git log shows me a commit from zuul that says "Merge commit 'refs/changes/64/672064/6' of ssh://review.opendev.org:29418/openstack/cinder into HEAD"16:02
Roamer`and that's exactly what https://spfactory.storpool.com/zuul/t/local/status says is being tested right now16:02
*** irclogbot_2 has quit IRC16:02
*** irclogbot_3 has joined #openstack-cinder16:03
Roamer`I mean, fine, it's not called zuul-merger any more, it's part of what zuul-executor does :)16:05
*** irclogbot_3 has quit IRC16:07
*** irclogbot_3 has joined #openstack-cinder16:07
*** gnufied has joined #openstack-cinder16:20
*** e0ne has joined #openstack-cinder16:23
*** e0ne has quit IRC16:31
*** sapd1_x has joined #openstack-cinder16:34
*** zigo has quit IRC16:43
*** spsurya has quit IRC16:48
*** sapd1_x has quit IRC16:48
*** rosmaita has left #openstack-cinder17:01
*** ociuhandu has joined #openstack-cinder17:01
*** markvoelker has quit IRC17:04
*** tesseract has quit IRC17:06
*** zigo has joined #openstack-cinder17:09
openstackgerritEric Harney proposed openstack/cinder stable/stein: Don't allow retype to encrypted+multiattach type  https://review.opendev.org/68047317:09
openstackgerritEric Harney proposed openstack/cinder master: Continue renaming volume_utils (core)  https://review.opendev.org/68047417:10
openstackgerritEric Harney proposed openstack/cinder master: Continue renaming of volume_utils (drivers)  https://review.opendev.org/68047517:10
*** markvoelker has joined #openstack-cinder17:13
*** udesale has quit IRC17:18
*** ociuhandu_ has joined #openstack-cinder17:32
*** ociuhandu has quit IRC17:36
*** ociuhandu_ has quit IRC17:36
hemnaRoamer`: so did you get the storpool lib py3 compatible ?17:39
*** davidsha has quit IRC17:47
openstackgerritSofia Enriquez proposed openstack/cinder master: Allow removing NFS snapshots in error status  https://review.opendev.org/67913817:58
*** sfernand has quit IRC18:05
*** sahid has quit IRC18:05
*** thgcorrea has quit IRC18:07
smcginnisRoamer`: If it is being set, that is good. I'm not aware of how zuul-executor does that, as usually I've just used devstack and I thought it would just check out master if not told to do otherwise.18:22
smcginnisRoamer`: I didn't see a flag in local.conf allowing using unsupported drivers either, so that led me to believe it's not actually running the current code if the driver is marked unsupported. It should fail otherwise.18:23
smcginnisI wonder if devstack or service initialization logs the current commit anywhere.18:23
smcginnisAh, looks like it is applying the storpool patches separately.18:25
*** mvkr has quit IRC18:26
smcginnisBut this is a little concerning - https://spfactory.storpool.com/logs/15/670715/12/check/cinder-storpool-tempest/57b655b/job-output.txt.gz#_2019-09-05_16_47_30_96499418:26
smcginnisSo it's testing master with the outstanding storpool patches applied, regardless of the patch that actually is supposed to be tested?18:27
smcginnisYep, checking out master - https://spfactory.storpool.com/logs/15/670715/12/check/cinder-storpool-tempest/57b655b/job-output.txt.gz#_2019-09-05_16_57_08_74869918:27
*** mriedem has quit IRC18:31
*** mriedem has joined #openstack-cinder18:33
hemnasmcginnis: I saw the cinder.conf that he had that enabled unsupported driver18:37
smcginnisOh, you're right.18:38
hemnahttps://spfactory.storpool.com/logs/85/679785/2/check/cinder-storpool-tempest/105fbc7/controller/logs/etc/cinder/cinder_conf.txt.gz18:38
hemnasmcginnis: did you ever look into this anymore?  https://bugs.launchpad.net/cinder/+bug/166263718:39
openstackLaunchpad bug 1662637 in Cinder "Rate limit settings not enforced" [Undecided,New]18:39
smcginnishemna: No, I never got back to that. I think Michal's suggestion is probably the best.18:40
hemnanova still has their limiter code in place afaik18:40
openstackgerritSean McGinnis proposed openstack/cinder master: Perform CI test  https://review.opendev.org/52314318:40
hemnahttps://github.com/openstack/nova/blob/master/nova/api/openstack/compute/limits.py18:40
hemnaI'm trying to debug it a little bit to see if I can even get it loaded18:40
hemnanot sure the api-paste.ini settings are working18:41
smcginnisThat was part of the confusing bit. We have horrible terminology. That nova limiting I believe is related to quotas. The bug is for API rate limiting.18:41
hemnaok good, I'm not the only one completely confused by it18:41
smcginnisI think when I filed that, it was as I was trying to understand what all that was because it was very confusing at first glance.18:42
hemnaand we have zero documentation on it18:42
hemnaso what has caused me to look into this is an issue when issuing lots of API requests in a short period of time18:42
hemnac-vol crashes when you call lots of deletes18:43
hemnaand18:43
hemnathere are problems with calling create quickly too18:43
hemnaI have a bash script that creates 60 volumes, quickly.18:44
hemna(ceph backend)18:44
smcginnisWow, that's not good. Are we crashing in a specific place, or does it cause general instability?18:44
hemnathe scheduler shits out saying that it can't find any hosts18:44
hemnaas it thinks the ceph backend is full18:44
hemnabut, ceph sends a stats update and it's got lots of space18:44
smcginnisSo only with Ceph? Doesn't happen with LVM?18:45
hemnaI haven't tried with lvm thin yet18:45
hemnaI suppose I can try with lvm18:45
eharneyiirc that space accounting issue is not ceph-specific and has to do with how we keep track of usage in the scheduler18:45
smcginnisIt might be an interesting experiment to see if it's something specific to the Ceph driver or to the volume manager.18:45
hemnahave to create a large thin vg18:45
eharneybut it's been a little bit since i looked at it18:45
hemnaone thing I noticed is that during deletes18:45
hemnasomeone is calling get_volume_stats after every delete18:46
smcginniseharney: So maybe not handling concurrent operations very well?18:46
hemnaand that is what eventually pukes18:46
hemnabut on create, the scheduler relies on the period task to update stats18:46
smcginnisThat's weird, I would have thought up to date stats would be more important for create than delete.18:47
hemnait's pretty damn expensive to call get_volume_stats18:47
smcginnisUnless it's doing that to make sure you can create right away after deleting.18:47
smcginnisYeah.18:47
hemnadepending on the usage and what's been created, etc.18:47
hemnaall of this begs the question of rally jobs18:48
hemnaI guess we have 1 rally job?18:48
hemnacinder-rally-task18:48
hemnahttps://github.com/openstack/cinder/blob/master/rally-jobs/cinder.yaml#L5418:49
eharneysmcginnis: need to find my notes/bug on it18:49
hemnaheh yah, that's not really much of a test18:49
eharneyi think abishop ran into this happening on an LVM backend in some tripleo CI18:51
hemnaso in this case there are 2 separate problems18:51
hemna1 is the scheduler and stats during create18:51
hemnaand the other is lots of deletes w/ ceph18:51
hemnalet me setup lvm in my same vagrant vm and see what happens18:51
*** e0ne has joined #openstack-cinder18:54
abishopyeah, see my comment in the commit message to the patch I submitted for tripleo, https://review.opendev.org/67889418:54
smcginnisWe've had to raise that in devstack periodically too. https://review.opendev.org/#/c/533312/18:56
eharneyi think the trick to this is that consume_from_volume() in the scheduler subtracts away free space in real time upon creation, but nothing does that when you delete a volume18:57
hemnayah that looks familiar18:57
eharneyso you have to wait until get_volume_stats() happens again18:57
smcginnisSo maybe we just need an unconsume_from_volume() call in there.18:57
abishopeharney: precisely18:57
hemnaand that get_volume_stats() goes through rabbitmq a few times before it gets to the capacity filter18:57
hemnathe scheduler has the updated date, but the capacity filter doesn't yet.18:58
hemnano host found.18:58
abishopyup, that's what I observed18:58
smcginnishemna: Maybe a quick test to add back the space with putting a consume_from_volume(-negative_space) call in the delete path?18:58
hemnaso I'm doing lots of creates18:58
hemnathen after that's all done18:58
hemnalots of deletes18:58
hemnanot a mix18:58
hemnathe scheduler runs out of space, even though the backend doesn't18:59
abishopit's the mix that should reveal the issue18:59
hemnayou'll hit it with lots of creates quickly18:59
eharneysmcginnis: i'm not sure the delete path even goes through that same area of code -- i think that's the right concept but i'm not sure what the implementation looks like18:59
abishopsure, if you actually create enough to consume the space18:59
hemnathe scheduler will think it's full, but the backend doesn't18:59
abishopit's the rapid create/delete where space is not accumulated that I found to be the issue18:59
hemnaand that causes the create failures19:00
smcginniseharney: Oh, is that another one that we don't have go through the scheduler yet?19:00
eharneysmcginnis: i think so19:00
abishopscheduler deducts for the creates, but slow feedback to realize the deletes reclaimed the space19:00
smcginnisSo not so quick of a test.19:00
* hemna restacks19:00
hemnait's slow feedback even during the creates19:01
hemnaas teh scheduler deducts the absolute, but then the backend eventually says...no I still have unused space19:01
hemnaso if you put a sleep inbetween create calls, they all work19:02
hemnaI had a pastebin of this last week19:02
smcginnisIf it's not going through the scheduler, then that update stats call after the delete is pretty useless too.19:04
smcginnisSomething has to let the scheduler know that things have changed.19:04
hemnaafter stack is back up I'll run a test and pastebin the logs19:04
smcginnisShort of forcing a stats update immediately before every create call.19:04
hemnathis makes me believe we need to throttle requests19:05
hemnawe aren't a web server serving up e-commerce pages19:05
eharneyso delete in the volume manager calls publish_service_capabilities() which gathers stats... which appears to send it to schedulers via update_service_capabilities, and even says so with a comment19:06
eharneydid we used to send an update for each delete from the volume manager from the scheduler and now that isn't going through for some reason?19:07
eharneybecause it sure reads like it's already trying to solve this problem19:07
*** e0ne has quit IRC19:07
hemnawell, during deletes it does19:07
hemnado 100 creates19:07
hemnait'll puke in there19:07
hemnaeven when the backend has the space for it19:08
hemnawell, ceph in this case19:08
eharneythat sounds like the other problem19:08
hemnaand I think the get_volume_stats is where we crash during the 100 parallel deletes19:08
*** trident has quit IRC19:09
eharneyhmm19:10
eharneywhat kind of error happens there?19:11
*** e0ne has joined #openstack-cinder19:12
hemnahttps://github.com/openstack/cinder/blob/master/cinder/volume/drivers/rbd.py#L48319:13
hemnaI think that's what is nuking it19:13
hemnaif you have 1000 volumes in there and you try and delete 10019:13
*** e0ne has quit IRC19:14
hemnaget_volume_stats() will eventually end up with calls to the rbdproxy to fetch the size of all 1000 volumes individually19:14
hemnaon every delete call19:14
hemna100*19:14
hemna:(19:14
eharneyhmm19:15
hemnawe have an internal bug tracking this19:15
hemnaI wish they would just file this upstream19:15
hemnabut it has customer details in it :(19:15
*** e0ne has joined #openstack-cinder19:16
hemnamaybe put a lock around get_volume_stats() ?19:17
hemnathat would serialize it19:17
hemnaboatloads and boatloads of ImageNotFound errors in c-vol19:18
*** e0ne_ has joined #openstack-cinder19:18
hemnaas the volumes dissapear between get_volume_stats calls19:18
eharneyhemna: which release is this?19:18
* hemna checks19:18
smcginnisWould still be good to file an upstream bug that just has the failure information without any customer details.19:19
eharneythis sounds like the old stats behavior before we rewrote it a while ago to be much faster19:19
hemnaPike19:19
smcginnisStill happens with master?19:20
*** trident has joined #openstack-cinder19:20
eharneyhemna: setting the rbd_exclusive_cinder_pool option may help a lot: https://review.opendev.org/#/c/607192/219:21
hemnawaiting for devstack to come back up to try it19:21
hemnayah we have done that, that was the main suggestion to the customer19:21
hemnanot sure if they updated to include that19:21
*** e0ne has quit IRC19:21
hemnaok looks like they did set that19:23
hemnaand it helped, but still getting tons of ImageNotFound exceptions19:23
hemnahttps://github.com/openstack/cinder/commit/5e4d7e5e986f7a7076632f1cef2c8195fdcc0824#diff-439c80f9f706c9c6c8fe90266cde5c4019:24
hemnahrmm19:24
hemnathat might help19:24
hemnathe ImageNotFound exceptions are coming from the parallel deletes eventually having a race19:28
hemnabetween the time it fetches the list of images19:29
hemnaand when it calls driver.rbd.Image() in the rbdvolumeproxy19:29
*** jmlowe has quit IRC19:32
hemnamaybe we can not log the ImageNotFound exception during _get_usage_info() time19:33
hemnasince that seems to ignore the ImageNotFound exception anyway19:33
hemnahttps://github.com/openstack/cinder/blob/stable/pike/cinder/volume/drivers/rbd.py#L14419:33
hemnawe log the exception there19:33
hemnabut in the case of _get_usage_info() we just keep going, and in this case we flood the logs with ImageNotFound exceptions19:34
eharneyif you're seeing that then they aren't using the exclusive pool option19:35
eharneybecause it skips that method19:35
hemnaoh I see that19:35
hemnaon line 48919:35
hemnahrmm19:35
*** rosmaita has joined #openstack-cinder19:42
*** e0ne_ has quit IRC19:49
*** jmlowe has joined #openstack-cinder19:50
*** e0ne has joined #openstack-cinder19:54
hemnadeletes aren19:57
hemnadeletes aren't going through the scheduler19:57
*** enriquetaso has quit IRC19:58
*** e0ne_ has joined #openstack-cinder19:59
*** e0ne has quit IRC20:00
*** eharney has quit IRC20:01
hemnalooks like the create issue happens with LVM too20:03
hemnayup20:06
hemnaso the problem is related to the periodic task happening every 60 seconds during non delete calls20:07
hemnaif you have lots of create calls between update_volume_stats(), the scheduler can think you have no allocation space left on the backend20:07
hemnaand it will fail create requests20:08
hemnauntil the next get_volume_stats() happens20:08
hemnaso for scale testing, you can have lots of failures for creates, that really are false failures20:08
hemnamakes me wonder why we are doing get_volume_stats after every delete, but not after every create20:12
hemnaas well as periodic20:12
*** whfnst has quit IRC20:15
*** e0ne_ has quit IRC20:19
Roamer`smcginnis, you are right that the devstack git_clone function is called with "master" as argument, but it does not really check out the master branch: since RECLONE is not set, it will not forcefully replace an already-checked-out repository20:28
Roamer`smcginnis, but I saw what you did with the "be evil" "break the CI" patchset, we'll see how it goes :)20:28
Roamer`hemna, yes, we fixed the storpool and storpool.spopenstack libraries on PyPI; the Cinder StorPool driver itself did not need any changes20:29
smcginnisRoamer`: OK, great. That sounds right then. Thanks for checking on it.20:29
*** lpetrut has joined #openstack-cinder20:34
*** lpetrut has quit IRC20:38
*** ociuhandu has joined #openstack-cinder20:39
*** ociuhandu has quit IRC20:45
*** ociuhandu has joined #openstack-cinder20:46
hemnaRoamer`: you might want to look at adding the storpool lib to setup.cfg https://github.com/openstack/cinder/blob/master/setup.cfg#L10020:48
hemnaRoamer`: https://github.com/openstack/cinder/commit/4c9ae85ac8e31394c73338920642ef1b4ffa1127#diff-380c6a8ebbbce17d55d50ef17d3cf90620:49
*** ociuhandu has quit IRC20:50
Roamer`hemna, ah, thanks, just found your commit that mentioned that storpool was skipped20:50
Roamer`hemna, I'll do that, thanks a lot!20:50
hemnaso I added publish_service_capabilities() at the bottom end of create_volume() for success20:58
hemnaand the create problem went away20:58
hemnaRoamer`: no problem.  Thanks for getting your CI back up and supporting the driver!20:58
*** markvoelker has quit IRC21:05
*** markvoelker has joined #openstack-cinder21:11
-openstackstatus- NOTICE: Gerrit is being restarted to pick up configuration changes. Should be quick. Sorry for the interruption.21:13
*** markvoelker has quit IRC21:15
Roamer`hemna, BTW is there a reason why purestorage is mentioned twice in setup.cfg's extras section - once for "pure" (where I believe it belongs) and once again for "all"?21:32
openstackgerritMerged openstack/cinder master: Google backup: correct string encoding between py 2 and 3  https://review.opendev.org/67640321:55
openstackgerritPeter Penchev proposed openstack/cinder master: StorPool: update the driver requirements.  https://review.opendev.org/68053022:11
*** carloss has quit IRC22:28
*** kaisers has quit IRC22:37
*** kaisers has joined #openstack-cinder22:38
*** mriedem has quit IRC22:50
*** dviroel has quit IRC23:02
*** tkajinam has joined #openstack-cinder23:02
*** rcernin has joined #openstack-cinder23:23
*** threestrands has joined #openstack-cinder23:30
*** n-saito has joined #openstack-cinder23:33
*** rcernin is now known as rcernin|brb23:52

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!