Thursday, 2021-09-16

fungi	clarkb: the reason we turned off autoreload is that it basically dropped all pending tasks in the queue at reload	00:00
fungi	not sure if latest gerrit still has that behavior, but it was resulting in lots of lost replication tasks and stale repo mirrors	00:01
Clark[m]	I think I saw it say it does similar in the docs. That would explain it. I knew there was a good reason just didn't remember specifics	00:08
fungi	well, at the time we were making much more frequent changes to the replication config. now we hardly change it at all so it might be okay? but ultimately there's still some risk	02:13
*** ykarel_ is now known as ykarel		03:57
*** ysandeep\|away is now known as ysandeep		05:09
*** bhagyashris\|off is now known as bhagyashris		05:34
*** jpena\|off is now known as jpena		07:28
newopenstack	Need to setup openstack with 6 server and wnat to use Maas and Juju	07:46
newopenstack	please advise.	07:46
newopenstack	and then want to grow the infrastucture to more compute nodes	07:46
newopenstack	also want to use some storage center from dell	07:46
newopenstack	please share some guidelines.	07:47
newopenstack	any one can help .. please	07:47
*** ykarel__ is now known as ykarel		08:03
*** ykarel is now known as ykarel\|lunch		08:20
*** ykarel\|lunch is now known as ykarel		09:27
*** odyssey4me is now known as Guest65		10:07
opendevreview	Michal Nasiadka proposed opendev/bindep master: Add Rocky Linux support https://review.opendev.org/c/opendev/bindep/+/809362	10:11
*** ysandeep is now known as ysandeep\|brb		10:52
*** dviroel\|out is now known as dviroel		11:20
*** jpena is now known as jpena\|lunch		11:21
*** ysandeep\|brb is now known as ysandeep		11:52
*** ykarel is now known as ykarel\|afk		11:54
*** jpena\|lunch is now known as jpena		12:21
fungi	newopenstack: sorry, this is the channel where we coordinate the services which make up the opendev collaboratory. you're probably looking for the #openstack channel or more likely the openstack-discuss@lists.openstack.org mailing list	12:55
fungi	newopenstack: though since you mentioned maas and juju (software made by canonical, it's not part of openstack really) you might want to be looking closer at https://ubuntu.com/openstack	12:56
fungi	hope that helps!	12:56
opendevreview	Merged openstack/project-config master: Add openstack-loadbalancer charm and interfaces https://review.opendev.org/c/openstack/project-config/+/807838	13:08
*** ykarel\|afk is now known as ykarel		13:20
*** slaweq__ is now known as slaweq		13:23
*** frenzy_friday is now known as anbanerj\|ruck		13:35
*** odyssey4me is now known as Guest74		13:42
*** ysandeep is now known as ysandeep\|dinner		14:26
opendevreview	daniel.pawlik proposed opendev/puppet-log_processor master: Add capability with python3; add log request cert verify https://review.opendev.org/c/opendev/puppet-log_processor/+/809424	14:55
*** ykarel is now known as ykarel\|away		15:01
*** marios is now known as marios\|out		15:33
clarkb	We have no currently leaked replication tasks	15:41
*** ysandeep\|dinner is now known as ysandeep\|out		15:41
clarkb	I've just confirmed the inmotion boots continue to fail. Will try and dig into that after some breakfast	15:44
clarkb	I've got tails running against the three different servers' nova api error logs. If that doesn't record anything interesting in the next bit I'll dig in further. I expect this should give me a clue in the next few minutes though	16:16
clarkb	The api was very quiet. looking at other things I find messages like Instance f98ce366-90b1-43ba-8513-bf2ea559c931 has allocations against this compute host but is not found in the database. in the nova compute log	16:28
clarkb	I suspect that may be the underlying cuase? we're leaking instances that don't exist but count against quota?	16:28
*** jpena is now known as jpena\|off		16:28
clarkb	hrm no quotas as reported by openstackclient look fine	16:30
clarkb	"Allocations" seems to be what placement does	16:31
fungi	might be a question for #openstack-nova	16:32
clarkb	nova.exception_Remote.NoValidHost_Remote: No valid host was found. <- is what the conductor says	16:32
clarkb	so ya I think what is happening is placement is unable to place possibly ebcause it has leaky allocations.	16:33
clarkb	https://docs.openstack.org/nova/latest/admin/troubleshooting/orphaned-allocations.html is the indicated solution from the nova channel	16:39
*** ysandeep\|out is now known as ysandeep		16:39
clarkb	thank you melwitt!	16:39
clarkb	I'll have to digest that and dig around and see if I can fix things.	16:39
melwitt	clarkb: lmk if you run into any issues or have questions and I will help	16:40
clarkb	will do	16:40
fungi	yeah, this particular provider is unique in that they give us an automatically deployed turn-key/cookie-cutter openstack environment, but it's mostly us on the hook if it falls over	16:42
clarkb	any idea what provides the openstack resource provider commands to osc? seems my installs don't have that	16:43
melwitt	clarkb: osc-placement is the osc plugin you need	16:44
melwitt	you just install it and then it works	16:44
clarkb	thanks	16:44
clarkb	and now I've hit policy problems. I think I need to escalte my privs. I expect the next bit will just be me stumbling around to find the correct incantations :)	16:45
melwitt	clarkb: placement api is defaulted to admin-only	16:46
clarkb	melwitt: I've found the env to administrate the environment and can run the resource provider commands. When I run openstack server list --all-projects only one VM shows up (our mirror). In the doc you shared it showed performing actions for specific VMs but I don't seem to have that here. In this case would I just run the heal command first?	16:54
clarkb	and i guess make note of the allocation for the single VM that is presnet first	16:54
* melwitt looks		16:55
clarkb	there also doesn't appear to be a way to list all resource allocations.	16:57
melwitt	clarkb: ok yeah sorry, heal_allocations is when you still have the server and want to "heal" it. but it might still work if you pass the uuid of the server from the error message	16:58
melwitt	if not, we'll want to do allocation deletes directly	16:58
clarkb	melwitt: got it. Do you know if there is a way to list the allocations? I can show the allocations for the uuids in the logs and they show up but I can't seem to do a listing of all of them	16:58
clarkb	but worst case I can parse the log and generate a list to operate on. That should be doable	16:59
melwitt	listing allocations can be done per resource provider by 'openstack resource provider show <compute node uuid> --allocations"	16:59
clarkb	aha thanks!	17:00
melwitt	compute node uuid == resource provider uuid	17:00
clarkb	I think I have what I need then. I can list all the allocations. Remove allocation(s) for the mirror VM then iterate over that list deleting the allocations and healing them	17:01
melwitt	yeah you just want to remove allocations for any servers that no longer exist	17:02
melwitt	i.e. "not in the database"	17:02
melwitt	and the "consumer" uuids in placement map to the server uuids in nova	17:03
melwitt	most of the time consumer == nova server/instance	17:04
melwitt	I say "most of the time" because other services/entities can consume resources in placement as well	17:05
clarkb	makes snse. in this case I only see allocations that seem to map to nova	17:06
clarkb	their attributes have servery things like memory and disk and cpus	17:06
melwitt	ah yeah	17:11
melwitt	you are right, those are nova	17:11
clarkb	melwitt: do I need to run the heal command at all if these instances don't exist? I should be able to simply delete the allocations then I am done? Or are there other side effects of the heal that I want?	17:16
melwitt	clarkb: no I think heal is when the instance is still around but has some extra allocations from env "irregularities" during migrations etc. you are good to just delete for these servers that were deleted in the past	17:16
clarkb	melwitt: thanks for confirming	17:17
melwitt	clarkb: ok so sorry but I got the tools mixed up 😓 this is the one I should have told you https://docs.openstack.org/nova/latest/cli/nova-manage.html#placement-audit for this case where you want to delete ones that no longer exist	17:18
clarkb	melwitt: oh thanks	17:18
melwitt	'nova-manage placement audit --verbose' will iterate over all resource providers and look for orphaned allocations and if you pass --delete it will delete them for you	17:19
clarkb	I'll try that before I manually delete over my list. Though I have to figure out whwere the nova-manage command is. I think it must be in one of the containers. Does nova-manage talk to the apis like osc and need those credentials or is it more behind the scenes?	17:19
clarkb	looks like it reads configs directly in the install somewhere	17:20
melwitt	yeah was just looking through, it does call the placement api as well but you don't need your own creds for it	17:22
clarkb	alright it cleaned up 65 allocations and the mirror still shows up with its allocations	17:22
clarkb	now we wait and see if nodepool can launch successfully	17:23
melwitt	ok cool	17:23
clarkb	melwitt: doing it the more difficult way was good because I feel like I learned a bit more :)	17:24
clarkb	but then having easy mode at the end was nice	17:24
melwitt	:)	17:25
clarkb	[node_request: 300-0015441935] [node: 0026535559] Node is ready	17:25
clarkb	I think it is happy now	17:26
melwitt	phew!	17:26
fungi	awesome	17:30
clarkb	https://grafana.opendev.org/d/4sdNjeXGk/nodepool-inmotion?orgId=1	17:40
*** ysandeep is now known as ysandeep\|out		18:27
opendevreview	Jeremy Stanley proposed zuul/zuul-jobs master: Explicit tox_extra_args in zuul-jobs-test-tox https://review.opendev.org/c/zuul/zuul-jobs/+/809456	19:01
opendevreview	Jeremy Stanley proposed zuul/zuul-jobs master: Add tox_config_file rolevar to tox https://review.opendev.org/c/zuul/zuul-jobs/+/806613	19:17
opendevreview	Jeremy Stanley proposed zuul/zuul-jobs master: Support verbose showconfig in tox siblings https://review.opendev.org/c/zuul/zuul-jobs/+/806621	19:17
opendevreview	Jeremy Stanley proposed zuul/zuul-jobs master: Include tox_extra_args in tox siblings tasks https://review.opendev.org/c/zuul/zuul-jobs/+/806612	19:17
opendevreview	Jeremy Stanley proposed zuul/zuul-jobs master: Explicit tox_extra_args in zuul-jobs-test-tox https://review.opendev.org/c/zuul/zuul-jobs/+/809456	19:17
opendevreview	Jeremy Stanley proposed zuul/zuul-jobs master: Pin protobuf<3.18 for Python<3.6 https://review.opendev.org/c/zuul/zuul-jobs/+/809460	19:17
fungi	infra-root: bad news, ticket from rackspace says they're planning a block storage maintenance for 2021-10-04 impacting afs01.dfw.opendev.org/main04	19:43
fungi	i suppose we should attach a new volume, add it as a pv in the main vg on the server, and then pvmove the extents off main04 and delete the volume	19:44
fungi	i'll try to get that going today or tomorrow, it should be hitless for us	19:45
fungi	at least we have a few weeks warning	19:46
fungi	unfortunately, cinder operations in rackspace are a pain because of the need to use the cinder v1 api which osc no longer supports	19:47
*** odyssey4me is now known as Guest93		20:05
Clark[m]	fungi: I think the osc in the venv in my home for on bridge works with tax cinder you just have to override the API version on the command line to v1	20:36
opendevreview	Slawek Kaplonski proposed opendev/irc-meetings master: Update Neutron meetings chairs https://review.opendev.org/c/opendev/irc-meetings/+/809478	20:48
fungi	Clark[m]: i'll give that a try, but i also have cinderclient set up on bridge i can use to do the cinder api bits	20:49
*** dviroel is now known as dviroel\|out		21:00
opendevreview	Clark Boylan proposed opendev/system-config master: Run daily backups of nodepool zk image data https://review.opendev.org/c/opendev/system-config/+/809483	21:13
clarkb	infra-root ^ that isn't critical to backup but nodepool has grown the ability to do those data dumps so I figure we may as well take advantage of it'	21:13
fungi	i just did `curl -XPURGE https://pypi.org/simple/reno` (and a second time with a trailing / just in case) based on the discussion in #openstack-swift about job failures which look like more stale reno indices being served near montreal	22:11
*** odyssey4me is now known as Guest101		22:56
fungi	clarkb: i think ianw was able to work out how to extract the cached indices from the fs at one point, but i don't recall how he located samples	23:29
ianw	fungi: ISTR it being a inelegant but ultimately fruitful application of "grep"	23:31
ianw	2020-09-16 : "pypi stale index issues ... end up finding details by walking mirror caches" is what i have in my notes	23:32
fungi	sounds about right	23:32
fungi	wow, and today's the anniversary! coincidence?	23:33
clarkb	fwiw I did a find /var/cache/apache2/proxy -type f -name \*.header -exec grep reno {} \;	23:36
ianw	https://meetings.opendev.org/irclogs/%23opendev/%23opendev.2020-09-15.log.html#t2020-09-15T20:22:56	23:36
clarkb	then looked at all the files. It seems that pip explicitly asks for uncached data and that the only version of the file we cached was up to date	23:37
clarkb	for reno's index specifically on the iweb mirror	23:37
ianw	fungi: haha yes, i guess that's from my timestamp, so happened on the 15th UTC	23:37

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!