Friday, 2022-04-22

opendevreviewDan Smith proposed openstack/devstack master: WIP: Test static perfdata comparisons  https://review.opendev.org/c/openstack/devstack/+/83894700:08
*** pojadhav|out is now known as pojadhav04:59
*** bhagyashris is now known as bhagyashris|ruck05:33
*** pojadhav is now known as pojadhav|lunch06:27
*** jpena|off is now known as jpena06:59
*** pojadhav|lunch is now known as pojadhav07:11
opendevreviewBalazs Gibizer proposed openstack/devstack master: Use proper sed separator for paths  https://review.opendev.org/c/openstack/devstack/+/83903410:05
opendevreviewMerged openstack/devstack master: modify the sample value of LOGDAYS  https://review.opendev.org/c/openstack/devstack/+/83882911:09
sean-k-mooneyclarkb: i have been playing with the virtual env patch actully this morning but in a very different setup11:42
sean-k-mooneyclarkb: it was working pretty well on centos9 for me before11:43
sean-k-mooneythis morning i have been tryign to use it to install on arm (m1 macbook air) using debian testing/bookworm 11:43
sean-k-mooneyclarkb: ... privsep12:09
sean-k-mooneyprivsep is not using the same python as everything else12:10
sean-k-mooneyoh i bet this is the concole script edge case wehre the the python version is not updated after the initall insatll12:14
sean-k-mooneyi did orgianlly try to stack with 10 a few weeks ago12:14
sean-k-mooneyi bet that is the issue12:14
*** pojadhav is now known as pojadhav|afk13:23
opendevreviewDan Smith proposed openstack/devstack master: Update our API call counting method  https://review.opendev.org/c/openstack/devstack/+/83906714:00
dansmithclarkb: gmann: I'm starting to realize that some of these "stable" metrics aren't actually that performance-independent14:05
dansmithlike api calls.. on a slower worker, tempest will poll more times for resource completion, inflating that number14:05
dansmiththat's actually not that hard to filter out14:06
dansmithbut, the db queries it generates is14:06
dansmithservice memory footprint seems to also be somewhat unreliable, like by 30% in some cases, which seems odd to me14:07
*** akekane_ is now known as abhishekk15:15
clarkbdansmith: oh intreesting15:18
sean-k-mooneydansmith: is that 30% in the same job but from differnt runs15:23
sean-k-mooneyor stack/unstack on your local system15:23
sean-k-mooneythat does seam higher then i would expect too15:23
dansmithsean-k-mooney: same job, same patch, two subsequent runs15:28
dansmith(in ci, not local)15:28
sean-k-mooneydansmith: did it hit the same provier? i would not thinkt that shoudl affect memory liek that but if it was slower excrta maybe caches are expanding or somethign over time15:29
dansmithsean-k-mooney: nope, different providers, which is likely why it took 10 minutes longer on one, and had more polling15:32
sean-k-mooneyya so we might be bufferign the logs or soemthing that would result in larger memory usage15:33
dansmith30% seems like a lot for that difference, but yeah, something15:34
dansmithI was thinking more like the extra polling ended up spawning more horizontal workers or something15:34
sean-k-mooneyya it does15:34
dansmithmore tempest tests waiting and polling all at once or something15:34
sean-k-mooneyperhaps or more queued requets in apache15:34
sean-k-mooneythe way we run  say nova-api behidn aprch in uwsgi means that we get no scaling form eventlet15:35
sean-k-mooneyall requets are queued in appache and then each uwsgi python proces handels one request at a time15:36
sean-k-mooneyso more polling means more thigns in the queue and more memory usage 15:36
dansmithwell, the things in the queue are small though, and not charged against the python process until they're dispatched right?15:38
sean-k-mooneyhum ya i guess that woudl be correct it woudl be agaisnt apache15:38
sean-k-mooneynot hte python process15:38
sean-k-mooneyand ya its jut the http get wich in most cases is tiny15:39
dansmithright15:39
dansmithespecially in the polling case15:39
sean-k-mooneyalthough each request will translate to a db query15:39
sean-k-mooneyand proably memcache lookup15:40
sean-k-mooneyso its proably on the service -> db side that its increasing15:40
dansmithmemcache when?15:42
dansmithnot for things like instance show I wouldn't think15:43
sean-k-mooneyi was thinking the keystone tokens dont we cache that in memcache15:43
dansmithand memcached might go up, but not sure why memcache calls from the python services would15:43
sean-k-mooneyi guess that would not affect the python usage15:44
sean-k-mooneyits an interesting result in any case15:44
sean-k-mooneyi assume os-profiler or the other tools we have would not help narrow down why its increasing15:45
dansmithyeah I dunno, but yes definitely interesting15:45
dansmithI think focusing on the should-be-repeatable metrics first is probably most useful15:46
dansmithdb queries being impacted by polling is troubling,15:46
dansmithso I probably need to separate out  SELECT vs. (everything else) or something15:46
dansmithbut even still, the impetus for this was additional select queries due to rbac, so it would only help for large spikes15:46
sean-k-mooneyis that select load confied ot keystone or across all services 15:48
sean-k-mooneyi would not have expected this to affect other serivces15:48
dansmithof course it's all services, because it15:49
dansmithis polling for instance -> active or something :)15:49
dansmithwhich makes it.. a mess :)15:49
sean-k-mooneyright but that would increass the http request to keystone form nova15:49
sean-k-mooneysory form nova to keystone15:49
sean-k-mooneyto validate the tokens15:49
dansmithso keystone goes up as well because of all the token validation, but so does straight up nova db queries, pulling the instance each time15:50
sean-k-mooneyya but that last part is partly unavoiable15:50
sean-k-mooneyaltough i think status is in the non detail endpoint15:50
dansmithit is, and that's the point15:50
sean-k-mooneyso if its just active they shoudl not use the detail one15:50
sean-k-mooneyi know ceilomiter fixed that a few years ago15:50
dansmithif we're trying to alert on "why is nova doing a bunch more db queries on this patch" it's hard to separate that from "this is a slow worker and tempest did a bunch more polling on just this run"15:51
sean-k-mooneyya15:51
sean-k-mooneymaybe you could syntisie a memtric liek quiries/jobtime or something that was less affected15:52
dansmithyeah, so I can tell what api calls are tempest and which are inter-service, so maybe figuring out what the proportion is and then apply that to the db queries would help normalize it15:54
*** jpena is now known as jpena|off15:55
sean-k-mooneydansmith: clarkb https://twitter.com/sean_k_mooney/status/1517537626923929601?s=20&t=arNXrLIXTd_74nKhsAZauA16:17
dansmithnice16:17
sean-k-mooneyi have not got vms booting fullly yet. im missing some config for uefi to work properly16:18
sean-k-mooneybut its close16:18
sean-k-mooneyi am also using the global_venv16:18
clarkbnice! when you say natively is it still in a linux vm but running arm not emulated x86?16:19
sean-k-mooneynope16:20
clarkb(I wouldn't expect devstack to run on osx direclty, really neat if so)16:20
sean-k-mooneylinux running nativly on m1 16:20
sean-k-mooneyno vm16:20
sean-k-mooneythen devstack installed on that16:20
clarkboh woww you are extremely brave :)16:20
clarkbthat is cool though16:20
sean-k-mooneyso debian testing 16:20
dansmithah, gdi, we're using a different log format for tls-proxy.log16:21
sean-k-mooneyhttps://github.com/AsahiLinux/docs/wiki/SW%3AAlternative-Distros im using the debian comunity installer16:21
dansmithclarkb: gmann frickler: do you know why we're not using the "combined" format for the tls proxy log? we're specifically choosing a lot format there, but it's lacking things like user-agent16:23
clarkbI don't16:23
clarkbare we adding port info? that may be why (its a common reason for opendev services ot override common format at least)16:24
gmannme too, not sure about it. 16:24
clarkbby default the combined format doesn't show youport info which is necessary to trace connections through a proxy16:24
dansmithit's very barebones16:25
dansmithCustomLog /var/log/apache2/tls-proxy_access.log "%{%Y-%m-%d}t %{%T}t.%{msec_frac}t [%l] %a \"%r\" %>s %b"16:25
dansmithso I'mma switch that to combined like our regular access.log if that's okay16:26
clarkbya if port info isn't there then combined shouldn't be a regression. wfm16:26
opendevreviewDan Smith proposed openstack/devstack master: Update our API call counting method  https://review.opendev.org/c/openstack/devstack/+/83906716:31
sean-k-mooneydansmith: clarkb  got a vm booted :)17:00
opendevreviewDan Smith proposed openstack/devstack master: Update our API call counting method  https://review.opendev.org/c/openstack/devstack/+/83906717:02
sean-k-mooneyok that might be premature i think it crashed17:08
sean-k-mooneywhich kind of make sense17:08
dansmithhah17:08
sean-k-mooneythe current kernel im using does not yet support 4k pages17:08
sean-k-mooneyits using 16k pages by default because of iommu issues17:09
sean-k-mooneyoh yay ... sudo reboot does not work17:10
sean-k-mooneypower button does. i think ill leave it off for a while its had a long day17:10
sean-k-mooneyo/ talk to ye on monday17:11
dansmithshocking that apple even includes a power button anymore17:13
opendevreviewDan Smith proposed openstack/devstack master: Update our API call counting method  https://review.opendev.org/c/openstack/devstack/+/83906718:40
opendevreviewDan Smith proposed openstack/devstack master: WIP: Test static perfdata comparisons  https://review.opendev.org/c/openstack/devstack/+/83894718:40

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!