opendevreview | Dan Smith proposed openstack/devstack master: WIP: Test static perfdata comparisons https://review.opendev.org/c/openstack/devstack/+/838947 | 00:08 |
---|---|---|
*** pojadhav|out is now known as pojadhav | 04:59 | |
*** bhagyashris is now known as bhagyashris|ruck | 05:33 | |
*** pojadhav is now known as pojadhav|lunch | 06:27 | |
*** jpena|off is now known as jpena | 06:59 | |
*** pojadhav|lunch is now known as pojadhav | 07:11 | |
opendevreview | Balazs Gibizer proposed openstack/devstack master: Use proper sed separator for paths https://review.opendev.org/c/openstack/devstack/+/839034 | 10:05 |
opendevreview | Merged openstack/devstack master: modify the sample value of LOGDAYS https://review.opendev.org/c/openstack/devstack/+/838829 | 11:09 |
sean-k-mooney | clarkb: i have been playing with the virtual env patch actully this morning but in a very different setup | 11:42 |
sean-k-mooney | clarkb: it was working pretty well on centos9 for me before | 11:43 |
sean-k-mooney | this morning i have been tryign to use it to install on arm (m1 macbook air) using debian testing/bookworm | 11:43 |
sean-k-mooney | clarkb: ... privsep | 12:09 |
sean-k-mooney | privsep is not using the same python as everything else | 12:10 |
sean-k-mooney | oh i bet this is the concole script edge case wehre the the python version is not updated after the initall insatll | 12:14 |
sean-k-mooney | i did orgianlly try to stack with 10 a few weeks ago | 12:14 |
sean-k-mooney | i bet that is the issue | 12:14 |
*** pojadhav is now known as pojadhav|afk | 13:23 | |
opendevreview | Dan Smith proposed openstack/devstack master: Update our API call counting method https://review.opendev.org/c/openstack/devstack/+/839067 | 14:00 |
dansmith | clarkb: gmann: I'm starting to realize that some of these "stable" metrics aren't actually that performance-independent | 14:05 |
dansmith | like api calls.. on a slower worker, tempest will poll more times for resource completion, inflating that number | 14:05 |
dansmith | that's actually not that hard to filter out | 14:06 |
dansmith | but, the db queries it generates is | 14:06 |
dansmith | service memory footprint seems to also be somewhat unreliable, like by 30% in some cases, which seems odd to me | 14:07 |
*** akekane_ is now known as abhishekk | 15:15 | |
clarkb | dansmith: oh intreesting | 15:18 |
sean-k-mooney | dansmith: is that 30% in the same job but from differnt runs | 15:23 |
sean-k-mooney | or stack/unstack on your local system | 15:23 |
sean-k-mooney | that does seam higher then i would expect too | 15:23 |
dansmith | sean-k-mooney: same job, same patch, two subsequent runs | 15:28 |
dansmith | (in ci, not local) | 15:28 |
sean-k-mooney | dansmith: did it hit the same provier? i would not thinkt that shoudl affect memory liek that but if it was slower excrta maybe caches are expanding or somethign over time | 15:29 |
dansmith | sean-k-mooney: nope, different providers, which is likely why it took 10 minutes longer on one, and had more polling | 15:32 |
sean-k-mooney | ya so we might be bufferign the logs or soemthing that would result in larger memory usage | 15:33 |
dansmith | 30% seems like a lot for that difference, but yeah, something | 15:34 |
dansmith | I was thinking more like the extra polling ended up spawning more horizontal workers or something | 15:34 |
sean-k-mooney | ya it does | 15:34 |
dansmith | more tempest tests waiting and polling all at once or something | 15:34 |
sean-k-mooney | perhaps or more queued requets in apache | 15:34 |
sean-k-mooney | the way we run say nova-api behidn aprch in uwsgi means that we get no scaling form eventlet | 15:35 |
sean-k-mooney | all requets are queued in appache and then each uwsgi python proces handels one request at a time | 15:36 |
sean-k-mooney | so more polling means more thigns in the queue and more memory usage | 15:36 |
dansmith | well, the things in the queue are small though, and not charged against the python process until they're dispatched right? | 15:38 |
sean-k-mooney | hum ya i guess that woudl be correct it woudl be agaisnt apache | 15:38 |
sean-k-mooney | not hte python process | 15:38 |
sean-k-mooney | and ya its jut the http get wich in most cases is tiny | 15:39 |
dansmith | right | 15:39 |
dansmith | especially in the polling case | 15:39 |
sean-k-mooney | although each request will translate to a db query | 15:39 |
sean-k-mooney | and proably memcache lookup | 15:40 |
sean-k-mooney | so its proably on the service -> db side that its increasing | 15:40 |
dansmith | memcache when? | 15:42 |
dansmith | not for things like instance show I wouldn't think | 15:43 |
sean-k-mooney | i was thinking the keystone tokens dont we cache that in memcache | 15:43 |
dansmith | and memcached might go up, but not sure why memcache calls from the python services would | 15:43 |
sean-k-mooney | i guess that would not affect the python usage | 15:44 |
sean-k-mooney | its an interesting result in any case | 15:44 |
sean-k-mooney | i assume os-profiler or the other tools we have would not help narrow down why its increasing | 15:45 |
dansmith | yeah I dunno, but yes definitely interesting | 15:45 |
dansmith | I think focusing on the should-be-repeatable metrics first is probably most useful | 15:46 |
dansmith | db queries being impacted by polling is troubling, | 15:46 |
dansmith | so I probably need to separate out SELECT vs. (everything else) or something | 15:46 |
dansmith | but even still, the impetus for this was additional select queries due to rbac, so it would only help for large spikes | 15:46 |
sean-k-mooney | is that select load confied ot keystone or across all services | 15:48 |
sean-k-mooney | i would not have expected this to affect other serivces | 15:48 |
dansmith | of course it's all services, because it | 15:49 |
dansmith | is polling for instance -> active or something :) | 15:49 |
dansmith | which makes it.. a mess :) | 15:49 |
sean-k-mooney | right but that would increass the http request to keystone form nova | 15:49 |
sean-k-mooney | sory form nova to keystone | 15:49 |
sean-k-mooney | to validate the tokens | 15:49 |
dansmith | so keystone goes up as well because of all the token validation, but so does straight up nova db queries, pulling the instance each time | 15:50 |
sean-k-mooney | ya but that last part is partly unavoiable | 15:50 |
sean-k-mooney | altough i think status is in the non detail endpoint | 15:50 |
dansmith | it is, and that's the point | 15:50 |
sean-k-mooney | so if its just active they shoudl not use the detail one | 15:50 |
sean-k-mooney | i know ceilomiter fixed that a few years ago | 15:50 |
dansmith | if we're trying to alert on "why is nova doing a bunch more db queries on this patch" it's hard to separate that from "this is a slow worker and tempest did a bunch more polling on just this run" | 15:51 |
sean-k-mooney | ya | 15:51 |
sean-k-mooney | maybe you could syntisie a memtric liek quiries/jobtime or something that was less affected | 15:52 |
dansmith | yeah, so I can tell what api calls are tempest and which are inter-service, so maybe figuring out what the proportion is and then apply that to the db queries would help normalize it | 15:54 |
*** jpena is now known as jpena|off | 15:55 | |
sean-k-mooney | dansmith: clarkb https://twitter.com/sean_k_mooney/status/1517537626923929601?s=20&t=arNXrLIXTd_74nKhsAZauA | 16:17 |
dansmith | nice | 16:17 |
sean-k-mooney | i have not got vms booting fullly yet. im missing some config for uefi to work properly | 16:18 |
sean-k-mooney | but its close | 16:18 |
sean-k-mooney | i am also using the global_venv | 16:18 |
clarkb | nice! when you say natively is it still in a linux vm but running arm not emulated x86? | 16:19 |
sean-k-mooney | nope | 16:20 |
clarkb | (I wouldn't expect devstack to run on osx direclty, really neat if so) | 16:20 |
sean-k-mooney | linux running nativly on m1 | 16:20 |
sean-k-mooney | no vm | 16:20 |
sean-k-mooney | then devstack installed on that | 16:20 |
clarkb | oh woww you are extremely brave :) | 16:20 |
clarkb | that is cool though | 16:20 |
sean-k-mooney | so debian testing | 16:20 |
dansmith | ah, gdi, we're using a different log format for tls-proxy.log | 16:21 |
sean-k-mooney | https://github.com/AsahiLinux/docs/wiki/SW%3AAlternative-Distros im using the debian comunity installer | 16:21 |
dansmith | clarkb: gmann frickler: do you know why we're not using the "combined" format for the tls proxy log? we're specifically choosing a lot format there, but it's lacking things like user-agent | 16:23 |
clarkb | I don't | 16:23 |
clarkb | are we adding port info? that may be why (its a common reason for opendev services ot override common format at least) | 16:24 |
gmann | me too, not sure about it. | 16:24 |
clarkb | by default the combined format doesn't show youport info which is necessary to trace connections through a proxy | 16:24 |
dansmith | it's very barebones | 16:25 |
dansmith | CustomLog /var/log/apache2/tls-proxy_access.log "%{%Y-%m-%d}t %{%T}t.%{msec_frac}t [%l] %a \"%r\" %>s %b" | 16:25 |
dansmith | so I'mma switch that to combined like our regular access.log if that's okay | 16:26 |
clarkb | ya if port info isn't there then combined shouldn't be a regression. wfm | 16:26 |
opendevreview | Dan Smith proposed openstack/devstack master: Update our API call counting method https://review.opendev.org/c/openstack/devstack/+/839067 | 16:31 |
sean-k-mooney | dansmith: clarkb got a vm booted :) | 17:00 |
opendevreview | Dan Smith proposed openstack/devstack master: Update our API call counting method https://review.opendev.org/c/openstack/devstack/+/839067 | 17:02 |
sean-k-mooney | ok that might be premature i think it crashed | 17:08 |
sean-k-mooney | which kind of make sense | 17:08 |
dansmith | hah | 17:08 |
sean-k-mooney | the current kernel im using does not yet support 4k pages | 17:08 |
sean-k-mooney | its using 16k pages by default because of iommu issues | 17:09 |
sean-k-mooney | oh yay ... sudo reboot does not work | 17:10 |
sean-k-mooney | power button does. i think ill leave it off for a while its had a long day | 17:10 |
sean-k-mooney | o/ talk to ye on monday | 17:11 |
dansmith | shocking that apple even includes a power button anymore | 17:13 |
opendevreview | Dan Smith proposed openstack/devstack master: Update our API call counting method https://review.opendev.org/c/openstack/devstack/+/839067 | 18:40 |
opendevreview | Dan Smith proposed openstack/devstack master: WIP: Test static perfdata comparisons https://review.opendev.org/c/openstack/devstack/+/838947 | 18:40 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!