*** ykarel_ is now known as ykarel | 07:01 | |
*** haleyb|out is now known as haleyb | 14:51 | |
sean-k-mooney | hi folks i know we can query logs viw opensearch like this https://tinyurl.com/35dfcfcy | 17:41 |
---|---|---|
sean-k-mooney | or we can ask zuul of the buidl history like this https://zuul.openstack.org/builds?job_name=tempest-integrated-compute&result=TIMED_OUT&skip=0 | 17:41 |
sean-k-mooney | but is there a way we can get the average runtime of a job per provider | 17:42 |
sean-k-mooney | looking at the opensearch results it looks like almost all of the timeoute for tempest-integrated-compute are rax space providers | 17:43 |
sean-k-mooney | so im trying to fiture out if the successful runs are typeically slower on rax vs other providers and if so by how much | 17:43 |
clarkb | with the old elasticsearch system we queried the log info for timing and then aggregated by region. Not sure if still doable | 17:44 |
sean-k-mooney | look like the build times api does not actully work | 17:50 |
sean-k-mooney | curl -X GET "https://zuul.openstack.org/api/tenant/{tenant_name}/build-times" -H "accept: */*" | 17:50 |
sean-k-mooney | it proably would not give me the info i want anyway | 17:51 |
clarkb | sean-k-mooney: there is a test for build-times so ti should work | 17:51 |
sean-k-mooney | well its responding with a html snipit | 17:51 |
sean-k-mooney | that does not render | 17:51 |
sean-k-mooney | i get this https://termbin.com/rvs3 | 17:52 |
clarkb | you probably need to send the header for json not */* | 17:52 |
clarkb | ya @cherrypy.tools.json_out(content_type='application/json; charset=utf-8') decorates the build_times method | 17:52 |
sean-k-mooney | perhaps but thats also what i get back form the openapi client in my broser | 17:52 |
sean-k-mooney | i.e. the one here https://zuul.openstack.org/openapi | 17:53 |
clarkb | you also need to replace tenant_name with a valid tenant name | 17:53 |
clarkb | but this is tested it should work | 17:53 |
sean-k-mooney | yep im using openstack | 17:53 |
clarkb | I managed to get an empty reply from server. I wonder if it timed out | 17:56 |
sean-k-mooney | its fine as i said this proably wont have the provider info in the resonce anyway | 17:57 |
clarkb | no it would not include thati nfo | 17:57 |
sean-k-mooney | what im trying to determin is it reasonab to increase the time out form 2 hours to 2.5 hours to accomadate the ocational slow node on rax | 17:58 |
sean-k-mooney | and to determin that i kind of need ot know what the expect time is for succesful runs per provider | 17:59 |
sean-k-mooney | if we see a trend that rax is typically slower then the rest then that supprot exending it. if however its typicall yin the same range perhaps we should expore somethign else | 18:00 |
clarkb | ya I would look at the records in elasticsearch | 18:01 |
clarkb | you should be able to collect runtime info out of the log files matching on a specific logline. Then collect them for the regions you care about and dump into a csv file for processing | 18:02 |
sean-k-mooney | the only problem with this is we dont have the tiem a job took to execute in elasticsearch | 18:02 |
clarkb | no but you have the tox/stestr runtime and also info if it timed out | 18:02 |
clarkb | you also have the devstack runtimes | 18:02 |
opendevreview | James Parker proposed openstack/whitebox-tempest-plugin master: Use sudo when gather compute id information https://review.opendev.org/c/openstack/whitebox-tempest-plugin/+/921699 | 18:02 |
sean-k-mooney | thats fir we could look at the tempest execution time an ddevstack time for the succsseful runs | 18:03 |
sean-k-mooney | and use that as a proxy for total job time | 18:03 |
clarkb | sean-k-mooney: I checked with corvus and he says the build-times stuff is still a bit of a work in progress. It was added to support https://review.opendev.org/c/zuul/zuul/+/899767 which hasn't landed yet and the queries involved are optimized for what that page does | 18:03 |
sean-k-mooney | ya i noticed the api docs said "reposce is not docuemtned yet" | 18:04 |
sean-k-mooney | so i assuemd it was not ready for prime time | 18:04 |
sean-k-mooney | oh interesting | 18:05 |
sean-k-mooney | ya that page sound useful | 18:05 |
sean-k-mooney | its just a shame we still wont have the rejoin info form the jobs | 18:05 |
sean-k-mooney | its in the zuul inventory | 18:05 |
clarkb | rejoin info? | 18:05 |
sean-k-mooney | but the region is just not a zull job output so presumable | 18:06 |
sean-k-mooney | not quieryable | 18:06 |
sean-k-mooney | the provider | 18:06 |
sean-k-mooney | i just spell thing horribly :) | 18:06 |
clarkb | oh region. Ya the reason for that is all of that info is on the nodepool side of things so doesn't end up in zuul's db. There is effort underway to move nodepool into zuul as a zuul component and then we could probably collect that data more easily | 18:06 |
sean-k-mooney | well its also in the inventory that zuul creats | 18:07 |
sean-k-mooney | but the inventory is not really part fo any zuul api | 18:07 |
sean-k-mooney | and the filed we are using is likely custom to our deployment | 18:07 |
sean-k-mooney | ok so this kind of works https://tinyurl.com/996xk2s2 | 18:11 |
sean-k-mooney | thats all the devstack timing form the devstack summary | 18:11 |
opendevreview | James Parker proposed openstack/whitebox-tempest-plugin master: Use sudo when gather compute id information https://review.opendev.org/c/openstack/whitebox-tempest-plugin/+/921699 | 18:15 |
opendevreview | Goutham Pacha Ravi proposed openstack/devstack-plugin-ceph master: Standalone nfs-ganesha with cephadm deployment https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/915212 | 23:58 |
opendevreview | Goutham Pacha Ravi proposed openstack/devstack-plugin-ceph master: Delete package-based-installation test jobs https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/918951 | 23:58 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!