clarkb | any idea how http://logs.openstack.org/58/671858/3/gate/zuul-tox-docs/b6f9626/job-output.txt.gz#_2019-07-23_16_48_40_674698 happened and if it has been corrected? | 00:14 |
---|---|---|
*** ianychoi has quit IRC | 00:18 | |
*** ianychoi has joined #zuul | 00:20 | |
fungi | clarkb: fixed by 672372 maybe? | 00:22 |
clarkb | ah I see that now in scrollback thanks | 00:23 |
openstackgerrit | Merged zuul/zuul master: Fix sphinx error https://review.opendev.org/672372 | 00:33 |
*** mattw4 has quit IRC | 00:37 | |
*** jamesmcarthur has joined #zuul | 00:52 | |
*** jamesmcarthur has quit IRC | 00:58 | |
*** igordc has quit IRC | 01:04 | |
*** jamesmcarthur has joined #zuul | 01:20 | |
*** bhavikdbavishi has joined #zuul | 01:52 | |
*** bhavikdbavishi1 has joined #zuul | 01:55 | |
*** bhavikdbavishi has quit IRC | 01:57 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 01:57 | |
*** jamesmcarthur has quit IRC | 02:16 | |
*** jamesmcarthur has joined #zuul | 02:17 | |
*** jamesmcarthur has quit IRC | 02:21 | |
*** jamesmcarthur has joined #zuul | 02:22 | |
*** jamesmcarthur has quit IRC | 02:24 | |
*** jamesmcarthur has joined #zuul | 02:24 | |
*** jamesmcarthur has quit IRC | 02:32 | |
*** bhavikdbavishi has quit IRC | 02:46 | |
*** mattw4 has joined #zuul | 03:21 | |
*** bhavikdbavishi has joined #zuul | 03:35 | |
*** mattw4 has quit IRC | 03:37 | |
*** mattw4 has joined #zuul | 03:43 | |
*** mattw4 has quit IRC | 03:51 | |
*** igordc has joined #zuul | 03:58 | |
*** igordc has quit IRC | 04:01 | |
*** bolg has joined #zuul | 04:02 | |
*** raukadah is now known as chandankumar | 04:02 | |
*** pcaruana has joined #zuul | 04:27 | |
*** michael-beaver has quit IRC | 04:31 | |
*** bjackman has joined #zuul | 04:43 | |
*** pcaruana has quit IRC | 05:12 | |
*** jangutter has quit IRC | 05:18 | |
*** pcaruana has joined #zuul | 05:25 | |
*** bolg has quit IRC | 05:34 | |
*** bolg has joined #zuul | 05:36 | |
*** bolg has quit IRC | 05:55 | |
*** tosky has joined #zuul | 06:41 | |
*** rlandy has joined #zuul | 06:58 | |
*** jpena|off is now known as jpena | 07:12 | |
*** jangutter has joined #zuul | 07:19 | |
*** jpena is now known as jpena|mtg | 07:19 | |
daniel2 | So with the dockerized zuul setup, how does nodepool build images? Does it still use diskimagebuilder? | 07:21 |
*** jangutter has quit IRC | 07:23 | |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Enable debug logs for openstack-functional tests https://review.opendev.org/672412 | 07:23 |
*** hashar has joined #zuul | 07:41 | |
*** bolg has joined #zuul | 07:53 | |
*** threestrands has joined #zuul | 08:02 | |
*** rlandy is now known as rlandy|mtg | 08:51 | |
*** altlogbot_2 has quit IRC | 08:51 | |
*** irclogbot_2 has quit IRC | 08:51 | |
*** altlogbot_2 has joined #zuul | 08:53 | |
*** irclogbot_3 has joined #zuul | 08:53 | |
*** hwangbo has quit IRC | 09:04 | |
*** jangutter has joined #zuul | 09:16 | |
*** saneax has joined #zuul | 09:35 | |
*** bhavikdbavishi has quit IRC | 09:38 | |
*** bolg has quit IRC | 09:39 | |
*** bolg has joined #zuul | 09:48 | |
*** jamesmcarthur has joined #zuul | 09:49 | |
*** arxcruz is now known as arxcruz|brb | 09:54 | |
*** jamesmcarthur has quit IRC | 09:58 | |
*** jamesmcarthur_ has joined #zuul | 09:58 | |
*** threestrands has quit IRC | 10:08 | |
*** sshnaidm|afk is now known as sshnaidm | 10:10 | |
*** jamesmcarthur_ has quit IRC | 10:11 | |
zbr_ | can can I add some extra tasks to run on a child-job without overriding pre-run from parent? Is there a way to do this? | 10:13 |
zbr_ | i guess roles would do the trick, but I am not sure how inheritance works with them | 10:14 |
AJaeger | zbr_: if you add a pre-run, it is run *in-addition*, see also https://zuul-ci.org/docs/zuul/user/config.html#job | 10:16 |
AJaeger | zbr_: so, you never override pre-run | 10:16 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Return dependency cycle failure to user https://review.opendev.org/672487 | 10:16 |
zbr_ | AJaeger: super! thanks. | 10:17 |
zbr_ | it totally makes sense to do it like that. | 10:17 |
*** hashar has quit IRC | 10:40 | |
*** saneax has quit IRC | 11:32 | |
*** saneax has joined #zuul | 11:33 | |
*** irclogbot_3 has quit IRC | 11:33 | |
*** irclogbot_3 has joined #zuul | 11:36 | |
*** wxy-xiyuan has quit IRC | 11:42 | |
*** hashar has joined #zuul | 11:57 | |
*** bolg has quit IRC | 12:04 | |
*** bhavikdbavishi has joined #zuul | 12:12 | |
*** roman_g has joined #zuul | 12:22 | |
roman_g | Hello team! Users question. I have 2 dependent changes in 2 repos - A and B. Change in repo B has proper Depends-On: xxxxxx pointing to the change in repo A. Code in repo B git-clones repo A internally and it currently clones 'master'. How could I properly utilize Zuul to do git-clone repo A for me, and apply the parent change over it? | 12:28 |
roman_g | *User's | 12:28 |
flaper87 | what ansible version does zuul-executor use by default? | 12:28 |
flaper87 | (when there are multiple versions installed, that is) | 12:28 |
flaper87 | is it the oldest or the latest? | 12:29 |
*** arxcruz|brb is now known as arxcruz | 12:29 | |
Shrews | flaper87: https://zuul-ci.org/docs/zuul/user/config.html#attr-job.ansible-version | 12:29 |
*** bhavikdbavishi has quit IRC | 12:35 | |
*** bhavikdbavishi has joined #zuul | 12:39 | |
flaper87 | Shrews: thanks | 12:39 |
*** bhavikdbavishi1 has joined #zuul | 12:42 | |
*** bhavikdbavishi has quit IRC | 12:43 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 12:43 | |
AJaeger | roman_g: Use required-projects - it's a list of repos that we download on disk for you, it's the same branch plus your depends-on changes on it. | 12:44 |
AJaeger | roman_g: So, just use that repo, you get it for free ;) | 12:44 |
AJaeger | I meant: checked out tree - and not entirely free but no magic for you to apply a change | 12:44 |
AJaeger | roman_g: you can pass that dir to your jobs like in https://opendev.org/zuul/nodepool/src/branch/master/playbooks/nodepool-functional-openstack/pre.yaml#L4 | 12:47 |
*** bolg has joined #zuul | 12:50 | |
*** yoctozepto has joined #zuul | 12:54 | |
roman_g | AJaeger: thank you! Looking into it. | 12:56 |
*** bjackman has quit IRC | 12:59 | |
*** bolg has quit IRC | 12:59 | |
*** bolg has joined #zuul | 13:01 | |
*** bolg has quit IRC | 13:05 | |
*** jku has joined #zuul | 13:06 | |
*** jku has quit IRC | 13:06 | |
*** jank has joined #zuul | 13:07 | |
*** jank has quit IRC | 13:07 | |
tristanC | corvus: about zuul-tests.d, is there a change planned to match such test job when a job definition changes (to avoid adding the full zuul.yaml to the files list)? | 13:09 |
*** jank has joined #zuul | 13:10 | |
AJaeger | tristanC: already merged ;) | 13:11 |
AJaeger | tristanC: see https://review.opendev.org/669752 | 13:11 |
*** jpena|mtg is now known as jpena|off | 13:12 | |
*** jank has quit IRC | 13:14 | |
tristanC | AJaeger: thanks. So a tox-linters-test job would set tox-linters as parent to trigger when tox-linters definition change right? | 13:15 |
*** jank has joined #zuul | 13:15 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Return dependency cycle failure to user https://review.opendev.org/672487 | 13:15 |
AJaeger | tristanC: not sure what happens if parent changes, best check implementation or ask corvus... | 13:16 |
*** bhavikdbavishi has quit IRC | 13:20 | |
tristanC | corvus: it seems like match-on-config-updates implies that a job is able to self test itself. Wouldn't it make sense to let another job run when a job definition change? e.g. tox-linters-test.match-on-job-update: [tox-linters] ? | 13:24 |
*** jpena|off is now known as jpena | 13:25 | |
*** jpena is now known as jpena|mtg | 13:27 | |
*** jamesmcarthur has joined #zuul | 13:55 | |
*** jeliu_ has joined #zuul | 14:07 | |
*** lennyb has quit IRC | 14:18 | |
*** jamesmcarthur has quit IRC | 14:25 | |
*** michael-beaver has joined #zuul | 14:29 | |
*** mattw4 has joined #zuul | 14:31 | |
*** rlandy|mtg has quit IRC | 14:32 | |
fungi | daniel2: depends on what you mean by "dockerized zuul setup" but sure you can run nodepool builders (including diskimage-builder) in a container. nodepool can also talk to container orchestration engines (kubernetes, openshift) to launch containers for jobs to run in, if that's what you're asking | 14:35 |
*** jpena|mtg is now known as jpena|off | 14:36 | |
corvus | tristanC: right, the match-on-config-update feature was designed to eliminate the need for '.zuul.yaml' in files matchers, so that when a job is updated, it is run. most jobs are self-testing. i'm having trouble imagining a job which tests another job without also being descended from that job. in zuul-jobs, we have a lot of test-foo jobs, but they are testing roles. a tox-linters-test job could inherit | 14:39 |
corvus | from tox-linters, and it should match on a job update to either. so i think even that use-case is covered. | 14:39 |
tristanC | corvus: so match-on-config-updates also matches parent job definition change? | 14:42 |
corvus | tristanC: afaik it should | 14:42 |
corvus | tristanC: it's basically implemented as a diff for the finalized job config; if anything about what it's about to run changes, it'll match | 14:43 |
tristanC | corvus: the use-case is to be able to test the pre/post phase of the job | 14:43 |
tristanC | corvus: for example, for tox job, we might want to be able to setup a tools/tests-setup.sh, which would need to be done before the unittest pre phase | 14:44 |
corvus | tristanC: got it. yeah, that should work | 14:45 |
*** igordc has joined #zuul | 14:49 | |
*** AshBullock has joined #zuul | 14:51 | |
*** mattw4 has quit IRC | 14:51 | |
AshBullock | Hey guys, we are trying to hook up kubernetes to nodepool, we see in the documented configs two labels, namepace and pod, we've added both of these, but are seeing "AttributeError: 'NoneType' object has no attribute 'create_namespace'" in the nodepool logs, how are these labels supposed to be setup? Thanks | 14:55 |
tristanC | AJaeger: there should be an exception in the logs about "Couldn't load client from config" | 14:56 |
tristanC | AshBullock: ^ (sorry AJaeger, wrong autocomplete) | 14:56 |
AshBullock | this is our config file for nodepool http://paste.openstack.org/show/754804/ | 14:59 |
*** mattw4 has joined #zuul | 15:00 | |
AshBullock | and the error we get is http://paste.openstack.org/show/754805/ | 15:00 |
AshBullock | and to confirm we do see the error you mentioned 2019-07-24 14:55:57,139 ERROR nodepool.driver.kubernetes.KubernetesProvider: Couldn't load client from config | 15:01 |
*** swest has quit IRC | 15:03 | |
tristanC | corvus: alright, we'll give this a try then. I guess we can re-run the pre and post phase of the parent in the run phase of the test job and do the assert in the child job | 15:03 |
tristanC | s/test job/child job/ | 15:04 |
tristanC | AshBullock: did you setup the ~nodepool/.kube/config file? | 15:05 |
AshBullock | have the kube config added yes | 15:06 |
AshBullock | Now receiving this error: "Failure","message":"namespaces is forbidden: User \"system:anonymous\" cannot create resource \"namespaces\" in API group \"\" at the cluster scope","reason":"Forbidden","details":{"kind":"namespaces"},"code":403} | 15:07 |
AshBullock | after updating my nodepool config | 15:08 |
AshBullock | to reference the kube context name correctly | 15:08 |
tristanC | AshBullock: perhaps there is something to enable in EKS to enable your service account to list/create resources | 15:09 |
*** jank has quit IRC | 15:09 | |
AshBullock | thanks, I'll look into that now | 15:09 |
tristanC | AshBullock: iirc eks requires an iam token to use the api from outside. is your nodepool-launcher service running in eks? | 15:12 |
*** roman_g has left #zuul | 15:29 | |
AshBullock | thanks, managed to get it spinning up now, issue was the aws client was out of date and did not have the get-token command | 15:32 |
AshBullock | thanks for all the help | 15:33 |
clarkb | we might want to make note of that requirement in the docs? | 15:34 |
corvus | yeah, any additional info that could help future users would be great :) | 15:37 |
AshBullock | So I've got the containers running but get this error: main | MODULE FAILURE: error: You must be logged in to the server (Unauthorized) after installing kubectl on the executor, I can run kubectl commands from nodepool user but I assume the ansible run is using a virtual env, any ideas how to solve this? | 15:42 |
AshBullock | this is running on the pre.yml tasks | 15:43 |
*** mattw4 has quit IRC | 15:44 | |
*** mattw4 has joined #zuul | 15:44 | |
AshBullock | which is running zuul roles add-build-sshkey , prepare-workspace | 15:45 |
tristanC | AshBullock: nodepool creates a service account per zuul noderequest like so: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/driver/kubernetes/provider.py#L155 | 15:46 |
tristanC | AshBullock: which may different from the one you had to configure for nodepool... | 15:47 |
AshBullock | so that created service account probably doesn't have access to the credentials? | 15:48 |
AshBullock | so how would I pass them through to the user? | 15:48 |
*** mattw4 has quit IRC | 15:49 | |
tristanC | AshBullock: it shouldn't have access to the credentials to prevent the job from tempering with another job resources. | 15:50 |
tristanC | AshBullock: it is meant to be a restricted account for the namespace created per job | 15:51 |
AshBullock | we're targeting hosts: all, should we be targeting hosts: pod as per this guide: https://www.softwarefactory-project.io/tech-preview-using-openshift-as-a-resource-provider.html | 15:51 |
tristanC | "hosts: all" should match the resources given to the job | 15:53 |
tristanC | the blog post uses "pod" instead just to be more explicit | 15:54 |
AshBullock | with containers is there a list of approved zuul roles to use? | 15:56 |
tristanC | AshBullock: most should work, except for the one using the "synchronize" module | 15:57 |
tristanC | AshBullock: here is how we copy the sources to pod: https://review.opendev.org/631402 | 15:58 |
*** AshBullock has quit IRC | 16:04 | |
clarkb | I have rechecked https://review.opendev.org/#/c/671858/3 since sphinx builds were fixed (just a heads up that that should be going in) | 16:06 |
openstackgerrit | James E. Blair proposed zuul/zuul-jobs master: Download-artifact: use the artifact type rather than name https://review.opendev.org/672557 | 16:09 |
*** hashar has quit IRC | 16:10 | |
*** AshBullock has joined #zuul | 16:13 | |
*** pcaruana has quit IRC | 16:13 | |
mordred | corvus: that also looks good - I can go either way | 16:20 |
corvus | mordred: it's a little more work, but ending up with "_type" is going to bother me less, so i'm perfectly happy to do it :) | 16:21 |
mordred | \o/ | 16:21 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add log browsing to build page https://review.opendev.org/671906 | 16:25 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Move artifacts to their own section https://review.opendev.org/672379 | 16:25 |
daniel2 | fungi: So I added the nodepool builder container, I'm not sure if you or anyone active here has done this, but I can't get the images directory to create due to permission issues. Even if its put inside of the home /var/lib/nodepool. | 16:27 |
daniel2 | Docker log shows: PermissionError: [Errno 13] Permission denied: '/var/lib/nodepool/images/builder_id.txt' | 16:28 |
clarkb | daniel2: can you check the uid:gid ownership of that file and that of the nodepool-builder process? | 16:30 |
clarkb | the nodepool-builder records that file so that it identifies itself uniquely to zookeeper iirc. It will need to be able to write to that directory and file | 16:31 |
daniel2 | clarkb: so when docker runs, it does a whoami, but fails `whoami: cannot find name for user ID 10001` | 16:31 |
daniel2 | This is using the zuul/nodepool-builder image | 16:31 |
daniel2 | That folder is probably not the right permissions but not sure how to change that. I guess I could write a file to run before the command | 16:32 |
clarkb | I expect the nodepool-builder image wants you to mount in a volume or bindmount for that directly so that it persists between docker container instances | 16:32 |
clarkb | and you'll have to set permissions appropriately? | 16:32 |
daniel2 | clarkb: That's what I'm doing. | 16:32 |
daniel2 | using a bind mount | 16:32 |
clarkb | daniel2: you can docker exec a shell into the image to poke around and check things like permissions | 16:32 |
daniel2 | I cant because the container died due to nodepool-builder exiting. | 16:32 |
clarkb | you can run it then and change the command | 16:33 |
daniel2 | dunno why I didnt think of that | 16:33 |
daniel2 | must be the lack of sleep | 16:33 |
clarkb | just double confirming that uid is the one set by the image so that is the expected value | 16:35 |
daniel2 | clarkb: this is just strange, the permissions appear correct. | 16:38 |
daniel2 | Would it possibly be an issue with the host? | 16:38 |
pabelanger | is /var/lib/nodepool/images/ a mount? | 16:39 |
daniel2 | /dev/mapper/area50--vg-root on /var/lib/nodepool/images type ext4 (rw,relatime,errors=remount-ro,data=ordered) | 16:39 |
pabelanger | that is from inside container? | 16:40 |
daniel2 | yes | 16:40 |
pabelanger | and touch /var/lib/nodepool/images/foo.txt works? | 16:40 |
daniel2 | no, gives permission denied | 16:40 |
pabelanger | what does mount outside of container look like | 16:41 |
clarkb | and what are permissions of the directory | 16:41 |
clarkb | (to create a file you need w on the dir iirc) | 16:42 |
daniel2 | drwxrwxr-x 2 ubuntu ubuntu 4.0K Jul 24 04:50 images | 16:42 |
pabelanger | what is uuid of ubuntu? | 16:42 |
clarkb | that would be the issue I think | 16:42 |
daniel2 | 1000 | 16:42 |
pabelanger | that should be the same as user in container | 16:42 |
daniel2 | eheh | 16:42 |
daniel2 | Can I specify the user id in the docker compose file | 16:43 |
clarkb | we may need to modify our entrypoint to chown that | 16:43 |
daniel2 | ah I see | 16:43 |
pabelanger | I don't think we start a nodepool-builder in quickstart do we? | 16:43 |
daniel2 | no we don't. | 16:43 |
clarkb | pabelanger: we don't | 16:43 |
daniel2 | I added that myself. | 16:43 |
Shrews | quickstart uses static nodes so builder isn't necessary | 16:45 |
daniel2 | builder_1 | chown: changing ownership of '/var/lib/nodepool/images': Operation not permitted | 16:45 |
daniel2 | :D | 16:45 |
pabelanger | daniel2: https://opendev.org/windmill/ansible-role-nodepool/src/branch/master/tests/playbooks/templates/etc/systemd/system/nodepool-builder.service.j2 is how I do it directly with docker, -u 1001:1001, for the volumes | 16:46 |
*** mattw4 has joined #zuul | 16:46 | |
pabelanger | not sure how do to it with compose | 16:46 |
clarkb | pabelanger: that presumes you chown outside of docker though right? | 16:46 |
daniel2 | ohhh | 16:46 |
clarkb | pabelanger: its the same problem in either case whether you change the uid or not | 16:46 |
clarkb | something has to set ownership on that dir an dlooks like the current image is failing to do so | 16:47 |
daniel2 | Well, I guess changing the id wouldn't work in nodepool-builder | 16:47 |
clarkb | daniel2: do you have more log context for the chown failure? | 16:47 |
daniel2 | no thats all it said | 16:47 |
pabelanger | clarkb: I cann't remember. I'd have to look at docs | 16:47 |
pabelanger | but I don't chown anything directly, docker does it | 16:47 |
pabelanger | daniel2: but that is using zuul/nodepool-builder images, so should work | 16:48 |
daniel2 | I could try and bind mount the service file | 16:48 |
daniel2 | oh no | 16:49 |
daniel2 | Thats to start with docker | 16:49 |
pabelanger | yah, this isn't using compose, just docker directly | 16:49 |
pabelanger | you could just try command manually, and see if it works | 16:50 |
daniel2 | https://shafer.cc/paste/view/0c89553c That was what I had in docker-compose.yaml when I tried with chown | 16:50 |
pabelanger | then, work to add it to compose | 16:50 |
clarkb | daniel2: did you chown it to ubuntu:ubuntu then? | 16:51 |
daniel2 | I got past it | 16:51 |
daniel2 | I set user: 1000:1000 in docker-compose section for builder | 16:52 |
clarkb | sure but if you didn't chown it to 1000:1000 in the first place would it have owrked? | 16:52 |
daniel2 | No, I didn't chown it until you guys had mentioned it | 16:52 |
daniel2 | before I wasn't doing anything outside of the norm | 16:52 |
clarkb | fwiw our entrypoint for the opendev gitea images does an explicit chown of the mounts. I'm not see where we might do that for nodepool-builder | 16:52 |
clarkb | daniel2: what i sthe context of builder_1 | chown: changing ownership of '/var/lib/nodepool/images': Operation not permitted ? was that a chown you ran or a chown that it tried to rn on its own? | 16:53 |
daniel2 | it was a chown I ran | 16:53 |
daniel2 | using the command config line in the docker-compose section | 16:53 |
pabelanger | clarkb: Yah, same. I have ansible chown / chmod the /var/lib/nodepool folder to specific user, which is uuid of user in container. I assume compose has the same ability | 16:53 |
clarkb | gotcha so I think we may want ot update nodepool/tools/uid_entrypoint.sh to chown things properly | 16:53 |
clarkb | pabelanger: its not compose's job, it is the entrypoint | 16:54 |
clarkb | at least with how we've set up gitea | 16:54 |
Shrews | perhaps the best course of action is to set images-dir in nodepool.yaml to some place that a) it has permission to write to, and b) has space to actually build images | 17:00 |
daniel2 | mkdir: cannot create directory '/var/lib/nodepool/.cache/image-create': Permission denied | 17:00 |
daniel2 | this is getting old D: | 17:00 |
Shrews | that will fix the builder_id.txt issue since it writest to images-dir | 17:00 |
Shrews | an external volume is probably best | 17:01 |
daniel2 | It is. | 17:01 |
daniel2 | I have no name!@7c14b606484d:/$ ls | 17:01 |
daniel2 | Nice hostname | 17:01 |
daniel2 | or username. | 17:01 |
clarkb | Shrews: yes the way it is done with other tools is to have the entrypoint do the chown | 17:01 |
clarkb | Shrews: it would also work to chown outside of the container | 17:01 |
* clarkb gets an example | 17:02 | |
clarkb | Shrews: https://opendev.org/opendev/system-config/src/branch/master/docker/gitea-init/entrypoint.sh#L17-L26 we could do that with nodepool's entrypoint | 17:04 |
Shrews | clarkb: but /var/lib is within the actual container, right? i'm suggesting an external volume mounted at run time | 17:04 |
clarkb | Shrews: yes | 17:04 |
clarkb | same as with /data/ in the gitea example | 17:05 |
*** AshBullock has quit IRC | 17:09 | |
*** igordc has quit IRC | 17:16 | |
*** igordc has joined #zuul | 17:17 | |
daniel2 | So I fixed one problem and created another :) | 17:23 |
openstackgerrit | James E. Blair proposed zuul/zuul-operator master: WIP: testing https://review.opendev.org/672567 | 17:24 |
*** sgw has quit IRC | 17:25 | |
fungi | daniel2: welcome to computers? ;) | 17:25 |
daniel2 | haha right | 17:25 |
fungi | pretty much describes my typical day | 17:25 |
daniel2 | at least I finished enough of the CI setup that we were able to close out that issue for the sprint. | 17:26 |
daniel2 | Thats why I was up so late, wanted to knock that out. | 17:26 |
fungi | awesome~! | 17:26 |
daniel2 | We moevd the nodepool stuff to another issue. | 17:26 |
*** igordc has quit IRC | 17:27 | |
yoctozepto | tried asking on infra but maybe here is a better place - any idea why http://zuul.openstack.org/builds?project=openstack%2Fkolla-ansible&pipeline=periodic&pipeline=periodic-stable&branch=master&branch=stable%2Fstein&branch=stable%2Frocky&branch=stable%2Fqueens does not return? | 17:29 |
yoctozepto | it does if you replace kolla-ansible with kolla | 17:29 |
yoctozepto | or remove filter on periodic-stable | 17:30 |
yoctozepto | otherwise it does not work (or would take hours? waited several minutes already) | 17:30 |
fungi | it's not clear to me what that query means | 17:33 |
fungi | are those terms expected to be anded? ored? | 17:33 |
clarkb | fungi: it operates as an AND | 17:33 |
clarkb | (sorry lucene query language all caps habit) | 17:33 |
clarkb | I think that is why you get no results | 17:34 |
clarkb | you can't be in two pipelines | 17:34 |
fungi | are the types compared via and but multiple options within each type compared with or? | 17:34 |
clarkb | fungi: I don't think so | 17:34 |
* clarkb looks at the sql query | 17:34 | |
clarkb | hrm it moved | 17:35 |
AJaeger | clarkb, fungi: http://zuul.openstack.org/builds?project=openstack%2Fkolla-ansible&pipeline=periodic-stable is much simpler and shows the problem that yoctozepto has... | 17:36 |
AJaeger | yoctozepto: keep it simple, please ;) | 17:36 |
fungi | i'm guessing the expectation is that this should query "project:openstack/kolla-ansible AND pipeline:(periodic-stable OR stable) AND branch:(master OR stable/stein or stable/rocky OR stable/queens)" | 17:36 |
corvus | if someone wants to implement that, go for it, but that's not how it works :) | 17:37 |
yoctozepto | AJaeger: I did in the other channel, then I discovered this one loads in a couple of minutes | 17:37 |
fungi | ahh, sorry, my network connection here is going in and out, so my responses are lagging somewhat | 17:37 |
yoctozepto | ;-) | 17:37 |
clarkb | as a sanity check periodic-stable pipeline does have the mysql reporter listed | 17:37 |
yoctozepto | guys | 17:38 |
yoctozepto | but it works for kolla | 17:38 |
yoctozepto | magic: http://zuul.openstack.org/builds?project=openstack%2Fkolla&pipeline=periodic&pipeline=periodic-stable&branch=master&branch=stable%2Fstein&branch=stable%2Frocky&branch=stable%2Fqueens | 17:38 |
clarkb | AJaeger: that url works for me | 17:38 |
yoctozepto | hence someone has implemented it | 17:38 |
yoctozepto | but it does not want to work for kolla-ansible | 17:38 |
fungi | or this is undefined behavior | 17:38 |
yoctozepto | for no particular reason | 17:38 |
clarkb | looking at the api code it expects a singular pipeline | 17:38 |
yoctozepto | lolz, but it worked so great until I tried it on k-a | 17:39 |
yoctozepto | ;D | 17:39 |
yoctozepto | http://zuul.openstack.org/builds?project=openstack%2Fnova&pipeline=periodic&pipeline=periodic-stable&branch=master&branch=stable%2Fstein&branch=stable%2Frocky&branch=stable%2Fqueens | 17:39 |
yoctozepto | etc. | 17:39 |
yoctozepto | ;D | 17:39 |
fungi | but yes, the results there do seem to match the pseudoquery i wrote above, so possible it works that way by accident | 17:39 |
yoctozepto | it also does work for check+gate: http://zuul.openstack.org/builds?project=openstack%2Fkolla-ansible&pipeline=gate&pipeline=check&branch=master&branch=stable%2Fstein&branch=stable%2Frocky&branch=stable%2Fqueens | 17:40 |
corvus | if there are multiple values for a parameter, then they are treated as an "in" query | 17:40 |
fungi | i wonder if one of those branches or pipelines has no kolla-ansible matches and that's the difference | 17:40 |
yoctozepto | only not if you sprinkle periodic-stable | 17:41 |
yoctozepto | fungi: good one | 17:41 |
yoctozepto | lemme check | 17:41 |
AJaeger | yoctozepto: so, what's the smallest query that shows the problem? | 17:41 |
*** pcaruana has joined #zuul | 17:41 | |
yoctozepto | AJaeger: I wish I knew | 17:41 |
AJaeger | yoctozepto: yeah, get now results for the one I posted using kolla-ansible - was too impatient last time ;) | 17:41 |
clarkb | q = self.listFilter(q, buildset_table.c.pipeline, pipeline) | 17:42 |
clarkb | corvus: ^ that generates the "in" ? | 17:42 |
*** saneax has quit IRC | 17:42 | |
corvus | clarkb: yep | 17:42 |
corvus | if it's single, it's "==", otherwise it's "in" | 17:42 |
clarkb | ah yup I see the defintion of listFilter now | 17:43 |
yoctozepto | rocky/queens do not have any periodic-stable | 17:43 |
yoctozepto | but withouth them: http://zuul.openstack.org/builds?project=openstack%2Fkolla-ansible&branch=master&branch=stable%2Fstein&pipeline=periodic&pipeline=periodic-stable | 17:43 |
yoctozepto | still nothing | 17:43 |
*** saneax has joined #zuul | 17:43 | |
*** saneax has quit IRC | 17:44 | |
yoctozepto | AJaeger: you might like this, it's shorter ;p | 17:44 |
yoctozepto | I think I overloaded Zuul now | 17:44 |
yoctozepto | it says Fetching info... | 17:44 |
yoctozepto | for the previously working too now | 17:44 |
fungi | well, it's only zuul-web you're overloading, presumably | 17:45 |
yoctozepto | yeah, thought about that | 17:45 |
fungi | (if you are) | 17:45 |
fungi | separate daemon from the scheduler | 17:45 |
yoctozepto | I'm not well versed in zuul's components yet | 17:45 |
yoctozepto | I see, will have that in mind | 17:45 |
yoctozepto | (the next time I overload it) | 17:45 |
corvus | this is the query that has been running for 11 minutes: http://paste.openstack.org/show/754810/ | 17:46 |
clarkb | is it possible that seraching over (project, branch, pipeline) simply needs to be better indexed? | 17:46 |
clarkb | the query that works is only (project, pipeline) | 17:46 |
yoctozepto | clarkb: it works fast for kolla and nova for example | 17:47 |
* yoctozepto - zuul-web official overloader | 17:47 | |
corvus | wow it does construct the query differently for kolla and kolla-ansible | 17:47 |
* yoctozepto convicted for generating a 11 minute query :-( | 17:48 | |
fungi | huh, that's bizarre | 17:48 |
yoctozepto | maybe due to -? | 17:48 |
corvus | http://paste.openstack.org/show/754811/ | 17:48 |
yoctozepto | I mean: - | 17:48 |
corvus | first one is k-a, second is k | 17:48 |
yoctozepto | ah, on the mysql side | 17:49 |
yoctozepto | Using filesort and Using join buffer (Block Nested Loop) spoil the play :-( | 17:50 |
corvus | mordred, Shrews: ^ i'm stumped as to why the query planner would make that choice | 17:57 |
corvus | kolla-ansible has 11k buildsets in the db, and kolla has 10k. so they should both be in the index | 17:58 |
yoctozepto | corvus: could you try explain after replacing USE INDEX with FORCE INDEX? | 17:58 |
yoctozepto | corvus: you can also compare with nova as it is fine too | 17:58 |
corvus | yoctozepto: force index with k-a shows the same plan | 17:59 |
Shrews | what indexes are available on the zuul_build table? | 18:00 |
yoctozepto | corvus: yeah, was worth trying anyway, it protects agains full scan on the same table though... | 18:00 |
Shrews | it's doing a table scan on that table for both queries | 18:00 |
corvus | https://etherpad.openstack.org/p/Z0cucbdugf | 18:00 |
*** igordc has joined #zuul | 18:01 | |
* fungi is enjoying this impromptu crash course in database mysteries | 18:01 | |
Shrews | corvus: could you get a describe for the other two tables? | 18:02 |
* yoctozepto sharing fungi's enjoyment | 18:03 | |
corvus | done | 18:03 |
Shrews | what's the difference in those two queries? my eyes can't find it | 18:04 |
yoctozepto | Shrews: kolla vs kolla-ansible | 18:04 |
yoctozepto | changing project caused it | 18:04 |
corvus | yeah, only the string constant; no structural change | 18:04 |
Shrews | and they're both slow? | 18:04 |
yoctozepto | exactly | 18:04 |
yoctozepto | nope | 18:05 |
yoctozepto | kolla fast | 18:05 |
yoctozepto | kolla-ansible slow | 18:05 |
yoctozepto | nova fast | 18:05 |
yoctozepto | probably many more fast | 18:05 |
Shrews | oh. figuring out why the optimizer does what it does is dark magic. lemme see if i can spot anything obvious though. my guess is the cardinality is significantly different for the projects | 18:06 |
fungi | (de)optimizer | 18:07 |
Shrews | select count(*) from zuul_buildset where project = "openstack/kolla" <--- that and a similar count for "openstack/kolla-ansible" might be useful | 18:09 |
yoctozepto | and nova | 18:09 |
Shrews | or at least interesting | 18:09 |
fungi | 17:58 <corvus> kolla-ansible has 11k buildsets in the db, and kolla has 10k... | 18:09 |
yoctozepto | and nova has probably many more | 18:10 |
corvus | 46 | 18:10 |
corvus | exact numbers in etherpad now | 18:10 |
yoctozepto | ;D | 18:10 |
yoctozepto | k-a has an even number | 18:11 |
yoctozepto | the others have odd | 18:11 |
corvus | nova has the fast query, nova-specs is slow | 18:12 |
yoctozepto | and count is? | 18:13 |
corvus | yoctozepto: and nova-specs is 2153 -- odd :) | 18:13 |
yoctozepto | :-( | 18:13 |
Shrews | pipeline might make a difference in counts | 18:13 |
corvus | yoctozepto: but that is making me wonder about your '-' theory | 18:13 |
yoctozepto | corvus: yeah but why xD | 18:13 |
corvus | it's a longer form of something that's also in the index | 18:14 |
yoctozepto | I proposed that before you explained it is mysql query optimizer | 18:14 |
yoctozepto | ah, you think this way | 18:14 |
yoctozepto | well, it did try to use project_pipeline_idx | 18:15 |
yoctozepto | and cardinality is different/better | 18:15 |
yoctozepto | because as you observed we have a prefix | 18:15 |
yoctozepto | I would try out some others but zuul-web is still angry at me | 18:16 |
yoctozepto | actually it does not load for me at all atm | 18:16 |
clarkb | ya I think it is angry more globally | 18:16 |
clarkb | scheduler is still running so nothing should be lost | 18:16 |
yoctozepto | I hope so | 18:17 |
corvus | i can probably kill the queries | 18:17 |
clarkb | I checked the scheduler logs and it was busy and happy | 18:17 |
yoctozepto | but if mysql locked tables then it is bad anyway | 18:17 |
corvus | ah, there's only one long query right now | 18:17 |
corvus | running for 1566 seconds | 18:17 |
yoctozepto | kill it anyways | 18:17 |
corvus | just died on its own :) | 18:18 |
yoctozepto | ok ;-) | 18:18 |
yoctozepto | (yeah, right) | 18:18 |
corvus | if we have more questions, i'm happy to run 'explain' commands so we can find out without tying things up | 18:18 |
yoctozepto | zuul-web still angry | 18:18 |
yoctozepto | sure, let's go with more - dash examples | 18:18 |
yoctozepto | something small | 18:19 |
yoctozepto | karbor | 18:19 |
yoctozepto | karbor-dashboard | 18:19 |
yoctozepto | (well, relatively to nova) | 18:19 |
Shrews | so, when the pipeline is considered in the counts, the cardinality of the rows is much more different: 820 rows vs. 3307 | 18:20 |
Shrews | might be enough to cause the optimizer to choose a different plan | 18:20 |
Shrews | sadly, i've forgotten sooooo much about query optimization | 18:20 |
yoctozepto | Shrews: A vs B | 18:21 |
yoctozepto | A = ? B = ? | 18:21 |
Shrews | prepared statements might make sense here | 18:23 |
yoctozepto | Shrews: what cardinalities were those that you posted? | 18:24 |
yoctozepto | k vs k-a? | 18:24 |
yoctozepto | k-a vs k? | 18:24 |
yoctozepto | something vs nova? :D | 18:24 |
yoctozepto | (I can't do that myself, you know) | 18:24 |
Shrews | project name and pipeline counts from zuul_buildsets | 18:24 |
Shrews | select count(*) from zuul_buildset where project = "openstack/kolla-ansible" and pipeline IN ('periodic', 'periodic-stable'); | 18:24 |
Shrews | select count(*) from zuul_buildset where project = "openstack/kolla" and pipeline IN ('periodic', 'periodic-stable'); | 18:25 |
yoctozepto | so kolla simply has more here? | 18:25 |
*** panda has quit IRC | 18:25 | |
Shrews | yes. but nova has 3000+ too. | 18:26 |
yoctozepto | and that's why it is fast | 18:26 |
Shrews | no. just speculation | 18:26 |
yoctozepto | yeah, but quite possible | 18:26 |
yoctozepto | it actually used that index | 18:26 |
yoctozepto | with 820 | 18:27 |
yoctozepto | | 1 | SIMPLE | zuul_buildset | NULL | range | PRIMARY,project_pipeline_idx,project_change_idx | project_pipeline_idx | 1536 | NULL | 820 | 4.00 | Using index condition; Using where; Using temporary; Using filesort | | 18:27 |
yoctozepto | so it liked this one specifically | 18:27 |
yoctozepto | <corvus> yoctozepto: and nova-specs is 2153 -- odd :) | 18:27 |
yoctozepto | Shrews: can you check nova-specs filtered? | 18:27 |
Shrews | i get 0 with nova-specs | 18:28 |
corvus | nova-specs with periodic pipelines is 0 | 18:28 |
corvus | which makes sense | 18:28 |
yoctozepto | and it's slow | 18:28 |
yoctozepto | slowest 0 in history, been there, done that | 18:28 |
Shrews | with the smaller cardinality of that combo, it's picking a poor index (project_pipeline_idx) and we then scan 820 x 7093912 rows, vs just 7093912 | 18:31 |
Shrews | is that a needed index? | 18:31 |
yoctozepto | we can exclude it | 18:32 |
corvus | Shrews: which one, project_pipeline_idx? | 18:32 |
*** panda has joined #zuul | 18:33 | |
Shrews | i'd have to get a mysqldump of that data and then remember a whole bunch of stuff before i could say what a fix is | 18:33 |
Shrews | corvus: yeah | 18:33 |
yoctozepto | IGNORE INDEX (blah) | 18:33 |
corvus | Shrews: i think it's only there to speed up queries like this :) | 18:34 |
Shrews | corvus: seems to do the opposite :) | 18:34 |
yoctozepto | if you don't want to remove it entirely, try if IGNORE INDEX after that join will help us for now | 18:35 |
corvus | yoctozepto: yeah that seems to switch to the other form | 18:35 |
yoctozepto | YAY | 18:35 |
corvus | 2614 rows in set (9.24 sec) | 18:35 |
yoctozepto | reasonable | 18:35 |
*** hwangbo has joined #zuul | 18:36 | |
yoctozepto | still, could probably be better if things were index more optimally for this type of query | 18:36 |
corvus | it also seems that removing the branch terms uses the better query | 18:36 |
corvus | yeah, i think the thinking for the project_pipeline_idx was that it should help this case, because we're giving it a project and a pipeline, so it should be able to get the 820 buildsets that match, then join with the builds. | 18:38 |
yoctozepto | ¯\_(ツ)_/¯ | 18:38 |
yoctozepto | ^ that's mysql to us | 18:38 |
corvus | yoctozepto, Shrews: using a single pipeline helps as well | 18:39 |
corvus | i wonder if indexing the project and pipeline separately would help? | 18:39 |
yoctozepto | corvus: http://zuul.openstack.org/builds?project=openstack%2Fkolla-ansible&pipeline=periodic-stable&branch=master&branch=stable%2Fstein&branch=stable%2Frocky&branch=stable%2Fqueens - seems slow ? | 18:40 |
yoctozepto | ah not that much | 18:40 |
yoctozepto | like 15 s | 18:40 |
yoctozepto | so fine indeed | 18:40 |
yoctozepto | I got impatient waiting minutes for the bad one | 18:40 |
yoctozepto | just like AJaeger did | 18:41 |
corvus | that's actually a 3rd query plan | 18:41 |
corvus | yoctozepto: http://paste.openstack.org/show/754812/ | 18:42 |
corvus | when i switched to 1 pipeline, i did 'periodic' and got the 'fast' one. your switching to 'periodic-stable' got us a new 'medium' one :) | 18:42 |
*** tosky has quit IRC | 18:42 | |
corvus | (it looks sort of like the 'slow' one, but it's a 'ref' rather than a 'range' scan) | 18:42 |
yoctozepto | lol | 18:43 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-operator master: WIP: debug zuul-operator-functional-k8s job https://review.opendev.org/672576 | 18:43 |
clarkb | unrelated but do we need to edit zuul's pipeline configs so that rechecked changes go into the gate? | 18:44 |
yoctozepto | I don't get this thingy | 18:45 |
yoctozepto | we have | 18:45 |
yoctozepto | INNER JOIN zuul_buildset ON zuul_buildset.id = zuul_build.buildset_id | 18:45 |
yoctozepto | [ FROM zuul_build USE INDEX (PRIMARY) ] | 18:45 |
tristanC | jeliu_: corvus: we shouldn't need the operator-sdk to test the zuul-operator. It seems like the cli should be used for local dev... I'll have a look why the pod doesn't start now | 18:45 |
yoctozepto | buildset has | 18:45 |
yoctozepto | PRIMARY KEY (`id`), | 18:45 |
corvus | tristanC: agreed, and thanks! i think jeliu_ was also just looking into that too | 18:45 |
yoctozepto | build has | 18:45 |
*** fdegir has quit IRC | 18:45 | |
yoctozepto | PRIMARY KEY (`id`), | 18:45 |
yoctozepto | KEY `buildset_id` (`buildset_id`), | 18:45 |
yoctozepto | yet we get: | 18:45 |
yoctozepto | | 1 | SIMPLE | zuul_build | NULL | ALL | NULL | NULL | NULL | NULL | 7094344 | 0.00 | Using where; Using join buffer (Block Nested Loop) | | 18:45 |
yoctozepto | it's counterintuitive for me atm | 18:46 |
*** fdegir has joined #zuul | 18:46 | |
yoctozepto | damn lol | 18:46 |
yoctozepto | USE INDEX | 18:46 |
yoctozepto | what happens if drop this bastard? | 18:46 |
yoctozepto | if you* drop | 18:46 |
corvus | yoctozepto: language :) | 18:46 |
yoctozepto | sorry | 18:47 |
yoctozepto | non-native speaker here, I probably feel it differently | 18:47 |
corvus | yoctozepto: that corrected some very bad queries we used to have | 18:47 |
yoctozepto | hmm, do you have examples? | 18:47 |
yoctozepto | did they use buildset_id or job_name_buildset_id_idx? | 18:48 |
yoctozepto | maybe let's add at least buildset_id | 18:48 |
yoctozepto | to the list | 18:48 |
corvus | yoctozepto: https://review.opendev.org/605170 says it was the build list query we were fixing | 18:49 |
corvus | we could probably find more detail in irc logs around that | 18:49 |
yoctozepto | I checked https://dictionary.cambridge.org/pl/dictionary/english/bastard some dictionaries really list that as offensive, sorry | 18:50 |
corvus | yoctozepto: no prob :) | 18:50 |
corvus | yoctozepto: but i think that may have been the query without any project filters | 18:50 |
yoctozepto | fun thing is the best translations to Polish are not offensive language at all, they are actually much politer than what you actually hear | 18:51 |
yoctozepto | "MySQL's query optimizer is choosing a poor index to use when" | 18:51 |
yoctozepto | poor index is not very precise :D | 18:51 |
corvus | yoctozepto: agreed. though i'm not sure we could characterize our current problem with 100% precision :) | 18:51 |
yoctozepto | it's a very generic method anyway | 18:52 |
yoctozepto | this probably hit some things good, some things bad | 18:52 |
yoctozepto | corvus: well, forcibly excluding the best JOIN INDEX | 18:53 |
yoctozepto | is a BAD IDEA | 18:53 |
yoctozepto | (TM) | 18:53 |
corvus | if we drop use index, things get very fast indeed, for k-a, 2614 rows in set (0.99 sec) | 18:53 |
yoctozepto | can you re-explain the queries without this? | 18:54 |
corvus | yoctozepto: line 114 of etherpad | 18:54 |
yoctozepto | can yo do that for kolla and nova and nova-specs too? | 18:55 |
yoctozepto | and possibly kolla-ansible in the "medium case"? | 18:55 |
yoctozepto | (interested in how mysql goes about it) | 18:56 |
yoctozepto | | 1 | SIMPLE | zuul_build | NULL | ref | buildset_id | buildset_id | 5 | zuul.zuul_buildset.id | 8 | 100.00 | NULL | | 18:56 |
yoctozepto | ^ this, sir, is pure heaven | 18:56 |
corvus | i'm about 30 mins late for lunch now, maybe Shrews can help? | 18:56 |
yoctozepto | corvus: sorry, I am actually planning my bedtime now | 18:57 |
Shrews | we still may want to consider prepared statement for this if it's executed a lot | 18:57 |
Shrews | if it's rare/on-demand... meh | 18:57 |
corvus | yoctozepto: i'm pretty sure where that breaks down is when we do the same queries without any project, pipeline, or branch selection; ie, what you get when you first hit the builds page without entering any search terms | 18:57 |
fungi | it's on-demand insofar as folks are querying it via the builds page on the web dashboard | 18:57 |
yoctozepto | on-demand, my case possibly rare but I wanted to propose it as CI health check | 18:57 |
Shrews | glad my suspicion bore us some fruit though | 18:57 |
yoctozepto | corvus: it can be checked easily | 18:58 |
corvus | yep; someone needs to take over for me though; i need food | 18:58 |
corvus | biab | 18:58 |
yoctozepto | Shrews/fungi? | 18:59 |
Shrews | yoctozepto: sorry, got distracted... what do you need? | 19:00 |
yoctozepto | let's have explains for completenes | 19:00 |
yoctozepto | withtout the use index | 19:00 |
yoctozepto | corvus produced one at L114 | 19:00 |
yoctozepto | would like to look at kolla and nova cases too | 19:01 |
yoctozepto | and maybe the no-WHERE case as corvus suggested? | 19:01 |
Shrews | lemme look | 19:01 |
yoctozepto | (apart from tenant WHERE probably) | 19:01 |
Shrews | yoctozepto: posted nova and kolla explains | 19:05 |
yoctozepto | hmm, you got the same color as I have ;D | 19:06 |
yoctozepto | ok, it did them all in the same way | 19:06 |
yoctozepto | which is... good | 19:06 |
Shrews | and why would we want no WHERE clause? | 19:06 |
yoctozepto | not that 'Using index condition; Using where; Using temporary; Using filesort' is very good but still | 19:07 |
yoctozepto | Shrews: corvus suggested main page was slow without the use index | 19:07 |
yoctozepto | could be* | 19:07 |
yoctozepto | (not was) | 19:07 |
yoctozepto | it should run just the tenant where | 19:07 |
yoctozepto | but not the others | 19:07 |
Shrews | personally, i'm not interested in complex, slow queries without a where clause | 19:08 |
Shrews | we're doing something wrong if we are doing that | 19:08 |
yoctozepto | Shrews: it's the curse of on-demand stuff | 19:09 |
yoctozepto | Shrews: in case you want to take a look: http://zuul.openstack.org/builds | 19:10 |
yoctozepto | this is what we users are filtering | 19:10 |
mordred | Shrews: I'm back from lunch - it seems like this is a topic that someone might reasonably expect me to also look at? | 19:11 |
yoctozepto | mordred: be my guest | 19:11 |
* mordred is still reading scrollback | 19:11 | |
Shrews | mordred: i think the immediate need has dissipated | 19:11 |
mordred | ok. there's a LOT of scrollback | 19:12 |
fungi | we painted an elder sign on the side of the database and it seems to be keeping the great old ones on the other side of the portal now | 19:12 |
mordred | oh thank god | 19:12 |
mordred | I was worried I was going to have to page in the mysql optimizer internals again | 19:13 |
yoctozepto | fungi: fantasy RPG fan? | 19:13 |
yoctozepto | mordred: we kind of found the culprit | 19:13 |
fungi | yoctozepto: or h.p. lovecraft stories. take your pick | 19:13 |
yoctozepto | fungi: or both? | 19:13 |
yoctozepto | mordred: USE INDEX(PRIMARY) kills our best available JOIN index | 19:14 |
Shrews | mordred: it's been over 10 years for both of us. i question any of the knowledge that we've paged out at this point | 19:14 |
mordred | Shrews: yeah. | 19:14 |
mordred | yoctozepto: something was doing an explicit index hint? | 19:14 |
yoctozepto | mordred: yeah, the very generic method | 19:15 |
yoctozepto | https://review.opendev.org/#/c/605170/2/zuul/driver/sql/sqlconnection.py | 19:15 |
yoctozepto | wish the commit message was a bit less enigmatic | 19:15 |
mordred | ah - yeah - I think I remember chatting about that at the time. 9 times out of 10 index hints wind up being the wrong choice - I think at the time it was presenting as the strange case where it was needed - but now things have shifted again (Which is usually the issue with index hints - most of the time humans can't keep up with the optimizer in terms of dealing with changing data set) | 19:17 |
yoctozepto | mordred: optimizer changes too | 19:17 |
yoctozepto | heuristic change | 19:17 |
yoctozepto | data changes | 19:18 |
mordred | http://eavesdrop.openstack.org/irclogs/%23zuul/%23zuul.2018-09-25.log.html#t2018-09-25T16:25:10 | 19:18 |
mordred | there's the original conversation | 19:18 |
fungi | yeah, we're likely running a newer mysql/mariadb in opendev than we were back then | 19:18 |
fungi | so entirely possible the optimizer is no longer the same | 19:18 |
yoctozepto | "doing a full table scan to check the tenant" | 19:19 |
yoctozepto | not happening today | 19:20 |
mordred | well - that's good at least :) | 19:20 |
fungi | ahh, actually we're using a trove instance for the db, so i doubt the version has been updated | 19:21 |
yoctozepto | "a box of Sakila memberberries" | 19:21 |
mordred | yoctozepto: I set them right next to my little glass dolphin sculpture | 19:26 |
mordred | of course, now I can't remember where either of them are | 19:26 |
fungi | in a storage unit somewhere by the airport | 19:26 |
mordred | oh right | 19:26 |
mordred | although it turns out "by" in this case is give or take an hour | 19:27 |
fungi | for atlanta, that's practically next door? | 19:27 |
mordred | we opted for a unit in kennesaw, because it was super cheap and near one of my cousins | 19:27 |
yoctozepto | ah, guys | 19:27 |
yoctozepto | back to real life | 19:27 |
mordred | :) | 19:27 |
* mordred disturbs the real work | 19:27 | |
yoctozepto | what about reverting that one change | 19:28 |
yoctozepto | and observing? | 19:28 |
fungi | we would only need to restart zuul-web with that revert applied, right? | 19:29 |
fungi | if so, doesn't seem terribly risky | 19:29 |
mordred | well - I don't think we should revert revert should we? we should only remove the with_hint line? | 19:30 |
mordred | or have I not paged in enough of the backscroll? | 19:30 |
mordred | NEVERMIND | 19:30 |
mordred | the other bits are just formatting change | 19:30 |
fungi | that's what reverting 605170 would be, right | 19:30 |
mordred | yeah | 19:31 |
mordred | want me to propose a revert? | 19:31 |
fungi | whoever wants to push the button, i'm happy to review | 19:31 |
fungi | though i was thinking we could just hand-apply it on zuul.opendev.org and restart zuul-web | 19:33 |
mordred | fungi: that's also a fine choice | 19:33 |
mordred | fungi: we could do those in parallel even | 19:33 |
fungi | and see if it has the anticipated performance impact | 19:33 |
openstackgerrit | Monty Taylor proposed zuul/zuul master: Revert "Speed up build list query under mysql" https://review.opendev.org/672581 | 19:33 |
yoctozepto | but don't forget to change it permanently later | 19:33 |
clarkb | ya the hand rvert and restart is a common thing we've done when checking things like memory leaks | 19:34 |
mordred | that gives us a place to discuss | 19:34 |
clarkb | I would do it tha way to quickly get results iwthout ending up with revert revert revert revert commits | 19:34 |
yoctozepto | yeah, it needs some time for observations | 19:34 |
fungi | yoctozepto: right, i mean hand-apply the proposed change without approving it straight away | 19:34 |
fungi | still propose it for review so we have a place to better record our observations | 19:34 |
*** hwangbo has quit IRC | 19:34 | |
fungi | which mordred has done | 19:35 |
openstackgerrit | Merged zuul/zuul master: Update xterm to >= 3.14.5 https://review.opendev.org/671858 | 19:39 |
clarkb | I'll keep an eye on ^ | 19:39 |
clarkb | was tested with the check queue results so should e fine | 19:40 |
*** hwangbo has joined #zuul | 19:50 | |
corvus | no way | 19:54 |
fungi | no way what? | 19:54 |
corvus | the solution to the problem is to go back to when things were even worse? | 19:54 |
corvus | i sunk several hours of analysis into this, including why that revert would be a bad idea | 19:55 |
corvus | and i go out for lunch, and come back to find that in my absence, folks have just decided to ignore that? | 19:55 |
fungi | ahh, i misunderstood that was the conclusion | 19:55 |
corvus | 19:31 < mordred> want me to propose a revert? | 19:56 |
corvus | 19:31 < fungi> whoever wants to push the button, i'm happy to review | 19:56 |
fungi | review by testing it, to see if that made it worse/better | 19:57 |
fungi | (push revert button generating the revert, not button to approve it) | 19:58 |
corvus | a year ago, that query took 20 seconds. why would it be faster now? | 19:58 |
corvus | 50 rows in set (1 min 14.12 sec) | 19:59 |
fungi | it sounded like the theory was that changes in relative table sizes have caused the optimizer to choose different indices now than they did in october, but i have likely misunderstood | 19:59 |
corvus | fungi: i tested it ^ | 19:59 |
corvus | that's going to happen everytime someone goes to the build page http://zuul.openstack.org/builds | 20:01 |
fungi | got it, so 672581 will cause the builds page default query to take 20s to load on opendev's deployment | 20:02 |
corvus | fungi: no, it would have taken 20 seconds a year ago, it would take 1 minute 14 seconds now. | 20:02 |
clarkb | 74.12 seconds now looks like | 20:02 |
fungi | well, even 20s would not be great, agreed | 20:03 |
fungi | so yes the revert does seem to not be a solution | 20:03 |
fungi | mordred: yoctozepto: Shrews: ^ | 20:03 |
corvus | it will make one arcane and rarely used query faster at the cost of making the simple query used many times per day extremely slow | 20:04 |
Shrews | i just returned from physical therapy, catching up. what's being reverted? | 20:04 |
corvus | Shrews: nothing -- https://review.opendev.org/672581 | 20:05 |
mordred | nothing. corvus already analyzed it and determined it's a terrible idea | 20:05 |
Shrews | oh yeah. that wasn't the solution | 20:06 |
corvus | with my limited knowledge, i can think of 2 ways to proceed -- 1) repeat the process from a year ago and try to come up with a better set of indexes that works in all cases; or 2) maybe we look into keeping the hint where we aren't filtering by project, etc, but drop it when we are | 20:07 |
corvus | option 2 seems kind of hacky, like we're doubling down on the "try to beat the optimizer" bet. but it also seems like it might be practical. | 20:08 |
corvus | and i agree with mordred in principle; i would love to not try to beat the optimizer | 20:08 |
fungi | but by default the optimizer is selecting a less optimal query for the default page view | 20:09 |
mordred | corvus: I'm still digesting - but I agree - we're already in beat-the-optimizer land, so I don't think 2 is any WORSE - and if we know in code the difference between filtering by project and not, sending two different queries to the db is a fine choice | 20:09 |
mordred | like, it's not uncommon for the answer to hard query optimization to be "put more logic in the app and ask the database different questions" - even though it frequently feels wrong | 20:10 |
corvus | mordred: okay, i think it'd be great if you continued to digest this (https://etherpad.openstack.org/p/Z0cucbdugf has a bunch of queries you can run on the prod zuul db) 'cause you and Shrews are gonna be more likely to synthenize an option 3 than i am. but while you work on that, i can try regression testing a bunch of queries against the use/don't-use hint idea and see if that's feasible. | 20:12 |
mordred | corvus: yeah. I am now reading the etherpad and enjoying it | 20:12 |
corvus | also, if anyone else wants to do this, i'm sure we can put up a copy of the db somewhere; there isn't a whit of sensitive data in it. | 20:13 |
Shrews | corvus: i was about to suggest letting me mysqldump the data and play with it locally. | 20:13 |
Shrews | of course, now i have to remember how to use mysqldump :/ | 20:13 |
corvus | Shrews: ++ | 20:13 |
clarkb | mysqldump -u foo -p databasename | gzip -9 > foo.sql.gz | 20:14 |
clarkb | roughly | 20:14 |
Shrews | hrm, how can i pull that out of the container | 20:14 |
corvus | Shrews: no container on this host | 20:14 |
clarkb | I don't think it is in a container | 20:14 |
Shrews | oh wait | 20:14 |
Shrews | yah | 20:14 |
Shrews | nm | 20:14 |
corvus | least, not yet :) | 20:15 |
corvus | (but also, ftr, we have figured out how to do that; it's a bit wonky with lots of weird shell quoting; we do that for gitea) | 20:15 |
fungi | Shrews: yeah, the database is i a trove instance, so you need to know the hostname along with the credentials | 20:16 |
fungi | but can be found in the zuul configs in /etc | 20:17 |
mordred | corvus: so - just so I can make sure I'm understanding what's going on from the etherpad ... | 20:17 |
mordred | the first query is an example of with the hint but filtering by project | 20:17 |
mordred | which shows the results of 820 * 7093912 rows which == a lot of rows | 20:18 |
Shrews | mordred: the query plan changes based on project name (which caused me to suspect cardinality of that data). the one you reference there is the slow plan | 20:19 |
corvus | mordred: ah, the hint didn't really come into the conversation until really late. line 114. | 20:19 |
mordred | wait - both .. yeah | 20:19 |
mordred | this is why I'm talking out loud :) | 20:19 |
corvus | mordred: so, yes that's true for the first query. but it's also true for the second query. but then what Shrews said -- the only diff between those 2 queries is the project name. | 20:19 |
Shrews | mordred: corvus: anyone else: data dump in my home dir on zuul01 (zuul.sql.gz) | 20:20 |
fungi | thanks Shrews! | 20:20 |
corvus | cool; if anyone without access to that server wants it, i can put it somewhere public | 20:21 |
Shrews | 5.7.18 MySQL Community Server | 20:23 |
Shrews | for those playing along | 20:23 |
Shrews | mordred: we could put everything in ndb | 20:25 |
* Shrews waits for hurled wet critters | 20:25 | |
mordred | Shrews: I was actually thinking mongo | 20:26 |
mordred | (but i could totally kill this with NdbRecord) | 20:26 |
corvus | yeah, this is the kind of data it was made for | 20:31 |
Shrews | oh wow. 5.7.18 is really old | 20:34 |
mordred | yeah. when we start deploying zuul from our container images I TOTALLY want to reassess our db backend | 20:34 |
corvus | yeah, that's why we're tending toward mariadb containers for new things | 20:34 |
*** panda has quit IRC | 20:35 | |
Shrews | 5.7.27 is the closest archive download available. will try that | 20:35 |
*** panda has joined #zuul | 20:37 | |
mordred | Shrews: your user on servers is uncapitalized which confuses me | 20:45 |
Shrews | keeps the feds off my trail | 20:46 |
* mordred decapitalizes | 20:46 | |
mordred | corvus: I can't believe I'm about to say this - but we should change this to be a subquery | 20:54 |
corvus | mordred: feeling feverish? | 20:55 |
mordred | I know. but rewriting the first query as a subquery returns in 0.03 seconds and looks at 1536 * 5 rows | 20:55 |
mordred | I'm still experimenting, mind you - so let's consider that an anecdote | 20:56 |
mordred | for now | 20:56 |
corvus | mordred, Shrews: i have some initial results from my regression testing on dropping the hint. as expected, so far it's only a problem for the queries which have no project or pipeline conditionals. if you have either of those, it's good. if you have a project, it's excellent (0 seconds). if you have a pipeline without a project, it's okay (1-5 seconds). that makes me wonder if an independent pipeline | 20:58 |
corvus | index would additionally be helpful (we only have (project, pipeline) right now) | 20:58 |
mordred | I doubt it -the pipeline cardinality is going to be super low - it's really only valuable combined with something else | 20:59 |
corvus | oh that's a good point | 21:00 |
mordred | oh - let me check pipeline without project in my subquery experiment | 21:00 |
corvus | i think i should improve this script a little bit to make it more automated, and so we can add in more of the fields we search (job, build, etc), and we can also do apples/apples with mordred's subquery idea | 21:01 |
mordred | 2.8 seconds for pipeline without project. 0.29 seconds for nova | 21:01 |
corvus | mordred: branch or no? | 21:02 |
mordred | yeah - those all have branches | 21:02 |
mordred | or - a list of them | 21:02 |
corvus | mordred: then that's the same time i got (2.35s) | 21:02 |
mordred | awesome | 21:02 |
corvus | mordred: so what's the subquery like with only "zuul_buildset.tenant='openstack'" in the where clause? | 21:03 |
mordred | checking | 21:04 |
*** pcaruana has quit IRC | 21:05 | |
mordred | 8.69 seconds | 21:05 |
mordred | of course, that produces 7.2 million records | 21:06 |
corvus | well, it's better than 74, but not better than 0 | 21:07 |
corvus | mordred: throw a 'limit 50' on the end there :) | 21:07 |
corvus | (that query with the hint and limit 50 is 0.00 seconds) | 21:07 |
mordred | it's no better with limit because there is no support (at least in 5.7) for limit in a subquery | 21:08 |
mordred | oh - although I guess it's actually not correct to limit in the subquery in this case | 21:08 |
mordred | corvus: it's possible we might actually have a collection of different queries that are each better for different use cases | 21:09 |
corvus | mordred: well, also i'm wondering if the subquery is producing pretty similar results to the single query but without the hint | 21:10 |
mordred | it might be - except for the limit 50 case with the hint where that's smoking the subquery | 21:11 |
Shrews | are you both testing on the production db? i just remembered sometimes running ANALYZE on a table makes a difference on funky path decisions. might want to give that a whirl | 21:12 |
Shrews | just got my local db loaded | 21:12 |
mordred | Shrews: I am testing on the production db | 21:12 |
corvus | yep | 21:12 |
mordred | which is a sentence I realize is a crazy sentence to type | 21:12 |
Shrews | how very infra-responsible of you two :J) | 21:13 |
corvus | it's not *that* important of a db :) | 21:13 |
* fungi wonders if mariadb has added support for freudian_analyze | 21:13 | |
mordred | corvus, Shrews: I put a couple of subquery examples at the bottom | 21:13 |
fungi | er, i should have said psychoanalyze ;) | 21:14 |
corvus | mordred: is there a case where you think the subquery is better than not? | 21:14 |
mordred | corvus: we could totally get around the limit limitation by performing it as 2 queries - one with a limit on the buildset table to produce a list of ids, then a second query with a constructed in list with its own limit | 21:15 |
mordred | corvus: I think it's very good on things that filter by project | 21:15 |
corvus | mordred: but i think the single query without the hint is also good on things that filter by project | 21:15 |
mordred | it's not awful at things that don't limit by project | 21:15 |
mordred | yeah | 21:15 |
corvus | given that so far the single query with hint seems to be the only thing that can handle the query with no additional filters, it may not be worth the complexity of adding subquery into the mix if it doesn't have a win over single-query-without-hint | 21:16 |
mordred | corvus: what about with the hint without project and with limit - is that still terrible? | 21:16 |
mordred | agree | 21:17 |
*** armstrongs has joined #zuul | 21:17 | |
corvus | mordred: re your question ^ are you asking about pipeline and branch or no? | 21:17 |
*** mattw4 has quit IRC | 21:18 | |
Shrews | my analyze suggestion does nothing locally, fwiw | 21:19 |
corvus | basically: hint: yes, pipeline: no, project: no, branch: no --> 0.00s (this is the query for the builds landing page) | 21:19 |
corvus | but so far, that's the only thing we'd want to use the hint for | 21:20 |
corvus | oh, sorry -- if we only have branch, we want to use the hint | 21:20 |
mordred | yeah, because branch also has super low cardinality | 21:21 |
corvus | so let me rephrase that: so far, it's looking like "use the hint if we have pipeline or project; otherwise do not use the hint" with a single query is producing excellent results | 21:21 |
*** armstrongs has quit IRC | 21:22 | |
mordred | wfm | 21:24 |
clarkb | console log streaming still works in the web ui so the xterm update must be working | 21:26 |
mordred | clarkb: woot | 21:28 |
clarkb | mordred: corvus points out we aren't updating the js in production currently | 21:29 |
clarkb | because js tarball has moved | 21:29 |
mordred | clarkb: oh. yeah. so we have not learned that the xterm update is working | 21:30 |
*** jeliu_ has quit IRC | 21:30 | |
*** mattw4 has joined #zuul | 21:31 | |
corvus | mordred: for timing purposes i would like to eliminate the query cache | 21:31 |
corvus | mordred: know of a way to do that? | 21:31 |
corvus | cause i'm starting to get crazy fast times | 21:32 |
mordred | yes - one sec | 21:32 |
mordred | corvus: select SQL_NO_CACHE ... | 21:32 |
mordred | corvus: or SET SESSION query_cache_type = OFF; | 21:33 |
Shrews | corvus: that's a deprecated modifier now. weird | 21:38 |
corvus | good thing we're on an old server | 21:38 |
corvus | okay, i have automated my regression script; running it now | 21:38 |
mordred | yeah - that's the 5.7 thing | 21:38 |
mordred | cool | 21:38 |
Shrews | corvus: it should be deprecated even on 5.7 ('show warnings' after your select will output the deprecation notice) | 21:39 |
Shrews | still deprecated in 8.0 so... whatever | 21:39 |
corvus | mordred, Shrews: http://paste.openstack.org/show/754818/ | 21:39 |
fungi | there's a mysql 8.0? what rock have i been living under? | 21:40 |
corvus | i omitted the queries with no project or pipeline, since i ran them earlier and we know that without the hint they take forever | 21:40 |
corvus | that's measuring execution time in python, so that includes query parsing, fetching data, etc. it's going to be a little more than what the mysql cli would report | 21:40 |
mordred | cool | 21:41 |
Shrews | fungi: went from 5.7 to 8.0, so you haven't been under a rock too long | 21:41 |
fungi | ahh, okay | 21:41 |
fungi | i see, there were 7.x cluster releases | 21:41 |
corvus | i'm running my script with the other search terms now (job, build, result, etc) | 21:52 |
corvus | it's looking like we also don't want to use the hint when we have a change; that's the other thing that's indexed on the buildset table. | 21:58 |
corvus | so far: if not (project or pipeline or change): use hint; otherwise do not use hint | 21:58 |
corvus | http://paste.openstack.org/show/754820/ | 22:01 |
corvus | that held up for that set of things | 22:01 |
mordred | corvus: I'm fascinated that pipeline makes it ok | 22:02 |
mordred | project and change make sense to me - they're the things that are the most specific | 22:02 |
corvus | let me run that narrowly | 22:02 |
mordred | corvus: also - I have meltybrain - can I restate to make sure I'm parsing - if project or pipeline or chage: do not use hint ; else: use hint | 22:03 |
corvus | mordred: yep -- except i think you are right to be suspicious of pipeline | 22:04 |
corvus | http://paste.openstack.org/show/754821/ | 22:04 |
corvus | it looks like we *are* better off using the hint with pipeline; it's just that if we don't use the hint with pipeline, it's not terrible | 22:05 |
mordred | corvus: I think it we wanted to generalize, we could do a query at startup something like "select count(distinct(column_name))" to get the number of distinct values - and then define a column threshold over which we use the hint | 22:05 |
mordred | that's super weird - we only have l ike 10 pipeline names | 22:06 |
mordred | but - empirical data wins | 22:06 |
mordred | but for now - I think just hard-coding the logic as you have defined it seems like a fine choice | 22:06 |
corvus | mordred: yeah, there's a relationship to the indexes here, so it's not *crazy*; just incomprehensible | 22:06 |
mordred | yeah | 22:07 |
* mordred needs to dinner ... I'll check back post dinnering to see if there is further things to be baffled by | 22:07 | |
corvus | project and change are the most accessible indexes on that table | 22:07 |
corvus | k, i'll probably check a few more things then write this up | 22:07 |
mordred | yeah. project and change are the best things to be looking for | 22:07 |
corvus | sorta weird that build isn't faster; we may want to look into that one | 22:08 |
corvus | oh that's not weird at all | 22:08 |
corvus | we don't have an index on build uuid | 22:08 |
corvus | that's a big oversight | 22:08 |
corvus | Shrews: are you still around? can you try some queries for me before and after adding an index? | 22:09 |
Shrews | corvus: can be. just a sec | 22:09 |
corvus | i'm digging up the q's | 22:09 |
Shrews | k. ready | 22:10 |
corvus | Shrews: http://paste.openstack.org/show/754822/ | 22:10 |
corvus | Shrews: can you run both of those before and after adding an index on zuul_build.uuid ? | 22:11 |
Shrews | yup | 22:11 |
Shrews | corvus: err, the explains, yes? | 22:11 |
Shrews | or do you want the actual data? | 22:11 |
corvus | Shrews: timing | 22:12 |
Shrews | oh | 22:12 |
corvus | (explains could be interesting too) | 22:12 |
*** mattw4 has quit IRC | 22:12 | |
corvus | that's the table with 7m rows, so it may take a minute to build the index | 22:12 |
corvus | also, how long does it take to build the index is good info :) | 22:12 |
Shrews | before index, 1st query is 11.71s, 2nd query is 2.96s | 22:13 |
Shrews | corvus: explains for those are at the end of the etherpad. anything else before i add the index | 22:14 |
corvus | Shrews: not that i can think of now | 22:15 |
Shrews | ok. gimme a sec to recall the create index syntax | 22:15 |
*** mattw4 has joined #zuul | 22:16 | |
Shrews | corvus: index creation data at the end of the etherpad | 22:18 |
corvus | 30s isn't too bad | 22:19 |
Shrews | corvus: that uuid does not exist in my data | 22:19 |
corvus | oh, heh, it's really new | 22:19 |
corvus | i'm sure i have an old one handy, 1 sec | 22:19 |
corvus | Shrews: try 0f4573fc46934b79bec64438d7d63c70 | 22:20 |
Shrews | good news is with the index, both empty result sets returned REALLY quickly | 22:20 |
Shrews | corvus: see end of etherpad. the results of the non-index queries will be slightly skewed since there were no results | 22:23 |
Shrews | but you can consider the timings best case i guess? | 22:23 |
Shrews | 0s and 2.78s for the queries with the index | 22:23 |
Shrews | i can drop the index and retry them with the existing uuid if you like | 22:25 |
*** jeliu_ has joined #zuul | 22:25 | |
corvus | Shrews: can i beg you to drop the index and repeat that with a slight modification? (1 sec and i'll explain) | 22:26 |
Shrews | yep | 22:26 |
corvus | we have a job name index there that includes the buildset id, so i looked up why and found this: https://review.opendev.org/481614 | 22:27 |
corvus | a "covering index" | 22:27 |
corvus | i'm thinking that we'd probably want the same thing for a build uuid index -- so can you create it as (uuid, buildset_id) ? | 22:28 |
Shrews | creating. you want the queries with the existing uuid i assume corvus? | 22:29 |
corvus | yeah | 22:29 |
Shrews | index create about the same, 28.6s | 22:29 |
Shrews | corvus: 0.00s on the first query, 2.84s on the second | 22:31 |
corvus | k, about the same | 22:31 |
Shrews | yeah | 22:31 |
corvus | but theoretically better assuming the 'covering index' thing works like that | 22:32 |
corvus | that seems like a Vocabulary Word so i assume mordred is right about that :) | 22:32 |
Shrews | corvus: the explain actually looks much better | 22:32 |
openstackgerrit | Jeff Liu proposed zuul/zuul-operator master: [WIP] Verify Operator Pod Running https://review.opendev.org/670395 | 22:32 |
Shrews | corvus: see end of etherpad again | 22:33 |
corvus | thx | 22:33 |
Shrews | that actually may have more to do with using an existing uuid | 22:34 |
Shrews | hrm, nope | 22:35 |
Shrews | (uuid, buildset_id) looks like a winner | 22:36 |
corvus | cool, i'll include a schema change with this patch too | 22:36 |
*** jeliu_ has quit IRC | 22:36 | |
corvus | now i'm curious why job_name isn't performing in the same way; maybe it's because the cardinality is too low on something like 'openstack-tox-py27' | 22:36 |
corvus | i'm trying it with a more rare job | 22:36 |
corvus | yep, that's it. if i use 'zuul-operator-functional-k8s' as the job name, it looks like what you got for the build uuid query | 22:38 |
corvus | so, because the uuid is unique, once we add that index, we're always going to have good results with no hint. | 22:39 |
Shrews | corvus: openstack-tox-py27 occurs the most of any other value | 22:40 |
corvus | but with the same kind of index for job name, we get better results using the hint with a job with lots of hits like 'openstack-tox-py27', but better results without the hit for a rare job. | 22:40 |
Shrews | i placed the top 20 counts in the etherpad | 22:40 |
corvus | this is where it would be really nice if the optimizer were making the right choices :) | 22:41 |
corvus | http://paste.openstack.org/show/754823/ numbers for those ^ | 22:41 |
Shrews | only 29 for zuul-operator-functional-k8s | 22:42 |
Shrews | corvus: need anything else? food time here | 22:43 |
corvus | Shrews: nope, thanks! | 22:43 |
corvus | i'll flip a coin on whether to use the hint for job name, then write up the patch | 22:43 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Improve SQL query performance in some cases https://review.opendev.org/672606 | 23:13 |
corvus | Shrews, mordred, yoctozepto: ^ whew. there we go. thanks for your help there | 23:14 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Improve SQL query performance in some cases https://review.opendev.org/672606 | 23:14 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Improve SQL query performance in some cases https://review.opendev.org/672606 | 23:15 |
*** michael-beaver has quit IRC | 23:24 | |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Enable debug logs for openstack-functional tests https://review.opendev.org/672412 | 23:28 |
*** mattw4 has quit IRC | 23:44 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!