opendevreview | Michael Kelly proposed zuul/zuul-jobs master: roles: Add git-submodule-init role https://review.opendev.org/c/zuul/zuul-jobs/+/871539 | 00:07 |
---|---|---|
clarkb | infra-root I've just updated the meeting agenda page. I'll give it another 10 minutes or so before sending it out if anyone has anything else to add | 00:09 |
clarkb | fwiw I'm punting on the rescue image stuff beacuse I just don't have bw for it now | 00:20 |
clarkb | (so removed it from the agenda) | 00:20 |
clarkb | I've dug around in the gitea source and I think https://github.com/go-gitea/gitea/blob/v1.18.3/modules/indexer/code/bleve.go#L339-L348 is where we get a Match query or a "Fuzzy" query. Those map onto prefix and phrase match here: http://blevesearch.com/docs/Query/#match-phrase:8f767fbc41af8ff1ddcf4c60ed8c0fe9 | 00:40 |
clarkb | bleve is capable of doing regex queries: http://blevesearch.com/docs/Query-String-Query/ but I don't think gitea is exposing that | 00:40 |
clarkb | ianw: ^ fyi | 00:41 |
fungi | so that might be an easy improvement to propose | 00:42 |
clarkb | possibly via another match type. However, another considering to make here is that they also support elasticsearch as the indexer and I'm pretty sure it odesn't do regexes. THat may make them wary of supporting regexes at all? | 00:43 |
clarkb | heh nevermind elasticsearch does regex queries now | 00:43 |
ianw | interesting | 00:44 |
clarkb | and now to send that meeting agenda I said I'd send before getting nerd sniped on code searches | 00:46 |
clarkb | https://github.com/go-gitea/gitea/blob/v1.18.3/modules/indexer/code/bleve.go#L142 I think this also confirms that fuzzy in this case largely means ignore case | 00:52 |
clarkb | fungi: ^ I think you pointed out that appeared to be the behavior | 00:52 |
fungi | yeah, that seems to match my observations | 01:01 |
fungi | nothing else i could think of which would count as "fuzzy" seemed to work anyway | 01:01 |
opendevreview | Michael Kelly proposed zuul/zuul-jobs master: roles: Add git-submodule-init role https://review.opendev.org/c/zuul/zuul-jobs/+/871539 | 01:11 |
opendevreview | Michael Kelly proposed zuul/zuul-jobs master: roles: Add git-submodule-init role https://review.opendev.org/c/zuul/zuul-jobs/+/871539 | 04:53 |
*** ysandeep|out is now known as ysandeep | 05:05 | |
opendevreview | Michael Kelly proposed zuul/zuul-jobs master: roles: Add git-submodule-init role https://review.opendev.org/c/zuul/zuul-jobs/+/871539 | 06:02 |
ianw | kevinz / others : (kevinz hope you are enjoying new year and do not read this :) | 06:28 |
ianw | i'll send an email, but in the new linaro cloud i basically did what i mentioned in correspondence about resizing things | 06:29 |
ianw | i deleted all the cinder volumes and started again. i made a 150g partition and then made that the cinder vg -- it worked and i recreated the mirror volume on that, reattached it to the mirror, and it seems happy | 06:29 |
ianw | the rest of it (the remaining space after that 150g partition, and the second nvme drive) i put into another vg called "openstack" | 06:30 |
ianw | that's mounted on /opt/openstack. i stopped nova+glance, made a nova + glance directory, copied the old volume data into it, set the relevant datadir_volume in the config, redeployed kolla and ... to my surprise it seems to have "just worked" | 06:32 |
ianw | i've removed the nova_compute and glance volumes from the root partition, everything is still working | 06:32 |
ianw | i'm out of time for today but glance seems to be receiving images from nb04 | 06:36 |
ianw | this is the biggest machine i've played with kolla on. very impressed all round :) | 06:36 |
*** ysandeep is now known as ysandeep|lunch | 07:39 | |
*** ysandeep|lunch is now known as ysandeep | 08:10 | |
*** jpena|off is now known as jpena | 08:35 | |
*** ysandeep is now known as ysandeep|afk | 08:55 | |
*** ysandeep|afk is now known as ysandeep | 09:07 | |
*** cloudnull2 is now known as cloudnull | 09:09 | |
*** rlandy|out is now known as rlandy | 11:09 | |
*** dviroel|out is now known as dviroel | 11:18 | |
*** ysandeep is now known as ysandeep|afk | 11:22 | |
*** ysandeep|afk is now known as ysandeep | 12:11 | |
*** ysandeep is now known as ysandeep|afk | 13:43 | |
*** dasm|off is now known as dasm | 13:58 | |
pojadhav | folks, please update agenda if any @ https://hackmd.io/iraYQWGBT4qPCKH0VNG31A#2023-01-24-Community-Call for today's community call | 14:41 |
fungi | pojadhav: wrong channel? | 15:00 |
pojadhav | fungi, yeah.. sorry | 15:01 |
fungi | np | 15:01 |
*** dviroel is now known as dviroel|lunch | 15:19 | |
*** ysandeep|afk is now known as ysandeep|out | 15:24 | |
clarkb | my tuesday mornings of meetings have somehow become tuesday mornings with a hole I can do other things in | 16:17 |
fungi | mine became openstack security advisory time | 16:24 |
*** dviroel|lunch is now known as dviroel | 16:30 | |
clarkb | ianw: fyi I WIP'd https://review.opendev.org/c/opendev/jeepyb/+/869873 due to your discovery of the RO projects resulting in errors on push. We need to handle those seprately so they don't block everything else | 16:35 |
clarkb | Mostly bookkeeping at this point as I'm not sure what the best way to handle that is. Maybe we just need to remove RO projects from pojects.yaml? | 16:36 |
clarkb | I'll have to think on this one a bit | 16:36 |
fungi | i wonder if we can do a bit of parsing to detect ahead of time that the ack is read-only and then skip it if so | 16:42 |
fungi | s/ack/acl/ | 16:42 |
clarkb | ya we could search the acl file contents for the read only = true content (or whatever that actually is) | 16:42 |
clarkb | that might be the best way to handle it since its explicit about the behavior we want | 16:43 |
*** dviroel is now known as dviroel|doc_appt | 16:43 | |
clarkb | rather than trying to parse errors or something which can get confusing | 16:43 |
fungi | oh, though there's a catch-22 with that approach. we'll never actually apply the read-only acl for projects retired in the future because we'll skip them before we do | 16:46 |
clarkb | ah yup. So may need a hybrid. Push and if error check if read only is set | 16:47 |
fungi | yeah, that could be the ticket | 16:47 |
*** jpena is now known as jpena|off | 17:21 | |
clarkb | Gerrit is adding pop up alerts for changes on your alert list. People seem really excited about this and it makes me wonder why I've got such a strong reaction in the other direction | 17:51 |
clarkb | you can disable it thankfully | 17:51 |
clarkb | as a user, I don't think it can be turned off globally | 17:52 |
fungi | s/clippy/diffy/ | 17:53 |
opendevreview | Clark Boylan proposed opendev/git-review master: Switch from tox to nox https://review.opendev.org/c/opendev/git-review/+/871652 | 18:17 |
clarkb | I think git-review's tox.ini is nonfunctional too fwiw | 18:17 |
fungi | not surprising | 18:20 |
opendevreview | Clark Boylan proposed opendev/jeepyb master: Switch from tox to nox https://review.opendev.org/c/opendev/jeepyb/+/871653 | 18:33 |
opendevreview | Clark Boylan proposed opendev/git-review master: Switch from tox to nox https://review.opendev.org/c/opendev/git-review/+/871652 | 18:33 |
fungi | clarkb: do you happen to know whether the docker's copy command recurses specified directories? | 18:49 |
fungi | trying to figure out if the 404 errors from screenshotting 869091 is because the files aren't in the assets image or because of some other reason | 18:50 |
clarkb | fungi: `docker cp` or the COPY directivein a Dockerfile? | 18:51 |
fungi | dockerfile copy command | 18:51 |
fungi | https://docs.docker.com/engine/reference/builder/#copy seems to imply i need a glob match on the files | 18:52 |
clarkb | I think it is recursive since we sometimes copy entire git trees in | 18:52 |
fungi | ah, no, it should work | 18:53 |
clarkb | COPY . /tmp/src from zuul's dockerfile for example | 18:53 |
fungi | "f <src> is a directory, the entire contents of the directory are copied, including filesystem metadata." | 18:53 |
fungi | okay, so maybe the files are ending up in the wrong place somehow | 18:53 |
fungi | what's the easiest way to check the resulting file tree for one of our built container images in a check job? | 18:53 |
clarkb | fungi: they should be listed in the artifacts list and you can docker pull then run that image | 18:54 |
clarkb | you can also docker build locally yourself | 18:54 |
fungi | yeah, https://zuul.opendev.org/t/openstack/build/b093f7a18d1c430ca827737e6107c58c/artifacts lists it but i'm not sure how to just download the file and unpack it | 18:55 |
fungi | i'm assuming it's some sort of simple archive i can just inspect with standard tools, but maybe i'm hoping for too much | 18:56 |
clarkb | in this case `docker run insecure-ci-registry.opendev.org:5000/opendevorg/assets:b093f7a18d1c430ca827737e6107c58c_latest bash` is what I would probably do | 18:56 |
clarkb | oh but this is the image that may not have bash in it? | 18:56 |
fungi | right, it's just some files i think. i simply want to know what files are inside it | 18:57 |
clarkb | its a docker image which has a json manifest of layers then a bunch of layers. Its not simple | 18:57 |
fungi | tempted to add a task to our jobs to just do a find inside each container we build and dump that to a text file | 18:58 |
clarkb | where is the dockerfile for this image? | 18:58 |
fungi | https://review.opendev.org/869091 | 18:58 |
fungi | i can see from https://zuul.opendev.org/t/openstack/build/b093f7a18d1c430ca827737e6107c58c/console#3/0/12/ubuntu-jammy that it seems to think it copied the donors directory to the same place it put the other files | 18:59 |
fungi | but maybe those aren't exposed to apache correctly | 18:59 |
fungi | though the vhost config is pretty straightforward | 19:01 |
clarkb | fungi: maybe look at the gitea image build to see how it is copying the images out of the assets container | 19:03 |
clarkb | it might be referring to specific files? I don't recall | 19:03 |
fungi | oh! i didn't realize it didn't simply add the assets as a layer or mount the container somewhere | 19:04 |
fungi | yeah, that may be the problem | 19:07 |
fungi | RUN --mount=type=bind,from=opendevorg/assets,target=/tmp/assets cp /tmp/assets/* /custom/public/img/ | 19:07 |
fungi | https://zuul.opendev.org/t/openstack/build/8911faaf3ab54bebab6c96252a4b7b62/console#3/0/20/ubuntu-jammy | 19:07 |
fungi | i think we're not doing a recursive copy that way | 19:11 |
*** dviroel|doc_appt is now known as dviroel | 19:13 | |
corvus | ianw: re https://paste.opendev.org/show/bRhxa0ix8C982EI5jypb/ what host? | 19:15 |
ianw | corvus: nl03, talking to linaro-regionone | 19:16 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Feature our cloud donors on opendev.org https://review.opendev.org/c/opendev/system-config/+/869091 | 19:18 |
corvus | >>> print(client.get_image('1afe1a96-a572-4c10-b277-7f4505ffd050')) | 19:28 |
corvus | None | 19:28 |
corvus | ianw: ^ i get that in an interactive session; there may be an issue with that image.. | 19:28 |
clarkb | corvus: ianw: could this be caused by the mismatched provider names? | 19:29 |
ianw | openstack --os-cloud=opendevzuul-linaro image list | grep 1afe1a96-a572-4c10-b277-7f4505ffd050 | 19:30 |
ianw | that is interesting, because that image doesn't exist ... | 19:30 |
ianw | ... but ... this is a corner case because this cloud was stuck uploading images as it was out of disk | 19:31 |
clarkb | ah | 19:31 |
corvus | i got the image id from nodepool image-list | 19:31 |
ianw | yesterday i reworked all the storage, which allowed images to start uploading again | 19:31 |
corvus | so nodepool thinks it's a ready image | 19:31 |
corvus | i may not understand the cloud/provider names here | 19:32 |
corvus | so i'm not 100% sure i'm talking to the right cloud | 19:32 |
corvus | in the nodepool image list, i see nodepool provider names 'linaro' 'linaro-regionone' and 'linaro-us-regionone' | 19:33 |
corvus | (the failing server was being launched in linaro-regionone | 19:33 |
ianw | yeah i have forgotten the "-regionone" on the builder config | 19:33 |
fungi | d'oh! | 19:34 |
corvus | okay, provider 'linaro-regionone' uses cloud 'linaro' region 'regionone' | 19:34 |
corvus | so my interactive session was against the correct cloud (the same cloud+region used by the nodepool 'linaro-regionone' provider) so i think my method of verifying that the image does not exist in that provider is correct, and that agrees with ianw's listing. | 19:35 |
corvus | sorry for the detour, just wanted to make sure i got that straight | 19:35 |
corvus | so i think the state is: the launcher is failing to boot the nodes because the image doesn't exist (so no bug there except for maybe we should catch that and return a specific error). | 19:36 |
ianw | yeah, i have dropped the -regionone in one of the configs. we will fix that up once we've removed the old cloud | 19:37 |
ianw | (we being i :) | 19:37 |
corvus | ianw: so is the actual error that the builder has a mismatched provider name and uploaded an image to one cloud but set the provider to a different name? | 19:37 |
ianw | yep, thanks, that makes sense. i was surprised to see a traceback | 19:37 |
ianw | corvus: i'm not sure what state that image was in, because the cloud ran out of disk for glance to upload images. so some were stuck at one point | 19:38 |
corvus | okay cool. yeah, i think we can definitely catch that error explicitly. | 19:38 |
ianw | in this case i think we can delete it; it's definitely not from general normal operation it went missing | 19:39 |
corvus | oh interesting. i'd be surprised if nodepool would have marked it as ready without the cloud telling it so, but also, if there was a problem, it would not be the first time a cloud lied to us. | 19:40 |
opendevreview | Merged openstack/project-config master: nodepool: empty linaro-us cloud https://review.opendev.org/c/openstack/project-config/+/871220 | 19:44 |
ianw | i've "nodepool image-delete --provider linaro-regionone --build-id 0000058743 --upload-id 0000000001 --image ubuntu-focal-arm64" | 19:45 |
ianw | that is now deleting | 19:48 |
ianw | | 0000058743 | 0000000001 | linaro-regionone | ubuntu-focal-arm64 | ubuntu-focal-arm64-1673833826 | 1afe1a96-a572-4c10-b277-7f4505ffd050 | deleting | 00:00:03:15 | | 19:48 |
ianw | but i wonder if the missing image will hang that? | 19:48 |
opendevreview | Joshua Watt proposed zuul/zuul-jobs master: doc: docker-image: Add recommended dependency https://review.opendev.org/c/zuul/zuul-jobs/+/871657 | 19:57 |
fungi | i'll go ahead with the mm3 containers restart now | 19:58 |
ianw | 1afe1a96-a572-4c10-b277-7f4505ffd050 | deleting | 00:00:32:01 | | 20:16 |
ianw | i'm guessing this isn't going to go away normally | 20:17 |
fungi | grr, still 404... https://zuul.opendev.org/t/openstack/build/d3ce29c9ee784a4a9af92e06085bb9f6/log/gitea99.opendev.org/apache2/gitea-ssl-access.log#630 | 20:37 |
ianw | may donors needs a trailing / or something to copy the whole directory? | 20:42 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Feature our cloud donors on opendev.org https://review.opendev.org/c/opendev/system-config/+/869091 | 20:42 |
fungi | ianw: not based on my local attempts, but that ^ sets the cp to verbose so i can see if it's missing at the source side or not | 20:42 |
ianw | fungi: they are copied into the root -> https://paste.opendev.org/show/b1vr4yK8eh3OMGHqvVjJ/ | 20:47 |
ianw | from that log it looks like "GET /assets/img/donors/rackspace.jpg" | 20:47 |
fungi | oh, is that what docker means by copy being recursive but only copying files? | 20:48 |
ianw | so maybe drop the donors/ ? | 20:48 |
ianw | maybe? :) it does interesting things with ADD depending on what you specify too, sometimes extracting things that end with .gz and sometimes not, etc. | 20:49 |
fungi | "Note: The directory itself is not copied, just its contents." https://docs.docker.com/engine/reference/builder/#copy | 20:49 |
fungi | i guess we'd need a separate step to create the directory and then copy into it | 20:50 |
ianw | yeah, i just pulled insecure-ci-registry.opendev.org:5000/opendevorg/assets:d7e756c0eaa643a58cf53eef79baf9da_latest ; ran create and exported the result to see | 20:51 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Feature our cloud donors on opendev.org https://review.opendev.org/c/opendev/system-config/+/869091 | 20:52 |
fungi | not sure if copy will create a target directory if it doesn't exist | 20:52 |
fungi | "If <dest> doesn’t exist, it is created along with all missing directories in its path." | 20:53 |
fungi | okay, hopefully this will do the trick | 20:53 |
ianw | " In this case, if <dest> ends with a trailing slash /, it will be considered a directory and the contents of <src> will be written at <dest>/base(<src>)." | 20:55 |
fungi | yeah, not sure how to parse that part ;) | 20:55 |
ianw | me either, and you did have a trailing / ... so | 20:57 |
ianw | Predicted remaining provider quota: {'co | 21:07 |
ianw | mpute': {'cores': 104, 'instances': 33, 'ram': -6144}} | 21:07 |
fungi | i wonder how well the nodes will run on negative ram | 21:08 |
ianw | yes it's an interesting one | 21:09 |
ianw | # openstack --os-cloud=opendevzuul-linaro quota show | grep ram | 21:10 |
ianw | | ram | 51200 | 21:10 |
ianw | we can put a zero on that | 21:20 |
fungi | that would be very nice | 21:24 |
clarkb | I think if you do a limits show it will show you what it thinks it is using | 21:35 |
opendevreview | Joshua Watt proposed zuul/zuul-jobs master: use-buildset-registry: Prepend buildset registry to mirrors https://review.opendev.org/c/zuul/zuul-jobs/+/869760 | 21:40 |
opendevreview | Clark Boylan proposed opendev/jeepyb master: Raise an error if acl pushes fail https://review.opendev.org/c/opendev/jeepyb/+/869873 | 21:41 |
ianw | should i restart the executors with the images with updated skopeo? | 21:54 |
opendevreview | Joshua Watt proposed zuul/zuul-jobs master: upload-container-image: Add option to stage in separate repository https://review.opendev.org/c/zuul/zuul-jobs/+/871664 | 21:54 |
opendevreview | Joshua Watt proposed zuul/zuul-jobs master: promote-artifactory-image: Add role https://review.opendev.org/c/zuul/zuul-jobs/+/871665 | 21:54 |
fungi | ianw: what did we need the updated skopeo for? | 22:02 |
ianw | fungi: to upload nodepool images that are made with buildx | 22:02 |
ianw | (and any other images using buildx, but nodepool's the one i'm aware of) | 22:03 |
ianw | i need new nodepool images to fix rocky linux builds in the gate | 22:03 |
fungi | got it | 22:03 |
fungi | yeah, sounds like a good reason for a rolling restart sooner rather than waiting for the weekend | 22:03 |
ianw | i'll take a look in a bit; just trying to sort out linaro launching nodes | 22:04 |
opendevreview | Merged openstack/project-config master: nodepool: drop linaro-us https://review.opendev.org/c/openstack/project-config/+/871196 | 22:06 |
opendevreview | Ian Wienand proposed openstack/project-config master: nodepool: fix new linaro provider name in nb04 https://review.opendev.org/c/openstack/project-config/+/871666 | 22:07 |
ianw | https://zuul.opendev.org/t/zuul/build/942b6fdabbcd4989a44717b0599f3d14 is actually another job that fails, but same reason | 22:29 |
clarkb | I'm reviewing the copycondition change now. Wondering if openstack should try and standardize this and put it in their central acl inherited by everything | 22:29 |
clarkb | anyway doing that wouldn't be for us to solve just thinking out loud as I go | 22:29 |
ianw | yeah the usage is inconsistent, but i think sometimes not always intended that way | 22:31 |
clarkb | ianw: I +2'd it but didn't approve because i think it may be a good idea to announce the change before hadn (not necessarily with a ton of lead time) just so that users can call out behavior changes if we misinterpreted gerrit docs | 22:35 |
clarkb | (I left the same comment on the change) | 22:35 |
ianw | sure i can send a mail | 22:35 |
ianw | i'm going to let 871196 apply, then manually fix up that bad provider name, and we can merge 871666 | 22:36 |
ianw | if the new cloud still isn't picking up nodes after that, then i'm starting to really be at a loss for what's wrong. afaict nodepool thinks it has enough quota to run vm's, it's just not accepting any node requests | 22:37 |
clarkb | the nodepool logs are pretty good if it is a quota thing (it logs what it thinks the quota is and how much room it has) | 22:39 |
clarkb | that should be able to help rule things out too | 22:40 |
corvus | is there a specific request it should be handling but isn't? | 22:42 |
corvus | (cause looking into "mostly idle provider is idle" is tricky to find interesting log entries :) | 22:43 |
*** dviroel is now known as dviroel|out | 22:44 | |
fungi | yay! http://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f88/869091/6/check/system-config-run-gitea/f88ff85/bridge99.opendev.org/screenshots/gitea-main.png | 22:48 |
fungi | now i can clean up my useless debugging and un-wip finally | 22:48 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Feature our cloud donors on opendev.org https://review.opendev.org/c/opendev/system-config/+/869091 | 22:50 |
*** rlandy is now known as rlandy|out | 23:01 | |
*** dasm is now known as dasm|off | 23:09 | |
ianw | ok, there's a bunch of images like | 23:29 |
ianw | | 0000042108 | 0000000001 | linaro-regionone | centos-8-stream-arm64 | centos-8-stream-arm64-1673834289 | 18cf4707-7156-40eb-b9da-d59cc593eea5 | deleting | 00:00:05:27 | | 23:29 |
ianw | i think that may have come from nb03 when i started this a long time ago | 23:29 |
ianw | oh interesting, they got reaped with the name change | 23:31 |
ianw | nb04 is now uploading with the correct name | 23:31 |
fungi | neat! | 23:31 |
opendevreview | Merged openstack/project-config master: nodepool: fix new linaro provider name in nb04 https://review.opendev.org/c/openstack/project-config/+/871666 | 23:38 |
ianw | so i'm thinking of running zuul_rolling_restart in a root screen on bridge? is that the best way to get the executors restarted? | 23:39 |
ianw | after a zuul_pull | 23:40 |
fungi | that sounds right to me... clarkb ^ ? | 23:54 |
clarkb | ianw: that playbook will do 6 executors at a time | 23:54 |
clarkb | so it is quicker than the reboot playbook. | 23:54 |
clarkb | things are not super busy so that should be fine. Demand on zuul is the biggest consideration there. Also it won't do the scheduler (I don't think you need the scheduler) so a should be good | 23:55 |
ianw | nope just executors | 23:55 |
ianw | ok, it's running in a root screen | 23:56 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!