@bennetefx:matrix.org | Hi Clark, it seems like this issue is only there for one of my non-ephemeral static node. I guess the service has died on this node and subsequent jobs that are being run on this node does not restart this service (it's supposed to right?). | 02:16 |
---|---|---|
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 941620: Use change networks in gerrit driver queries https://review.opendev.org/c/zuul/zuul/+/941620 | 03:20 | |
@jim:acmegating.com | bennetefx: put it in the first pre-run playbook of your base job so that every job will start it if it isn't already running. it noops if it's already running. | 03:21 |
@bennetefx:matrix.org | Yeah I did, added the **start_zuul_console** role as a pre-run of my base job and it is working fine now. Thank you. | 03:38 |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 941626: Fix Gerrit change query retry exceeded https://review.opendev.org/c/zuul/zuul/+/941626 | 06:38 | |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 941626: Fix Gerrit change query retry exceeded https://review.opendev.org/c/zuul/zuul/+/941626 | 06:42 | |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 941626: Fix Gerrit change query retry exceeded https://review.opendev.org/c/zuul/zuul/+/941626 | 06:52 | |
@cidlik:matrix.org | Yeap, partially. But key words are "different jobs" here. I wouldn't want to breed similar jobs which have different names and nodeset only. It's hard to support. I would prefer to have one job for multiple nodes. Or to run one job several times but for different nodes in one buildset. | 07:13 |
@joao15130:matrix.org | Hello, new day new problem. | 09:26 |
We have observed some out-of-sync between nodepool and our openstack env. | ||
From time to time nodepool list images while openstack nova doesn't show anything. The images remain available in nodepool and zuul picks something which doesn't really exist. | ||
Even worse, we can see instances created by nodepool deleted without any reason, and the instances remain visible from nodepool. I wasn't able to see any errors in openstack logs. | ||
Our env is based upon a devstack AIO installation and we have registered this cloud as a provider. | ||
images-dir: /etc/nodepool/dib | ||
build-log-dir: /etc/nodepool/dib_log | ||
elements-dir: /etc/nodepool/elements | ||
build-log-retention: 7 | ||
webapp: | ||
port: 8005 | ||
listen_address: 0.0.0.0 | ||
zookeeper-servers: | ||
- host: zk | ||
port: 2281 | ||
zookeeper-tls: | ||
cert: /var/certs/certs/client.pem | ||
key: /var/certs/keys/clientkey.pem | ||
ca: /var/certs/certs/cacert.pem | ||
labels: | ||
- name: nodepool-jammy | ||
min-ready: 1 | ||
diskimages: | ||
- name: nodepool-jammy | ||
elements: | ||
- ubuntu-minimal | ||
- vm | ||
- simple-init | ||
- growroot | ||
- cache-devstack | ||
- openstack-repos | ||
- nodepool-base | ||
- infra-package-needs | ||
- zuul-worker | ||
release: jammy | ||
env-vars: | ||
GIT_BASE: http://opendev.org | ||
DIB_DEBIAN_COMPONENTS: 'main,universe' | ||
DIB_APT_LOCAL_CACHE: '0' | ||
DIB_TMP: '/opt/dib_tmp' | ||
DIB_DISABLE_APT_CLEANUP: '1' | ||
DIB_DEBOOTSTRAP_EXTRA_ARGS: '--no-check-gpg' | ||
TMPDIR: /root/ | ||
DIB_CHECKSUM: '1' | ||
DIB_IMAGE_CACHE: /opt/dib_cache | ||
DIB_GRUB_TIMEOUT: '0' | ||
DIB_SHOW_IMAGE_USAGE: '1' | ||
ZUUL_USER_SSH_PUBLIC_KEY: | | ||
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKGWg2C9M/BkwRawt6143wKiO4or18fW42hkXkZ6xuaQ root@9cab794c8bc6 | ||
providers: | ||
- name: ci-cloud | ||
driver: openstack | ||
cloud: ci-cloud | ||
boot-timeout: 600 | ||
launch-timeout: 3600 | ||
launch-retries: 3 | ||
clean-floating-ips: true | ||
diskimages: | ||
- name: nodepool-jammy | ||
config-drive: true | ||
pools: | ||
- name: dell-openstack-ci | ||
auto-floating-ip: true | ||
max-servers: 4 | ||
security-groups: | ||
- nodepool-secgroup | ||
networks: | ||
- nodepool | ||
labels: | ||
- name: nodepool-jammy | ||
diskimage: nodepool-jammy | ||
flavor-name: m1.large | ||
``` | ||
images-dir: /etc/nodepool/dib | ||
build-log-dir: /etc/nodepool/dib_log | ||
elements-dir: /etc/nodepool/elements | ||
build-log-retention: 7 | ||
webapp: | ||
port: 8005 | ||
listen_address: 0.0.0.0 | ||
zookeeper-servers: | ||
- host: zk | ||
port: 2281 | ||
zookeeper-tls: | ||
cert: /var/certs/certs/client.pem | ||
key: /var/certs/keys/clientkey.pem | ||
ca: /var/certs/certs/cacert.pem | ||
labels: | ||
- name: nodepool-jammy | ||
min-ready: 1 | ||
diskimages: | ||
- name: nodepool-jammy | ||
elements: | ||
- ubuntu-minimal | ||
- vm | ||
- simple-init | ||
- growroot | ||
- cache-devstack | ||
- openstack-repos | ||
- nodepool-base | ||
- infra-package-needs | ||
- zuul-worker | ||
release: jammy | ||
env-vars: | ||
GIT_BASE: http://opendev.org | ||
DIB_DEBIAN_COMPONENTS: 'main,universe' | ||
DIB_APT_LOCAL_CACHE: '0' | ||
DIB_TMP: '/opt/dib_tmp' | ||
DIB_DISABLE_APT_CLEANUP: '1' | ||
DIB_DEBOOTSTRAP_EXTRA_ARGS: '--no-check-gpg' | ||
TMPDIR: /root/ | ||
DIB_CHECKSUM: '1' | ||
DIB_IMAGE_CACHE: /opt/dib_cache | ||
DIB_GRUB_TIMEOUT: '0' | ||
DIB_SHOW_IMAGE_USAGE: '1' | ||
ZUUL_USER_SSH_PUBLIC_KEY: | | ||
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKGWg2C9M/BkwRawt6143wKiO4or18fW42hkXkZ6xuaQ root@9cab794c8bc6 | ||
providers: | ||
- name: ci-cloud | ||
driver: openstack | ||
cloud: ci-cloud | ||
boot-timeout: 600 | ||
launch-timeout: 3600 | ||
launch-retries: 3 | ||
clean-floating-ips: true | ||
diskimages: | ||
- name: nodepool-jammy | ||
config-drive: true | ||
pools: | ||
- name: dell-openstack-ci | ||
auto-floating-ip: true | ||
max-servers: 4 | ||
security-groups: | ||
- nodepool-secgroup | ||
networks: | ||
- nodepool | ||
labels: | ||
- name: nodepool-jammy | ||
diskimage: nodepool-jammy | ||
flavor-name: m1.large | ||
``` | ||
@joao15130:matrix.org | * Hello, new day new problem. | 09:27 |
We have observed some out-of-sync between nodepool and our openstack env. | ||
From time to time nodepool list images while openstack nova doesn't show anything. The images remain available in nodepool and zuul picks something which doesn't really exist. | ||
Even worse, we can see instances created by nodepool deleted without any reason, and the instances remain visible from nodepool. I wasn't able to see any errors in openstack logs. | ||
Our env is based upon a devstack AIO installation and we have registered this cloud as a provider. | ||
images-dir: /etc/nodepool/dib | ||
build-log-dir: /etc/nodepool/dib\_log | ||
elements-dir: /etc/nodepool/elements | ||
build-log-retention: 7 | ||
webapp: | ||
port: 8005 | ||
listen\_address: 0.0.0.0 | ||
zookeeper-servers: | ||
- host: zk | ||
port: 2281 | ||
zookeeper-tls: | ||
cert: /var/certs/certs/client.pem | ||
key: /var/certs/keys/clientkey.pem | ||
ca: /var/certs/certs/cacert.pem | ||
labels: | ||
- name: nodepool-jammy | ||
min-ready: 1 | ||
diskimages: | ||
- name: nodepool-jammy | ||
elements: | ||
- ubuntu-minimal | ||
- vm | ||
- simple-init | ||
- growroot | ||
- cache-devstack | ||
- openstack-repos | ||
- nodepool-base | ||
- infra-package-needs | ||
- zuul-worker | ||
release: jammy | ||
env-vars: | ||
GIT\_BASE: http://opendev.org | ||
DIB\_DEBIAN\_COMPONENTS: 'main,universe' | ||
DIB\_APT\_LOCAL\_CACHE: '0' | ||
DIB\_TMP: '/opt/dib\_tmp' | ||
DIB\_DISABLE\_APT\_CLEANUP: '1' | ||
DIB\_DEBOOTSTRAP\_EXTRA\_ARGS: '--no-check-gpg' | ||
TMPDIR: /root/ | ||
DIB\_CHECKSUM: '1' | ||
DIB\_IMAGE\_CACHE: /opt/dib\_cache | ||
DIB\_GRUB\_TIMEOUT: '0' | ||
DIB\_SHOW\_IMAGE\_USAGE: '1' | ||
ZUUL\_USER\_SSH\_PUBLIC\_KEY: | | ||
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKGWg2C9M/BkwRawt6143wKiO4or18fW42hkXkZ6xuaQ root@9cab794c8bc6 | ||
providers: | ||
- name: ci-cloud | ||
driver: openstack | ||
cloud: ci-cloud | ||
boot-timeout: 600 | ||
launch-timeout: 3600 | ||
launch-retries: 3 | ||
clean-floating-ips: true | ||
diskimages: | ||
- name: nodepool-jammy | ||
config-drive: true | ||
pools: | ||
- name: dell-openstack-ci | ||
auto-floating-ip: true | ||
max-servers: 4 | ||
security-groups: | ||
- nodepool-secgroup | ||
networks: | ||
- nodepool | ||
labels: | ||
- name: nodepool-jammy | ||
diskimage: nodepool-jammy | ||
flavor-name: m1.large | ||
``` | ||
@joao15130:matrix.org | * Hello, new day new problem. | 09:27 |
We have observed some out-of-sync between nodepool and our openstack env. | ||
From time to time nodepool list images while openstack nova doesn't show anything. The images remain available in nodepool and zuul picks something which doesn't really exist. | ||
Even worse, we can see instances created by nodepool deleted without any reason, and the instances remain visible from nodepool. I wasn't able to see any errors in openstack logs. | ||
Our env is based upon a devstack AIO installation and we have registered this cloud as a provider. | ||
``` | ||
images-dir: /etc/nodepool/dib | ||
build-log-dir: /etc/nodepool/dib\_log | ||
elements-dir: /etc/nodepool/elements | ||
build-log-retention: 7 | ||
webapp: | ||
port: 8005 | ||
listen\_address: 0.0.0.0 | ||
zookeeper-servers: | ||
- host: zk | ||
port: 2281 | ||
zookeeper-tls: | ||
cert: /var/certs/certs/client.pem | ||
key: /var/certs/keys/clientkey.pem | ||
ca: /var/certs/certs/cacert.pem | ||
labels: | ||
- name: nodepool-jammy | ||
min-ready: 1 | ||
diskimages: | ||
- name: nodepool-jammy | ||
elements: | ||
- ubuntu-minimal | ||
- vm | ||
- simple-init | ||
- growroot | ||
- cache-devstack | ||
- openstack-repos | ||
- nodepool-base | ||
- infra-package-needs | ||
- zuul-worker | ||
release: jammy | ||
env-vars: | ||
GIT\_BASE: http://opendev.org | ||
DIB\_DEBIAN\_COMPONENTS: 'main,universe' | ||
DIB\_APT\_LOCAL\_CACHE: '0' | ||
DIB\_TMP: '/opt/dib\_tmp' | ||
DIB\_DISABLE\_APT\_CLEANUP: '1' | ||
DIB\_DEBOOTSTRAP\_EXTRA\_ARGS: '--no-check-gpg' | ||
TMPDIR: /root/ | ||
DIB\_CHECKSUM: '1' | ||
DIB\_IMAGE\_CACHE: /opt/dib\_cache | ||
DIB\_GRUB\_TIMEOUT: '0' | ||
DIB\_SHOW\_IMAGE\_USAGE: '1' | ||
ZUUL\_USER\_SSH\_PUBLIC\_KEY: | | ||
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKGWg2C9M/BkwRawt6143wKiO4or18fW42hkXkZ6xuaQ root@9cab794c8bc6 | ||
providers: | ||
- name: ci-cloud | ||
driver: openstack | ||
cloud: ci-cloud | ||
boot-timeout: 600 | ||
launch-timeout: 3600 | ||
launch-retries: 3 | ||
clean-floating-ips: true | ||
diskimages: | ||
- name: nodepool-jammy | ||
config-drive: true | ||
pools: | ||
- name: dell-openstack-ci | ||
auto-floating-ip: true | ||
max-servers: 4 | ||
security-groups: | ||
- nodepool-secgroup | ||
networks: | ||
- nodepool | ||
labels: | ||
- name: nodepool-jammy | ||
diskimage: nodepool-jammy | ||
flavor-name: m1.large | ||
``` | ||
``` | ||
@joao15130:matrix.org | Hello, new day new problem. | 09:28 |
We have observed some out-of-sync between nodepool and our openstack env. | ||
From time to time nodepool list images while openstack nova doesn't show anything. The images remain available in nodepool and zuul picks something which doesn't really exist. | ||
Even worse, we can see instances created by nodepool deleted without any reason, and the instances remain visible from nodepool. I wasn't able to see any errors in openstack logs. | ||
Our env is based upon a devstack AIO installation and we have registered this cloud as a provider. | ||
``` | ||
images-dir: /etc/nodepool/dib | ||
build-log-dir: /etc/nodepool/dib\_log | ||
elements-dir: /etc/nodepool/elements | ||
build-log-retention: 7 | ||
webapp: | ||
port: 8005 | ||
listen\_address: 0.0.0.0 | ||
zookeeper-servers: | ||
- host: zk | ||
port: 2281 | ||
zookeeper-tls: | ||
cert: /var/certs/certs/client.pem | ||
key: /var/certs/keys/clientkey.pem | ||
ca: /var/certs/certs/cacert.pem | ||
labels: | ||
- name: nodepool-jammy | ||
min-ready: 1 | ||
diskimages: | ||
- name: nodepool-jammy | ||
elements: | ||
- ubuntu-minimal | ||
- vm | ||
- simple-init | ||
- growroot | ||
- cache-devstack | ||
- openstack-repos | ||
- nodepool-base | ||
- infra-package-needs | ||
- zuul-worker | ||
release: jammy | ||
env-vars: | ||
GIT\_BASE: http://opendev.org | ||
DIB\_DEBIAN\_COMPONENTS: 'main,universe' | ||
DIB\_APT\_LOCAL\_CACHE: '0' | ||
DIB\_TMP: '/opt/dib\_tmp' | ||
DIB\_DISABLE\_APT\_CLEANUP: '1' | ||
DIB\_DEBOOTSTRAP\_EXTRA\_ARGS: '--no-check-gpg' | ||
TMPDIR: /root/ | ||
DIB\_CHECKSUM: '1' | ||
DIB\_IMAGE\_CACHE: /opt/dib\_cache | ||
DIB\_GRUB\_TIMEOUT: '0' | ||
DIB\_SHOW\_IMAGE\_USAGE: '1' | ||
ZUUL\_USER\_SSH\_PUBLIC\_KEY: | | ||
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKGWg2C9M/BkwRawt6143wKiO4or18fW42hkXkZ6xuaQ root@9cab794c8bc6 | ||
providers: | ||
- name: ci-cloud | ||
driver: openstack | ||
cloud: ci-cloud | ||
boot-timeout: 600 | ||
launch-timeout: 3600 | ||
launch-retries: 3 | ||
clean-floating-ips: true | ||
diskimages: | ||
- name: nodepool-jammy | ||
config-drive: true | ||
pools: | ||
- name: dell-openstack-ci | ||
auto-floating-ip: true | ||
max-servers: 4 | ||
security-groups: | ||
- nodepool-secgroup | ||
networks: | ||
- nodepool | ||
labels: | ||
- name: nodepool-jammy | ||
diskimage: nodepool-jammy | ||
flavor-name: m1.large | ||
``` | ||
@joao15130:matrix.org | * Hello, new day new problem. | 09:28 |
We have observed some out-of-sync between nodepool and our openstack env. | ||
From time to time nodepool list images while openstack nova doesn't show anything. The images remain available in nodepool and zuul picks something which doesn't really exist. | ||
Even worse, we can see instances created by nodepool deleted without any reason, and the instances remain visible from nodepool. I wasn't able to see any errors in openstack logs. | ||
Our env is based upon a devstack AIO installation and we have registered this cloud as a provider. | ||
``` | ||
images-dir: /etc/nodepool/dib | ||
build-log-dir: /etc/nodepool/dib_log | ||
elements-dir: /etc/nodepool/elements | ||
build-log-retention: 7 | ||
webapp: | ||
port: 8005 | ||
listen_address: 0.0.0.0 | ||
zookeeper-servers: | ||
- host: zk | ||
port: 2281 | ||
zookeeper-tls: | ||
cert: /var/certs/certs/client.pem | ||
key: /var/certs/keys/clientkey.pem | ||
ca: /var/certs/certs/cacert.pem | ||
labels: | ||
- name: nodepool-jammy | ||
min-ready: 1 | ||
diskimages: | ||
- name: nodepool-jammy | ||
elements: | ||
- ubuntu-minimal | ||
- vm | ||
- simple-init | ||
- growroot | ||
- cache-devstack | ||
- openstack-repos | ||
- nodepool-base | ||
- infra-package-needs | ||
- zuul-worker | ||
release: jammy | ||
env-vars: | ||
GIT_BASE: http://opendev.org | ||
DIB_DEBIAN_COMPONENTS: 'main,universe' | ||
DIB_APT_LOCAL_CACHE: '0' | ||
DIB_TMP: '/opt/dib_tmp' | ||
DIB_DISABLE_APT_CLEANUP: '1' | ||
DIB_DEBOOTSTRAP_EXTRA_ARGS: '--no-check-gpg' | ||
TMPDIR: /root/ | ||
DIB_CHECKSUM: '1' | ||
DIB_IMAGE_CACHE: /opt/dib_cache | ||
DIB_GRUB_TIMEOUT: '0' | ||
DIB_SHOW_IMAGE_USAGE: '1' | ||
ZUUL_USER_SSH_PUBLIC_KEY: | | ||
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKGWg2C9M/BkwRawt6143wKiO4or18fW42hkXkZ6xuaQ root@9cab794c8bc6 | ||
providers: | ||
- name: ci-cloud | ||
driver: openstack | ||
cloud: ci-cloud | ||
boot-timeout: 600 | ||
launch-timeout: 3600 | ||
launch-retries: 3 | ||
clean-floating-ips: true | ||
diskimages: | ||
- name: nodepool-jammy | ||
config-drive: true | ||
pools: | ||
- name: dell-openstack-ci | ||
auto-floating-ip: true | ||
max-servers: 4 | ||
security-groups: | ||
- nodepool-secgroup | ||
networks: | ||
- nodepool | ||
labels: | ||
- name: nodepool-jammy | ||
diskimage: nodepool-jammy | ||
flavor-name: m1.large | ||
``` | ||
-@gerrit:opendev.org- Dong Zhang proposed: | 09:35 | |
- [zuul/zuul] 940872: Implement keystore functions for OIDC RS256 https://review.opendev.org/c/zuul/zuul/+/940872 | ||
- [zuul/zuul] 941629: Use ZuulTreeCache for OIDC signing keys https://review.opendev.org/c/zuul/zuul/+/941629 | ||
-@gerrit:opendev.org- Dong Zhang proposed: | 09:38 | |
- [zuul/zuul] 940872: Implement keystore functions for OIDC RS256 https://review.opendev.org/c/zuul/zuul/+/940872 | ||
- [zuul/zuul] 940971: Manage OIDC signing key rotation https://review.opendev.org/c/zuul/zuul/+/940971 | ||
- [zuul/zuul] 941235: Implement command for deleting OIDC signing keys https://review.opendev.org/c/zuul/zuul/+/941235 | ||
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 941235: Implement command for deleting OIDC signing keys https://review.opendev.org/c/zuul/zuul/+/941235 | 09:43 | |
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 941629: Use ZuulTreeCache for OIDC signing keys https://review.opendev.org/c/zuul/zuul/+/941629 | 09:45 | |
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 941235: Implement command for deleting OIDC signing keys https://review.opendev.org/c/zuul/zuul/+/941235 | 09:49 | |
@joao15130:matrix.org | Here's an example: | 12:58 |
``` | ||
@joao15130:matrix.org | Here's an example: | 12:59 |
``` | ||
root@4ee2d4dc4696:/# nodepool list | ||
+------------+----------+----------------+--------------------------------------+----------------+------+-------+-------------+----------+ | ||
| ID | Provider | Label | Server ID | Public IPv4 | IPv6 | State | Age | Locked | | ||
+------------+----------+----------------+--------------------------------------+----------------+------+-------+-------------+----------+ | ||
| 0000000009 | ci-cloud | nodepool-jammy | 1c0420cf-08f3-4900-a23e-c436eb8cf150 | 10.225.110.105 | | ready | 00:00:03:39 | unlocked | | ||
+------------+----------+----------------+--------------------------------------+----------------+------+-------+-------------+----------+ | ||
``` | ||
But in nova: | ||
``` | ||
stack@ci-cloud-provider:~$ openstack server list --all | ||
``` | ||
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 938463: sqlreporter: ensure build end data is stored even if log URL is too long https://review.opendev.org/c/zuul/zuul/+/938463 | 13:27 | |
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed on behalf of Tristan Cacqueray https://matrix.to/#/@tristanc_:matrix.org: [zuul/zuul-jobs] 927582: Update the set-zuul-log-path-fact scheme to prevent huge url https://review.opendev.org/c/zuul/zuul-jobs/+/927582 | 13:28 | |
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 938128: autohold REST API: add ref filter validation https://review.opendev.org/c/zuul/zuul/+/938128 | 13:29 | |
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed on behalf of Tristan Cacqueray https://matrix.to/#/@tristanc_:matrix.org: [zuul/zuul-jobs] 927600: Fix the upload-logs-s3 test playbook https://review.opendev.org/c/zuul/zuul-jobs/+/927600 | 13:54 | |
@fungicide:matrix.org | joao15130: nodepool tracks the image and node states in zookeeper, so it's possible for things to get deleted directly in the provider without nodepool being certain they're really gone. i would start with trying to debug on the cloud side what's causing them to disappear | 15:08 |
@joao15130:matrix.org | fungi: The only thing I see from the cloud side is some DELETE requests that goes to nova: | 15:16 |
``` | ||
ci-cloud-provider devstack@n-api.service[1842]: INFO nova.api.openstack.requestlog [None req-b66e7d61-1147-48da-926d-98512234d3cb nodepool nodepool] 10.228.237.17 "DELETE /compute/v2.1/servers/0a7b9cd9-8829-4ab1-a5e1-59303a662db7" status: 204 len: 0 microversion: 2.96 time: 0.204060 | Feb 14 13:08 | |
ci-cloud-provider devstack@n-api.service[1840]: INFO nova.api.openstack.requestlog [None req-0c20666d-5a27-44f5-9c82-7a4725e745ce nodepool nodepool] 10.228.237.17 "DELETE /compute/v2.1/servers/4adbf124-1aa6-45bf-886f-44b3eefc3ca0" status: 204 len: 0 microversion: 2.96 time: 0.207188 | Feb 14 13:08 | |
``` | ||
@joao15130:matrix.org | weird thing is that it seems we got rid of the problem by renaming the cloud provider | 15:40 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 16:03 | |
- [zuul/zuul] 941620: Use change networks in gerrit driver queries https://review.opendev.org/c/zuul/zuul/+/941620 | ||
- [zuul/zuul] 941683: Change network future collision resolution https://review.opendev.org/c/zuul/zuul/+/941683 | ||
- [zuul/zuul] 941684: Copy query cache from winning network https://review.opendev.org/c/zuul/zuul/+/941684 | ||
@fungicide:matrix.org | joao15130: well, that'll leave a ton of cruft behind. the safe way to rename a provider in nodepool is to evacuate it of resources (set max-servers to 0, clear out the list of images to be uploaded by nodepool, wait for everything to get cleanly deleted normally in the provider, then remove the provider definition in nodepool config and add your new one) | 16:10 |
@joao15130:matrix.org | Indeed, thanks for the procedure, but we just tried this as a workaround. Surprisingly, no instances are deleted now... | 16:14 |
@fungicide:matrix.org | yeah, i suspect you have a bunch of orphaned data in zookeeper instead, but anyway when you see the problem crop up again check on the cloud side to figure out why things disappeared out from under nodepool or errored when initially asked by nodepool to delete them | 16:25 |
@joao15130:matrix.org | fungi: We did some investigations but never found anything useful except the log from the API. No other errors. | 16:28 |
@mnaser:matrix.org | Is there a decent workaround for a scenario when I am using a third-party role that I want to use a secret against. In my case it's connecting to Tailscale, so if I do: | 18:58 |
``` | ||
vars: | ||
go_version: 1.23.3 | ||
tailscale_authkey: "{{ tailscale_config_authkey.secret }}" | ||
tailscale_tags: tag:tailscale-config | ||
tailscale_oauth_preauthorized: true | ||
secrets: | ||
- name: tailscale_config_authkey | ||
secret: zuul-config-internal-tailscale-config | ||
``` | ||
That doesn't work, because the secret has !!unsafe, so Zuul is protecting me here, alternatively, I tried: | ||
``` | ||
vars: | ||
go_version: 1.23.3 | ||
tailscale_authkey: "{{ hostvars['localhost'].tailscale_config_authkey.secret }}" | ||
tailscale_tags: tag:tailscale-config | ||
tailscale_oauth_preauthorized: true | ||
secrets: | ||
- name: tailscale_config_authkey | ||
secret: zuul-config-internal-tailscale-config | ||
``` | ||
That didn't do the trick either, it still shows as undefined.. it would be nice to be able to use this pre-existing role, so how would I tackle this sort of issue? I can't define a secret where data is just a plain string. | ||
@mnaser:matrix.org | FWIW, the role is github.com/artis3n/ansible-role-tailscale but I don't think it's relevant in this case, this is just the case when you need to use a flat variable | 18:59 |
@mnaser:matrix.org | I'm trying an actual set_fact in the pre-tasks .. maybe that might do it | 19:06 |
@mnaser:matrix.org | Oh, that seems to have done it.. I hope these channels are logged so someone someday ctrl+f and finds this | 19:07 |
@fungicide:matrix.org | looking at opendev's base job for an example, we seem to supply log upload credentials as secrets and those are plumbed through a post-run playbook into an included role from zuul/zuul-jobs which uses them | 19:10 |
@mnaser:matrix.org | https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/add-build-sshkey/README.rst | 22:44 |
Did I miss something or all those overrides are basically not usable because they're in vars/ ? | ||
@clarkb:matrix.org | there are 7 types of variable that have higher precedence than role vars according to https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_variables.html#understanding-variable-precedence so I guess it depends on how you want to use it | 22:51 |
@clarkb:matrix.org | I'm not sure that was an intentional choice though as this was likely one of the very first roles ever written to bootstrap zuul with ansible | 22:51 |
@clarkb:matrix.org | its possible that defaults is the better choice it just wasn't known at the time | 22:51 |
@clarkb:matrix.org | ya the code originates from https://opendev.org/openstack/openstack-zuul-roles/commit/d002b51c1735341c2042040238208aa2f978dd8a | 22:53 |
@clarkb:matrix.org | looking at that i highly doubt it was ever intentional to not use defaults | 22:54 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 23:23 | |
- [zuul/zuul] 941620: Use change networks in gerrit driver queries https://review.opendev.org/c/zuul/zuul/+/941620 | ||
- [zuul/zuul] 941683: Change network future collision resolution https://review.opendev.org/c/zuul/zuul/+/941683 | ||
- [zuul/zuul] 941684: Copy query cache from winning network https://review.opendev.org/c/zuul/zuul/+/941684 | ||
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed on behalf of Simon Westphahl: [zuul/zuul] 941435: Make Gerrit event pre-processor multi-threaded https://review.opendev.org/c/zuul/zuul/+/941435 | 23:23 | |
@mnaser:matrix.org | Cool. I’ll try and work on clearing that up | 23:29 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 920212: Use latest patchset with Gerrit cherry-pick https://review.opendev.org/c/zuul/zuul/+/920212 | 23:33 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!