*** JayF has joined #openstack-infra | 00:11 | |
*** tosky has quit IRC | 00:28 | |
*** thiago__ has joined #openstack-infra | 00:43 | |
*** tdasilva_ has quit IRC | 00:45 | |
*** tdasilva_ has joined #openstack-infra | 00:48 | |
*** thiago__ has quit IRC | 00:50 | |
*** hamalq has joined #openstack-infra | 01:17 | |
*** yamamoto has quit IRC | 01:18 | |
*** Xuchu has joined #openstack-infra | 01:20 | |
*** Xuchu_ has quit IRC | 01:22 | |
*** yamamoto has joined #openstack-infra | 01:23 | |
*** hamalq has quit IRC | 01:24 | |
*** hamalq has joined #openstack-infra | 01:25 | |
*** yamamoto has quit IRC | 01:28 | |
*** Xuchu_ has joined #openstack-infra | 02:12 | |
*** Xuchu has quit IRC | 02:15 | |
*** yamamoto has joined #openstack-infra | 02:21 | |
*** hamalq has quit IRC | 02:24 | |
*** yamamoto has quit IRC | 02:38 | |
*** dviroel has quit IRC | 03:22 | |
*** dviroel has joined #openstack-infra | 03:25 | |
*** Xuchu has joined #openstack-infra | 03:26 | |
*** Xuchu_ has quit IRC | 03:29 | |
dansmith | Saw this on a grenade job during setup, trying to install packages from apt: | 04:03 |
---|---|---|
dansmith | E: You don't have enough free space in /var/cache/apt/archives/. | 04:03 |
dansmith | https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c31/774317/2/check/grenade/c31a0f0/job-output.txt | 04:03 |
*** david-lyle has joined #openstack-infra | 05:25 | |
*** Xuchu_ has joined #openstack-infra | 05:25 | |
*** redrobot6 has joined #openstack-infra | 05:27 | |
*** dklyle has quit IRC | 05:28 | |
*** redrobot has quit IRC | 05:28 | |
*** redrobot6 is now known as redrobot | 05:28 | |
*** Xuchu has quit IRC | 05:29 | |
*** mgoddard has quit IRC | 05:31 | |
*** kota_ has quit IRC | 05:31 | |
*** kota_ has joined #openstack-infra | 05:31 | |
*** mgoddard has joined #openstack-infra | 05:31 | |
*** dviroel has quit IRC | 05:42 | |
*** Xuchu_ has quit IRC | 05:46 | |
*** yamamoto has joined #openstack-infra | 06:35 | |
*** yamamoto has quit IRC | 06:39 | |
*** david-lyle has quit IRC | 07:42 | |
*** vesper11 has joined #openstack-infra | 08:35 | |
*** yamamoto has joined #openstack-infra | 08:36 | |
*** yamamoto has quit IRC | 08:41 | |
*** matt_kosut has joined #openstack-infra | 09:19 | |
*** matt_kosut has quit IRC | 09:19 | |
*** xek has joined #openstack-infra | 09:51 | |
*** vesper11 has quit IRC | 09:51 | |
*** paladox has quit IRC | 09:59 | |
*** tosky has joined #openstack-infra | 10:14 | |
*** xek has quit IRC | 10:22 | |
*** yamamoto has joined #openstack-infra | 10:37 | |
*** yamamoto has quit IRC | 10:42 | |
*** yamamoto has joined #openstack-infra | 10:55 | |
*** yamamoto has quit IRC | 11:29 | |
*** yamamoto has joined #openstack-infra | 12:02 | |
*** yamamoto has quit IRC | 12:10 | |
*** tdasilva_ has quit IRC | 12:11 | |
*** tdasilva_ has joined #openstack-infra | 12:12 | |
*** yamamoto has joined #openstack-infra | 14:07 | |
fungi | dansmith: looks like devstack could stand to do an apt clean after each round of things it installs. by default, debian derivatives leave copies of all installed packages in /var/cache/archive, and disk space might be tight in that provider | 14:09 |
fungi | er, in /var/cache/apt/archive i mean | 14:10 |
fungi | `apt-get clean` or `apt clean` will clear them out | 14:10 |
*** yamamoto has quit IRC | 14:11 | |
*** paladox has joined #openstack-infra | 14:18 | |
*** dviroel has joined #openstack-infra | 14:23 | |
*** tosky has quit IRC | 16:03 | |
*** maysams has quit IRC | 16:12 | |
*** Tengu has quit IRC | 17:20 | |
*** Tengu has joined #openstack-infra | 17:21 | |
*** ralonsoh has joined #openstack-infra | 17:22 | |
*** xek has joined #openstack-infra | 17:22 | |
*** ralonsoh has quit IRC | 17:25 | |
*** xek has quit IRC | 17:35 | |
*** xek has joined #openstack-infra | 17:38 | |
*** xek has quit IRC | 17:38 | |
*** d34dh0r53 has quit IRC | 18:32 | |
*** tosky has joined #openstack-infra | 19:23 | |
*** slaweq has joined #openstack-infra | 19:48 | |
dansmith | fungi: really? it's complaining about not having 400mb of disk.. are the workers really that tight on space? | 20:02 |
dansmith | the workers get cleaned after each run, so it's not package cache from the previous run right? | 20:02 |
*** yamamoto has joined #openstack-infra | 20:10 | |
*** yamamoto has quit IRC | 20:14 | |
fungi | not sure what you mean by workers, but the job nodes are deleted and booted fresh | 20:15 |
*** slaweq has quit IRC | 20:15 | |
fungi | unfortunately it failed in such a way that the usual devstack log collection didn't happen, so we don't have a df to see what the actual filesystem size was | 20:15 |
fungi | possible something happened when that node booted which caused it not to growpart the rootfs at boot and left it at the nominal image size | 20:16 |
fungi | first time i've seen that, so hard to speculate as to the cause | 20:17 |
fungi | are you finding multiple occurrences? | 20:18 |
dansmith | yeah I saw it a couple times yesterday, always on the grenade job | 20:18 |
dansmith | by workers I mean the thing that we run devstack in.. so yeah, I assumed those get booted fresh, but I thought maybe you were suggesting that we just do a ./clean and re-run of devstack, so wasn't sure | 20:19 |
dansmith | fungi: this doesn't have to be something for a saturday for either of us, it just seemed like maybe something had changed and we were going to see a rash of fails due to disk space coming | 20:19 |
dansmith | I did a bunch of pushes last night before the jobs finished, so I'm not sure many of those actually got reported, but the last round that I let complete last night seemed to finish | 20:21 |
fungi | we might ought to add a df to https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/validate-host/library/zuul_debug_info.py and then it will be included in the zuul-info/zuul-info.*.txt files we collect from the nodes | 20:22 |
fungi | i'll throw up a patch for that now while i'm thinking about it | 20:22 |
fungi | at least that way we'll know what the filesystem sizes and utilization look like at the start of each job, and can speculate a bit better as to what happened | 20:23 |
dansmith | ack, devstack or zuul could also do it before to see what we start with.. I dunno how big the disks are on those, but 400m seemed like an awfully small margin | 20:23 |
fungi | yep, but that was also after numerous package install rounds earlier in the log | 20:24 |
dansmith | sure sure, but.. 400m :) | 20:24 |
dansmith | if we need to start being more disk conscious then that's a thing I guess, but I'd want to know where it's all going | 20:25 |
fungi | dansmith: https://review.opendev.org/774358 | 20:30 |
fungi | also there was a time when we sparsely fallocated swapfiles on nodes, but more recent linux kernels have required us to preallocate them instead | 20:30 |
fungi | so depending on the swap size set in the job configuration (default in our deployment is 1gb) that can eat away at available space on the rootfs | 20:31 |
fungi | a lot of jobs have it set to 8gb, but even that alone doesn't seem like it should be the cause of the problem in that example | 20:32 |
dansmith | how big are the roots supposed to be? | 20:32 |
dansmith | maybe our fs isn't expanded or something and we have a small margin over the actual size of the disk? | 20:32 |
fungi | i think some providers have a rootfs as small as 20gb and then allocate a larger ephemeral disk which some kinds of jobs (e.g. devstack) mount at /opt | 20:33 |
dansmith | okay | 20:34 |
fungi | i don't recall how small they are for that particular provider from your example, i'd probably have to manually boot or hold a node and investigate | 20:34 |
dansmith | ack, well, anyway, let's not make a saturday out of this.. it was mostly just an FYI in case something has changed lately that was likely to cause a raft of disk space fails | 20:36 |
fungi | yep, totally appreciate the heads up, i'll advocate for the patch to collect initial fs sizes/utilization and see if we can't get a better idea of why we see it sometimes | 20:38 |
dansmith | logstash shows several hits in the last 48 hours btw | 20:41 |
dansmith | so it wasn't just those two | 20:41 |
fungi | same provider each time? | 20:42 |
dansmith | airship-kn1 yeah looks like | 20:43 |
fungi | possible something has changed there, in that case | 20:53 |
*** zzzeek has quit IRC | 21:01 | |
*** zzzeek has joined #openstack-infra | 21:02 | |
fungi | i wonder if they shrunk the disk on the flavor we've been using, for example | 21:07 |
dansmith | I heard there's some k8s malware going around mining bitcoin, maybe we've got an openstack virus on our hands that eats disk :) | 21:08 |
fungi | tasty, tasty disk | 21:08 |
fungi | it could also be something like this has always been the smallest rootfs of all our providers but recently some change merged to grenade which caused it to begin using far more disk, and because we boot so few instances in that provider it's gone unspotted until now | 21:09 |
dansmith | yeah I dunno what has changed really.. could be as simple as the mysql package includes sample databases now or something I guess | 21:14 |
dansmith | the message actually says 100mb is what it has free, but needs 400, which really seems like too close a margin for something not to have changed recently | 21:15 |
jrosser | if it’s focal could it be the delta to 20.04.2 trying to install? that landed on feb 4th | 21:19 |
dansmith | that's a good idea, but I don't see any giant "and all these 300 will come too" package installs in that log | 21:24 |
corvus | dtantsur|afk: hi, it looks like openlab terraform-provider-openstack jobs are failing after feb 2; i looked and do not immediately see the cause. here's the build history: http://status.openlabtesting.org/builds?job_name=terraform-provider-openstack-acceptance-test it's failing on "TASK [install-devstack : Set fact for devstack openrc]" which has a no_log, so i can't see why. do you have any | 21:37 |
corvus | idea? | 21:37 |
corvus | dtantsur|afk: see also https://github.com/theopenlab/openlab/issues/681 and https://github.com/theopenlab/openlab-zuul-jobs/pull/1104 | 22:03 |
corvus | (and i'm totally open to suggestions of a better irc channel, i realize this is not directly TACT related, but there's a community nexus here; sorry) | 22:05 |
*** slaweq has joined #openstack-infra | 22:05 | |
*** zzzeek has quit IRC | 22:06 | |
*** zzzeek has joined #openstack-infra | 22:08 | |
*** yamamoto has joined #openstack-infra | 22:11 | |
*** slaweq has quit IRC | 22:50 | |
*** dviroel has quit IRC | 22:53 | |
*** yamamoto has quit IRC | 23:19 | |
*** tosky has quit IRC | 23:52 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!