*** dviroel|rover|afk is now known as dviroel|out | 00:50 | |
*** vishalmanchanda_ is now known as vishalmanchanda | 03:51 | |
*** ysandeep|out is now known as ysandeep | 05:02 | |
*** ykarel_ is now known as ykarel | 05:20 | |
*** bshephar is now known as bshephar|brb | 05:36 | |
*** ysandeep is now known as ysandeep|afk | 06:30 | |
*** ysandeep|afk is now known as ysandeep|trng | 06:58 | |
*** jpena|off is now known as jpena | 07:32 | |
*** ykarel is now known as ykarel|lunch | 08:48 | |
afaranha | fungi, hi, regarding the timing out test on the CI, could you save the node and give access to it? https://zuul.opendev.org/t/openstack/build/8fefe2da3d754c9484f2cdd2090eb484 | 10:12 |
---|---|---|
opendevreview | Pierre Riteau proposed openstack/project-config master: [kolla] Preserve Backport-Candidate and Review-Priority scores https://review.opendev.org/c/openstack/project-config/+/814548 | 10:14 |
*** rlandy is now known as rlandy|ruck | 10:34 | |
*** ykarel|lunch is now known as ykarel | 10:34 | |
*** jcapitao is now known as jcapitao_lunch | 10:47 | |
*** dviroel|out is now known as dviroel|rover | 11:06 | |
*** jpena is now known as jpena|lunch | 11:26 | |
*** jcapitao_lunch is now known as jcapitao | 12:10 | |
*** jpena|lunch is now known as jpena | 12:15 | |
*** ysandeep|trng is now known as ysandeep | 12:56 | |
fungi | afaranha: i've set an autohold for that job and rechecked your change | 13:21 |
fungi | once it fails again, let me know the ssh key you want granted access to the job node | 13:21 |
*** bshephar|brb is now known as bshephar | 13:23 | |
afaranha | fungi, sure, thanks :) | 13:24 |
afaranha | we have 2 approaches here, one is to identify the test that timed out, and the second one is to increase the size of the of the file created that is mounted for the test, this second one I don't quite get it yet | 13:27 |
opendevreview | Mark Goddard proposed openstack/project-config master: kolla-cli: end gating for retirement https://review.opendev.org/c/openstack/project-config/+/814580 | 13:27 |
fungi | afaranha: yeah, also the build does archive the subunit stream from the tests, i don't know if you've tried analyzing it | 13:28 |
fungi | it's presumably missing whichever test(s) timed out or otherwise didn't get run | 13:28 |
afaranha | fungi, I was checking the job-output.txt, the tox folder has only the requirements it seems | 13:32 |
afaranha | do you mean something else? | 13:33 |
fungi | the tempfile which is included in the top level logs list is the subunit stream | 13:35 |
fungi | normally it gets postprocessed to create the test report | 13:35 |
fungi | but in cases where the job could not complete it's still available for retrieval and analysis | 13:36 |
afaranha | fungi, so the last entry there was: test: test.functional.s3api.test_object.TestS3ApiObject.test_put_object_underscore_in_metadata | 13:49 |
afaranha | that means this was the last test is run, so the next one is the one that timed out? | 13:49 |
fungi | not necessarily, the tests are likely run in parallel, so determining a sequence could be hard. you could compare the names of the completed tests to the list of tests you expected to be run and see which ones are missing or don't indicate they finihed | 13:54 |
fungi | finished | 13:54 |
gibi | #nova next Oslo-Nova cross project session from 14:00 | 14:00 |
gibi | #nova now Oslo-Nova cross project session from 14:00 | 14:00 |
gibi | #nova now Oslo-Nova cross project session: oslopolicy-sample-generator extensions | 14:01 |
gibi | #nova next break | 14:02 |
frickler | gibi: EWIN | 14:09 |
gibi | ups | 14:10 |
gibi | sorry | 14:10 |
*** lbragstad_ is now known as lbragstad | 14:15 | |
*** ykarel_ is now known as ykarel | 14:22 | |
afaranha | fungi, thanks, I'll try to isolate it. For the second approach, increase the size of the file, can you point me to where this is done and how can I change it? | 14:36 |
fungi | afaranha: it's somewhere in swift's job definitions i think, probably search their repo for xattr or xfs or mkfs | 14:38 |
opendevreview | Mark Goddard proposed openstack/project-config master: kolla-cli: enter retirement https://review.opendev.org/c/openstack/project-config/+/814597 | 14:48 |
timburke_ | afaranha, fairly certain it's this: https://opendev.org/openstack/swift/src/branch/master/tools/test-setup.sh#L7-L14 | 14:53 |
timburke_ | it's also worth noting that we *do* run our functional tests serially (because they share a user and do things like upload a few objects then check that account stats match) -- see https://opendev.org/openstack/swift/src/branch/master/.functests#L10 | 14:56 |
opendevreview | Mark Goddard proposed openstack/project-config master: kolla-cli: end gating for retirement https://review.opendev.org/c/openstack/project-config/+/814580 | 15:05 |
opendevreview | Mark Goddard proposed openstack/project-config master: kolla-cli: enter retirement https://review.opendev.org/c/openstack/project-config/+/814597 | 15:05 |
opendevreview | Thiago Paiva Brito proposed openstack/project-config master: Adding gerritreview messages to starlingx channel https://review.opendev.org/c/openstack/project-config/+/814600 | 15:06 |
afaranha | fungi, timburke_ thanks for the help, I'll try to increase the size as soons as I find time today or tomorrow (quite a busy week) | 15:07 |
fungi | yeah, as to why it would run out of space in that file, i have a feeling just increasing the side won't make any difference and that there's some test running away filling it up no matter how large you make it | 15:08 |
fungi | s/side/size/ | 15:08 |
afaranha | fungi, right, so I'll get have to find and check the timed out test first to see what it's trying to do | 15:10 |
timburke_ | fwiw, i'd be somewhat shocked if it was really setting the xattrs that pushed it over the edge -- i would've expected the data itself to be doing it. there might be something funky going on with max xattr sizes in fips mode | 15:10 |
fungi | yeah, hopefully once we've got a held node for the failing job, the reasons will become more obvious | 15:11 |
timburke_ | we have some checks for large xattr support (https://opendev.org/openstack/swift/src/branch/master/test/unit/__init__.py#L1265-L1302) but i'm willing to bet we don't have appropriate skips in *all* the right places | 15:14 |
fungi | afaranha: what ssh key do you want granted access? | 15:27 |
afaranha | fungi, https://paste.openstack.org/raw/810081/ | 15:29 |
opendevreview | Thiago Paiva Brito proposed openstack/project-config master: Adding gerritreview messages to starlingx channel https://review.opendev.org/c/openstack/project-config/+/814600 | 15:30 |
fungi | afaranha: ssh root@137.74.28.172 | 15:33 |
timburke_ | speaking of gate jobs, i wouldn't mind getting some thoughts on https://review.opendev.org/c/zuul/zuul-jobs/+/795419 and/or https://review.opendev.org/c/openstack/project-config/+/794351 -- currently, pyeclib's test-release-openstack job keeps failing because it tries to build wheels despite not having all the deps it'd need to build binaries | 15:34 |
fungi | fwiw, i mounted /home/zuul/1G_xfs_file to /home/zuul/xfstmp/ on that node and it's only 5% space used, 1% inodes used | 15:35 |
fungi | so it's not full now, even if it was at some point | 15:35 |
fungi | timburke_: looking | 15:35 |
timburke_ | thanks! seems to me we can either skip building the wheel (it's not been getting uploaded anyway) or install bin-deps -- i don't care much which way it goes | 15:36 |
fungi | timburke_: i think skipping the wheel builds is best. you won't be able to upload platform-specific linux wheels to pypi anyway, you need special build environments for the "manylinux" meta-architectures | 15:37 |
timburke_ | makes sense | 15:40 |
timburke_ | afaranha, you might try cd'ing into the xfs mount and running something like `touch t && xattr -w user.test $(python -c "print(4097*'x')") t` | 15:42 |
opendevreview | Thiago Paiva Brito proposed openstack/project-config master: Adding gerritreview messages to starlingx channel https://review.opendev.org/c/openstack/project-config/+/814600 | 15:47 |
clarkb | fungi: timburke_ note there is some manylinux build work done for cryptography on our zuul installation. Though I think the package itself has to be fairly aware too and statically link libs that aren't part of manylinux | 15:52 |
fungi | right, would need to bundle eclib, presumably | 15:53 |
fungi | (at a minimum) | 15:53 |
timburke_ | yup | 15:53 |
fungi | working toward having manylinux* wheels for pyeclib might be an interesting exercise, but probably best to not have the tarball release uploads broken until that can be added | 15:54 |
timburke_ | it'd be cool to have liberasurecode statically linked, but i don't know that anyone's interested enough to dig into what it would take to do it | 15:54 |
timburke_ | fungi, exactly -- the really frustrating thing it that it's blocking me from even updating the README to reflect the freenode -> OFTC change | 15:54 |
fungi | i think it's just a matter of copying its .so into the right path in the wheel, but i've not tried it | 15:54 |
fungi | timburke_: yep, i think if we can get tristan to confirm that change isn't breaking software foundary, we should be able to go ahead and merge that flag | 15:56 |
fungi | i've asked him in the zuul matrix channel | 15:56 |
afaranha | timburke_, ack, I just need to find the venv to source it | 15:58 |
fungi | afaranha: no need to source a venv for that, just cd to /home/zuul/xfstmp where it's mounted | 16:00 |
afaranha | fungi, yea, but I can't use python | 16:00 |
afaranha | oops, python3 my bad | 16:00 |
afaranha | but no xattr | 16:01 |
afaranha | let me see what it does... | 16:01 |
fungi | might need to yum install it? | 16:02 |
fungi | er, dnf now | 16:02 |
fungi | oh, i see, it is a python-based tool, so yeah it's probably installed in the tox venv | 16:03 |
fungi | afaranha: the tox venv is /home/zuul/src/opendev.org/openstack/swift/.tox/func-encryption-py3 | 16:04 |
fungi | you can just directly invoke /home/zuul/src/opendev.org/openstack/swift/.tox/func-encryption-py3/bin/xattr in that case | 16:04 |
fungi | no need to activate the venv to do that | 16:04 |
afaranha | right, just a minute | 16:05 |
afaranha | fungi, I run it, no issue happened, but what was I supposed to see? | 16:07 |
fungi | potentially a traceback with "OSError: [Errno 28] No space left on device" like in the job log | 16:08 |
afaranha | no, no issue | 16:08 |
afaranha | I checked the file with xattr -l t, and the attribute is there | 16:08 |
fungi | in /var/log/messages at the same time as the failure in the job output i also see this: | 16:11 |
fungi | Oct 19 13:35:05 centos-8-ovh-gra1-0027003455 wsgi-server[1826]: ERROR Insufficient Storage 127.0.0.1:41647/sdb1 (txn: tx2eef682df47b4737af575-00616ec989) | 16:11 |
fungi | afaranha: https://paste.opendev.org/show/810084 that's the transaction which failed on the insufficient space error, i think | 16:16 |
fungi | includes the syslogged entries from both swift and wsgi-server | 16:17 |
afaranha | fungi, shouldn't we see the sdb1 partition on this server? | 16:17 |
afaranha | "ERROR Insufficient Storage 127.0.0.1:41647/sdb1" | 16:18 |
fungi | i have no idea if that sdb1 is an actual block device partition on the node or something that's been virtualized in the functest suite | 16:18 |
fungi | i'm not super familiar with swift's functional testing | 16:19 |
timburke_ | most likely it's something faked up for the test. i can double check | 16:19 |
timburke_ | yeah, those are basically just made up names: https://opendev.org/openstack/swift/src/branch/master/test/functional/__init__.py#L510-L521 | 16:22 |
clarkb | most of our clouds give us xvd* and vd* type devices | 16:22 |
opendevreview | Merged openstack/project-config master: Adding gerritreview messages to starlingx channel https://review.opendev.org/c/openstack/project-config/+/814600 | 16:25 |
*** jpena is now known as jpena|off | 16:27 | |
*** ysandeep is now known as ysandeep|out | 16:29 | |
timburke_ | afaranha, might also try a higher value than 4097 -- we expect to be able to go as high as 64k: https://opendev.org/openstack/swift/src/branch/master/swift/obj/diskfile.py#L252-L268 | 16:37 |
timburke_ | and the encryption job is definitely going to want to write more metadata per object, which could explain why it fails while the erasure-coding job succeeds | 16:38 |
clarkb | timburke_: out of curiousity is the semi recent update to linux xfs drivers to make xfs Y2038 safe something that is on swifts radar? I have no idea what that actually entails from an operation standpoint. I guess tell everyone to upgrade to linux >=5.10 before 2038 and do whatever xfs udpate is necessary? | 16:41 |
fungi | the clock is ticking, only 19 years left! | 16:43 |
timburke_ | it's something we'll definitely need to think about, but i don't think anyone's started on it yet. presumably, operators would start formatting all new disks to be Y2038-safe; normal disk failures and hardware deprecation cycles would mostly ensure that you're ready by the time it really matters | 16:48 |
timburke_ | might make some difference in our RAM requirements (those inodes must've gotten a little bigger), but RAM's getting denser and denser anyway | 16:50 |
clarkb | timburke_: ya and you've got plenty of time. It came up in the context of building centos-9 images because centos-8 and ubuntu focal can't mount the xfs that centos-9 produces by default. Turns out the y2038 fix is the reason. Definitely seems like somethign worht backporting but maybe not with that many years ahead of it | 16:50 |
timburke_ | good to know! | 16:51 |
fungi | i suppose that could be a problem for scenarios like rdb or other shared block devices if one of the systems is centos-9 and another is centos-8, like during upgrades | 16:53 |
-opendevstatus- NOTICE: Both Gerrit and Zuul services are being restarted briefly for minor updates, and should return to service momentarily; all previously running builds will be reenqueued once Zuul is fully started again | 17:00 | |
ade_lee | fungi, timburke_ hey I saw you set an autohold on the swift fips job that was failing. Did you guys figure out what was going on? | 19:39 |
fungi | ade_lee: afaranha was looking into it | 19:40 |
ade_lee | fungi, ok - did he get his keys on there? | 19:40 |
fungi | yep | 19:40 |
ade_lee | ok cool -- I expect I'll hear from him tomorrow then | 19:41 |
fungi | i poked at it a little and confirmed swift and wsgi-server logged a test related transaction was rejected with a 503 due to insufficient space somewhere, but it's not entirely clear where | 19:42 |
ade_lee | fungi, ok - hopefully with a system there we'll be able to see whats going on. do you want to discuss how want to handle https://review.opendev.org/c/zuul/zuul-jobs/+/813253 now? | 19:43 |
ade_lee | dviroel|rover, ^^ | 19:43 |
fungi | ade_lee: the opendev meeting is in progress, and after that i need to work on dinner prep, but can probably start revisiting it | 19:44 |
clarkb | I think you should parent the multinode job and then add in your own enable-fips rather than making the multinode job have a flag for fips | 19:45 |
clarkb | the two don't really ahve a relationship to each other so keeping them separate will reduce confusion | 19:45 |
fungi | yeah, one of the suggestions i had was to do the fips setup before the multinode setup | 19:45 |
fungi | because the reboot should happen as early as possible in order to not have to worry about stateless setup steps getting undone by the reboot | 19:46 |
* dviroel|rover reads | 19:46 | |
clarkb | fungi: ah yup. But that should be doable. Have a fips job and then layer on top of that | 19:46 |
fungi | since zuul wraps pre-run and post-run playbooks like an onion based on order of inheritance, it implies having some job which uses the multinode bits parented to a job with the fips setup bits | 19:47 |
clarkb | ya I think that is fine too as you can do that to the side. But if you stick it directly in the role it becomes a part of that specific interface which is weird to me | 19:48 |
fungi | or just having one pre-run playbook which includes the fips role first and then the multinode role | 19:48 |
clarkb | basically have a fips job, them a multinode-fips that inherits from fips | 19:48 |
clarkb | but the multinode role interface itself isn't change | 19:48 |
clarkb | *isn't changed | 19:48 |
ade_lee | fungi, clarkb ok - so what I'm hearing is add a fips job and a mutinode fips job to zuul.d/general-jobs.yaml | 19:50 |
ade_lee | where multinode-fips depends on the fips job | 19:50 |
clarkb | ya I think that is better than modifying the multinode role which is intended to be a bit more self contained | 19:50 |
fungi | right | 19:50 |
ade_lee | dviroel|rover, ^^ think this could work for us? | 19:51 |
fungi | or you can have a multinode-fips job which directly includes the fips and then multinode roles in that order, i think | 19:51 |
dviroel|rover | yeah, make sense for me, to have another multinode-fips and not mixing things | 19:51 |
ade_lee | cool - well that was easy :) | 19:52 |
dviroel|rover | ade_lee: yes, i can work on that in a couple of mins | 19:52 |
ade_lee | cool thanks -- fungi , clarkb - we'll let you know when we have some results | 19:53 |
fungi | great! don't hesitate to ask if you have more questions. happy to answer them when i can, though some will come down to experimentation | 19:54 |
ade_lee | yup will do - especially if it doesn't work :) | 19:56 |
*** dviroel|rover is now known as dviroel|rover|afk | 20:59 | |
opendevreview | Ian Wienand proposed openstack/project-config master: infra-package-needs: install latest pip https://review.opendev.org/c/openstack/project-config/+/814677 | 22:20 |
opendevreview | Ian Wienand proposed openstack/project-config master: infra-package-needs: install latest pip https://review.opendev.org/c/openstack/project-config/+/814677 | 22:22 |
opendevreview | Ian Wienand proposed openstack/project-config master: infra-package-needs: install latest pip https://review.opendev.org/c/openstack/project-config/+/814677 | 22:56 |
opendevreview | James E. Blair proposed openstack/project-config master: Add ubuntu-bionic-32GB to vexxhost-specific https://review.opendev.org/c/openstack/project-config/+/814683 | 23:11 |
opendevreview | James E. Blair proposed openstack/project-config master: Add ubuntu-bionic-32GB to vexxhost-specific https://review.opendev.org/c/openstack/project-config/+/814683 | 23:13 |
opendevreview | James E. Blair proposed openstack/project-config master: Add ubuntu-bionic-32GB https://review.opendev.org/c/openstack/project-config/+/814683 | 23:17 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!