clarkb | I'm guessing both versions of yaml interact with the built in type though possibly differently | 00:00 |
---|---|---|
ianw | so it's looking at | 00:00 |
mordred | and it may just be that yaml does a lot of dict operations, so greater likelihood of triggering it | 00:00 |
clarkb | the reporter found 5 distinct bugs :) | 00:00 |
clarkb | mordred: ya | 00:00 |
ianw | @_ssh_retry | 00:00 |
ianw | def _run(self, cmd, in_data, sudoable=True, checkrc=True): | 00:00 |
ianw | processing that decorator when it fails | 00:00 |
jeblair | that ^ makes me think this is not related to yaml | 00:00 |
ianw | just for added degree of difficultly | 00:00 |
mordred | jeblair: ++ | 00:00 |
jeblair | on account of everything related to names in python being a dict, i don't think we can eliminate dict operations from the suspect list though | 00:01 |
ianw | unless yaml got gc things in a bad state previously | 00:01 |
jeblair | taking a cue from that bug report: | 00:03 |
jeblair | (gdb) print *op | 00:03 |
jeblair | $1 = {ob_refcnt = 5, ob_type = 0x0} | 00:03 |
jeblair | ob_type should be a data structure | 00:04 |
jeblair | clarkb: i don't know enough about python internals to connect your fix to our backtrace; so we may just need to compile a new python and try it as you suggest. | 00:05 |
clarkb | ya I too don't know enough. Its also a fairly complicated piece of C last I looked at it | 00:06 |
mordred | I hear SpamapS knows how to get patches included in stable backport repos for ubuntu ... | 00:07 |
ianw | the python bt doesn't suggest there's much yaml in there : http://paste.openstack.org/show/618729/ | 00:08 |
jeblair | i need to eod; if anyone wants to compile python with that fix, i'll be happy to try it again tomorrow | 00:08 |
jeblair | otherwise, i'll start in on that in the morning | 00:09 |
clarkb | jeblair: you could also run the python tests really quickly to confirm the bug is present at least | 00:09 |
clarkb | before worryign about patching in the fix | 00:09 |
clarkb | but I too need to go make some dinner | 00:09 |
mordred | jeblair: I'm also about at eod - if nobody has when I wake up, i'll try to get a python bult | 00:09 |
mordred | built | 00:09 |
jeblair | cool, thx. | 00:10 |
mordred | ok. I've got the patch imported in to the quilt stack on top of the xenial python3.5 package | 00:16 |
SpamapS | is this libyaml or pyyaml? | 00:18 |
* SpamapS has been distracted | 00:18 | |
mordred | SpamapS: python | 00:20 |
ianw | module = imp.load_source(full_name, path, module_file) | 00:32 |
ianw | imp -- Access the import internals | 00:32 |
ianw | i'm not surprised this is getting into "interesting" territory | 00:32 |
SpamapS | yeah haven't seen imp go right too often | 01:08 |
ianw | jeblair: heh, looking a combination of the bt and py-bt & variables I'd like to bet it's got to do with racing and module importing (http://paste.openstack.org/show/618729/) | 01:44 |
ianw | and i see this isn't your first rodeo there :) -> https://github.com/jeblair/ansible/commit/13807d79012c2354a20eb4020021c88f1909bd36 | 01:44 |
ianw | although this is all in the forking & loading path, no threads ... but still ... prime suspect | 01:50 |
*** eventingmonkey has quit IRC | 04:29 | |
*** eventingmonkey has joined #zuul | 04:54 | |
*** isaacb has joined #zuul | 05:27 | |
*** isaacb has quit IRC | 06:15 | |
*** amoralej|off is now known as amoralej | 06:45 | |
*** jkilpatr has quit IRC | 10:46 | |
*** jkilpatr has joined #zuul | 11:07 | |
*** dkranz has joined #zuul | 12:24 | |
*** amoralej is now known as amoralej|off | 13:40 | |
jeblair | mordred: where's your python + patch? | 14:30 |
mordred | jeblair: almost done | 14:33 |
mordred | jeblair: (I built one for 3.5.1, which was what's on my laptop, but we have 3.5.2 on the server because we have backports turned on - so finishing building for 3.5.2 now) | 14:34 |
jeblair | mordred: weird... 3.5.2 is in xenial main: (well, xenial security) https://packages.ubuntu.com/xenial/python3.5 | 14:39 |
mordred | jeblair: it may just be that I don't have security/backports set up for my deb-src lines, so apt-get source failed me | 14:40 |
jeblair | mordred: ah, hope so. (missing security is less than ideal!) | 14:41 |
mordred | jeblair: ok - there's a set of packages - and the patch itself - in /home/mordred/python-backport on ze01 | 14:48 |
jeblair | mordred: cool, i'll install them and restart and recheck | 14:49 |
pabelanger | https://review.openstack.org/494677/ and https://review.openstack.org/494672/ are ready for a discussion. Basically trying to see if we want untrusted or trusted, because I believe both jobs will work the same, regardless of repo they are in | 15:21 |
Shrews | jeblair: mordred: leifmadsen: https://review.openstack.org/493873 | 15:27 |
jeblair | mordred: i installed the new python, restarted executor (no idea why actually -- i guess i didn't need to since it's ansible-playbook doing the deed) and have rechecked all open zuul changes | 16:23 |
SpamapS | does ansible try to fork and continue forward in python without re-execing python? | 16:28 |
SpamapS | or | 16:28 |
SpamapS | I said that wrong.. | 16:28 |
SpamapS | I've always heard that importing after fork was something known to be flaky in python for some reason. | 16:29 |
SpamapS | That if that was possible, the best practice was to make an entry point and re-exec python from the top into that entry point. | 16:29 |
SpamapS | but this is one of those dark evil memories that I can't recall details on. | 16:29 |
mordred | SpamapS: there are definitely imports that happen in ansible-playbook post-fork | 16:32 |
mordred | SpamapS: well- actually - I say that - I may actually be lying about that | 16:33 |
mordred | there are definitely imports that happen in callback plugins, and imports that happen in action plugins | 16:33 |
mordred | I don't know when the callback plugins get imported - I believe action plugins are imported as needed though | 16:34 |
mordred | of course, there is CRAZY import magic that happens in ansible for importing modules and plugins | 16:35 |
SpamapS | a bit more coming back to me.. it had something to do with globals and import side effects | 16:36 |
mordred | we chatted a little with the core team about maybe reworking that in ansible core - they're not opposed to it, but there's such black magic it'll be both a big job and one that needs to be done exceptionally carefully | 16:36 |
*** robled has quit IRC | 16:42 | |
*** robled has joined #zuul | 16:42 | |
*** robled has joined #zuul | 16:42 | |
*** rcarrill1 is now known as rcarrillocruz | 16:42 | |
jeblair | mordred, SpamapS, clarkb: the first batch of rechecks just completed with no errors. | 16:48 |
jeblair | that did not happen yesterday -- the number of errors we got for each batch was: [4, 1, 5, 6] | 16:49 |
jeblair | i will launch a second batch of rechecks now | 16:50 |
clarkb | that sounds promsiing | 16:51 |
SpamapS | indeed | 17:05 |
SpamapS | have we had any success boiling down to a smaller reproducer? | 17:05 |
SpamapS | or even tried? | 17:05 |
SpamapS | Like would be nice if we can hand the Ubuntu SRU team a script and say "run this 100 times and if it never fails this is fixed" | 17:06 |
SpamapS | races have an exception though.. if reproducing is hard, they'll just trust you :) | 17:06 |
mordred | SpamapS: yah- there are some reproduction test cases in the python bug | 17:06 |
clarkb | SpamapS: the upstream patch had reproducers for all of the cases they found. So assuming this fixes it for us then ya | 17:07 |
mordred | SpamapS: I've got a local package source repo with the patch applied and whatnot - so I'm more than happy to submit the various things to ubuntu to try to get it accepted - but I might need you to walk me through that process | 17:07 |
mordred | SpamapS: I also don't need credit - so if it's easier for me to make a debdiff and hand it to you than for you to tell me what to do, that's cool too | 17:08 |
clarkb | I think we see a higher rate of fails with py35 + tempest too I wonder if some of that goes away with this | 17:08 |
jeblair | i just verified that the reproducer which matches our traceback fails with xenial python 3.5.2 | 17:11 |
jeblair | it's this one: http://paste.openstack.org/show/618790/ | 17:12 |
SpamapS | mordred: do you have it as a quilt patch? | 17:16 |
mordred | I do | 17:16 |
SpamapS | that should be relatively easy to submit | 17:16 |
mordred | SpamapS: cool. what should my process for that be? | 17:16 |
jeblair | second batch just completed with no segfaults | 17:16 |
SpamapS | https://wiki.ubuntu.com/StableReleaseUpdates <-- that's the instructions | 17:16 |
SpamapS | and it's actually pretty straight forward | 17:17 |
SpamapS | the version # is the hardest part | 17:17 |
SpamapS | jeblair: \o/ | 17:17 |
jeblair | i will run one more batch because 3 is a nice number, but i think we're in pretty safe territory now | 17:17 |
mordred | SpamapS: Check that the bug is fixed in the current development release, and that its bug task is "Fix Released". Equivalently for new upstream releases, this (or a newer) release must be in the development release. It | 17:17 |
pabelanger | great work | 17:18 |
SpamapS | mordred: I can do the bug nomination | 17:18 |
mordred | SpamapS: so do I need to first fix python3.5 in zenity? | 17:18 |
SpamapS | mordred: zenity is on 3.6 by now I think | 17:18 |
SpamapS | not 100% sure | 17:18 |
SpamapS | https://launchpad.net/ubuntu/+source/python3.6 | 17:18 |
SpamapS | yeah, artful was 3.6 too | 17:18 |
*** bhavik1 has joined #zuul | 17:19 | |
SpamapS | zesty is what you meant though | 17:19 |
mordred | there IS a 3.5 in zesty though | 17:19 |
SpamapS | artful is the dev release | 17:19 |
mordred | so should I also submit to that? | 17:19 |
SpamapS | you don't have to fix zesty | 17:19 |
SpamapS | thank god | 17:19 |
mordred | ok. cool | 17:19 |
SpamapS | so the patch isn't even in python mainline yet? | 17:19 |
mordred | also - doko seems to do things in debian and sync - do I need to worry about that? | 17:20 |
mordred | it is | 17:20 |
jeblair | it is in mainline, and it looks like it's in the 3.5.4 release | 17:20 |
SpamapS | Worth pinging doko in #ubuntu-devel | 17:20 |
SpamapS | If it's already in mainline you may be covered | 17:20 |
mordred | SpamapS: https://github.com/python/cpython/commit/2f7f533cf6fb57fcedcbc7bd454ac59fbaf2c655 | 17:20 |
mordred | xenial is only on 3.5.2 and the patch cleanly backports - so should I request they bump xenial to 3.5.4 or just submit the backport patch? | 17:21 |
SpamapS | You could also make the argument that 3.5 isn't in artful ;) | 17:21 |
SpamapS | jeblair: where did you find that it was in 3.5.4? Would be good supporting info | 17:23 |
SpamapS | also I assume it's in a 3.6 release too then | 17:23 |
mordred | SpamapS: http://bugs.python.org/issue27945 is the upstream bug | 17:23 |
mordred | SpamapS: so I do need to file a bug on the xenial python package and that's step one yeah? | 17:24 |
SpamapS | mordred: yeah, it's also possible that bug is already reported and being tracked. Let's see if I remember how to check that on launchpad.. | 17:25 |
*** bhavik1 has quit IRC | 17:25 | |
jeblair | SpamapS, mordred: https://docs.python.org/3.5/whatsnew/changelog.html#core-and-builtins bpo-27945 is in the 3.5.4 release notes | 17:25 |
jeblair | also 3.6.2 | 17:27 |
SpamapS | https://launchpad.net/debian/sid/+source/python2.7/+changelog | 17:28 |
SpamapS | found that.. so 2.7 has it in sid.. | 17:28 |
* SpamapS isn't helping | 17:28 | |
jeblair | SpamapS: 3.5 version of that page: https://launchpad.net/debian/+source/python3.5/+changelog | 17:28 |
SpamapS | https://launchpad.net/ubuntu/+source/python3.5/+changelog | 17:29 |
SpamapS | so 3.5.4 is actually in artful... | 17:29 |
SpamapS | so you can check off that box | 17:29 |
mordred | ok. cool. but there's no existing bug so filing a new one is valuable | 17:29 |
SpamapS | mordred: I'll file one | 17:31 |
SpamapS | unless you already finished | 17:31 |
SpamapS | https://bugs.launchpad.net/ubuntu/+source/python3.5/+bug/1711724 | 17:32 |
openstack | Launchpad bug 1711724 in python3.5 (Ubuntu) "Segfaults with dict" [Undecided,New] | 17:32 |
SpamapS | I'll add the SRU template and nominate it | 17:32 |
mordred | SpamapS: quilt patch attached | 17:35 |
mordred | in case that makes anything easier | 17:35 |
SpamapS | it does | 17:35 |
SpamapS | I don't see any simple reproducers in that upstream bug | 17:35 |
SpamapS | ahh there's one | 17:36 |
mordred | they're up at the top | 17:36 |
jeblair | SpamapS: this is the one that bit us and i verified fails in xenial 3.5.2: http://paste.openstack.org/show/618790/ | 17:37 |
jeblair | (it's from that bug) | 17:37 |
mordred | neat. should we attach that to the bug? | 17:37 |
jeblair | it's poc27 from the bug (note: 27 afaict has nothing to do with python versions -- i think it's just literally script #27 the author wrote) | 17:38 |
mordred | btw - test_fromkeys_operator_modifying_dict_operand is added to the test suite in the patch I posted, and it is the paste jeblair just shared | 17:39 |
mordred | jeblair: you've verified that works with the new python installed? | 17:39 |
mordred | (I mena, the package wouldn't have built if it didn't) | 17:39 |
jeblair | batch 3 completed with no segfaults. that's 0/48 runs and we're now 6 standard deviations away from our expected value (average 25% failure rate, so we should have seen 12 by now). we're well into 'statistically significant' territory. :) | 17:39 |
pabelanger | ++ | 17:40 |
mordred | ok. so that's awesome | 17:43 |
mordred | between now and that being released - how do we want to deal with this? just install those debs on our executors by hand? put it into a local repo and add that to our puppet? | 17:44 |
SpamapS | mordred: ok I'll just make the SRU package from your quilt patch. | 17:44 |
SpamapS | and upload it | 17:44 |
SpamapS | since I still have my ubuntu dev rights | 17:44 |
mordred | SpamapS: sweet! wow, you have power | 17:44 |
SpamapS | so that should get it into the SRU team's hands fairly quickly. | 17:44 |
SpamapS | (I no longer have the power to accept it into xenial-proposed though) | 17:45 |
mordred | SpamapS: do you think we should just instlal the deb by hand on the executors for now? | 17:45 |
mordred | or I suppose I could upload it to a ppa real quick | 17:45 |
jeblair | eithor of those wfm | 17:46 |
mordred | how about I make a ppa just for this and upload the package to that | 17:46 |
mordred | that way we can add the ppa to our puppet and the situation will be documented and repeatable | 17:47 |
jeblair | (omg why *isn't* it spelled eithOR? that's awesome!) | 17:47 |
jeblair | mordred: i like that; we don't have to worry as much when spinning up new launchers | 17:47 |
jeblair | er executors. whatever they're calld | 17:47 |
mordred | ++ | 17:48 |
mordred | pabelanger: we should maybe nudge some friends at RH about this WRT centos/fedora - I think latest fedora has 3.6 so likely not a problem, but whatever python3.5 pacakge is in centos likely wants to get this patch too | 17:48 |
mordred | pabelanger: I have zero clue how to go about doing that though | 17:48 |
pabelanger | mordred: sure, can look into that on pkgdb and see if we need to apply it | 17:49 |
pabelanger | mordred: centos is another story, they are still 2.7 in their repo. But software collection can likely apply this | 17:50 |
mordred | pabelanger: yah - basically whatver thing a person running on a RH-based OS would use so that they could run Zuul v3 | 17:51 |
pabelanger | ++ | 17:51 |
SpamapS | Uploading python3.5_3.5.2-2ubuntu0~16.04.2_source.changes: done. | 17:53 |
SpamapS | Successfully uploaded packages. | 17:53 |
SpamapS | I think putting it in a PPA is a good idea | 17:53 |
mordred | https://launchpad.net/~openstack-ci-core/+archive/ubuntu/python-bpo-27945-backport/+packages | 17:53 |
mordred | it's building | 17:53 |
mordred | ppa:openstack-ci-core/python-bpo-27945-backport is hte PPA ... lemme go make a puppet patch | 17:54 |
SpamapS | oh that's interesting, python3.5 was just deleted from artful yesterday :-P | 17:57 |
SpamapS | They _might_ still ask us to update zesty's 3.6.1 .. but I'm hoping they won't | 17:58 |
mordred | SpamapS: yah - since zesty isn't an LTS nor the current development release I hope not | 17:59 |
mordred | SpamapS: but if they do, I can get that done - and we're covered by the PPA in the mean time | 17:59 |
SpamapS | yeah and since they went to the 9 month instead of 12 month support cycle, nobody expects 16.04->16.10 after the 6 month window. | 17:59 |
SpamapS | yakkety's already EOL | 18:00 |
SpamapS | and to wrap it up in a bow, I pinged doko about the SRU upload in #ubuntu-devel | 18:03 |
SpamapS | so at this point, I'd say we can move on and just delete the package from the PPA whenever the SRU lands in xenial-updates | 18:04 |
clarkb | or upgrade to next lts whichever comes first | 18:04 |
clarkb | also this was a bug in 3.6 too should make sure its patched in newer ubuntu | 18:05 |
SpamapS | only 8 months away! | 18:05 |
SpamapS | clarkb: see the bug, artful has already got 3.6.2 which has the fix. | 18:05 |
SpamapS | I THINK I got the statuses right on the bug | 18:06 |
clarkb | two lts releases ina row we've caught segfault bugs in python | 18:06 |
mordred | clarkb: we're getting good at that | 18:09 |
mordred | clarkb: I am curious as to whether this is affecting tempest stability | 18:10 |
jeblair | SpamapS: why is the python3.5 overall status 'invalid' ? | 18:11 |
clarkb | mordred: ya it has a higher fail rate and I'm guessing this must have something to do with it | 18:11 |
clarkb | I've approved the change to the executors to use the ppa, so we can wait for daily upgrades to pull it down or we'll need a manual reinstall | 18:11 |
SpamapS | jeblair: because that's the status "in the development release" | 18:16 |
SpamapS | there's no python3.5 in the dev release | 18:16 |
mordred | jeblair: "oldrev = 40 * '0'" - didn't we introduce a constant or a name for that or something somewhere? | 18:17 |
mordred | (the 40 * '0') | 18:17 |
SpamapS | quirk of the way source package names change with upstream python | 18:17 |
clarkb | mordred: null | 18:17 |
mordred | maybe that was just for reporting back to user | 18:17 |
clarkb | mordred: internally its no longer 40 zeros | 18:17 |
clarkb | its just None or null | 18:17 |
mordred | but it is coming from gerrit - cool, I thought I remembered something there | 18:17 |
jeblair | mordred, pabelanger: with all the problem of the past 2 days, i lost track of movement on the tarball/publish jobs. sitrep? | 19:05 |
pabelanger | have have 2 patches up for untrusted / trusted version. Just need to see what people prefer, they should both do the same thing. https://review.openstack.org/494672/ and https://review.openstack.org/494677/ | 19:06 |
pabelanger | twine is ready and depending on ^ to some degree. Confirmed pip is working on executor | 19:06 |
pabelanger | gpg still needs to be done, but just need to ensure it exists on executors mostly | 19:07 |
jeblair | pabelanger: we do have site local variables -- but your patch(es?) are still adding TODO comments for mordred to add site local vars | 19:07 |
jeblair | pabelanger: is there a reason not to use those now? | 19:07 |
pabelanger | jeblair: ya, we can move them into site-varaibles now | 19:08 |
pabelanger | I can propose patches for that also | 19:08 |
jeblair | pabelanger: (though, i don't know why we should add site local variables for things in project-config or openstack-zuul-jobs) | 19:08 |
pabelanger | true, maybe we don't need too | 19:08 |
jeblair | i think that's the position i'm going to adopt. :) | 19:09 |
pabelanger | wfm | 19:09 |
pabelanger | I'm pretty sure we can also swap out our zuulv3-dev ssh keys for production, people just need to audit things if they still believe an issue | 19:09 |
jeblair | (the whole point of zuulv3 was to get this stuff *out* of system-config, everything we put back in walks that back, so we should be very reluctant to do so) | 19:09 |
pabelanger | Ya, site-variables are still a little odd, because we manage that with puppet. | 19:10 |
clarkb | is bindep not going to be part of a base job pre run? | 19:12 |
pabelanger | not right now | 19:13 |
jeblair | pabelanger: i don't understand 494677 -- it's adding a second post-run playbook. why? | 19:14 |
clarkb | also looks like you delete the wheels that are built because they are not tar.gz files | 19:14 |
pabelanger | jeblair: right, is we decided the untrusted approach, I need to collapse those playbooks into a single post playbook | 19:15 |
pabelanger | yes, we don't upload whls on branch tarballs, or do we? | 19:16 |
clarkb | I'm not sure but you are building them | 19:16 |
jeblair | pabelanger: yeah, honestly, i find it hard to understand what that job is doing. i'd prefer one playbook and all the tasks refactored into roles and just have the playbooks of both jobs use the same roles. | 19:17 |
pabelanger | clarkb: yes, this is the same as JJB today, we build wheels but don't upload | 19:17 |
clarkb | huh that seems like a bug | 19:17 |
jeblair | yeah, we certainly don't need to do throwaway work in a post job | 19:17 |
clarkb | ++ | 19:18 |
pabelanger | jeblair: okay, so should I focus trusted on untrusted playbooks? | 19:18 |
clarkb | pabelanger: maybe just drop the bdist_wheel in your update and when we switch that bug will be fixed | 19:18 |
pabelanger | and we are okay uploading wheels for branch-tarballs? | 19:18 |
clarkb | or we can upload wheels. I think dropping the wheel build is likely fine and will simplify things | 19:18 |
jeblair | pabelanger: mordred never got back to you with a preference for which repo that job should be in? | 19:19 |
pabelanger | clarkb: I'll upload whl, that is actually less work for playbooks | 19:20 |
pabelanger | jeblair: nobody except you | 19:20 |
jeblair | pabelanger: is there a reason you want it in openstack-zuul-jobs as opposed to project-config? | 19:23 |
clarkb | for those of us that haven't been able to keep up with the thinking on ^ might be nice to have a short blurb on the current structure of things | 19:23 |
pabelanger | jeblair: I've just found it easier to test ansible in untrusted projects | 19:24 |
pabelanger | that's really about it | 19:24 |
jeblair | pabelanger: how so? | 19:25 |
jeblair | (if you're able to upload something to tarballs.o.o in a check pipeline, that's a bug) | 19:25 |
pabelanger | right, with 494677 we talked about creating experimental pipeline to allow for testing that | 19:25 |
pabelanger | so we could validate uploads to tarballs.o.o | 19:26 |
jeblair | pabelanger: how would that work? i don't remember this chat. | 19:26 |
pabelanger | I'll hack to look back in IRC logs, but let me maybe ask a different question | 19:27 |
pabelanger | today, a project can add a new exerpimental job to JJB, to ensure things upload to tarballs.o.o. Would we also do that in zuulv3? | 19:27 |
jeblair | pabelanger: we should never approve a job to the experimental pipeline that uploads something to tarballs. experimental is pre-review. | 19:28 |
pabelanger | okay, so we need to fix that, because this is how I was able to test our deb-package work, and how kolla also testing their publishing things | 19:28 |
pabelanger | so, in this case, it doesn't make sense to have it in openstack-zuul-jobs, as we cannot actually do the publish part | 19:29 |
jeblair | pabelanger: yah, i mean, if that's in place, anyone can propose a change to kolla that uploads whatever they want to tarballs.o.o | 19:29 |
pabelanger | ya, basically | 19:29 |
pabelanger | we usually add the experimental job to validate the upload, then move it right into post | 19:30 |
jeblair | experimental is to validate things before they get moved to check | 19:30 |
pabelanger | okay, that's not have we have been using it for some time | 19:31 |
pabelanger | I'll stop doing that now | 19:31 |
jeblair | "On-demand pipeline for requesting a run against a set of jobs that are not yet gating." | 19:31 |
jeblair | cool, thanks :) | 19:31 |
pabelanger | so, then this will live in project-config | 19:31 |
jeblair | sounds like a plan | 19:31 |
pabelanger | we first need to delete the existing publish-openstack-python-branch-tarbal job in zuul, then project-config so we can land 494672 | 19:32 |
jeblair | so move to project config -- and roles so we can share things between the regular and branch versions of the job | 19:32 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Remove publish-openstack-python-branch-tarball job https://review.openstack.org/495418 | 19:34 |
*** jkilpatr has quit IRC | 19:35 | |
jeblair | we could set up openstack-zuul-jobs to shadow project-config to make this easier in the future, but we can do that ^ for now | 19:36 |
pabelanger | sure, if that is better in the long run | 19:37 |
jeblair | pabelanger: are you going to move publish-openstack-python-tarball too? | 19:38 |
pabelanger | yes, looking at that now | 19:38 |
pabelanger | jeblair: 494672 should be that | 19:38 |
pabelanger | Going to make the changes clarkb suggested | 19:39 |
mordred | jeblair, pabelanger: sorry - my brain has had a hard time grokking the scrollback just now - I agree about roles - can I restate a few of the other things said above to make sure I understand what we're talking about? | 19:44 |
pabelanger | I think the confusion comes from experimental pipelines being able to publish to tarballs.o.o | 19:45 |
pabelanger | after that was cleared up, project-config is the right place | 19:45 |
mordred | ok. lemme just make sure I've got it all though - I'm feeling dumb at the moment ... :) | 19:45 |
mordred | openstack-publish-artifacts must be in project-config because it does the actual artifact publication | 19:45 |
pabelanger | yes | 19:46 |
mordred | none of check, gate and experimental pipelines should be able to execute jobs that have openstack-publish-artifacts as a base job | 19:46 |
pabelanger | yes | 19:46 |
leifmadsen | Shrews: great doc change, +1 | 19:47 |
mordred | publish-openstack-python-tarball COULD be in openstack-zuul-jobs with openstack-publish-artifacts as a base, but in either case because it has openstack-publish-artifacts as a base it will not be able to be added to check, gate or experimental | 19:47 |
mordred | (like, it's safe for it to be, but it still wouldn't be able to be run speculatively) | 19:47 |
pabelanger | yes | 19:47 |
mordred | ok. cool. mostly just making sure I had it all in my head straight :) | 19:48 |
*** jkilpatr has joined #zuul | 19:50 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Remove publish-openstack-python-branch-tarball job https://review.openstack.org/495418 | 19:51 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add role for emitting ara logs on the executor https://review.openstack.org/495422 | 20:07 |
jeblair | mordred: did you want to review 494651 ? | 20:14 |
jeblair | also, is anyone else interested in reviewing my changes to use the cached branch list in dynamic reconfiguration? 494650, 494651, and 494618 | 20:15 |
jeblair | that's a performance improvement we need before going live | 20:15 |
mordred | jeblair: yes! | 20:18 |
mordred | jeblair: I thought I'd reviewed that one already ... | 20:18 |
mordred | oh - I got distracted by the 40 * '0' question | 20:18 |
mordred | jeblair: stack looks great | 20:19 |
mordred | jeblair: https://review.openstack.org/#/c/494343/ is finally green after all the recheck fun too :) | 20:19 |
mordred | pabelanger, SpamapS: ^^ jeblair's patches are more important, but I'd like to get that one in too so that we can use it in the base publish jobs | 20:20 |
jeblair | i re +2d that (lost with my recheck) | 20:22 |
pabelanger | +3 for jeblair stack | 20:23 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: DNM Reparent unittests on base-test https://review.openstack.org/495424 | 20:26 |
mordred | jeblair: I'm not getting web streaming http://zuulv3.openstack.org/static/stream.html?uuid=1874f74d2aea4a2195915e1705d50548&logfile=console.log | 20:28 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add role for emitting ara logs on the executor https://review.openstack.org/495422 | 20:29 |
jeblair | mordred: yes, i observed that yesterday too. finger works though. so maybe a zuul-web issue? | 20:29 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Allow multiple semaphore definitions within a project https://review.openstack.org/494650 | 20:29 |
mordred | jeblair: maybe so - maybe we need to have restarted it with the other things that were going on? | 20:30 |
pabelanger | Hmm, we web streaming not working? | 20:30 |
jeblair | mordred: i can't think of why; it may represent a failure worth examining. | 20:31 |
pabelanger | finger directly still works | 20:31 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Reload configuration when branches are created or deleted https://review.openstack.org/494651 | 20:31 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Use cached branch list in dynamic config https://review.openstack.org/494618 | 20:31 |
jeblair | SpamapS: if i add support for using bwrap tmpfs to hold secrets without hitting disk, i'm not sure what the nullwrap driver should do. do you have any thoughts? | 20:32 |
mordred | jeblair: agree | 20:32 |
jeblair | 2017-08-18 20:30:38,190 DEBUG zuul.web.LogStreamingHandler: Connecting to finger server ze01:79 | 20:33 |
jeblair | i rebooted ze01 | 20:33 |
jeblair | it got a new hostname | 20:33 |
jeblair | cat /etc/hostname | 20:34 |
jeblair | ze01 | 20:34 |
jeblair | cat /etc/hostname | 20:34 |
jeblair | ze02.openstack.org | 20:34 |
jeblair | let's move that to -ifra | 20:34 |
mordred | jeblair: nullwrap could still use a tmpfs for them | 20:34 |
jeblair | mordred: yeah, but then i'm writing a bunch of code in zuul to manage tmpfs's. i'm not sure anyone is using this code, and i don't know who would... which makes me....unenthused about writing it? | 20:35 |
mordred | jeblair: oh - sorry - what I meant was that in nullwrap you could just ignore it and let /tmp / mktmp() work as normal - and if a person is running iwth nullwrap they may want to set up a single tmpfs (maybe in /tmp) if they care about secrets hitting disk | 20:36 |
jeblair | mordred: well, the idea i have is to have bwrap both create the tmpfs and write the file contents out to it. so nullwrap would have to implement both of those things. it is slightly easier if i just have nullwrap 'mkdir' and splat the contents. that's probably only a few lines. | 20:37 |
mordred | jeblair: nod. yah- I think having nullwrap just do a mktemp() in the 'make tmpfs' call and just splat to it seems fine to me | 20:39 |
jeblair | mordred: not so much mktemp, but okay, i'll have nullwrap implement this by writing the secrets to disk. | 20:45 |
jeblair | that will be a behavior difference between nullwrap and bubblwrap, and will only reinforce the idea that no one should use nullwrap and everyone should use bubblewrap. | 20:46 |
jeblair | i think with an important behavior/security difference like this, we should either articulate under what circumstances nullwrap should be used, or remove it. | 20:47 |
jeblair | i restarted ze01 with correct hostname | 20:48 |
SpamapS | jeblair: nullwrap should make its own tmpfs? | 20:49 |
jeblair | SpamapS: for feature parity, yes. to be honest, i'm trying to get out of writing that because i'm not sure if there are any users. | 20:50 |
mordred | I'd rather see the simple "just write to disk" answer for nullwrap and put in a doc note/todo that if someone has a usecase for using nullwrap and they are concerned about secrets hitting disk on their executor there is work to do | 20:51 |
jeblair | mordred: but is that right under the note that says "don't use nullwrap because your executor will be pwned?" | 20:52 |
jeblair | like, nullwrap is a foot-cannon, so i'm trying to understand its place | 20:52 |
jeblair | it has no presence in the documentation atm, so to write that note we'd have to say "there's a thing you can do but you probably shouldn't and if you did it's still a bad idea because of this which you should fix" | 20:54 |
jeblair | if we're going to do that, i would like us to understand who the audience is. i don't. | 20:54 |
mordred | jeblair: well -for folks who are running internally or for them selves and don't have a distro that has bubblewrap available but the python stuff otherwise works - I could see them legitimately not wanting to care about bubblewrap/writing secrets to disk | 20:56 |
mordred | "run zuul on my laptop on my own content" ... probably doesn't care about wrapping ansible-playbook calls in bubblewrap | 20:57 |
jeblair | mordred: i agree, however, bubblewrap is the zero-effort* option -- asking someone to make this choice means having to explain the difference. | 20:58 |
jeblair | mordred: * i didn't think bubblewrap in distros was a problem? | 20:58 |
jeblair | (obviously, if your distro doesn't provide bubblewrap, that's not zero-effort) | 20:59 |
mordred | jeblair: well- we add a ppa for it for our xenial server - but it's PRETTY low effort to do that | 21:00 |
jeblair | ah :/ | 21:01 |
SpamapS | I wrote it in large part because I was worried there'd be users who want to use zuul w/o untrusted jobs and without access to bubblewrap. It may have been a bit too forward thinking. | 21:02 |
SpamapS | I truly also didn't realize how dead ubuntu backports was. | 21:02 |
SpamapS | They just sit now. | 21:03 |
jeblair | SpamapS: your hypothetical user sounds pretty close to mordred's | 21:03 |
mordred | yah - but they are hypothetical and there is a ppa - I could be argued either way here :) | 21:04 |
jeblair | my personal preference would be for it to be easy for zuul to be secure by default. we've put a lot of thought into that. so i would love it if we could decide that bwrap is simple enough to just require it in all cases. failing that, so i guess my question is: would that user prefer secrets not hit disk? or are they okay with them hitting disk? or would they prefer not to use secrets at all? (this is a new third option) | 21:06 |
jeblair | i'm happy to defer the question of whether we keep nullwrap for a while if we chose "okay with hitting disk" or "okay with no secrets at all". "implement tmpfs support in nullwrap" is enough work that i think we need to justify commitment. :) | 21:08 |
mordred | jeblair: I agree. I'm personally happy to ditch it and say that bubblewrap is easy enough | 21:09 |
pabelanger | so far bubblewrap has been working well | 21:10 |
pabelanger | I don't mind if we default to that and remove nullwrap | 21:11 |
SpamapS | jeblair: we could just make nullwrap explode on use of secrets for now. | 21:13 |
SpamapS | jeblair: that would be one way to find the users. | 21:13 |
jeblair | SpamapS: ok i'll do that | 21:14 |
SpamapS | raise NotImplementedError('If you want secrets with this, come talk to us.') | 21:15 |
SpamapS | we desperately want to understand you :) | 21:15 |
mordred | raise WhatsWrongWithYou("srrsly") | 21:16 |
SpamapS | raise CantEven() | 21:17 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: DNM Reparent unittests on base-test job https://review.openstack.org/495424 | 21:24 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP Add wrapper driver execution context https://review.openstack.org/495439 | 21:42 |
jeblair | mordred, SpamapS: ^ that's not done yet and will fail tests, but i wanted to get the API change up for early feedback. i realized that was a problem as i started thinking about this. | 21:42 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul feature/zuulv3: Document execution_wrapper setting. https://review.openstack.org/495440 | 21:44 |
SpamapS | jeblair: looking. ^^ is my stab at documenting the situation. | 21:44 |
SpamapS | we could also add a "may be removed in the future." | 21:45 |
SpamapS | As that may spark readers who want to use it to ask us not to remove it in the future. ;) | 21:45 |
jeblair | SpamapS: lgtm, and yeah that may be a good idea. | 21:45 |
mordred | jeblair: that approach seems good to me | 21:45 |
SpamapS | oh that's interesting. I hadn't even seen the setMountsMap method | 21:47 |
jeblair | and i'll build on that to add the tmpfs secrets | 21:47 |
SpamapS | or I purged it from memory | 21:47 |
SpamapS | I wonder if we should mention the minimum version of bubblewrap we require | 21:48 |
SpamapS | because I think they only recently added --die-with-parent | 21:48 |
jeblair | SpamapS: yeah, it arrived first to allow custom setting of site-wide ro/rw binds (which was fine as a driver method). then very recently i abused it to only mount the current playbook, without realizing the ramifications of that. | 21:48 |
SpamapS | jeblair: ahhhhh | 21:48 |
SpamapS | yeah this looks good, and I like the use of the term execution context, clarifies what that object is nicely | 21:49 |
jeblair | cool, i'll finish polishing it up | 21:50 |
mordred | SpamapS: if you have a sec, feel like nudging https://review.openstack.org/#/c/494343/ across the line? | 21:57 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add wrapper driver execution context https://review.openstack.org/495439 | 21:59 |
SpamapS | mordred: reading | 22:01 |
SpamapS | mordred: the description of the case was a little mind bending.. and I realized the word 'secret' is really unpleasant to say over and over in your head for some reason... but, +A ;) | 22:05 |
mordred | SpamapS: \o/ | 22:07 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Allow requesting secrets by a different name https://review.openstack.org/494343 | 22:13 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul feature/zuulv3: Document execution_wrapper setting. https://review.openstack.org/495440 | 22:13 |
SpamapS | Added a note about removing it and noted nullwrap with a footnote where appropriate. | 22:13 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Document and update fileserver roles https://review.openstack.org/494291 | 22:22 |
mordred | jeblair: when executor is set in "keep" mode, we still don't save the secrets file do we? | 22:23 |
jeblair | mordred: currently yes, after the change i'm writing, no | 22:34 |
*** maxamillion has quit IRC | 22:39 | |
mordred | well - that's ok - I've figured out this particular issue | 22:40 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Write secrets to tmpfs https://review.openstack.org/495449 | 22:57 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Document and update fileserver roles https://review.openstack.org/494291 | 23:02 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Document and update fileserver roles https://review.openstack.org/494291 | 23:08 |
jeblair | mordred: in 494291, why 'artifact_fileserver_group' instead of 'fileserver_group'? I ask because the add-fileserver role doesn't otherwise use the word artifact, so it feels like it's introducing something new. | 23:15 |
jeblair | (and i guess, similarly the default of 'zuul_artifacts_fileserver' instead of 'zuul_fileserver') | 23:15 |
mordred | jeblair: gah. I missed it | 23:18 |
mordred | I was trying ot get those artifact references gone | 23:18 |
jeblair | yay i helped! | 23:18 |
mordred | oh - actually - I was going to just remove that variable - on second thought it doesn't really help | 23:20 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Document and update fileserver roles https://review.openstack.org/494291 | 23:21 |
mordred | jeblair: ^^ how about that? | 23:22 |
mordred | also - fingers crossed - rechecking the base-test job again | 23:22 |
mordred | GAH :( | 23:25 |
jeblair | mordred: what's the publish role look like when you use it? ie, how to specify the host? | 23:25 |
mordred | jeblair: you do that in the hosts: line of the playbook | 23:25 |
mordred | jeblair: so very similar to this pattern: http://git.openstack.org/cgit/openstack-infra/project-config/tree/playbooks/base-test/post-logs.yaml | 23:26 |
mordred | jeblair: except replace "upload-logs" with "publish-openstack-artifacts" | 23:26 |
jeblair | mordred: gotcha, so just reach into the secret for the fqdn | 23:27 |
jeblair | (lgtm) | 23:27 |
mordred | woot | 23:27 |
pabelanger | working? | 23:27 |
mordred | well - still not working | 23:27 |
pabelanger | soon(tm) | 23:28 |
pabelanger | mordred: Oh, I see the issue. hostvars[groups['zuul_logserver']] no longer exists | 23:31 |
mordred | ah | 23:31 |
mordred | pabelanger: that should just be hostvars[site_logs.fqdn] at this point yeah? | 23:32 |
pabelanger | check syntax, but ya | 23:32 |
pabelanger | ya, that seem right | 23:32 |
mordred | cool. I'll push that up | 23:32 |
*** maxamillion has joined #zuul | 23:33 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add wrapper driver execution context https://review.openstack.org/495439 | 23:36 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Write secrets to tmpfs https://review.openstack.org/495449 | 23:36 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!