jeblair | * (not a recommended production configuration) ;) | 00:00 |
---|---|---|
jlk | lol yeah | 00:00 |
jlk | oh I'm getting alarm clock on feature/zuulv3 Interesting. I wasn't before | 00:11 |
*** jamielennox is now known as jamielennox|away | 04:18 | |
tobiash | jeblair: thanks for approving the semaphore patch :) | 05:36 |
tobiash | jeblair: what should be the process for the v2 version of that patch? | 05:37 |
tobiash | jeblair: shall I abandon this now or leave it open for others until the release of v3 and abandon this then? | 05:37 |
tobiash | SpamapS: I've left a comment on https://review.openstack.org/#/c/450704/ | 06:45 |
tobiash | SpamapS: I still think it's cleaner to use unicode but I'm ok with abandoning the patch if wanted | 06:45 |
*** bhavik1 has joined #zuul | 09:09 | |
*** yolanda has quit IRC | 09:19 | |
*** yolanda has joined #zuul | 09:19 | |
*** bhavik1 has quit IRC | 09:29 | |
*** openstackgerrit has quit IRC | 11:18 | |
*** hashar has joined #zuul | 11:35 | |
*** tobiash has quit IRC | 13:12 | |
*** openstackgerrit has joined #zuul | 13:15 | |
openstackgerrit | Merged openstack-infra/nodepool feature/zuulv3: Show message if node hold not found https://review.openstack.org/453971 | 13:15 |
eggshell | o/ | 14:15 |
jlk | So I let a bisect run last night, 916acb0a64c3afdde1a098cf0232b829ec68376b was the last hash that I could run tests locally. | 14:27 |
jlk | 91132fbe9155b4482e267b9d3ba703ea62c6eeba is where things started to fail. | 14:27 |
mordred | jlk: if you do a git revert of 14ab6ca01a8827918cb50dcb90a40c293786ea01 (the commit that commit merged) - does the tree work for you again? | 14:30 |
jeblair | jlk: since that's just enabling a single test, the cause may be testtools reshuffling test order in a way which caused more contention. | 14:31 |
mordred | yah- that's why I'm kind of curious if a revert on top of things fixes things again | 14:31 |
mordred | because wow that commit certainly doesn't seem be the sort of thing that would do anything substantial like break everything | 14:32 |
jeblair | mordred: i was thinking -- there's one other aspect of the config/project repo difference that is sort of hidden by changing the names to 'trusted/untrusted projects' -- config repos are allowed to configure more things (like pipelines), and are branchless (we only read from the master branch). | 14:37 |
jeblair | maybe that's two things. :) | 14:38 |
*** hashar has quit IRC | 14:40 | |
mordred | jeblair: hrm. I agree with you- both that it hides things, and that it hides two things | 14:45 |
Shrews | jeblair: i think the first thing is sort of inherent in the "trusted" thing. the branch thing is not obvious though | 14:46 |
jeblair | Shrews: yeah, the branch thing is more subtle | 14:46 |
Shrews | it's actually a bit counterintuitive that trusted projects are constrained to master, but untrusted aren't | 14:49 |
Shrews | (from a non-zuul-expert POV) | 14:49 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove source from pipelines (1/2) https://review.openstack.org/453362 | 14:51 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Replace config/project repos with trusted/untrusted projects https://review.openstack.org/453347 | 14:51 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add hostname to TriggerEvent https://review.openstack.org/452348 | 14:51 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove source from pipelines (2/2) https://review.openstack.org/453821 | 14:51 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fully qualify project configuration names https://review.openstack.org/451970 | 14:51 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add source to project and remove unused tenant attrs https://review.openstack.org/451969 | 14:51 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Pass source to project instantiations https://review.openstack.org/451596 | 14:51 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add a project index to Tenant https://review.openstack.org/451597 | 14:51 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove unused Tenant.getRepo method https://review.openstack.org/451929 | 14:52 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Use new tenant project index for config references https://review.openstack.org/451928 | 14:52 |
mordred | jeblair: maybe we should go back to config repos and project repos and just rename the opentsack-infra/project-config repo so that we personally don't go crazy :) | 14:52 |
jeblair | Shrews: yeah, i think fundamentally 'trusted project' says "this is a project of higher status" when it's the other way around -- it's a project that principally serves zuul. | 14:53 |
jeblair | mordred: yeah, maybe... though i do still want to remove the word 'repo' from the tenant config file... | 14:54 |
jeblair | config-project / untrusted-project? config-project / project? | 14:55 |
Shrews | zuul-project / project ? | 14:56 |
Shrews | blue-pill / red-pill ? :) | 14:57 |
jeblair | fred / ginger ? | 14:57 |
mordred | jeblair: naming sucks | 14:58 |
jeblair | ya | 14:58 |
jlk | jeblair: mordred: yeah sorry I had to step away for some domestic duties. I plan to try a variety of things with the tree today to figure out what's going on with tests. | 15:27 |
jlk | this could be a false report from bisect, so more investigation is needed. | 15:29 |
pabelanger | Shrews: so, found an issue with nodepool-launcher, if we remove a provider from configuration, with images still uploaded to cloud, we have no way to remove them from zookeeper | 15:32 |
pabelanger | Shrews: you can see this on nodepool.o.o today, with the tripleo-test-cloud-rh2 provider | 15:32 |
Shrews | pabelanger: out to lunch, but pretty sure jeblair put code in to handle that | 15:33 |
pabelanger | k | 15:33 |
Shrews | iirc. Could be misremembering | 15:34 |
Shrews | Something something obsoleteProvider something | 15:35 |
jeblair | pabelanger: https://docs.openstack.org/infra/nodepool/operation.html#removing-a-provider | 15:35 |
jeblair | there's the wind-down procedure for a provider | 15:35 |
pabelanger | https://review.openstack.org/#/c/451115/1/nodepool/nodepool.yaml | 15:37 |
pabelanger | max-servers was 0 | 15:37 |
pabelanger | that likely explains it | 15:37 |
pabelanger | okay, I'll clean this up | 15:37 |
jeblair | pabelanger: max servers 0 should be fine too as long as all the images are removed. | 15:38 |
pabelanger | jeblair: images: [] in provider section enough for that? | 15:38 |
jeblair | pabelanger: yep. | 15:38 |
pabelanger | k | 15:39 |
jeblair | pabelanger: i think you can even omit the key entirely | 15:39 |
jeblair | mordred, Shrews: i'm leaning toward config-project / untrusted-project. | 15:40 |
mordred | jeblair: thesaurus.com lists "thing" as a synonym for project ... so we could do config-thing and untrusted-thing | 15:42 |
jeblair | (i feel like 'project' needs a modifier for each one) | 15:42 |
jeblair | bwahaha | 15:42 |
mordred | jeblair: also, in the bwahaha section, "baby" shows up as an informal synonym - I think "config-baby" vs "untrusted-thing" would be great | 15:43 |
mordred | or maybe just "baby" and "thing" | 15:43 |
jlk | resource, archive, codex, .... suppository (as opposed to repository) | 15:43 |
jeblair | mordred: is thesaurus.com just a random word generator? | 15:44 |
mordred | yah | 15:44 |
mordred | """ project c.1400, "a plan, draft, scheme," from L. projectum "something thrown forth," """ - in case anyone was curious | 15:45 |
jlk | I have no joke, I just like saying "projectum" | 15:45 |
*** tobiash has joined #zuul | 15:45 | |
jeblair | i think that fairly well describes what we're trying to accomplish here. :) | 15:48 |
openstackgerrit | Paul Belanger proposed openstack-infra/nodepool master: Remove ubuntu-precise from dsvm-nodepool jobs https://review.openstack.org/454794 | 16:36 |
jlk | maybe I just need to reboot the VM between each test run :/ | 16:36 |
*** tobiash has quit IRC | 17:14 | |
*** tobiash has joined #zuul | 17:15 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove source from pipelines (1/2) https://review.openstack.org/453362 | 17:16 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Replace config/project repos with trusted/untrusted projects https://review.openstack.org/453347 | 17:16 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove source from pipelines (2/2) https://review.openstack.org/453821 | 17:16 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fully qualify project configuration names https://review.openstack.org/451970 | 17:16 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Extend test timeout to 120s https://review.openstack.org/454806 | 17:16 |
jeblair | jlk: i'm bumping the test timeout in my stack ^. i'd like to pare it back later, but at the moment, the tests are doing a lot of extra work which will take some effort to clean up. | 17:18 |
jeblair | jlk: that may help | 17:19 |
jlk | I was using a rather long timeout. I think there's just something about repeated runs in the same environment that are making things fail. I rebooted and a commit that had previously failed started working. | 17:20 |
jeblair | jlk: hrm. you might also try removing the .testrepository directory between runs. that should cause testr to avoid "optimizing" the ordering of the tests. | 17:22 |
jlk | oh good call. | 17:22 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix dynamic reconfiguration https://review.openstack.org/454395 | 17:37 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP test lookups https://review.openstack.org/454396 | 17:37 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Set jobdir to 0755 before we delete it https://review.openstack.org/454819 | 17:38 |
pabelanger | jeblair: clarkb: ^ os.walk change | 17:44 |
clarkb | was just looking at that, I think this is simpler than the error handler method | 17:46 |
clarkb | wondering if we shouldn't try to unlink directly rather than shutil.rmtree | 17:47 |
clarkb | but that also seems like less simple | 17:48 |
jeblair | i think os.walk is going to be top-down? presumably rmtree is bottom-up? so this is probably the most reliable thing? | 17:49 |
pabelanger | think so | 17:50 |
clarkb | oh good point | 17:50 |
jeblair | (ie, if we unlink on the way down, we may still encounter errors. so we need to chmod all the way down before we can start unlinking on the way up) | 17:50 |
mordred | jeblair: I think you can recurse with os.walk and then do the rms on the way back up the stack? | 17:50 |
clarkb | you can make os.walk do it the other way but ya | 17:50 |
mordred | jeblair: you said words better than I did | 17:50 |
clarkb | I left a comment and looks like pep8 is unhappy | 17:50 |
jeblair | oh, yeah, we run pep8 under py3 now so that needs to be 0o755 now | 17:51 |
clarkb | aha | 17:52 |
pabelanger | great | 17:52 |
pabelanger | let me fix | 17:52 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Set jobdir to 0755 before we delete it https://review.openstack.org/454819 | 17:53 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove source from pipelines (1/2) https://review.openstack.org/453362 | 17:54 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Replace config/project repos with trusted/untrusted projects https://review.openstack.org/453347 | 17:54 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove source from pipelines (2/2) https://review.openstack.org/453821 | 17:54 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Extend test timeout to 120s https://review.openstack.org/454806 | 17:54 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fully qualify project configuration names https://review.openstack.org/451970 | 17:54 |
pabelanger | tmpreaper, that was the app I was thinking of | 17:55 |
pabelanger | I'll see about adding puppet-tmpreaper to zl.o.o servers | 17:55 |
jeblair | handy but we can't rely on it. an executor can fill up / very quickly. :) | 17:55 |
pabelanger | true | 17:56 |
openstackgerrit | Cullen Taylor proposed openstack-infra/zuul feature/zuulv3: [WIP] Integration: Get static nodes from nodepool https://review.openstack.org/454826 | 17:56 |
mordred | 138709 | 18:01 |
jeblair | that's a very old change number. yes that's what that is. | 18:03 |
openstackgerrit | Merged openstack-infra/zuul master: Set jobdir to 0755 before we delete it https://review.openstack.org/454819 | 18:04 |
*** rcarrill1 has joined #zuul | 18:05 | |
pabelanger | Also, when was the last time we restart v25 launchers for zuul? | 18:05 |
pabelanger | pretty nice, not having to do that weekly :) | 18:05 |
*** rcarrillocruz has quit IRC | 18:06 | |
SpamapS | jlk: if removing .testrepository works, then you can likely find the misbehaving test(s) with analyze-isolation | 18:52 |
jlk | it doesn't work in isolation. A full reboot seems to have some help? | 18:53 |
SpamapS | analyze-isolation tries to find tests that interact with the others | 18:53 |
SpamapS | but a full reboot suggests .testrepository isn't the problem really.. maybe it's the zookeeepr | 18:54 |
jlk | Slowly climbing up the commit tree. I tried skipping afew and hit a failure. I'm kind of doing my own manual bisect. | 18:59 |
clarkb | also if tests don't work when isolated then likely not an inter test interaction | 19:00 |
pabelanger | https://review.openstack.org/#/c/454794/ removes ubuntu-precise from our dsvm job, since it is EOL | 19:02 |
SpamapS | the tests DO work when isolated | 19:07 |
SpamapS | only running the full suite produces the weird timeouts | 19:07 |
clarkb | ah | 19:07 |
SpamapS | which suggests there's an ordering issue | 19:07 |
clarkb | jlk said they don't work in isolation above | 19:07 |
SpamapS | which is exactly what testr can fix | 19:07 |
SpamapS | clarkb: "it" being the full suite | 19:07 |
SpamapS | I took it as anyway. | 19:07 |
SpamapS | Dunno, but I'm having similar issues to jlk | 19:08 |
SpamapS | to the point where I don't even bother trying to run it locally anymore | 19:08 |
jlk | I'd go that route, but my change is failing upstream and I would like to figure out why :/ | 19:10 |
clarkb | just thinking off the top of my head the .testrepository and zk server aren't reused in the gate | 19:10 |
clarkb | the .testrepository not being used means the test ordered in the gate is naive, but will be "smart" locally if you have at least one test run recorded | 19:11 |
SpamapS | jlk: oh you're geting a timeout fail in the gate too? but a different one? | 19:15 |
SpamapS | that I missed | 19:15 |
openstackgerrit | Merged openstack-infra/nodepool master: Remove ubuntu-precise from dsvm-nodepool jobs https://review.openstack.org/454794 | 19:19 |
*** openstackgerrit has quit IRC | 19:19 | |
jlk | SpamapS: I'm not sure what's happening in the check queue, it's 80~ megs of logs :) I'll wget it at some point. | 19:22 |
clarkb | apparently we need a mysql now? | 19:25 |
* clarkb will get one | 19:25 | |
clarkb | but first lunch | 19:25 |
mordred | clarkb: ya - for tests of the mysql reporter | 19:26 |
jeblair | mordred, pabelanger: i'm trying to put things in order before i leave next week -- | 20:06 |
jeblair | i'm not going to finish https://review.openstack.org/454396 today, so perhaps you can continue working on it next week. i've established a pattern there, so it should be fairly easy to go through the rest of the lookups | 20:07 |
jeblair | in fact, you could just land that patch and then do followups | 20:07 |
jeblair | it's parent is already ready to land. | 20:07 |
jeblair | pabelanger: i think we're okay to re-start zuulv3-dev now, and proceed with more jobs when you're ready | 20:08 |
pabelanger | great. Yes, I'll help work on the lookup tests for sure | 20:09 |
jeblair | pabelanger: feel free to just take over that change if you want. you can add new tests directly to it, or drop the WIP, land it, and do followups. | 20:10 |
pabelanger | k | 20:10 |
clarkb | jlk: is test_playbook one that fails for you when run in the full suite? | 20:10 |
jeblair | clarkb, fungi, pabelanger, SpamapS: it looks like the zuulv3 executor security spec is ready, so i've added it to next week's infra meeting agenda for approval even though SpamapS won't be there. i suspect it's been gone over enough to do this async, but of course, if there are objections in the meeting, the chair can just postpone it until next week (that's why we put these on the meeting agenda after all). | 20:13 |
pabelanger | jeblair: ack | 20:22 |
*** rcarrill1 is now known as rcarrillocruz | 20:23 | |
clarkb | its on my list of changes to review | 20:26 |
clarkb | along with the translation site spec ... | 20:26 |
*** rcarrillocruz has quit IRC | 20:31 | |
*** rcarrillocruz has joined #zuul | 20:32 | |
fungi | wfm | 20:34 |
fungi | i still intend to go through it this weekend as well | 20:34 |
*** openstackgerrit has joined #zuul | 20:37 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove source from pipelines (1/2) https://review.openstack.org/453362 | 20:37 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Replace config/project repos with config/untrusted projects https://review.openstack.org/453347 | 20:37 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove source from pipelines (2/2) https://review.openstack.org/453821 | 20:37 |
jeblair | mordred, Shrews: ^ that's now 'config-projects' and 'untrusted-projects'. | 20:37 |
jlk | clarkb: that's one I see often, yes. | 21:07 |
jlk | Oooh, I think I have a repeatable failure. | 22:01 |
jlk | 87834afb2a8b70113a91ebe094031d725e2385d9 seems to consistently fail. | 22:01 |
jlk | before it passes. | 22:01 |
jlk | ( 1ba9651166e7d8b6d720a97ae22b0abb08ede594 ) | 22:01 |
clarkb | jlk: SpamapS going off of spamaps idea I seem to only have the sql tests (because no mysql currently) fail if I delete my zk datadir before running tests | 22:13 |
clarkb | so I'm guessing that something is polluting the zk db | 22:13 |
clarkb | it also looks like I leaked a nodepool test chroot | 22:17 |
clarkb | (not sure if related at all yet) | 22:17 |
clarkb | oh that was concurrency=1 too | 22:19 |
clarkb | so now sorting out if concurrency=1 is sufficient | 22:29 |
jlk | I'm reverting the commit that seems to break me to see if things start working. | 22:29 |
clarkb | jlk: it looks like doing tox -epy27 -- --concurrency=1 is sufficient to make things reliable | 22:45 |
clarkb | jlk: so looking like intertest conflict | 22:45 |
clarkb | (also we reliably leak our last used zk chroot. So I'm gonna poke at that) | 22:45 |
clarkb | doesn't look like every test leaks so nwo to find which one(s) do | 22:52 |
jlk | Hey look at that. | 23:20 |
jlk | I reverted ae04e4ce8fca33872f3677838d6a813d2c378e79 and things keep working. | 23:20 |
jlk | fd0354a094116fab6496e025f7a23f888322058c is the last commit before my revert. | 23:21 |
jesusaur | jlk: womp womp :( | 23:27 |
jesusaur | is it consistent which tests break? | 23:27 |
jlk | jesusaur: heh, yeah, sorry about that.... | 23:27 |
jlk | it's test_playbook for me | 23:27 |
clarkb | jesusaur: it seems like its intertest conflict because --concurrecny=1 works | 23:28 |
clarkb | maybe git repo state is being shared somehwere it shouldn't in the test suite? | 23:28 |
jesusaur | possibly | 23:29 |
jesusaur | several of the test_v3.py tests were giving me trouble | 23:29 |
jesusaur | I think test_dynamic_config is the one that tripped me up the most, then it would pass when re-running tox with --failing | 23:30 |
clarkb | separately I think I have a couple improvements to the kazoo fixture to make debugging things easier (and might have foudn the reason we leak sometimes) | 23:30 |
clarkb | just waiting for tests to finish so I can push things | 23:30 |
mordred | clarkb: woot | 23:33 |
clarkb | nope that didn't fix the leak but I know which test does it now | 23:43 |
clarkb | heh and it doesn't leak when run on its own | 23:45 |
clarkb | so confused | 23:45 |
clarkb | this is interesting. Subunit file says the leaked test failed, but stdout from testr doesn't say it failed. Though the sql tests did fail | 23:49 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!