*** sgw has quit IRC | 00:01 | |
*** cloudnull has quit IRC | 01:04 | |
*** wxy-xiyuan has joined #zuul | 01:05 | |
*** cloudnull has joined #zuul | 01:05 | |
*** openstackstatus has joined #zuul | 01:17 | |
*** ChanServ sets mode: +v openstackstatus | 01:17 | |
*** jamesmcarthur has joined #zuul | 03:10 | |
*** bhavikdbavishi has joined #zuul | 03:25 | |
*** bhavikdbavishi1 has joined #zuul | 03:28 | |
*** bhavikdbavishi has quit IRC | 03:29 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 03:29 | |
*** jamesmcarthur has quit IRC | 03:40 | |
*** jamesmcarthur has joined #zuul | 03:44 | |
*** zxiiro has quit IRC | 03:45 | |
*** jamesmcarthur has quit IRC | 03:49 | |
*** sgw has joined #zuul | 03:53 | |
*** sgw has quit IRC | 04:08 | |
*** sgw has joined #zuul | 04:26 | |
*** evrardjp has quit IRC | 05:34 | |
*** evrardjp has joined #zuul | 05:34 | |
*** raukadah is now known as chkumar|rover | 05:44 | |
*** bhavikdbavishi has quit IRC | 06:53 | |
*** bhavikdbavishi has joined #zuul | 06:53 | |
*** dpawlik has joined #zuul | 06:55 | |
*** dpawlik has quit IRC | 07:16 | |
*** dpawlik has joined #zuul | 07:19 | |
*** tosky has joined #zuul | 07:35 | |
*** saneax has joined #zuul | 07:37 | |
reiterative | masterpe I have installed Zuul on bionic, so feel free to ping me if you hit a problem. I also came close to persuading the WIP Gitlab driver to work, but still have some problems to resolve. | 08:17 |
---|---|---|
*** carli has joined #zuul | 08:31 | |
*** jpena|off is now known as jpena | 08:54 | |
*** hashar has joined #zuul | 09:19 | |
*** bhavikdbavishi has quit IRC | 09:32 | |
*** sshnaidm has quit IRC | 09:41 | |
*** sshnaidm has joined #zuul | 09:42 | |
zbr | a review on https://review.opendev.org/#/c/705049/ would be really appreciated.. | 09:51 |
zbr | is very easy to test via dasboard, low risk and improves ability to compare current run with previous ones. | 09:52 |
*** jpena is now known as jpena|brb | 09:58 | |
*** hashar has quit IRC | 10:20 | |
carli | hello, I'm looking to find some stats about the number of deployments of openstack (if it's possible to see it for specific projects like for tripleo, devstack, kolla-ansible that would be even better) over a long period of time, and I've been looking at the logstash on logstash.openstack.org and checking the information on the zuul pages, does anyone have an idea how I can get that information? | 10:25 |
mnaser | carli: i think what you're looking for is the openstack user survey | 10:29 |
carli | the user survey talks about what various actors use everywhere, it does not display the number of times openstack(in its various iterations) is deployed for CI, or at least I haven't really seen the information (you're talking about https://www.openstack.org/analytics, right?) | 10:31 |
carli | I've seen slides by Monty Taylor where there's some information (on this http://inaugust.com/talks/zuul.html#/openstack-scale), but I'm looking to have a numer of deployments done (for integration testing) to use it as a statistics (I'm working on a paper with the use case of deploying openstack (kolla-ansible) faster than without our tool and we have the deployment times (as we have both deployed it with | 10:37 |
carli | our tool and without), but to give it some perspective we would have liked to know how often openstack gets deployed over a certain period (for example a year, a month) to be able to give an estimation of a time gain | 10:37 |
*** Defolos has joined #zuul | 10:39 | |
*** jpena|brb is now known as jpena | 10:43 | |
*** hashar has joined #zuul | 10:44 | |
mnaser | carli: oh i see, then you can probably rely on the zuul ci build logs and going from there | 10:48 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Authorization rules: add templating https://review.opendev.org/705193 | 11:14 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: WIP: Store unparsed branch config in Zookeeper https://review.opendev.org/705716 | 11:21 |
*** bhavikdbavishi has joined #zuul | 11:41 | |
*** bhavikdbavishi has quit IRC | 11:46 | |
*** hashar has quit IRC | 11:56 | |
carli | mnaser: i might not understand this correctly, but the zuul ci build logs only indicate current jobs, i don't really see how I can get an information over time. I've looked at the Zuul API to see if there was a way to send a request for specifics, but the API only indicates current available jobs | 12:02 |
*** wxy-xiyuan has quit IRC | 12:05 | |
mnaser | carli: have a look at http://zuul.openstack.org/builds | 12:33 |
mnaser | carli: more specifically http://zuul.openstack.org/openapi for /api/builds | 12:33 |
*** jpena is now known as jpena|lunch | 12:35 | |
*** electrofelix has joined #zuul | 12:44 | |
*** vivobg has joined #zuul | 12:45 | |
*** rlandy has joined #zuul | 12:50 | |
vivobg | Hi, all. Is there a way to flush the current, in-progress job queue in Zuul? | 12:53 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Authorization rules: add templating https://review.opendev.org/705193 | 13:00 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Authorization rules: add templating https://review.opendev.org/705193 | 13:00 |
fungi | vivobg: the easiest way to flush everything is to restart the scheduler | 13:20 |
fungi | vivobg: alternatively, you can take a queue dump and then transform that into a set of corresponding `zuul dequeue ...` rpc client commands | 13:21 |
*** bhavikdbavishi has joined #zuul | 13:22 | |
*** jpena|lunch is now known as jpena | 13:30 | |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Offload repo reset to processes https://review.opendev.org/706827 | 13:34 |
*** rfolco has joined #zuul | 13:42 | |
*** avass has joined #zuul | 13:44 | |
*** sgw has quit IRC | 13:52 | |
*** webknjaz has joined #zuul | 13:59 | |
*** zxiiro has joined #zuul | 14:08 | |
Shrews | fungi: wonderful article | 14:13 |
fungi | hope i didn't get anything major wrong in it | 14:15 |
fungi | and thanks! | 14:15 |
Shrews | the eavesdrop link was like a flashback! | 14:17 |
Shrews | nice touch | 14:17 |
Shrews | but it did bring up bad java-based memories for me when i looked through it :( | 14:17 |
fungi | heh, indeed | 14:17 |
fungi | it was all a work in progress ;) | 14:18 |
* fungi couldn't help himself, sorry! | 14:18 | |
Shrews | i'm still waiting (apparently) for that zuul option from corvus to randomly drop log entries | 14:18 |
*** Goneri has joined #zuul | 14:35 | |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Implement basic github checks API workflow https://review.opendev.org/705168 | 14:39 |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Implement basic github checks API workflow https://review.opendev.org/705168 | 14:44 |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Implement basic github checks API workflow https://review.opendev.org/705168 | 14:47 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Cap virtualenv to <20.0.0 https://review.opendev.org/706860 | 14:47 |
tobiash | zuul-maint: virtualenv just broke tests so cap it for now ^ | 14:48 |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Implement basic github checks API workflow https://review.opendev.org/705168 | 14:52 |
vivobg | @fungi Thanks. We tried a scheduler service restart, but that didn't clear the queue. We had a queued job, removed the project from the tenant config, restarted the scheduler and that caused all jobs to get a null reference and zuul was in a bad state. We ended up doing a complete redeploy, with fresh instances to restore service. | 14:56 |
vivobg | we are on 3.10.2 | 14:57 |
fungi | that's... odd. zuul keeps all queued builds in the scheduler's memory, so if the scheduler restarts it should lose track of them entirely | 14:57 |
fungi | unless however you're restarting the scheduler is also reenqueuing a saved dump of the previously-queued builds | 14:58 |
Shrews | tobiash: oh fun | 14:59 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Uncap virtualenv https://review.opendev.org/706871 | 15:02 |
tobiash | curious if this does the trick ^ | 15:03 |
fungi | tobiash: locally when i create an environment with virtualenv 20 it has setuptools included and i can import pkg_resources just fine: http://paste.openstack.org/show/789376/ | 15:09 |
fungi | i wonder what's different about how we're creating the ansible virtualenvs | 15:12 |
mordred | fungi: we're calling python -m virtualenv ... is that maybe different? | 15:13 |
tobiash | fungi: no idea, it's just a shot into the dark in a first try to avoid a deeper investigation ;) | 15:13 |
fungi | oh! i bet this happens only with distro-packaged python which splits pkg_resources out to a separate distro package | 15:13 |
* fungi tries something | 15:13 | |
mordred | fungi: ugh. distro-packaged-python again | 15:13 |
fungi | this is going to be harder to test since i already have python3-pkg-resources installed on most of my systems | 15:15 |
fungi | and uninstalling it wants to remove lots of other packages which depend on it | 15:15 |
fungi | do we have an example failure? | 15:15 |
* fungi goes hunting | 15:15 | |
tobiash | fungi: e.g. https://review.opendev.org/706827 (all test failures) | 15:21 |
tobiash | looks like in the ansible 2.6 venv even the setup module is broken with virtualenv 20.0.0 | 15:22 |
tobiash | https://95195c19b679153113ec-4f1ab06e880eb7499aa5249ca5a98f04.ssl.cf1.rackcdn.com/706827/1/check/tox-py35/eac5637/testr_results.html | 15:22 |
tobiash | seems like the ara callback needs it | 15:23 |
*** jamesmcarthur has joined #zuul | 15:50 | |
*** carli has quit IRC | 15:56 | |
openstackgerrit | Merged zuul/zuul master: Cap virtualenv to <20.0.0 https://review.opendev.org/706860 | 15:57 |
clarkb | My Project Gating with Zuul talk has been accepted at linux fest northwest | 16:15 |
clarkb | not sure if anyone else will be there (maybe jlk)? | 16:15 |
*** chkumar|rover is now known as raukadah | 16:15 | |
*** avass has quit IRC | 16:20 | |
corvus | clarkb: \o/ | 16:27 |
*** mattw4 has joined #zuul | 16:34 | |
*** jamesmcarthur has quit IRC | 16:40 | |
*** jamesmcarthur has joined #zuul | 16:41 | |
*** saneax has quit IRC | 16:44 | |
*** tosky has quit IRC | 16:48 | |
*** jamesmcarthur has quit IRC | 16:57 | |
*** jamesmcarthur has joined #zuul | 16:57 | |
*** Defolos has quit IRC | 17:22 | |
corvus | i tried to use the k8s module in an untrusted context and it said "Executing local code is prohibited" | 17:29 |
corvus | i don't see the k8s module in zuul/ansible. how did that hit that? | 17:29 |
corvus | oh, is that via the action module ? | 17:33 |
*** evrardjp has quit IRC | 17:34 | |
*** evrardjp has joined #zuul | 17:34 | |
corvus | so we would need to add a handle_k8s ? | 17:34 |
corvus | (the "normal" action module i should say) | 17:35 |
*** sshnaidm is now known as sshnaidm|afk | 17:35 | |
tobiash | corvus: yes | 17:39 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Allow more k8s actions in untrusted context https://review.opendev.org/706940 | 17:45 |
corvus | tobiash, mordred, tristanC: ^ can you take a look at that and tell me if that seems reasonable/secure? | 17:45 |
*** vivobg has quit IRC | 17:45 | |
mordred | corvus: I think it seems reasonable | 17:47 |
tristanC | corvus: commented | 17:47 |
tristanC | mordred: could you please check this operator specification update: https://review.opendev.org/706639 | 17:48 |
tobiash | corvus: commented | 17:49 |
*** electrofelix has quit IRC | 17:49 | |
tristanC | (i meant to use the spec content to start the zuul-operator documentation) | 17:49 |
corvus | tristanC: that's a good idea, though we didn't do that for the other modules. i don't know if we had a reason or not. maybe we should think about doing that on all of them? it might be easier to maintain? | 17:50 |
corvus | tobiash: good idea, i'll update | 17:50 |
tristanC | corvus: we could re-purpose the begining of the add_host run function to be re-used? though that may be a lot of work to ensure the change is backward compatible... | 17:52 |
mordred | corvus: alternately to whitelisting - we've also discussed dropping the ansible-level exclusions and relying on bubblewrap - should we maybe write up a spec about that? | 17:53 |
tristanC | unless there is an easy way to get the protected modules attributes per ansible version? last time i checked it involved doc parsing... | 17:53 |
tristanC | mordred: it seems like we need to protect some secrets exposed in the bubblewrap, iirc the winrm key | 17:53 |
corvus | maybe we should start a spec just to track the problem, even if we think it's not ready | 17:54 |
tristanC | corvus: yes, good idea | 17:54 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Allow more k8s actions in untrusted context https://review.opendev.org/706940 | 17:55 |
mordred | corvus: ++ | 17:55 |
*** openstackstatus has quit IRC | 17:57 | |
*** openstack has joined #zuul | 18:01 | |
*** ChanServ sets mode: +o openstack | 18:01 | |
corvus | hrm, i think i see why you put job in there -- because it doesn't just bind the volume to the executor, it also adds it to the ro path... | 18:02 |
corvus | or rw path | 18:02 |
corvus | ok. the word "job" confused me slightly at first, but i see it's value and i don't have a better suggestion. :) | 18:03 |
mordred | same | 18:04 |
*** jpena is now known as jpena|off | 18:09 | |
jlk | clarkb: I hadn't planned on going to LFNW. I'll be at Deconstruct conference in Seattle the Thursday / Friday before | 18:32 |
*** sgw has joined #zuul | 18:34 | |
*** bhavikdbavishi has quit IRC | 18:40 | |
clarkb | Shrews: what triggers the cleanup of failed image build zk records? | 19:07 |
clarkb | we've had a sad new cloud due to networking problems and that results in our zk having many failed records for image uploads | 19:07 |
openstackgerrit | Merged zuul/zuul master: Allow more k8s actions in untrusted context https://review.opendev.org/706940 | 19:10 |
Shrews | clarkb: i think it's periodic for some period of N | 19:17 |
clarkb | ok some of these records are more than a week old. Not sure how large of an N we use | 19:19 |
Shrews | clarkb: looks like every minute, so it's strange there would be lots of FAILED records | 19:19 |
clarkb | hrm I wonder if this is fallout from switching servers | 19:20 |
clarkb | maybe only the old server could have deleted them (thought it should have deleted them by now as they are way more than a minute old) | 19:20 |
Shrews | clarkb: it *might* be possible that if there is a failed record, but no corresponding upload, that the ZK record remains | 19:20 |
Shrews | (just a guess) | 19:21 |
clarkb | there are 7844 such records | 19:21 |
Shrews | oh, there was a server switch? | 19:21 |
clarkb | Shrews: yes, but yesterday and these records are up to a week old | 19:21 |
clarkb | Shrews: the old server couldn't talk to the glance api of the new cloud region so we built a new server that could | 19:22 |
clarkb | all of the failed records (the 7844 of them) seem to have stuck around frmo the old server that was having api access trouble | 19:22 |
Shrews | clarkb: hrm, can't debug that code path in my head then. i'd have to dig | 19:23 |
Shrews | clarkb: nb03 ? | 19:23 |
clarkb | Shrews: yes | 19:23 |
clarkb | Shrews: note these failures were due to tcp not being able to create a connection for the glance api requests | 19:25 |
Shrews | clarkb: looks like nb03 is super busy deleting atm | 19:25 |
Shrews | oh , no. that was a couple of hours ago | 19:25 |
clarkb | that 7844 number seems pretty stable over the last little while | 19:26 |
clarkb | and I think the only things that need deleting are the failed zk records | 19:26 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Allow template lookup in untrusted context https://review.opendev.org/706963 | 19:27 |
corvus | tobiash, tristanC, mordred: ^ another access barrier i ran into that i think we can open up | 19:27 |
clarkb | is jinja turing complete? | 19:28 |
clarkb | that might be a little bit more dangerous (though I suppose if ansible is evaluating them on the executor side for all jobs then it doesn't matter, and i'm not sure where it does that evaluation) | 19:28 |
Shrews | clarkb: oh, i bet we didn't copy over the builder_id.txt file | 19:28 |
clarkb | Shrews: probably not, but these records should've been deleted by the original builder a week ago if the timeout is a minute | 19:29 |
clarkb | (I don't think the server switch is the problem here) | 19:29 |
Shrews | well, if the old server couldn't delete the uploads, the records wouldn't be deleted | 19:29 |
clarkb | there were no uploads | 19:30 |
clarkb | beacuse the glance api was completely unreachable | 19:30 |
clarkb | (tcp did not work) | 19:30 |
Shrews | then I return to my original suspicion that if there is no upload, the zk record remains | 19:30 |
Shrews | which is something we should probably fix | 19:31 |
clarkb | it is possible the server switch would prevent those records frmo being deleted now if it was working otherwise. But I think the underlying bug is probably something lik^ and was present in the original server | 19:31 |
corvus | clarkb: i think we already allow templating on the executor, we just don't allow it in a lookup plugin. i think the side-effects that can be performed with a jinja template are limited (ie, i don't think you can read/write arbitrary files, other than using these lookup plugins (filters))" | 19:32 |
*** jamesmcarthur has quit IRC | 19:34 | |
*** igordc has joined #zuul | 19:35 | |
mordred | corvus: yeah - I think you're right about that | 19:41 |
*** jamesmcarthur has joined #zuul | 19:43 | |
mordred | Shrews, clarkb: I could imagine the situation being that there was no upload (because tcp errors) - which is, to nodepool, a retriable condition for the most part. that means it's actually most of the time the _right_ thing for nodepool to do to keep the record | 19:44 |
mordred | except for the case where a builder is deployed that is not able to talk to its clouds over TCP - which isn't a circumstance that tends to emerge under normal operating conditions | 19:44 |
clarkb | mordred: but we get a new record when it retries right | 19:45 |
clarkb | we indexby attempt | 19:45 |
mordred | hrm. yeah - if that's what we're keeping around I can see us fixing that - was more pointing out that nb not being able to delete the thing that's in-flight that needs to go away is a decent reason to keep the record around - so that it knows it needs to delete the remote object | 19:46 |
mordred | or, rather, that our current failure is somewhat pathological - so figuring out the "right" way to deal with it might be ... complicated | 19:47 |
Shrews | did the provider name change? | 19:49 |
Shrews | from linaro-us to linaro-london maybe? | 19:50 |
clarkb | no, they are different regions | 19:50 |
Shrews | k | 19:50 |
clarkb | (and both continue to exist) | 19:51 |
*** jamesmcarthur has quit IRC | 19:52 | |
*** igordc has quit IRC | 19:53 | |
*** jamesmcarthur has joined #zuul | 19:57 | |
Shrews | clarkb: mordred: ok, i think i might have a handle on what happened. the old server was continually failing, generating lots of failed records (which would have automatically been cleaned up later if the upload condition corrected itself). The server rebuild did not carry over the unique nodepool builder id file (https://nb03.openstack.org/images/builder_id.txt), so when the new server goes to cleanup, it sees those records and says (oh, these | 20:00 |
Shrews | aren't my uploads, so i can't safely remove them). I think cleaning up will be a manual process. | 20:00 |
Shrews | i'll code up a script that checks for the old server ID and removes the records | 20:01 |
Shrews | i don't think we can safely program nodepool to do that for us | 20:01 |
mordred | Shrews: so if we'd copied the builder_id.txt file things would have cleaned up naturally after themselves | 20:01 |
Shrews | mordred: yes | 20:01 |
Shrews | i *thought* we had that documented somewhere, but i am failing to find it | 20:02 |
mordred | cool. i agree - I don't think there's a good safe way to update nodepool to do that automatically\ | 20:02 |
Shrews | mordred: now, we *could* have that file contain multiple IDs... | 20:03 |
Shrews | oh, maybe not | 20:03 |
Shrews | b/c then it wouldn't know which one to use going forward... though i suppose it could just pick whichever | 20:03 |
Shrews | or we could tag one as primary, others as alternate | 20:04 |
* Shrews waves arms wildly in air in true mordred fashion | 20:05 | |
Shrews | anyway, will write that script now... | 20:05 |
Shrews | hrm, i may have to take that evaluation back. we also compare hostname | 20:12 |
Shrews | and that's the same. the mystery deepens | 20:12 |
tobiash | corvus: I have a comment on 706963 | 20:15 |
mordred | tobiash: oh - good catch | 20:31 |
corvus | tobiash: wow, thanks, that is indeed what went wrong :) | 20:42 |
openstackgerrit | James E. Blair proposed zuul/zuul master: llow template lookup in untrusted context https://review.opendev.org/706963 | 20:44 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Allow template lookup in untrusted context https://review.opendev.org/706963 | 20:45 |
corvus | let's see if that looks right | 20:45 |
mordred | corvus: lgtm | 20:50 |
mordred | corvus: of course, I missed the symlink thing before, so it's relaly mostly important that it looks good to tobiash | 20:50 |
*** rh-jelabarre has joined #zuul | 20:51 | |
*** jamesmcarthur has quit IRC | 20:53 | |
ianw | $ nodepool image-list | grep linaro- | grep failed | wc -l | 20:54 |
ianw | 7917 | 20:54 |
ianw | | 0000086134 | 0000000622 | linaro-london | ubuntu-xenial-arm64 | None | None | failed | 07:18:08:51 | | 20:54 |
ianw | e.g. ^ | 20:54 |
ianw | does anyone know how to get rid of these? | 20:54 |
clarkb | ianw: ya was talking to shrews and mordred about it earlier, and I think shrews thinks it may be a bug in nodepool | 20:57 |
clarkb | ianw: the problem is that the failed uploads should be cleaned up once successful ones happen (we now have successful uploads but no cleanup) | 20:57 |
Shrews | ftr, i have absolutely NO idea why these failure records aren't being deleted right now | 20:57 |
ianw | ahh, sorry, ok checking scrollback | 21:00 |
*** jamesmcarthur has joined #zuul | 21:00 | |
Shrews | this is exceedingly frustrating to debug | 21:05 |
Shrews | clarkb: ianw: you know what... i *think* nb03 might be stuck trying to delete an instance | 21:12 |
Shrews | the very last log entry is: 2020-02-10 17:18:21,993 INFO nodepool.builder.CleanupWorker.0: Deleting image build debian-buster-arm64-0000080896 from linaro-london | 21:13 |
mordred | Shrews: and thus can't do the cleanup, beause it's stuck? | 21:13 |
Shrews | clarkb: ianw: and it still shows as 'active' with openstack --os-cloud=linaro-london --os-region-name=London image show c45a9a08-1e75-4647-beeb-5e4a3a74f8c0 (which is the corresponding instance) | 21:13 |
clarkb | Shrews: the london delete failures are a known issue and part of the motivation for allowing us to delete files on disk early (as soon as upload records transition to deleting, before they actually delete) | 21:14 |
Shrews | mordred: yeah | 21:14 |
*** jamesmcarthur_ has joined #zuul | 21:14 | |
clarkb | wouldn't that be handled by a separate thread though as they are in different providers? | 21:14 |
clarkb | (I thought cleanups were per provider, but I could be wrong about that) | 21:14 |
ianw | also ... nb03 only started fresh yesterday; so *it* is having issues deleting in linaro london? i.e. linaro london glance is borked? | 21:15 |
clarkb | ianw: yes, but so did the old nb03 | 21:15 |
Shrews | clarkb: we have a single cleanup thread for all providers | 21:15 |
clarkb | ianw: I've also manually tried to delete these images in the past | 21:15 |
Shrews | and the image delete is not a new thread | 21:15 |
clarkb | ah ok | 21:16 |
ianw | clarkb: ok ... i hadn't considered that. i can loop in with kevinz on that | 21:16 |
Shrews | so no further cleanup can happen if that request gets "stuck" | 21:16 |
clarkb | I've sent email to kevinz about those in the past | 21:16 |
clarkb | let me see if I can find it | 21:16 |
clarkb | hrm did I only mention it on irc? | 21:16 |
*** jamesmcarthur has quit IRC | 21:17 | |
clarkb | ianw: in any case I believe all those stale images in london region can be deleted, nodepool and I have tried in the past but they just won't delete | 21:18 |
clarkb | it does make me wonder if some other tenant is using them as BFV or something | 21:18 |
mordred | clarkb: we didn't list them as public ... so other tenants shouldn't be able to do that | 21:19 |
Shrews | btw, i said "because it can't delete an instance" but meant "image", but that's probably clear now.. sorry | 21:19 |
clarkb | Shrews: yup your pasted example clarified it :) | 21:20 |
clarkb | Shrews: we should probably update nodepool to be greedy there and not short circuit? | 21:20 |
Shrews | clarkb: eh? | 21:21 |
clarkb | Shrews: after the image delete fails for that stale image, continue trying to delete the next things | 21:22 |
clarkb | and eventually it should get to deleting those failed uploads righ? | 21:22 |
Shrews | clarkb: the problem is it is NOT failing | 21:22 |
ianw | clarkb: does it fail, or not fail but also not delete? | 21:22 |
Shrews | if it failed, it would continue on deleting stuff | 21:22 |
clarkb | Shrews: so we only try to delete one thing per cleanup run? | 21:22 |
Shrews | no | 21:23 |
*** rfolco has quit IRC | 21:23 | |
clarkb | then why doesn't it continue to the next upload records? | 21:23 |
mordred | in each run we try to delete the things - but we delete them serially | 21:23 |
mordred | clarkb: because it's hung | 21:23 |
Shrews | each run, it collects all the cruft that needs cleaned up and deletes each in turn. but this delete request is not returning, apparently | 21:23 |
clarkb | hrm when I tried to manually delete those in the past it did return | 21:23 |
clarkb | it just never actually deleted things | 21:23 |
clarkb | maybe osc and shade do it differently though | 21:24 |
mordred | clarkb: there are many ways in which cloud APIs can break | 21:24 |
fungi | that could be openstack's official slogan | 21:24 |
mordred | clarkb: they _definitely_ do things differenlty. although gtema does have a patch up to swtich osc to using sdk for image operations | 21:24 |
mordred | at which point they'll be much more similar | 21:25 |
ianw | http://paste.openstack.org/show/789394/ <- linaro-london's image list right now | 21:25 |
Shrews | yep. there should be a max of 2 per image | 21:26 |
Shrews | that list was my clue something was up :) | 21:26 |
ianw | i'm trying to manual delete 6b195f5f-abb5-445a-a248-1da30ef3a26b | 21:27 |
clarkb | ya we normally see that in the BFV clouds which is why I theorized that could be related here | 21:27 |
ianw | it is, i would say, hanging | 21:27 |
clarkb | because some instance will hang on to the image with its volume preventing the image from deletin | 21:27 |
Shrews | ianw: are you using osc or an sdk script? | 21:29 |
clarkb | unrelated I see the problem with the virtualenv update | 21:30 |
fungi | oh? | 21:31 |
ianw | Shrews: osc | 21:31 |
Shrews | ianw: cool, so broken in both code paths | 21:31 |
clarkb | fungi: https://virtualenv.pypa.io/en/latest/cli_interface.html#section-seeder seems new virtualenv is more conservative about installing new packages? | 21:32 |
clarkb | hrm actually rereading that it is not going todownload newer setuptools but should bundle a version? | 21:32 |
fungi | i mean, when i use virtualenv 20 to create and environment, it preinstalls setuptools (45.x) | 21:33 |
ianw | Shrews: yep, running with debug it's just a "curl -g -i -X DELETE -H 'b'Content-Type': b'application/octet-stream'' -H 'b'X-Auth-Token'" that is never returning. we'll have to take it up with the cloud i guess | 21:33 |
fungi | er, create AN environment | 21:33 |
Shrews | all: fwiw, i am taking the rest of the week off in a use-it-or-lose-it vacation scheme i seem to be mixed up in | 21:33 |
clarkb | Shrews: seems like you should use it then | 21:33 |
mordred | Shrews: are you going to do anything fun? | 21:34 |
fungi | vacation ponzi schemes are the worst | 21:34 |
Shrews | mordred: as soon as i requested the days, the weather decided that it would rain for the entirety of the duration.... so, who knows | 21:34 |
clarkb | fungi: ya I think I misread the don't download default as meaning no install but that just means don't update by default but use the bundled setuptools instead | 21:34 |
clarkb | fungi: do you have a pkg_resources when you do that? | 21:35 |
fungi | it's part of stdlib | 21:35 |
fungi | but yes | 21:35 |
mordred | fungi: go get 5 people to give you some of their vacation, and give me a small percentage of what you get- then get those people to do the same - and by the time the scheme has worked its magic, you'll have permanent vacation! | 21:35 |
mordred | Shrews: "yay"! | 21:35 |
corvus | mordred, tristanC, tobiash, paladox|UKInEU: gerrit's zuul is self-deploying! https://gerrit-zuul.inaugust.com/t/gerrit/build/ec3af27521ba4c759b9c99c142074905/console | 21:35 |
* fungi is on permanent vacation, aerosmith style | 21:35 | |
clarkb | fungi: pkg_resources is part of setuptools not stdlib | 21:35 |
clarkb | fungi: thinking that maybe the bundled setuptools lacks that potentially | 21:36 |
mordred | corvus: \o/ | 21:36 |
fungi | clarkb: oh, but installed as a separate top-level module from the same package? | 21:36 |
paladox|UKInEU | corvus \o\o/ | 21:36 |
clarkb | fungi: ya | 21:36 |
*** rfolco has joined #zuul | 21:37 | |
clarkb | I am able to import pkg_resources on a virtualenv 20.0.1 installed venv though | 21:37 |
fungi | indeed, lib/python3.8/site-packages/pkg_resources within the virtualenv i created is a symlink to ~/.local/share/virtualenv/seed-v1/3.8/image/SymlinkPipInstall/setuptools-45.1.0-py3-none-any/pkg_resources | 21:41 |
clarkb | fungi: mine too, could this be a simple permissions thing? | 21:44 |
fungi | mayhaps | 21:44 |
clarkb | ya I bet that is it because bwrap | 21:44 |
clarkb | if we have symlinks in the bwrap context to things that aren't bind mounted in they disappear | 21:44 |
fungi | i woudln't be shocked to discover that's the case anyway | 21:44 |
clarkb | we could run it with allows copy set to true then I bet it would work | 21:45 |
mordred | oh - yeah | 21:45 |
*** jamesmcarthur_ has quit IRC | 21:45 | |
*** Goneri has quit IRC | 21:46 | |
clarkb | I think new virtualenv is trying to be smart and dedup common libs like setuptools and pkg_resources, but when we mount with bwrap we break that | 21:46 |
fungi | pip's going to want to use ~/.local for caching as well, so getting that working might help performance anyway | 21:46 |
clarkb | fungi: well the idea here is that we preinstall everything so you don't do any of those installs at job urn time | 21:46 |
fungi | unless that opens a security risk | 21:46 |
fungi | ahh, yeah | 21:47 |
mordred | mind blown at the virtualenv deduping | 21:47 |
fungi | we definitely don't want to share data cached by one build in another | 21:47 |
fungi | that would be a serious cache poisoning hole | 21:48 |
mordred | that actually _completely_ violates assumptions I operate under on how virtualenv works - glad I now know | 21:48 |
* mordred edits system libraries with print statements inside of virtualenvs because it's "safe" to do so | 21:48 | |
clarkb | that poses two new problems for us. The first is that old virtualenv and new virtualenv may not have the same configuration/arguments so we will need to detect that potentially. The other is that we probably do want symlinks of python itself but not the libs | 21:49 |
fungi | mordred: i just `python3 -m venv` normally and have only touched virtualenv locally for python2.7 things since ages | 21:49 |
mordred | at least it only does that for the pip/setuptools basics | 21:49 |
fungi | the venv module does not symlink foo/lib/python3.8/site-packages/pkg_resources to anywhere special, i just confirmed | 21:50 |
clarkb | fungi: ok so maybe we just use the venv module. I don't remember why it wsn't used initially. I seem to recall there was a reason though | 21:50 |
mordred | yeah - there's some $reason right? | 21:51 |
clarkb | ya I remember it coming up in reviews | 21:51 |
clarkb | I want to say it had to do with distros like ubuntu not actually packaging it properly | 21:51 |
openstackgerrit | Merged zuul/zuul master: Allow template lookup in untrusted context https://review.opendev.org/706963 | 21:51 |
clarkb | but that memory is fuzzy | 21:51 |
clarkb | I wonder if we set it to downloda those libs if they would go in the common lib dir or in the virtualenv directly | 21:52 |
mordred | yeah - I think that's right - like, distro python excludes venv or something | 21:52 |
fungi | clarkb: python2.7 maybe? | 21:52 |
fungi | and yes, ubuntu nees python3-venv installed because they separate it out (because debian does) | 21:53 |
clarkb | fungi: well zuul requires python3. Ansible can run under both. I thought venv can create a python2 venv it just needs to be started from python3 | 21:53 |
fungi | same for python3-pip | 21:53 |
mordred | yup | 21:53 |
fungi | and python3-pkg-resources | 21:53 |
fungi | which they separate out of python3-setuptools | 21:53 |
fungi | anyway, it looks to me like virtualenv is only symlinking easy_install, pip, pkg_resources, setuptools and wheel | 21:54 |
fungi | at least by default | 21:54 |
clarkb | ya its the seed packages | 21:54 |
mordred | yah | 21:54 |
mordred | so it's a known set | 21:55 |
fungi | i expect if you specify other stuff with --seed-packages those will get the same treatment | 21:55 |
clarkb | there are two --seeder options too | 21:55 |
clarkb | default is app-data the other is pip | 21:55 |
clarkb | I wonder if we used pip if it would install normally | 21:55 |
clarkb | because pip wouldn't know about the weird bunlding and symlinking | 21:55 |
* clarkb tests | 21:55 | |
fungi | indeed, that solves it | 21:56 |
fungi | --seeder=pip gets rid of the symlinks | 21:56 |
clarkb | yup confirmed locally | 21:56 |
fungi | --seeder=app-data makes them come back | 21:56 |
fungi | so that or python3 -m venv may be the simplest workarounds | 21:57 |
clarkb | ya | 21:57 |
fungi | except you need to detect whether virtualenv has a --seeder option | 21:57 |
clarkb | old virtualenv has no seeder option | 21:57 |
clarkb | jinx | 21:57 |
mordred | I think I like --seeder=pip of the two | 21:57 |
mordred | JOY | 21:57 |
clarkb | we can assert we need current virtualenv to deal with that | 21:58 |
mordred | I mean - we can just specify a min in requirements yeah? | 21:58 |
mordred | clarkb: jinx | 21:58 |
fungi | wfm | 21:58 |
fungi | i don't think anyone's doing distro packaging of zuul yet, right? | 21:58 |
clarkb | fedora people may be? | 21:58 |
fungi | though maybe they're using zuul on platforms where they want to provide virtualenv via system packages | 21:58 |
mordred | I'm pretty sure SF are deploying from packages they build | 21:59 |
fungi | in which case they're going to need to package a bleeding-edge virtualenv (or switch to managing ansible environments manually?) | 21:59 |
mordred | I'm not sure how that interacts with multi-python virtualenv | 21:59 |
clarkb | virtualenv --version is valid in both new and old virtualenv | 22:00 |
fungi | at least zuul doesn't insist on doing that for you if you want to supply your own | 22:00 |
clarkb | of course the format is different in both :) | 22:00 |
mordred | s/multi-python virtualenv/multi-ansible virtualenv/ | 22:00 |
clarkb | but we could switch on that | 22:00 |
mordred | clarkb: I tried virtualenv --version a little bit ago and it did not work | 22:00 |
clarkb | mordred: it is working for me | 22:00 |
clarkb | mordred: what version of virtualenv do you have? mine are 20.0.1 and 16.7.5 | 22:01 |
mordred | oh - no - I misspelled it | 22:01 |
mordred | apparenlty --verison doesn't work | 22:01 |
fungi | ooh i know, we could write a little python utility to ask pkg_resources to return the virtualenv version and parse that... just need a virtualenv to install it in | 22:01 |
fungi | mmm, --venison | 22:01 |
clarkb | fungi: pkg_resources is really slow too | 22:01 |
mordred | how about we start with a requirements pin and --seeder=pip | 22:01 |
mordred | since that's the most direct expression of what we _want_ to have happen here | 22:01 |
fungi | clarkb: sure, i was mostly creating a catch-22 scenario | 22:01 |
clarkb | mordred: that seems reasonable | 22:02 |
clarkb | should I update tobiash's chnage? | 22:02 |
mordred | but let's wait for the SF folks to review it and tell us if it makes things unworkable | 22:02 |
mordred | before landing | 22:02 |
fungi | yep, wfm | 22:02 |
fungi | tristanC: ^ when you're around | 22:02 |
clarkb | https://review.opendev.org/#/c/706871/1 is the change. I'll push up a new ps | 22:02 |
mordred | clarkb: and yeah - go ahead | 22:02 |
corvus | ftr, i have caught up on this issue in scrollback and. wow. | 22:03 |
mordred | corvus: ikr? | 22:04 |
fungi | it is quite wow-inducing | 22:05 |
fungi | to be fair, the virtualenv maintainers announced a 20 beta a couple weeks ago complete with release notes and asked for folks to test it out | 22:06 |
fungi | i didn't consider at the time the fact that zuul was using it to manage ansible environments | 22:07 |
fungi | though i still feel like eventually switching to the stdlib venv module might be a friendlier solution | 22:07 |
openstackgerrit | Clark Boylan proposed zuul/zuul master: Uncap virtualenv https://review.opendev.org/706871 | 22:07 |
clarkb | I've tried to capture what the isse is in the commit message there | 22:08 |
corvus | zuul-maint: remote: https://review.opendev.org/706989 Add gcp-authdaemon to Zuul | 22:11 |
*** dpawlik has quit IRC | 22:14 | |
*** Defolos has joined #zuul | 22:15 | |
*** dmsimard is now known as dmsimard|off | 22:17 | |
corvus | mordred: i'd like to install 'kubectl' and 'oc' in the zuul-executor image (because i think [especially since it's a *container* image] those commands are very likely to prove useful to people using the image). what do you think is the best way to do that? | 22:27 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Install kubectl/oc into executor container image https://review.opendev.org/706995 | 22:44 |
corvus | mordred, tristanC: ^ | 22:44 |
*** openstackgerrit has quit IRC | 22:46 | |
*** jamesmcarthur has joined #zuul | 23:22 | |
*** openstackgerrit has joined #zuul | 23:29 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: web: link to index.html if index_links is set https://review.opendev.org/705585 | 23:29 |
*** rh-jelabarre has quit IRC | 23:37 | |
*** jamesmcarthur has quit IRC | 23:39 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: Install kubectl/oc into executor container image https://review.opendev.org/706995 | 23:43 |
*** jamesmcarthur has joined #zuul | 23:50 | |
*** jamesmcarthur has quit IRC | 23:52 | |
*** Defolos has quit IRC | 23:53 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: Install kubectl/oc into executor container image https://review.opendev.org/706995 | 23:53 |
*** sgw has quit IRC | 23:56 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!