*** bhavikdbavishi has joined #softwarefactory | 03:13 | |
*** bhavikdbavishi has quit IRC | 04:15 | |
*** bhavikdbavishi has joined #softwarefactory | 04:36 | |
*** bhavikdbavishi has quit IRC | 05:11 | |
*** bhavikdbavishi has joined #softwarefactory | 12:56 | |
*** bhavikdbavishi has quit IRC | 15:54 | |
*** bhavikdbavishi has joined #softwarefactory | 16:10 | |
*** bhavikdbavishi has quit IRC | 16:47 | |
*** bhavikdbavishi has joined #softwarefactory | 16:48 | |
*** bhavikdbavishi has quit IRC | 18:04 | |
*** bhavikdbavishi has joined #softwarefactory | 18:04 | |
*** bhavikdbavishi has quit IRC | 18:29 | |
*** bhavikdbavishi has joined #softwarefactory | 18:39 | |
*** bhavikdbavishi has quit IRC | 18:59 | |
pabelanger | Hmm, seems we have a stuck periodic job: https://ansible-network.softwarefactory-project.io/zuul/status | 22:24 |
---|---|---|
pabelanger | tristanC: any chance you can look at zuul logs and see what is happening? Job should be running directly from executors. | 22:25 |
tristanC | pabelanger: hello, looking now | 23:08 |
tristanC | pabelanger: mmh, no "windmill-config-deploy" or "2853fd9ea4494f70a7d7379e3efc666a" in zuul-executors' log | 23:14 |
pabelanger | odd | 23:15 |
pabelanger | wonder if anything in scheduler | 23:15 |
tristanC | here are the logs for post event: https://softwarefactory-project.io/paste/show/1401/ | 23:16 |
tristanC | and for the periodic: https://softwarefactory-project.io/paste/show/1402/ | 23:17 |
tristanC | looking at logrotated file now | 23:19 |
pabelanger | Hmm, the job is trying to stream logs, for that to happen, I think the executor needs to accept the job | 23:19 |
pabelanger | k | 23:19 |
pabelanger | it almost looks like merger failed, from looking at UI | 23:19 |
tristanC | here are yesterday's log: https://softwarefactory-project.io/paste/show/1403/ | 23:21 |
tristanC | pabelanger: the executor's log from yesterday: https://softwarefactory-project.io/paste/show/1404/ | 23:25 |
tristanC | seems like there was a network issue | 23:25 |
pabelanger | 2019-01-26 04:00:01,873 DEBUG zuul.ExecutorClient: Received handle b'H:192.168.240.15:961' for <Build 0706a2527db14efa973f9177f06daa59 of windmill-config-deploy voting:True on <Worker Unknown>> | 23:25 |
pabelanger | not sure why Worker Unknown is there | 23:26 |
pabelanger | tristanC: ah, if that is the case, I think there is a use case where we can lose jobs, and they get stuck | 23:26 |
pabelanger | wonder if we can dequeue it | 23:28 |
tristanC | pabelanger: let me try to dequeue | 23:28 |
pabelanger | hopefully so, otherwise we need to restart scheduler | 23:28 |
pabelanger | Yay | 23:29 |
tristanC | debugging zuul is really difficult :( | 23:30 |
pabelanger | +1 | 23:30 |
pabelanger | going to chalk this up to networking issue. But also think we should discuss upstream in #zuul, maybe we need so sort of dequeue / requeue after X mins to avoid blocked jobs | 23:32 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!