*** myoung|ruck|afk is now known as myoung|ruck | 00:14 | |
*** rlandy is now known as rlandy|bbl | 00:16 | |
pabelanger | dmsimard|off: why does ARA need to be install for ara-report when ara_report_type == database? | 00:27 |
---|---|---|
pabelanger | install on executor? | 00:27 |
pabelanger | as I understand it, the log server would run ARA against the database when somebody hits the web right? | 00:28 |
pabelanger | oh, It is so the database is generated to start with, when we run ansible-playbook | 00:29 |
pabelanger | :) | 00:30 |
pabelanger | kk, sorry for the noise | 00:30 |
dmsimard|off | pabelanger: yeah, the executors need to have the callback enabled to have the database. The callback will be split into another python module in 1.0 so it's less of a pain. | 00:35 |
pabelanger | yah, thanks. I forgot that step. I do think ara-report could only check to make sure the database was found, since ara-report doesn't actually need to run ara at that point | 00:40 |
pabelanger | will test tomorrow and find out, ara in is a virtualenv, and not sure of the role will find it | 00:40 |
*** elyezer has quit IRC | 00:58 | |
*** elyezer has joined #zuul | 01:10 | |
*** elyezer_ has joined #zuul | 01:12 | |
*** elyezer has quit IRC | 01:16 | |
*** ssbarnea_ has quit IRC | 01:34 | |
*** rlandy|bbl has quit IRC | 02:16 | |
*** snapiri has joined #zuul | 05:32 | |
*** smyers has quit IRC | 05:44 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Support merged as requirement in github driver https://review.openstack.org/568488 | 05:44 |
*** smyers has joined #zuul | 05:45 | |
*** AJaeger has quit IRC | 06:38 | |
*** AJaeger has joined #zuul | 06:43 | |
*** AJaeger has quit IRC | 07:27 | |
*** AJaeger has joined #zuul | 07:29 | |
*** gtema has joined #zuul | 07:42 | |
*** jpena|off is now known as jpena | 07:50 | |
*** dims has quit IRC | 07:59 | |
*** dims has joined #zuul | 08:02 | |
*** dims has quit IRC | 08:07 | |
*** dims has joined #zuul | 08:07 | |
*** ssbarnea_ has joined #zuul | 08:34 | |
*** ekan is now known as johanssone | 09:12 | |
*** corvus has quit IRC | 10:29 | |
*** corvus has joined #zuul | 10:30 | |
*** hashar has joined #zuul | 10:33 | |
*** hashar has quit IRC | 11:02 | |
*** ssbarnea_ has quit IRC | 11:14 | |
*** jpena is now known as jpena|lunch | 11:47 | |
*** ssbarnea_ has joined #zuul | 11:47 | |
*** rlandy has joined #zuul | 12:29 | |
*** jpena|lunch is now known as jpena | 12:43 | |
*** elyezer_ has quit IRC | 13:30 | |
*** gtema has quit IRC | 13:36 | |
*** elyezer_ has joined #zuul | 13:43 | |
*** acozine1 has joined #zuul | 13:43 | |
*** gtema has joined #zuul | 14:20 | |
*** dkranz has quit IRC | 15:07 | |
*** dkranz has joined #zuul | 15:09 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Replace use of aiohttp with cherrypy https://review.openstack.org/567959 | 15:17 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Convert streaming unit test to ws4py and remove aiohttp https://review.openstack.org/568335 | 15:17 |
*** gtema has quit IRC | 16:15 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Replace use of aiohttp with cherrypy https://review.openstack.org/567959 | 16:30 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Convert streaming unit test to ws4py and remove aiohttp https://review.openstack.org/568335 | 16:30 |
mordred | corvus: 568028 has 3 +2s - should we hold off on landing it until the other two are ready? | 16:32 |
corvus | mordred: yeah, i've miped it for now. also we should probably land the mqtt change first; i think it might conflict | 16:33 |
corvus | that's https://review.openstack.org/535543 | 16:33 |
corvus | i think it's ready to go, but i didn't want to land a major change since i'm so distracted; but if others are around and want to, i think that's fine. | 16:34 |
mordred | corvus: I'm goig to run to the store, but I can maybe land it and keep an eye on things when I get back ... do we have a patch anywhere to connect it to firehose? | 16:37 |
corvus | mordred: i don't think so | 16:42 |
*** jpena is now known as jpena|off | 17:11 | |
*** sshnaidm|rover is now known as sshnaidm|off | 17:29 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Status branch protection checking for github https://review.openstack.org/535680 | 17:39 |
tobiash | finally added tests to ^ :) | 17:42 |
SpamapS | Looksl ike the split of zk so far is preventing all the nodepool waiting | 17:47 |
SpamapS | and in fact with that happening, now the executor gets load up to the governor | 17:47 |
tobiash | SpamapS: so happy end now :) | 17:48 |
pabelanger | woot | 17:48 |
pabelanger | SpamapS: how larger was your shared zookeeper before the move? | 17:48 |
pabelanger | err | 17:49 |
pabelanger | smaller* | 17:49 |
SpamapS | pabelanger: so I have everything on a single 16GB 8vcpu VM that tops out at around 400 IO/s | 17:50 |
SpamapS | moved Zookeeper to an 8GB 4vcpu VM on its own | 17:50 |
pabelanger | SpamapS: good info, I too am testing a single VM, but don't really have a lot of jobs currently | 17:51 |
SpamapS | My executor hit the governor load of 20 for the first time a few minutes ago. | 17:52 |
SpamapS | There were probably 25 - 30 concurrent playbooks running | 17:52 |
SpamapS | I should get a graphite/statsd set up so I can tell | 17:53 |
corvus | SpamapS: i wonder if there's a metric or something we could log to help identify this problem? | 17:53 |
SpamapS | corvus: zookeeper was warning me | 17:54 |
SpamapS | Apr 19 14:50:49 zuul.cloud.phx3.gdg zookeeper[2178]: 2018-04-19 14:50:49,268 - WARN [SyncThread:0:FileTxnLog@338] - fsync-ing the write ahead log in SyncThread:0 took 3451ms which will adv | 17:54 |
SpamapS | corvus: we shoudl tell people to watch out for that | 17:54 |
tobiash | SpamapS: did you also enable autopurge.purgeInterval? | 17:55 |
tobiash | that makes sure that you don't fill up all your space with snapshots | 17:55 |
tobiash | (or in my case it filled its data tmpfs with snapshots and oomed once a week) | 17:56 |
SpamapS | tobiash: no I just run the cleanout script | 17:57 |
SpamapS | but putting that on tmpfs would probably make it pretty fast. :) | 17:57 |
tobiash | ok, that's the other option ;) | 17:57 |
tobiash | SpamapS: yes but then you should spread it to several vms | 17:58 |
tobiash | I'm running it with 5 replica on tmpfs | 17:59 |
*** electrofelix has quit IRC | 18:01 | |
SpamapS | tobiash: IIRC that does not improve write load | 18:02 |
tobiash | SpamapS: I know, more zk are actually slower but I have 5 to reduce the risk of data loss and having to rebuild all images | 18:03 |
fungi | tmpfs (in linux anyway) is really just the kernel's filesystem cache layer divorced from any underlying physical block layer and granted the ability to page out to swap | 18:03 |
tobiash | I think the recommendation was 3, 5 or at max 7 | 18:03 |
fungi | so when you've got available ram, filesystem caching is roughly as performant as tmpfs, and when you don't have available ram tmpfs is using swap which makes it about the same performance as any actual block-backed fs | 18:05 |
SpamapS | tobiash: ah yeah, I should do that. ;) | 18:05 |
SpamapS | Loss of nodepool data would mostly mean that you have to clean up all the ready nodes and images. | 18:06 |
tobiash | fungi: zk does many fsync calls afaik and my nodes don't have swap ;) | 18:06 |
SpamapS | fungi: in the past swapping was far less performant than filesystem flushing. | 18:06 |
Shrews | SpamapS: awesome. i suspected the issue had to be environmental, but glad you confirmed. zk gets a LOT of traffic | 18:09 |
SpamapS | yeah, it's just weird because it wasn't showing as much io wait (15-20 percent) so was kinda hidden. | 18:10 |
*** elyezer_ has quit IRC | 18:10 | |
*** elyezer has joined #zuul | 18:12 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Replace use of aiohttp with cherrypy https://review.openstack.org/567959 | 18:12 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Convert streaming unit test to ws4py and remove aiohttp https://review.openstack.org/568335 | 18:12 |
Shrews | SpamapS: iirc, we have nodepool code that would clean up leaked instances (due to zk data loss) | 18:30 |
Shrews | i can't remember if there is an images equivalent | 18:30 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: WIP: Simplify driver API https://review.openstack.org/568704 | 18:32 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: WIP: Simplify driver API https://review.openstack.org/568704 | 18:34 |
tobiash | Shrews: afaik there is no image equivalent | 18:36 |
tobiash | but maybe that would make sense | 18:36 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: Simplify driver API https://review.openstack.org/568704 | 18:37 |
Shrews | tobiash: yeah, maybe | 18:37 |
tobiash | I think we would need to add a nodepool_provider_name property also to the images and add a cleanup thread | 18:37 |
Shrews | yeah, we'd need something to say "we own this" | 18:37 |
tobiash | we already have upload and build id but that doesn't tell us that information | 18:38 |
Shrews | tristanC (and any EasyStack folks who may silently linger here): I know 568704 is totally going to cause havoc on your driver proposals, but I think that is going to be a much easier interface, if you find time to take a quick peek. | 18:42 |
Shrews | no rush, obviously | 18:42 |
Shrews | i still need to get to the end of what i'm changing to validate that's actually going to work for us. more changes may come | 18:44 |
*** gtema has joined #zuul | 19:00 | |
*** gtema has quit IRC | 19:07 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: WIP: Cleanup leaked images https://review.openstack.org/568937 | 19:09 |
tobiash | Shrews: that's a first ugly wip ^ | 19:09 |
tobiash | but I think we might want to move that into the provider and add a cleanupLeakedImages function to the driver api? | 19:10 |
Shrews | tobiash: it should be in the builder, not launcher | 19:15 |
tobiash | oh right, there is also a cleanup worker | 19:18 |
corvus | mordred: the cherrypy change is ready -- it passes tests locally, however, it's hitting process-returncode failures in the gate with no output. do you think the stestr change would help illuminate the problem? | 19:47 |
mordred | corvus: MAYBE? | 19:48 |
mordred | corvus: I mean, it's worth a depends-on | 19:50 |
corvus | mordred: any idea where that ended up? :) | 19:51 |
corvus | mordred: oh, i think it was merged and reverted, and there's no unrevert | 19:51 |
corvus | 536882 was original | 19:51 |
corvus | i'll push up an unrevert and stack on it | 19:52 |
mordred | cool | 19:52 |
mordred | iirc, I think there was an issue with the original too that we uncovered that we were going to fix when we unreverted - although it might have just been a documentation issue | 19:53 |
corvus | yeah, i wish i had been more verbose in the revert commit :| | 19:53 |
corvus | i know one of the errors i ran into was my fault; i can't recall the others | 19:54 |
mordred | me either | 19:54 |
mordred | however - I can help re-diagnose them as soon as you encounter them | 19:54 |
corvus | yeah, as long as we aren't in a rush and can take some time to poke at it over the next couple weeks i'm sure we can sort it out | 19:55 |
mordred | ++ | 19:55 |
corvus | we just switched nodepool to stestr | 19:55 |
corvus | and i think Shrews may have ironed out some things there? | 19:55 |
corvus | so probably worth refreshing the zuul stestr change to account for any differences that ended up in the nodepool one | 19:56 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Revert "Revert "Switch to stestr"" https://review.openstack.org/568949 | 20:03 |
Shrews | only issue i had with the nodepool testr change was being able to run ttrun | 20:08 |
Shrews | which mordred fixed | 20:08 |
Shrews | s/testr/stestr | 20:09 |
fungi | corvus: any reason to hold off approving the mqtt publisher? lgtm but since it's accumulating +2s i didn't know if there was a reason to wait, seeking more feedback, something | 20:14 |
mordred | fungi: I think it's mostly making sure there's someone to watch it since it's a big change - I was gonna do it after lunch but got sidetracked | 20:15 |
fungi | okay, cool | 20:15 |
fungi | i'm already on the hook for watching the storyboard update unfold | 20:15 |
fungi | which looks like it's going to apply any moment now | 20:16 |
corvus | yeah, i just didn't want to accidentally make more work for someone since i'm running around with very small timeslices right now :) | 20:20 |
corvus | should be totally fine | 20:20 |
corvus | mordred: ft1.2: tests.unit.test_streaming.TestStreaming.test_decode_boundaries_StringException | 20:49 |
mordred | corvus: that doesn't seem to be much more helpful | 20:50 |
corvus | still looks like alarm clock timeouts leave us with no data | 20:50 |
mordred | corvus: "StringException" isn't clear to you? | 20:51 |
corvus | mordred: oh... hrm... i wonder if this actually has given us the answer... | 20:52 |
mordred | corvus: oh -like, tests.unit.test_streaming.TestStreaming.test_decode_boundaries was the test that bonged? | 20:52 |
corvus | yeah, i think so | 20:52 |
corvus | *maybe* stestr is better at reporting the actual failing test, whereas under testr the failure sometimes gets allocated to the wrong test? | 20:53 |
corvus | i don't know if that's actually the case, or maybe it's just the case that the random number generator happened to run the failing test last this time or something :) | 20:54 |
corvus | but it definitely hangs locally | 20:54 |
mordred | \o/ | 20:56 |
corvus | er, hrm. no actually that only hangs with 568335. when i run it under 567959 it works locally | 20:56 |
*** acozine1 has quit IRC | 21:04 | |
*** dkranz has quit IRC | 21:09 | |
ianw | corvus: I noticed this doing some debugging of aiohttp stuff as well. i removed all the OS_CAPTURE stuff from .testr.conf and then it started spitting out exceptions for me | 21:14 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Replace use of aiohttp with cherrypy https://review.openstack.org/567959 | 21:22 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Convert streaming unit test to ws4py and remove aiohttp https://review.openstack.org/568335 | 21:22 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Revert "Revert "Switch to stestr"" https://review.openstack.org/568949 | 21:22 |
corvus | ianw: yeah, if i run out of options, i may have to do that and try to capture the console output to see if i can catch the issue. it's just that in that case we get all of the output, which is enormous, and it's all interleaved :/ | 21:23 |
ianw | what was weird was that i didn't get all the debugging logs, etc. just the exception output | 21:25 |
dims | corvus : mordred : is there a badge functionality in zuul? (like the "build|passing" in travis - https://docs.travis-ci.com/user/status-images/) | 21:34 |
corvus | dims: afaik zuul sends the right things to github for build status to show up... | 21:35 |
mordred | yah - but not a badge like travis has | 21:35 |
mordred | but that's because with zuul there is no other state for a tree to be in | 21:36 |
corvus | oh, that | 21:36 |
mordred | so there could just be a static image badge "gated by zuul" or something | 21:36 |
corvus | yeah, just link to a green box :) | 21:36 |
mordred | we should totally do that | 21:36 |
corvus | ++ | 21:36 |
dims | right!! | 21:36 |
mordred | dims: consider your feature request accepted :) | 21:36 |
dims | w00t | 21:36 |
SpamapS | :-D | 22:10 |
SpamapS | Should just have a badge which is the zuul logo and the number of days since zuul first merged something: "It has been __ Days since the last failed build." | 22:12 |
*** ssbarnea_ has quit IRC | 22:14 | |
*** pabelanger has quit IRC | 22:15 | |
*** _ari_ has quit IRC | 22:16 | |
*** mhu has quit IRC | 22:16 | |
*** myoung|ruck has quit IRC | 22:17 | |
*** weshay has quit IRC | 22:17 | |
*** weshay has joined #zuul | 22:22 | |
*** weshay has quit IRC | 22:26 | |
*** rlandy is now known as rlandy|bbl | 22:28 | |
*** andreaf has quit IRC | 22:29 | |
*** andreaf has joined #zuul | 22:29 | |
*** weshay has joined #zuul | 22:34 | |
*** pabelanger has joined #zuul | 22:34 | |
*** _ari_ has joined #zuul | 22:35 | |
*** myoung has joined #zuul | 22:35 | |
*** mhu has joined #zuul | 22:36 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-website master: Add a "zuul: gated" status badge https://review.openstack.org/568975 | 22:40 |
mordred | SpamapS: ++ | 22:40 |
mordred | dims: ^^ how's that? | 22:40 |
mordred | it'll be at https://zuul-ci.org/gated.png if/when that lands | 22:40 |
corvus | the preview will be ready in just a minute | 22:41 |
mordred | SpamapS: I like your idea too - I sort of feel like we should have a collection of fun/snarky badges people can use | 22:41 |
corvus | http://logs.openstack.org/75/568975/1/check/zuul-website-build/328c5ec/html/gated.png | 22:45 |
SpamapS | I kind of want it to be a chunk of javascript, not a .png | 22:48 |
SpamapS | even if it just displays a png now | 22:48 |
SpamapS | eventually... | 22:48 |
SpamapS | there's some real fun we can have. | 22:48 |
SpamapS | Like we could make it an homage to the McDonalds 1,000,032 burgers served sign... or count how many gate fails there have been and be like "Protected against 45 bad patches. You're welcome." | 22:49 |
mordred | heh | 22:57 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul master: Add documentation about using the badge https://review.openstack.org/568978 | 22:59 |
fungi | if there's one thing i learned from scouts, is that people love collecting badges | 23:01 |
corvus | badge jokes? we don't need no stinkin' badge jokes! | 23:02 |
fungi | wow! ;) | 23:02 |
mordred | yay for jokes in comments | 23:03 |
*** pabelanger has quit IRC | 23:17 | |
*** mhu has quit IRC | 23:17 | |
*** weshay has quit IRC | 23:17 | |
*** myoung has quit IRC | 23:17 | |
*** _ari_ has quit IRC | 23:18 | |
*** pabelanger has joined #zuul | 23:20 | |
*** weshay has joined #zuul | 23:21 | |
*** _ari_ has joined #zuul | 23:21 | |
*** myoung has joined #zuul | 23:23 | |
ianw | nodepool@nl01:~$ nodepool list | grep arm64 | 23:24 |
ianw | 2018-05-16 23:23:39,227 WARNING kazoo.client: Connection dropped: socket connection error: Permission denied | 23:24 |
*** mhu has joined #zuul | 23:24 | |
ianw | why do i always see that ^ | 23:24 |
*** _ari_ has quit IRC | 23:28 | |
*** _ari_ has joined #zuul | 23:28 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul master: Add documentation about using the badge https://review.openstack.org/568978 | 23:53 |
*** myoung is now known as myoung|ruck | 23:56 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!