Friday, 2020-09-04

hasharmy ISP is probably too picky since that works fine from a gmail address00:00
hasharI tried with hashar@free.fr00:00
hasharso maybe there is a misconfiguration in your config or it is just my ISP being annoying00:00
corvushashar: we got this from your mail server: "SMTP error from remote mail server after end of data: 550 spam detected"00:01
hasharah yeah00:01
hasharthat is my isp ;]00:01
hasharthank you for checking it!00:01
corvushashar: np, sorry :(00:01
hasharit is one of the largest isp in France and they went with a few hammers when it comes to deal with inbound spam00:01
hasharthe reason was to ask about the status of opendev/gear since it has a lot of small patches that could use review00:02
*** cloudnull has joined #opendev00:02
corvushashar: ah, i can try to take a pass through those soon.  it mostly "just works" so i haven't really been looking00:04
hasharan idea I had was to write down a mail listing the patches and giving a brief overview for each of them00:05
hasharthat might be less intimidating  / easier to process them in bulk00:05
* hashar mailed the postmaster00:24
*** tkajinam has quit IRC00:59
*** tkajinam has joined #opendev00:59
*** elod has quit IRC01:35
*** elod has joined #opendev01:37
*** hashar has quit IRC01:49
*** euclidsun has joined #opendev02:53
*** euclidsun has left #opendev02:58
*** zbr4 has joined #opendev05:03
*** zbr has quit IRC05:06
*** zbr4 is now known as zbr05:06
*** ysandeep|away is now known as ysandeep05:06
*** qchris has quit IRC06:20
*** qchris has joined #opendev06:33
*** Gyuseok_Jung has quit IRC06:51
yoctozeptohow can infra help us (kolla) deal with the rate-limiting problem of - could we set up a caching docker registry?07:17
*** tosky has joined #opendev07:25
*** andrewbonney has joined #opendev07:40
*** moppy has quit IRC08:01
*** moppy has joined #opendev08:03
*** hashar has joined #opendev08:19
*** pushparajkvp has joined #opendev08:19
*** dtantsur|afk is now known as dtantsur08:24
*** moppy has quit IRC08:28
*** moppy has joined #opendev08:28
*** moppiner has joined #opendev08:32
*** moppy has quit IRC08:33
*** DSpider has joined #opendev08:51
*** pushparajkvp has quit IRC08:54
*** xiaolin has joined #opendev09:15
*** xiaolin has quit IRC09:28
*** stephenfin has quit IRC10:27
*** hashar is now known as hasharAway11:43
*** redrobot has quit IRC12:10
*** Eighth_Doctor has quit IRC12:21
*** mordred has quit IRC12:22
*** mordred has joined #opendev12:30
*** hasharAway has quit IRC12:45
*** Eighth_Doctor has joined #opendev13:00
*** lpetrut has joined #opendev13:21
fungiyoctozepto: we were talking about that yesterday (either in here or #openstack-infra, maybe both)13:50
fungiyoctozepto: docker has promised to publish recommendations for operators of ci systems as to how best to solve the problem, so we're mostly holding out for that13:52
fungithough if running a proxy registry does wind up being their recommended solution, i wonder if we should do a double-layered solution where we used a proxy registry to cache images somewhere centrally, and then pointed our current caching http proxies in each provider at that instead of at dockerhub. that way you get images cached near the nodes, but also have the caches hitting a registry which doesn't rate13:54
fungilimit them (we could even restrict access to it so only our http proxies were allowed to make requests if we needed to mitigate abuse13:54
fricklerfungi: yoctozepto: the docker blog says "To apply for an open source plan, please complete the short form here.", did anyone do that? Not sure whether they'll announce more details only to those that leave their data there, instead of publically13:56
*** ysandeep is now known as ysandeep|away13:56
fricklerthe form starts by asking for personal data including a docker id13:57
frickleroh, the form even says "Please complete our survey to get more information about how Docker can support your open source project on Docker Hub.14:01
frickler" at the top
fungii think we also assumed that it would require some sort of authentication to make use of a special plab14:07
*** qchris has quit IRC14:13
*** qchris has joined #opendev14:14
clarkbto be clear we do already cache. The specific issue is we cache blobs not manifests. The old rate limits werebased on blob fetches because they arethe actual data but docker changed the rate limiting ti be based on manifests becausepeople found blob limits confusing14:24
clarkbit is unfortunate because we were doing the right thing for the previous situation14:25
clarkband ya they promised a blog post specifically related to CI14:25
funginew yesterday, "Deprecate distutils module"14:35
fungi(in the ongoing setuptools/distutils saga)14:35
fungier, i meant to link but that's the discussion on their discourse14:35
fungimaybe this will finally force distros who want to be able to split files under their package management from those managed by pip et cetera to better come to a compromise with the upstream python devs and package ecosystem14:37
fungisince "just patch distutils" will cease to be an option14:37
fungi"Code that imports distutils will no longer work from Python 3.12."14:39
fungithat's going to be un14:39
fungialso fun14:39
clarkband setuptools is vendoring distutils because it does import distutils?14:42
*** priteau has joined #opendev14:43
clarkbfor docker I expect we have two simple options in the short term. First is stop using our caches, then the requests willbe distributed across many more IPs14:57
clarkbSecond is set up per project accounts and then use those with the mirrors so that manifest fetches are associated to accounts not IPs but we atill get blob caching for reliability (and perhaps speed)14:58
*** lpetrut has quit IRC15:06
fungisetuptools is vendoring distutils because it needs new distutils features and doesn't want to have to maintain backward compatibility with whatever the implementations in various 5-year-old stdlib might be15:12
fungiand also as indicated by pep 632, the python stdlib maintainers would like to be able to stop maintaining it themselves (it's currently used for building the stdlib modules, but they're looking to switch to using makefiles directly like the interpreter does)15:14
*** mlavalle has joined #opendev15:14
*** rpittau is now known as rpittau|afk15:18
openstackgerritNate Johnston proposed openstack/project-config master: Make the Backport-Candidate field in Octavia reviews persist
*** hashar has joined #opendev15:22
fungihrm, the old afs02.dfw cinder volumes i cleaned up went into error_deleting state for some reason15:24
fungii don't think afs01.dfw's did that15:25
fungi#status log all four cinder volumes for afs02.dfw have been replaced and cleaned up15:26
openstackstatusfungi: finished logging15:26
fungii'll get to work on the dfw mirror server's volume shortly15:26
clarkbinfra-root I intend to catch up on email and any review response, then pop out for a bike ride. When I get back I plan to try booting an server in linaro-us which can serve as our new dockerized nodepool builder for arm15:37
clarkbI do wonder if I sould boot a instead to avoid the hostname conflicts but iirc we fixed that in nodepool15:40
clarkband maybe this is a good test of that15:40
clarkbfungi: re error deleting volumes, di they successfully detach from the VM at least?15:48
*** pushparajkvp has joined #opendev15:57
*** dtantsur is now known as dtantsur|afk16:06
fungiyep, or at least cinder thought they did16:12
fungiit reported them as "available" rather than "in-use"16:12
fungiick, the dfw mirror recorded a page allocation failure in xenwatch the same time i attached its new volume16:16
fungiinfra-root: ^ should we turn down that region temporarily and reboot the mirror?16:17
clarkbis it persistently unhappy?16:18
clarkbpage allocation failures would indicate some type of OOM?16:18
fungii have a feeling the new volume wasn't successfully hot-added16:18
fungithat was the first entry in dmesg in roughly a month16:19
fungiand it didn't claim to be out of memory, or even close16:19
clarkbya unless you want to quickly write a partition table and mkfs and do some tests a reboot sounds practical (to be clear I'm saying reboot is probably simpler and easier than the laternative)16:21
yoctozeptofungi, frickler, clarkb: I haven't done that form for sure; glad to know there is plan to have some cache; it might be beneficial in general - dockerhub likes to go awry; please ping me wherever you discuss docker issues, I won't mind but rather be thankful :-)16:23
openstackgerritJeremy Stanley proposed openstack/project-config master: Temporarily disable rax-dfw for mirror reboot
openstackgerritJeremy Stanley proposed openstack/project-config master: Revert "Temporarily disable rax-dfw for mirror reboot"
fungii'll wip the revert for now16:25
openstackgerritNate Johnston proposed openstack/project-config master: Allow copyAnyScore in gerrit ACLs
openstackgerritNate Johnston proposed openstack/project-config master: Make the Backport-Candidate field in Octavia reviews persist
fungiwhile we wait for a safe mirror reboot, i'll work on etherpad, and then maybe gerrit after that since hopefully activity level will be dropping headed into the weekend so if there is any (unlikely) disruption from the pvmove there it won't be too painful16:41
*** hashar has quit IRC17:11
openstackgerritMerged openstack/project-config master: Temporarily disable rax-dfw for mirror reboot
fungietherpad pvmove is in progress under a root screen session now17:13
fungishouldn't take long, it's a 50gb ssd17:14
funginot like the afs servers where we have 4tb attached17:14
fungialready 10% complete17:14
fungi#status log cinder volume for etherpad01 has been replaced and cleaned up17:28
openstackstatusfungi: finished logging17:28
fungiclarkb: if we're worried about i/o performance, should we give review.o.o an ssd volume instead of sata?17:34
fungieasy enough to do while i'm replacing anyway17:34
fungiit might help with the upgrade17:34
fungilooks like the rax-dfw max-servers was zeroed at 17:25z, so in use counts there are dwindling18:07
fungionce they bottom out i'll reboot it and then may as well do its pvmove before bringing it back into service18:08
fungidemand's not that high now anyway18:08
fungiand after that maybe we'll have an idea of whether we want to make changes to the volume for review.o.o18:08
*** pushparajkvp has quit IRC18:11
*** andrewbonney has quit IRC18:22
clarkbfungi: I thought it was an ssd voolume already18:42
clarkbbut yes I think we want an ssd volume for the upgrade process18:43
fungiit's a 200gb sata volume right now. i could make it a... 256gb? ssd18:43
fungiit's around half-used currently, but i figure we've got db content moving into it too18:44
fungiwhen the notedb migration happens that is18:44
clarkbya one of the things we'll need to sort out is how much extra disk we need18:44
clarkbone reason the current volume is so full is we've got a bit of old stuff in /home/gerrit218:44
clarkbI cleaned up some of that recently though18:44
clarkb~250gb seems fine and if we need more we can always attach another18:45
fungior migrate to a larger volume, yep18:46
fungidown to 7 nodes in-use for rax-dfw18:47
fungithough we have a bunch of nodes there which look to be stuck in a "deleting" state according to grafana/graphite/nodepool18:48
fungithe cinder volume for review.o.o is undergoing pvmove to a 256gb ssd in a root screen session18:50
fungihopefully should be done within the hour18:50
fungiit's already 3% complete18:50
clarkbit looks like the cinder volume on isn't lvm'd18:51
fungiby then we ought to be clear to do the mirror18:51
clarkbI think I'll lvm the new server and if thats wrong becaus arm64 we'll sort that out18:51
fungiyeah, having the lvm layer in place allows us to do stuff like this ;)18:51
*** zbr0 has joined #opendev18:52
fungionce the pvmove on review.o.o is done and cleaned up, we can either do the logical volume and filesystem resize, or just wait and remember we want to do that during the next gerrit maintenance18:53
fungibut risk for online resize is low in my opinion18:53
*** zbr has quit IRC18:53
*** zbr0 is now known as zbr18:53
*** priteau has quit IRC19:00
clarkblaunch node is running (I had to specify a network to make it work but otherwise smooth sailing so far)19:02
clarkbhrm ping6 to wiki failed19:04
clarkbI can do --ignore_ipv6 which maybe is necessary in this cloud (it is for ovh)19:04
clarkbI'll see where that gets me I guess19:05
fungiaha, yeah ping6 in ovh will likely fail out of the gate19:05
fungiwe should see if we can think up a round-trip mechanism to push a working v6 config to the instances there19:05
clarkbwe can configure it statically with launch node there19:06
fungiought to be able to query the nova api and then generate something to scp onto it19:06
clarkb(still not sure what is up with linaro-us ipv6 but we can always scrap this server and start over if necessary. Figure some progress is better than one)19:06
fungireview pvmove is 60% complete19:09
fungi0 nodes in use for rax-dfw. i'll reboot it now19:11
clarkbnow I seem to have a conflict between make swap script and mount volume script19:12
fungi#status log rebooted mirror01.dfw.rax to resolve a page allocation failure during volume attach19:13
openstackstatusfungi: finished logging19:13
clarkbthe issue is that /dev/vdb is the device and assumes it owns that, then tries to use the same device and fails19:14
clarkbInstead of using launch_node to mount and configure the volume I'll do that after I've booted the instance and launch node is happy19:14
clarkbI need to pop out fo ra few minutes and finish lunch I'll try again without automated volume mounting after19:17
fungithe old sata volume for review.o.o went into error_deleting too, but it's migrated19:31
fungi#status log cinder volume for review.o.o has been replaced, upgraded from 200gb sata to 256gb ssd, and cleaned up19:32
openstackstatusfungi: finished logging19:32
fungii went ahead and resized the logical volume and fs to fill it19:35
fungiactivity seems pretty low at the moment19:35
fungiinfra-root: ^ just in case anyone spots anything amiss19:36
fungithe /home/gerrit2 fs is now 36% used19:36
fungiso we've got lots of headroom19:36
fungipvmove for the mirror01.dfw.rax volume is underway in a root screen session on the server19:40
clarkbthank you for keeping us up to date19:42
clarkbI've just started a new launch without volume automation to get around make swap and mount volume fighitng19:42
clarkbserver booted successfull this time. And it got its ipv6 address. It appears that the RAs may have a delay wich is why it failed before20:00
clarkbwe may want launch node to have some reasonable timeout for global ipv6 addr to show up via RAs but will worry about that once I've got this all sorted20:01
clarkbfungi: re uuids and devfs its the /dev/disk/by-id path but its not strictly 1:120:04
fungiyeah, i think that makes sense (so long as we output a message saying that's the delay)20:04
clarkbyou get a entry there that is virtio-truncateduuid20:04
fungioh, fun20:04
clarkbthis is on kvm though20:04
fungiyep, in rackspace (maybe because of xen?) we get no uuid for raw disk, only for partitions20:04
fungiat least according to the kernel20:05
fungipvmove on the rax-dfw mirror is about half done20:05
clarkbok I think the server is all good now with a volume attached and part of lvm and fstab. Working on the changes needed to turn it into a builder now20:08
openstackgerritClark Boylan proposed opendev/ master: Add to DNS
openstackgerritClark Boylan proposed opendev/system-config master: Remove nodepool builder puppetry and
openstackgerritClark Boylan proposed opendev/system-config master: Add
clarkbthere is a whole depends on chain in there to ensure we don't configure the nb03 with the wrong config20:23
clarkbif we do that I think it may delete images in all the clouds (because of our default config?)20:23
clarkband I'll WIP the last one in the stack which is the removal of the puppetry20:24
fungithat got ugly last time20:24
clarkb(we should only do that once we are happy with the new server20:24
clarkbIt is a holiday on monday20:49
clarkbchances are I'll end up being around but maybe not as much as a typical monday20:49
clarkbinfra-root is a useful launch node change from ianw to add sshfp info to the output of the script21:05
clarkbI had to manaully figure that out (based on what the script does)21:05
fungiyeah, i think christine doesn't expect me to be on the computer much monday, but i'll still try to be around for emergencies21:24
openstackgerritClark Boylan proposed opendev/system-config master: Wait for ipv6 addrs when launching nodes
clarkbthat change feels hacky but shoulddo the job21:27
fungi#status log cinder volume for mirror01.dfw.rax.o.o has been replaced and cleaned up21:32
openstackstatusfungi: finished logging21:32
fungiand i've approved the revert of its max-servers zeroing21:33
fungiafter quickly browsing around it and making sure it's not obviously broken21:33
openstackgerritMerged openstack/project-config master: Revert "Temporarily disable rax-dfw for mirror reboot"
*** mlavalle has quit IRC22:59
*** tosky has quit IRC23:07
*** DSpider has quit IRC23:41

Generated by 2.17.2 by Marius Gedminas - find it at!