openstackgerrit | Ian Wienand proposed opendev/system-config master: Make haproxy role more generic https://review.opendev.org/677903 | 00:36 |
---|---|---|
openstackgerrit | Ian Wienand proposed opendev/system-config master: borg-backup: use unique mark in .ssh/config https://review.opendev.org/758879 | 00:43 |
kevinz | ianw: just FYI, I've been noticed from the Colo facilities that Linaro-US is under PDU maintainence, the servers will be restarted. The reflected time will be narrowed to about 2 hours | 00:45 |
ianw | kevinz: ok, thanks | 00:47 |
ianw | kevinz: we've stil had a few instances of the servers in the control plane shutting down unexpectedly | 00:47 |
kevinz | let me check | 00:51 |
ianw | maybe if you can scroll back through the logs for nb03 and see if anything jumps out? if you give me timestamps i might be able to correlate | 00:52 |
kevinz | ianw: yes I saw the log, looks that all the instances have been scheduled to 1 hosts, and all instance fail to launch | 01:04 |
kevinz | I will change the scheduler rules to see if it become better | 01:04 |
kevinz | several node got no instances, while several nodes are running a lot of instances | 01:05 |
clarkb | looks like openstackid02 and 03 still arent in cacti for some reason. Its way late for me to debug that further (previous issue was the LE stuff iirc) | 01:50 |
*** mlavalle has quit IRC | 01:50 | |
clarkb | ianw: ^ if that is something you have time for that would be great otherwise I'll try to look tomorrow if fires are quiet :) | 01:50 |
*** mlavalle has joined #opendev | 01:51 | |
ianw | clarkb: ok, i don't think ansible got re-enabled yet? | 01:57 |
clarkb | I removed the DISABLE-ANSIBLE file a while back | 01:57 |
ianw | ahh, ok, will poke then | 01:57 |
clarkb | at least I thought I did, worth double checking | 01:58 |
ianw | yeah it's gone | 02:00 |
ianw | there's a bunch of stuff stuck on logstash-worker12.openstack.org to start | 02:01 |
ianw | looks like our standard stuck for undetermined reasons; rebooting it | 02:03 |
clarkb | fungi ianw this look good ? #status notice We are investigating an issue with our hosted Gerrit services. We will provide an update as soon as we can. If you want to follow the latest, feel free to join #opendev | 03:22 |
ianw | lgtm | 03:22 |
clarkb | #status notice We are investigating an issue with our hosted Gerrit services. We will provide an update as soon as we can. If you want to follow the latest, feel free to join #opendev | 03:23 |
openstackstatus | clarkb: sending notice | 03:23 |
-openstackstatus- NOTICE: We are investigating an issue with our hosted Gerrit services. We will provide an update as soon as we can. If you want to follow the latest, feel free to join #opendev | 03:23 | |
openstackstatus | clarkb: finished sending notice | 03:26 |
*** lamt has joined #opendev | 03:32 | |
*** aprice has joined #opendev | 03:38 | |
*** fressi has joined #opendev | 04:21 | |
clarkb | #status notice We identified a possible vulnerability in Gerrit and are investigating the potential impact on our services. Out of an abundance of caution we have taken our OpenDev hosted Gerrit system offline. We will update with more information once we are able. | 04:28 |
openstackstatus | clarkb: sending notice | 04:28 |
-openstackstatus- NOTICE: We identified a possible vulnerability in Gerrit and are investigating the potential impact on our services. Out of an abundance of caution we have taken our OpenDev hosted Gerrit system offline. We will update with more information once we are able. | 04:28 | |
openstackstatus | clarkb: finished sending notice | 04:31 |
*** amotoki has quit IRC | 04:35 | |
*** amotoki has joined #opendev | 04:37 | |
*** lpetrut has joined #opendev | 04:51 | |
*** lpetrut has quit IRC | 04:51 | |
*** fressi has quit IRC | 05:35 | |
*** brinzhang has joined #opendev | 05:49 | |
*** mkalcok has joined #opendev | 06:02 | |
*** gnuoy has joined #opendev | 06:03 | |
*** whoami-rajat__ has joined #opendev | 06:08 | |
whoami-rajat__ | hi #opendev team, is there an estimate as to when review.opendev.org will be active again? | 06:08 |
*** ralonsoh has joined #opendev | 06:12 | |
*** eolivare has joined #opendev | 06:26 | |
*** user_19173783170 has joined #opendev | 06:27 | |
*** sshnaidm|afk is now known as sshnaidm | 06:44 | |
*** chandankumar has joined #opendev | 06:47 | |
*** qchris has quit IRC | 07:02 | |
*** rpittau|afk is now known as rpittau | 07:04 | |
yoctozepto | morning infra! how bad is it? | 07:04 |
*** qchris has joined #opendev | 07:06 | |
*** sboyron has joined #opendev | 07:08 | |
ttx | yoctozepto: just got up, but last I heard they were comparing backups to see if anything was compromised | 07:10 |
*** andrewbonney has joined #opendev | 07:10 | |
*** slaweq has joined #opendev | 07:12 | |
ianw | that's more or less it. there's no catastrophe but we want to make sure things are good before turning anything back on | 07:12 |
yoctozepto | ttx, ianw: thanks! understood | 07:13 |
Tengu | does gerrit supports cryptographic signature for the commits? That might help preventing issues.. | 07:22 |
Tengu | *support | 07:23 |
*** tosky has joined #opendev | 07:43 | |
*** tobias-urdin has joined #opendev | 07:45 | |
*** mgoddard has joined #opendev | 07:50 | |
*** yoshito-ito has joined #opendev | 07:51 | |
*** sean-k-mooney has joined #opendev | 07:51 | |
*** ricolin has joined #opendev | 07:52 | |
*** jlibosva has joined #opendev | 07:56 | |
*** fressi has joined #opendev | 08:00 | |
whoami-rajat__ | can anyone provide an ETA for the same? | 08:11 |
*** lajoskatona has joined #opendev | 08:21 | |
*** mhu has joined #opendev | 08:23 | |
AJaeger | whoami-rajat__: Not yet, please be patient. | 08:24 |
*** hashar has joined #opendev | 08:24 | |
*** yoshito-ito has quit IRC | 08:28 | |
*** insected has joined #opendev | 08:32 | |
-openstackstatus- NOTICE: We identified a possible vulnerability in Gerrit and are investigating the potential impact on our services. Out of an abundance of caution we have taken our OpenDev hosted Gerrit system offline. We will update with more information once we are able. | 08:34 | |
*** ChanServ changes topic to "We identified a possible vulnerability in Gerrit and are investigating the potential impact on our services. Out of an abundance of caution we have taken our OpenDev hosted Gerrit system offline. We will update with more information once we are able." | 08:34 | |
mhu | Morning, is the gerrit vulnerability documented somewhere? We're running a few gerrits too and we would like to audit our instances if the problem is severe | 08:37 |
ttx | mhu: it's unclear that the vulnerability is in gerrit at this point | 08:40 |
ttx | (or what the vulnerability would be) | 08:40 |
*** ysandeep|away is now known as ysandeep | 08:45 | |
*** priteau has joined #opendev | 08:45 | |
*** insected has quit IRC | 08:51 | |
*** jcapitao has joined #opendev | 09:11 | |
*** ysandeep is now known as ysandeep|lunch | 09:12 | |
*** Eighth_Doctor has quit IRC | 09:24 | |
*** mordred has quit IRC | 09:24 | |
*** ysandeep|lunch is now known as ysandeep | 09:34 | |
*** mordred has joined #opendev | 09:34 | |
*** sshnaidm is now known as sshnaidm|afk | 09:48 | |
*** Eighth_Doctor has joined #opendev | 09:58 | |
*** user_19173783170 has quit IRC | 09:59 | |
*** fressi has quit IRC | 10:07 | |
*** fressi has joined #opendev | 10:08 | |
*** elod is now known as elod_afk | 10:20 | |
*** DSpider has joined #opendev | 10:57 | |
AJaeger | #status alert Update on gerrit downtime: After investigation, we believe the incident is related to a compromised Gerrit user account rather than a vulnerability in Gerrit software. We are continuing to review activity to verify the integrity of git data and expect to have an additional update with possible service restoration in approximately 2 hours. | 11:02 |
openstackstatus | AJaeger: sending alert | 11:02 |
-openstackstatus- NOTICE: Update on gerrit downtime: After investigation, we believe the incident is related to a compromised Gerrit user account rather than a vulnerability in Gerrit software. We are continuing to review activity to verify the integrity of git data and expect to have an additional update with possible service restoration in approximately 2 hours. | 11:03 | |
*** ChanServ changes topic to "Update on gerrit downtime: After investigation, we believe the incident is related to a compromised Gerrit user account rather than a vulnerability in Gerrit software. We are continuing to review activity to verify the integrity of git data and expect to have an additional update with possible service restoration in approximately 2 hours." | 11:03 | |
openstackstatus | AJaeger: finished sending alert | 11:08 |
mhu | Thanks for the update AJaeger - looking forward to a full report if possible, this was intriguing to say the least | 11:11 |
*** insected has joined #opendev | 11:13 | |
*** jcapitao is now known as jcapitao_lunch | 11:15 | |
*** eolivare has quit IRC | 11:17 | |
*** gnuoy has quit IRC | 11:23 | |
*** gnuoy has joined #opendev | 11:24 | |
*** lyarwood has joined #opendev | 11:34 | |
mnaser | Do we have any ETA? | 12:00 |
AJaeger | mnaser: see recent alert "approx. 2hours", so in another hour from now on. But that is promise for an update... | 12:04 |
mnaser | Oh, right. I missed that. Thanks AJaeger | 12:05 |
*** insected has quit IRC | 12:12 | |
*** elod_afk is now known as elod | 12:13 | |
*** weshay|ruck has joined #opendev | 12:14 | |
*** sshnaidm|afk is now known as sshnaidm | 12:20 | |
*** fnordahl has joined #opendev | 12:23 | |
*** eolivare has joined #opendev | 12:26 | |
*** jcapitao_lunch is now known as jcapitao | 12:28 | |
fungi | does status alert when there's already an alert in progress just update the message? | 13:27 |
clarkb | I'm not sure | 13:27 |
fungi | i'm also worried about leaving this in alert for any extended period because if there's a netsplit we'll end up having to manually fix >100 channel topics | 13:28 |
fungi | that's why we were initially using status notice | 13:28 |
AJaeger | fungi: AFAIK it does | 13:29 |
AJaeger | fungi: hope I didn't create chaos already ;( | 13:29 |
fungi | okay, i'll cross my fingers | 13:30 |
fungi | #status alert We've confirmed that known compromised identities have been reset or had their accounts disabled, and we are auditing other service accounts for signs of compromise before we prepare to restore Gerrit to working order. We will update again in roughly 2 hours. | 13:30 |
openstackstatus | fungi: sending alert | 13:30 |
-openstackstatus- NOTICE: We've confirmed that known compromised identities have been reset or had their accounts disabled, and we are auditing other service accounts for signs of compromise before we prepare to restore Gerrit to working order. We will update again in roughly 2 hours. | 13:31 | |
*** ChanServ changes topic to "We've confirmed that known compromised identities have been reset or had their accounts disabled, and we are auditing other service accounts for signs of compromise before we prepare to restore Gerrit to working order. We will update again in roughly 2 hours." | 13:31 | |
openstackstatus | fungi: finished sending alert | 13:37 |
*** mattd01 has joined #opendev | 13:47 | |
*** sshnaidm is now known as sshnaidm|afk | 13:52 | |
sean-k-mooney | so currently we use ubuntu one for gerrit login | 14:11 |
sean-k-mooney | someone downstream made a comment about two factor auth | 14:11 |
sean-k-mooney | which that does not support | 14:11 |
clarkb | it actually does but its convoluted | 14:11 |
*** sshnaidm|afk is now known as sshnaidm | 14:12 | |
sean-k-mooney | im wonderign how feasible it would be to swap to the openinfraid system and maybe ebale twofactor | 14:12 |
sean-k-mooney | oh really i dont see it in the ui | 14:12 |
clarkb | ya its not in the ui you have to join an lp group or something | 14:12 |
clarkb | if you google it you'll get the page with details | 14:12 |
*** clayg has joined #opendev | 14:12 | |
sean-k-mooney | https://help.ubuntu.com/community/SSO/FAQs/2FA | 14:12 |
clarkb | we also have a spec on improving the authentication system | 14:12 |
sean-k-mooney | you join launchpad.net/~sso-2f-testers | 14:13 |
sean-k-mooney | someday we will be able to log into everyting using ssh key pairs i hope | 14:14 |
*** slittle1 has quit IRC | 14:45 | |
*** slittle1 has joined #opendev | 14:46 | |
*** hjensas has joined #opendev | 15:20 | |
mnaser | sean-k-mooney: yeah i use 2fa for ubuntu one | 15:23 |
mnaser | its tricky to setup but itsit works ok for me | 15:23 |
sean-k-mooney | via that faq | 15:24 |
mnaser | i dont even think you need to be in the sso-2f-testers group (anymore) | 15:24 |
sean-k-mooney | e.g. adding yourself to the group | 15:24 |
sean-k-mooney | oh ok | 15:24 |
mnaser | or maybe you do | 15:24 |
mnaser | it looks like i am in it i gues | 15:24 |
sean-k-mooney | ya dont remove yourself incase it breaks | 15:25 |
sean-k-mooney | i was just wondering if we could swap in the openstack/openinfra openid/oath2 provider as an alternitive login with gerrit | 15:26 |
sean-k-mooney | assuming that could support 2 factor | 15:26 |
sean-k-mooney | i know gerrit can be configured to use multiple login providers but we dont | 15:27 |
mnaser | there's a spec up to work on this sean-k-mooney -- i'd link but i cant :) | 15:27 |
sean-k-mooney | oh ok cool | 15:27 |
sean-k-mooney | no worries i can find it myself some other time | 15:27 |
sean-k-mooney | when things are less on fire | 15:28 |
clarkb | #status alert Auditing is progressing but not particularly quickly. We'll keep updating every 2 hours or so. | 15:38 |
openstackstatus | clarkb: sending alert | 15:38 |
-openstackstatus- NOTICE: Auditing is progressing but not particularly quickly. We'll keep updating every 2 hours or so. | 15:38 | |
*** ChanServ changes topic to "Auditing is progressing but not particularly quickly. We'll keep updating every 2 hours or so." | 15:39 | |
openstackstatus | clarkb: finished sending alert | 15:44 |
*** fressi has quit IRC | 15:50 | |
*** jlibosva has quit IRC | 15:58 | |
clayg | thanks for the updates! keep up the good work. We appreciate y'all! | 16:00 |
*** mlavalle has quit IRC | 16:00 | |
*** hamalq has joined #opendev | 16:00 | |
*** hamalq has quit IRC | 16:03 | |
*** hamalq has joined #opendev | 16:04 | |
*** hamalq has quit IRC | 16:05 | |
*** hamalq has joined #opendev | 16:05 | |
*** mlavalle has joined #opendev | 16:06 | |
*** jcapitao has quit IRC | 16:16 | |
*** slittle1 has quit IRC | 16:21 | |
*** lajoskatona has quit IRC | 16:32 | |
*** rpittau is now known as rpittau|afk | 16:38 | |
*** mattd01 has quit IRC | 16:52 | |
*** hashar has quit IRC | 17:06 | |
*** eolivare has quit IRC | 17:11 | |
*** insected has joined #opendev | 17:39 | |
*** insected has quit IRC | 17:40 | |
*** mattd01 has joined #opendev | 17:44 | |
clarkb | frickler: fungi corvus how about #status ok Not actually ok, but resetting topics in order to reduce IRC spam. | 17:52 |
clarkb | then #status notice please refer to https://review.opendev.org/maintenance.html or #opendev for further updates. We're trying t cut back on IRC spam. | 17:52 |
fungi | s/resetting/restoring/ maybe | 17:53 |
clarkb | ++ | 17:53 |
corvus | clarkb: why not one more update to the alert with the link then leave that in place? | 17:53 |
corvus | (ie, just stop spamming, but leave the alert until it's really back up?) | 17:53 |
clarkb | corvus: there is concern that a netsplit will leave us having to manually reset 100 or so channel topics when we are done | 17:53 |
clarkb | fungi: frickler ^ I believe that was coming from you | 17:53 |
corvus | if that happens, we can semi-automate that since they're logged | 17:54 |
clarkb | ok in that case #status alert We'll stop sending alerts every couple hours. Instead please refer to https://review.opendev.org/maintenance.html or #opendev for the latest. | 17:55 |
clarkb | how does that look? | 17:56 |
corvus | i'd do: status alert "Gerrit is offline due to a security compromise. Please refer to https://review.opendev.org/maintenance.html or #opendev for the latest updates." | 17:56 |
clarkb | ++ thats more informative | 17:56 |
fungi | i raised that concern, but mostly out of already being overwhelmed with things we need to fix and not wanting to add everyone's irc channel topics to the growing pile | 17:56 |
clarkb | #status alert Gerrit is offline due to a security compromise. Please refer to https://review.opendev.org/maintenance.html or #opendev for the latest updates. | 17:58 |
openstackstatus | clarkb: sending alert | 17:58 |
-openstackstatus- NOTICE: Gerrit is offline due to a security compromise. Please refer to https://review.opendev.org/maintenance.html or #opendev for the latest updates. | 17:59 | |
*** ChanServ changes topic to "Gerrit is offline due to a security compromise. Please refer to https://review.opendev.org/maintenance.html or #opendev for the latest updates." | 17:59 | |
openstackstatus | clarkb: finished sending alert | 18:04 |
*** Tengu has quit IRC | 18:06 | |
*** Tengu has joined #opendev | 18:08 | |
*** qizhangapp has joined #opendev | 18:09 | |
*** andrewbonney has quit IRC | 18:09 | |
*** Tengu has quit IRC | 18:21 | |
*** hashar has joined #opendev | 18:24 | |
*** qizhangapp has quit IRC | 18:27 | |
*** Tengu has joined #opendev | 18:27 | |
dmsimard | hug ops <3 | 18:29 |
*** portdirect has joined #opendev | 18:31 | |
clayg | hugs ops <3 | 18:35 |
clayg | the plan on the maintenance.html is extensive - it sounds reasonable - BUT an audit of every change in every repo across distributed teams going back to 10/6 will take time - we need to download and review WIP ASAP (or temporarily move work elsewhere) | 18:37 |
clayg | can you provide any more estimates - i'm sure folks who have been at this will eventually need a break! | 18:38 |
clarkb | we've got work split up right now so we're hoping that we can make good progress but I'm wary of putting an eta on it | 18:40 |
clarkb | since we don't know what we'll find yet | 18:40 |
mhu | good luck folks, sounds like going through 2 weeks' worth of activity on opendev's gerrit won't be fun | 18:43 |
avass | clarkb: good luck! I hope everything goes well | 18:44 |
yoctozepto | hey infra; keeping fingers crossed for you | 18:47 |
yoctozepto | meanwhile having a question | 18:47 |
yoctozepto | is it possible to fetch proposed change contents if I know its id but gerrit is down? | 18:48 |
clarkb | yes, they are in gitea | 18:48 |
clarkb | refs/changes/XY/ABCXY/ | 18:48 |
yoctozepto | as in two digits? | 18:48 |
yoctozepto | thanks, I'll try | 18:48 |
clarkb | ya they shard by the last two digits in the change number | 18:48 |
clarkb | then follow it with the full change number | 18:49 |
yoctozepto | and can also include patchset number? | 18:49 |
clarkb | oh ya the patchset comes next in the path iirc | 18:49 |
clarkb | so xy/abcxy/1 | 18:49 |
yoctozepto | trying then | 18:49 |
yoctozepto | yup, it forces me to guess the patchset number | 18:51 |
yoctozepto | but works overally so better than nothing ;D | 18:54 |
yoctozepto | thanks clarkb | 18:54 |
yoctozepto | I wonder what kind of suspicious activity that was what you discovered | 18:55 |
*** tbogue has joined #opendev | 19:04 | |
*** priteau has quit IRC | 19:16 | |
*** mattd01 has quit IRC | 19:21 | |
*** tosky has quit IRC | 19:54 | |
*** xavpaice has joined #opendev | 20:16 | |
*** timburke has joined #opendev | 20:23 | |
*** sboyron has quit IRC | 21:04 | |
*** slaweq has quit IRC | 21:12 | |
*** slaweq has joined #opendev | 21:17 | |
*** slaweq has quit IRC | 21:26 | |
*** hashar has quit IRC | 21:45 | |
*** mlavalle has quit IRC | 21:45 | |
*** mlavalle has joined #opendev | 21:47 | |
*** ralonsoh has quit IRC | 22:12 | |
*** qchris has quit IRC | 22:41 | |
*** guilhermesp has joined #opendev | 22:52 | |
*** qchris has joined #opendev | 22:54 | |
*** rchurch has joined #opendev | 22:58 | |
*** mlavalle has quit IRC | 22:59 | |
*** mattmceuen has joined #opendev | 23:08 | |
*** DjeufackZane has joined #opendev | 23:10 | |
*** DjeufackZane has quit IRC | 23:18 | |
*** whoami-rajat__ has quit IRC | 23:56 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!