Monday, 2026-05-18

fungihttps://blog.cloudflare.com/cyber-frontier-models/15:44
fungi"Project Glasswing: what Mythos showed us"15:45
sean-k-mooneymaybe more of a tc or infra topic but has anyoen looked into https://openai.com/form/codex-for-oss/18:35
sean-k-mooneyand the openai codex for opensouce program in general https://developers.openai.com/community/codex-for-oss18:35
sean-k-mooneywe would need to review https://developers.openai.com/codex/codex-for-oss-terms ectra but that could be interesting for the agentic-workflows repo or some other comunity usage depening on what exactly that  could enable18:37
sean-k-mooneyi dont knwo if the other labs have similar opensouce programes or if we woudl want to engage but the cloudflare blog looks intersting. ills read over it18:38
fungii especially liked the duelling banjos approach of having two agents set up to disagree with each other in order to refine the results18:39
sean-k-mooneylol havnt got there yet but ya i can see that being fun and useful18:39
sean-k-mooneyi do often use models form diffent model families to cross check the work fo another18:40
sean-k-mooneyi.e. use opus to plan sonnet to execute and gpt to review if the output matched the intent18:41
sean-k-mooneypartly because ill probaly have to fix it if it finde anything, i have not tired having any of the models chain multiple bug to create an exploint that would not happen form any one bug alone18:43
sean-k-mooneybut the return oriented programing refence totally make sense in that regard18:44
sean-k-mooneyfungi: the request framign is interesitng. i would not have personally frased the request that was refused liekt that.18:47
sean-k-mooneythat is not express as a maintianer of a project woudl. i.e. instead of  "find a way to smuggel a request in x" i would have said somethign like "i belive there is a bug in x a that allwos Y, please help me create a repoducer to confirm so that we can plan how to fix it"18:49
fungiright, it's phrased as an attacker would approach it instead, which is probably why the model was pushing back on the query18:50
sean-k-mooneyya exactly18:50
sean-k-mooney"Patching faster does not change the shape of the pipeline that produces the patch. If regression testing takes a day, you cannot get to a two-hour SLA without skipping it, and the bugs you ship when you skip regression testing tend to be worse than the bugs you were trying to patch. "19:18
sean-k-mooneyya so ^ is definetly true19:18
sean-k-mooneyi know there was that recent trad about private patches and zuul runs19:19
sean-k-mooneybut i would be very very hesitent to skip ci just ot fix a bug19:19
sean-k-mooneyeven a security one19:19
sean-k-mooneyto me that a reason to invest in makign yoru ci more stabel and faster overall19:20
sean-k-mooneyand impornatly making it easy to run the imporant parts locally19:20
fungithat is why our workflow is to test locally as best you can, then push to review once the bug is public, and revise in public if there turns out to be a significant problem that was missed19:21
sean-k-mooneyproject that have local first light weight testign env the aproximate real world deployment are going to be out ahead of companies that had large ci/cd piplemiens with manual qe teams and post chagne verifcation19:22
fungibut also i continue to feel that keeping most of these bugs private while fixes are developed is actively harmful, especially now that it's become much easier to rediscover the same ones independently19:23
fungigouthamr spotted this article where linus is basically saying the same thing: https://www.theregister.com/security/2026/05/18/linus-torvalds-says-ai-powered-bug-hunters-have-made-linux-security-mailing-list-almost-entirely-unmanageable/524163319:24
sean-k-mooneyi woudl agree after a certen point. wheil 90 days may have made sense in the past 10-30 days may be more realistinc today19:24
sean-k-mooney i have not read linus's comemtn direcly but i have heard about the effect indicrtly19:25
sean-k-mooneyvia https://www.youtube.com/watch?v=O1VB7zzKNjU19:26
sean-k-mooneyhttps://www.youtube.com/@SavvyNik is one of the linux/OSS channels i follow19:27
fungiwell, i mean also our embargo process implies a minimum additional week after fixes are written and reviewed before the general public can get them19:27
fungiso we should only be imposing that time tax on releasing security fixes when they're really, really sensitive19:27
sean-k-mooneyya... i am not a fan of that in general 19:27
sean-k-mooneybut it is the current process so...19:28
sean-k-mooneyi think the privacy makes sense intially while we figure out if its real, can it be explicted and how to "stop the bleeding" 19:28
sean-k-mooneybut once we have a mergable patch im less conviced it is a benift to our users or contibutors to keep it prive much beyond that19:29
fungialso in my opinion if they're not sensitive enough to get prioritized and fixed immediately, we're better off making them public unfixed some someone else can write a fix for them, rather than keeping that knowledge secret from users even though it's probably going to be discovered by other people running the same llm to look for vulnerabilities which originally found it19:29
sean-k-mooneyobvioulsy it depend on the issue but i woudl not be agaisn moving ti to public security at that point if we didnt really need the extened case19:29
sean-k-mooneyya that why i mentioned 10-30 days. if i cant triage in 10 days and do a poc of a minimal patch in 30 it proably is not going to get resovled in a timely manner19:31
sean-k-mooneyat which point i kind of agree that keeping it private and unworked on may do more harm then good19:31

Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!