Wednesday, 2025-11-05

opendevreviewMerged openstack/governance master: Propose jlarriba to Telemetry PTL  https://review.opendev.org/c/openstack/governance/+/96451606:55
slaweqgouthamr: gmaan I was using script https://github.com/slawqo/tools/blob/master/user_survey_analyzer.py to have a bit of help with analyzing data. IIRC I had to simply copy just TC related questions and answers from the overall survey results and then save them in the csv file which this script could understand08:33
slaweqbut it was really long time ago when I last did it so I may be wrong there08:33
slaweqI just checked and it works still with csv file08:37
slaweqyou can simply export results from https://docs.google.com/spreadsheets/d/1bte_k0R3_rVzFF9ND5zTZV0j9liXI1PxPNo-XmFnFPs/edit?gid=1181187970#gid=1181187970 to csv and run this script08:37
slaweqjust install prettytable module first as it is needed by that script08:38
slaweqI hope it will help you a bit08:38
gmaanslaweq: ++16:03
gmaanmaybe it will be good to add it in governance repo. surly it will help in future also, slaweq gouthamr what you say?16:03
slaweqI think I was asking about it long time ago and then we decided that we don't need it there16:04
slaweqbut if you think it could be there, feel free to take it, make it to be more "production ready" tool and propose it - I'm fine with this16:04
slaweqfor now it is just simple script which helped me doing those analysis, nothing fancy really :)16:05
gmaanok, we have such script (project health etc) in this directory and adding this make sense to me https://github.com/openstack/governance/tree/master/tools16:08
cardoeSo I didn't have a chance to bring it up during the TC session yesterday but some feedback I got during the PTG was around our usage of AI. Teams were looking to formalize a bit more clarity around what the contributor had the AI do. e.g. write the tests, write the code, write the docs.17:46
cardoeThere's still some hesitation from projects to accept code that's written by AI due to the "code written by AI cannot compete against this AI" clause in the license that someone mentioned during one of the sessions (I forget who but if you see this maybe you can speak up and elaborate more).17:47
sean-k-mooneycardoe: personally i dont what to have to go into a lot of detail about what ai did or did not do in each commit message17:52
sean-k-mooneyi really really dont want at to end ups in the code17:53
sean-k-mooneyi.e. as comment in any way 17:53
cardoeI understand. I'm just trying to bring forward something that I heard in 2 different team's PTG sessions about the usage of AI.17:53
sean-k-mooneyyep its totally valid to disucss17:54
cardoeMy PTG schedule was a total hot mess and I was jumping around a lot. I even entirely missed the OpenStack Helm sessions and I'm a core over there. :/17:54
cardoeSo I don't even remember what teams said it.17:54
cardoeBut I just wanted to make the TC aware that it was a topic of concern among a few projects.17:54
sean-k-mooneysame its harder to do this virually i have found instead of in person when you work cross project17:54
sean-k-mooneythis being attend, and remember ot attend all the session you planned too17:55
cardoeAt one point 3 different sessions in 3 different rooms scheduled a topic I wanted to bring up at the same 30 minute window.17:56
sean-k-mooneydid any of them start or end the topic in that 30 min window :)17:57
sean-k-mooneyi was in session form 13:00-18:30 tuseday-firday and i only didnt atend on monday because it was a public holday. if it was in person it woudl have been less compressed17:59
sean-k-mooneymore work to get too/from the location but also more time when there.17:59
sean-k-mooneycardoe: back to your topic most of the ai terms i have seen are phased as "you cant you the out put of the llm to tain an llm that competee with the service we provide"18:00
clarkbcardoe: i brought up the claude tos issue.18:01
sean-k-mooneywhich is actully a eula restiction for your contract with them not a limiation on teh actual code generated18:01
sean-k-mooneyso antropic could ban my account for breakc of terms of use18:01
cardoeclarkb: perfect. Glad you're here cause you're more knowledgeable than I am.18:01
sean-k-mooneybut it shoudl not impact my code contibution18:01
cardoesean-k-mooney: okay. well that was different than what clarkb's interpretation was. So if the foundation's lawyers have agreed upon interpretation I think that would be good. TheJulia we've had them weigh in before yes?18:03
sean-k-mooneywell im obvioulsy not a lawyer so ill defer to them18:03
cardoeEither way. I'm not asking the TC to take any actions or projects to take any actions. I more wanted to bring awareness to teams feeling ambiguity and an interest for improved guidance.18:03
TheJuliaI'm on a call at the moment which requires my focus, giveme a littl ebit18:03
sean-k-mooneybut that is the interpertion i have gathered18:03
clarkbcardoe: to be fair my interpretation is "its ambiguous to me" not that there is a clear line either way18:03
clarkbthe existing foundation policy basically says you need to use models that are compatible with open source software development. I don't know if a model that limits field of endeavor for its users is compatbile18:05
clarkbmy understanding is that claude limits you to both legal acitvities and no competition with their service18:05
clarkbbut whether or not that only applies to the user account or the produced code that ends up in a code base is ambiguous to me18:05
clarkband I'm not a lawyer18:05
JayFI'll note I have a friend at Anthropic who I passed on the gist of these concerns to and he raised it internally. I don't expect any official response, but I suspect their intention is not to tamp down OSS use. This is not a comment on if I agree or disagree the clause is problematic legally :)18:07
clarkbI think if I recieved code from a human that said "you may not use this to compete with me or perform illegal actions" then I could not in good faith sign the dco when submitting that code18:08
sean-k-mooneycardoe: clarkb  youre ferign to section 3.2 of https://www.anthropic.com/legal/consumer-terms correct18:08
TheJuliaokay, Off that call18:08
TheJuliabeginning to read/digest18:08
sean-k-mooney"""To develop any products or services that compete with our Services, including to develop or train any artificial intelligence or machine learning algorithms or models or resell the Services"""18:09
clarkbsean-k-mooney: 3. User of our services has a giant list of restrictions18:09
sean-k-mooneyyep18:09
sean-k-mooneyi didnt think any of them appled to how i use it for contibutions18:09
sean-k-mooneybut obvioulsy there are limit as part fo the useage agreeemtn18:10
clarkbthe way I model it in my own head is similar to vendoring code from one code base to another. I can do that if the source code base is licensed in a manner that is compatible with the destination license. Eg no additional infringement of rights.18:11
sean-k-mooneythat is no how i think of it but i can see that perspetive18:11
clarkbbut I'm not a alwyer so have no idea if that is an appropriate way of understanding this18:12
clarkband as mentioned above the other way I think about it is if a human handed me the code and listed those limitations would I accept it to the code base (and again when considered this way no I would not)18:13
fungialso keep in mind that the first versions of the openinfra ai policy, and the lf's policy on which it was partly based, came about before osi began to formalize the osaid, and i believe some of the intent was to encourage use of open source ai once we know what that means, because the terms of use and intellectual property risks are somewhat mitigated18:14
sean-k-mooneythe usage policy does not asset any ownersip or liesnce over the code. the output content generated by the llm is not subect to a software licence itself.18:15
* TheJulia is still reading18:15
clarkbthe third way I look at it (and this is why its ambiguous to me) is claude the claude user have an agreement to interact with each other in a certain way. That agreement is not necessarily transitive so claude user deciding to interact with open source project differently may be fine18:16
clarkbs/is claude the claude user/is claud and the claude user/18:16
clarkbbut the DCO is transitive aiui18:16
sean-k-mooneyright i read it as the terms only gorvern my interaction with the service they provide and do not transitivly apply to the generate code18:17
fungi"Contributions to OpenInfra Foundation projects are distributed under open source software licenses (Apache 2.0 or other OSI approved licenses), so code or content included in a contribution must be compatible with those licenses. The license of a contribution does not need to be exactly the same as the project's license, being compatible means that the contribution's license18:17
fungigrants sufficient rights to allow everything the project's license allows (or allows more), and imposes similar restrictions (or fewer restrictions). Many open source licenses are compatible with other open source licenses, and code or content in the public domain is compatible with all open source licenses. Contributors need to verify they have the right to contribute output18:17
fungifrom AI tools, just like they do for their own original work, work owned by their employer, work copied or modified from another open source project, or work submitted on behalf of a third party." https://openinfra.org/legal/ai-policy18:17
fungithat's the relevant text from the current policy18:17
fungiwhich means you at least do need to know what the ip situation is for the output of the tools you use18:18
fungithe policy doesn't make assertions about what that situation is for various popular tools18:18
fungiand "i assume it'll be fine" doesn't really cut it, i don't think18:19
fungihopefully the tools you use do make their own assertions about the ownership of their output18:19
sean-k-mooneywell there si also https://iclg.com/news/22400-us-court-confirms-ai-generated-art-cannot-be-copyrighted to consider18:20
fungieither in a terms of use agreement or similar18:20
TheJuliaSo, chiming on on ambiguity item 3.2, It can be taken many different ways but from a plain reading standpoint, it is definitely a generalized catch-all which should give folks pause, in so much as using a tool to make a derivative work. You can't control what someone does beyond you and they are not forward bound really (and also, where the general legal ecosystem is sitll up in the air as sean just noted)18:20
sean-k-mooneythis si slighly diffent of cuse as im not asserting ther was no human input in an ai driven contibution18:21
TheJuliaPartly the policy for OIF was created to recognize "we know, no matter what we say, people are going to play with the shiny tools. Lets minimally get them to acknowledge it and track it through the tags on commit messages as well. Thus, should the legal landscape shift away from permissive, then we have a starting point to do the needful unwind/analysis."18:22
fungiyes, i personally don't use llm-type tools to help me write anything, precisely because i am unsure what the actual legal situation is with their output. it's a very, very new (in terms if ip law) situation which the courts are going to need to sort out over the course of years, and i'm okay waiting for them to do that18:22
clarkbclaude says this about outputs: Subject to your compliance with our Terms, we assign to you all of our right, title, and interest—if any—in Outputs."18:22
sean-k-mooneyto be clear there was ai generate code in oepnstack since at least bobcat well before the policy even exitied18:23
fungibut yes, i don't have control over what other people do except insofar as i might downvote/block a proposed change in review18:23
clarkbso basically your rights to the outputs are dependent on adherence to the tos18:23
clarkb(if I read that correctly as a layman)18:23
sean-k-mooneyas to me there was tecnially thign that requried the policy ot exsit to contibut baed on our prior terms18:23
fungialso is revocation of those rights retroactive to prior "infringing" use?18:23
sean-k-mooneyi think until antropic take someone to court and test there terms fo use in multipe juristiction its just goign to be maybe?18:24
clarkbright I'm not saying you shouldn't use ai or that I don't think it isn't going to happen anyway. I'm merely trying to explain why reviewers mgiht have pause when asked to accept changes from people generating code from llms that reqire a legal degree to undersatnd18:24
sean-k-mooneyi generally try ot stay pretty clear of there bondaries18:25
clarkbI think that there are some models out there that try to address this more directly. unfortauntely various models don't always perform the same so people don't necessarily want to use those models18:25
fungii.e. if claude decides you were in violation of their tos, then are your rights to things you've already contributed to openstack no longer yours, and they need to be removed?18:25
clarkbclaude is popular because it performs well aiui. It also has a complicated tos (in my opinion)18:25
sean-k-mooneywell for one its not clear that they have any rights to the output either18:26
sean-k-mooneyto be able to revoke them18:26
fungithat's something i personally would pay a lawyer to advise me on, or just avoid entirely if i can't justify paying counsel18:26
clarkbsean-k-mooney: thats a fair point. The tos can assert things they don't necessarily have a right to assert either18:26
sean-k-mooneyagain my reading is the service tems are a contract between me as teh human and them as the service provdier but not applicable to the output of the service18:26
TheJuliasean-k-mooney: my plain reading/interpretation is the same as yours it seems18:27
sean-k-mooney"As between you and Anthropic, and to the extent permitted by applicable law, you retain any right, title, and interest that you have in such Inputs. Subject to your compliance with our Terms, we assign to you all our right, title, and interest (if any) in Outputs."18:27
sean-k-mooneythey also say that in the right and responisblites section expelcity18:27
clarkbsean-k-mooney: yup but that implies if you breach the tos that you no longer have rights to the outputs.18:28
clarkband if I want to be super cynical there is an arguemtn there that if you breach the tos in a completely unrelated manner to openstack that they would consider all of your outputs in openstack to be no longer valid18:28
clarkb(but I'm not a lawyer so I have no idea how that maps onto existing legal interpretations)18:28
TheJuliaThere is also a question at what point is the resulting output been sufficiently changed to become a derivitive work18:29
TheJuliaThat is a whole un-litigated area related to AI, but in terms of the resulting output the bar is really kind of low... but of course open to interpretation on a case by case basis18:29
clarkbI think where the current policy fails code reviewers at least is guidelines on which models are considered compatible18:30
TheJuliai.e. a bulk rename doesn't make something a derivitie work, really.18:30
clarkbbeacuse I don't want to waste time (whcih I've already done withclaude) trying to sort that out for myself18:30
sean-k-mooneyclarkb: im not sure we shoudl try to answer that question18:30
fungialso wrt the article linked above, it looks like the court's finding in that case would place ai-generated output in the public domain within the usa (not all signatories of the berne and buenos aires conventions have a public domain concept though), so if that's the interpretation then the contributions would need to be declared public domain or at least not result in the18:30
fungicontributor making copyright claims on them18:30
sean-k-mooneyas i dont know that you coudl really prove any of them are18:30
fungii don't think we can legally answer that question. only courts can18:31
clarkbsean-k-mooney: I don't think we need to prove any of them in courst as much as have a general guideline that says we expect x y z are ok and contributions can be accepted from them18:31
clarkbcurrently the policy leaves that up to the reader which is me the code reviewer18:32
sean-k-mooneyclarkb: right but denpeing on what jusdgment you read and if they hold up, the output of a llm has been ruled in the use as transformtivie18:32
TheJuliaThe thing we explicitly wanted to avoid was going so deep into that level of detail out of the gate. Maybe time has come to visit that, but realistically I'm not sure folks will also have *that* much interest before the new year.18:33
sean-k-mooneyhttps://www.whitecase.com/insight-alert/two-california-district-judges-rule-using-books-train-ai-fair-use18:33
clarkbTheJulia: when the policy was developed I was expecting more of a "trained on open source for open source developemtn" sort of implementation as the real world application of the policy18:33
fungithe conservative stance for an ai contribution policy would be that projects don't accept such contributions because their legal status is not known and there's no better guidance at this time, but obviously that's not going to work because ai is already "too big to fail" in this industry and so we have to be pragmatic18:33
sean-k-mooneythat form the recent case wehre antorpic won on the fair use ground and still lost on the piaracy grounds18:33
clarkbTheJulia: the reality today is no one is doing that and everyone is using claude18:33
TheJuliaclarkb: yeah, not many folks had that interest, unfortunately. The landscape also shifted from shiny tool to useful tool as well during that time.18:34
clarkband I think that is the underlying source of the disconnect between the policy work and what we're running into as code reviewers today18:35
sean-k-mooneyto be fair i did try using openwiat models and i still do form time to time18:35
fungireminds me of when i was in school and we weren't allowed to use calculators in math classes. or the generation after me being told that internet sources aren't appropriate to use for research papers18:36
clarkbas I said I would not personally accept any code from a human under those terms nor would I sign the DCO under those terms if a human was the source. So I'm not sure why I'd accept it from an llm. sean-k-mooney points out that humans produce copyrighted work. LLMs don't necessarily and that may be one reason18:36
sean-k-mooneybut there si a very very very big deta in many of the capabliteis18:36
TheJuliaThe other issue is that it is a constantly shifting ecosystem. Who is to say every tool doesn't change their ToS next week in entirely incompatible or compatible ways as well.18:36
sean-k-mooneyclarkb: oh i know your talking form your presecitve 18:36
sean-k-mooneyi jsut was cometing on the fact that many18:36
sean-k-mooneypurly traied modesl are quite alot behiend the comercail ones today18:37
sean-k-mooneyalthoguht that is changing18:37
fungiclarkb: to play devil's advocate, usa government employees also don't produce copyrighted work (it's explicitly placed in the public domain), but we still accept their contributions18:37
clarkbsean-k-mooney: yup I get it. There is a lot of factors to consider and its easy to overlook things so ist good to try to be thorough before dismissing or accepting something out of hand18:37
TheJuliaAlso, entirely different different use cases/purposes as well.18:37
clarkbfungi: but they have special dispensation to do so18:37
clarkbfungi: via the special cla iirc18:37
clarkb(not sure if that went away with dco probably it did?)18:37
fungiwell, did, now they just use the dco18:37
sean-k-mooneyclarkb: on a related note i calim no copyright over the llm output of my code review job https://zuul.teim.app/t/main/build/68ab63e0ffa64ce2b095525587bd9785/log/code-review/review-report.md18:38
sean-k-mooneyclarkb: i got that workign using claude code as teh client (also opencode) with glm 4.6 hosted by z.ai18:38
sean-k-mooneystep 2 is make it pretty and readable not that it can work18:39
clarkbfungi: but also government employee humans providing things under the public domain are basically saying there are no limitations whatsoever whcih is different than llm service with complicated tos18:39
clarkbin any case I think this is where the friction arises at least as I see it when trying to review and accept generated contributions18:40
clarkbthe policy punts that work to the code reviewer. I think we had hoped the landscape would be more clear when applying the policy but unfortunately the opposite seems to have happened18:40
fungiyes, ideally the llm operator's tos would make a similar statement, but instead it seems like the user is going to make some personal judgement about it based on their non-lawyer (in most cases) understanding of the inapplicability of some tos clauses18:41
sean-k-mooneyyep at the end of the day it still relies on a human in the loop ot make a judgement call at some level18:45
TheJuliaTo be clear, this problem is wholly worse without labeling of change's commit messages. Going back to my prior statement, we know some folks are going to always try and take the most expedient path forward and the purpose of the labeling is to highlight the consideration and then the secondary reality call on the reviewer human side is "is this consistent with what I expect" and "is there anything else I need to be aware of". 18:48
TheJuliaThe general plain reading of the ToS, at least doesn't feel like a concern to me today, but if Anthropic decided to expand service areas, then that obviously would begin to be very very problematic and hopefully nobody is explicitly violating terms of service. Similarly, installing some tools which you may barely be conscious of on a host might mean you cannot measure or publish any results. It goes back to what is the plain 18:48
TheJuliareading of the person doing that because maybe your doing something else which is entirely unrelated, or loosely related, but still not in violation of the ToS you agreed to with that binary package from a specific hardware vendor18:48
clarkbTheJulia: I don't think the actual service areas matter too much if there is any limitation on field of endeavor fwiw18:49
clarkbat least for open source software dev. Either its ok to have that in the tos and contribute to open source projects or it isn't18:49
TheJuliayeah, and that is the key question18:49
TheJuliasort of like how in the US, the courts try to focus the question. Here, with this, we need to do the same.18:50
clarkbre people doing it secretly as a code reviewer I feel like the blame is squarely on them if I miss it. But if I'm reviewing code that is properly annotated and I let something in that I shouldn't then I own that18:51
clarkbin the first scenario I have done nothing wrong. In the second I have18:51
TheJuliaclarkb: do you mean secretly as a contributor?18:51
TheJuliafor the first part of that ?18:51
clarkbTheJulia: yes someone generating a contribution and not annotating it that way and contributing it up stream is what I mean for the first case18:52
TheJuliaYeah, fair. I totally agree.18:52
TheJuliaI think the other challenge is context and drawing the line, going back to the question and ultimately fair and general reading of the text18:53
* TheJulia goes and gets the other laptop with the account to email staff folks18:55
clarkbif claude didn't have this in the tos: "Outputs may not always be accurate and may contain material inaccuracies even if they appear accurate because of their level of detail or specificity." we could just ask claude :)18:57
TheJuliathe standard disclaimer! :)18:59
*** vhari_ is now known as vhari19:10
spotz[m]hehe19:58
JayFTheJulia: clarkb: The thing I keep thinking about is this: you know what else isn't yet well litigated across the world? OSS licenses themselves. 20:08
JayFTheJulia: clarkb: Any attempt by me, a software engineer with lots of experience and ideas but not an expert in the direction of legality, to understand legalese is going to be misleading as hell. This is 100000% a problem I need to outsource if it matters. I also am not 100% sure, but would be shocked if there aren't similarly restrictive clauses in other tools one might use as a swdev, that just aren't looked at in this detail. 20:10
TheJuliaThat reminds me, there was a case pending in the UK I need to go lookup.20:10
JayFe.g. https://www.wired.com/story/open-source-license-requires-users-do-no-harm/20:11
JayFhttps://archive.is/2xgsS is a paywall bypass for that article20:11
fungithere have been lots of "do no harm" licenses in the past that the osi rejected instantly because it's a restriction on fields of endeavor20:14
JayFYep, my point is more asking if we've done this level of diligence to ensure *those* tools weren't used in creation of OpenStack? My entire premise is that this class of issue always existed, it's just amplified by the uptake and discourse around AI.20:18
fungioh sure, i think in the past it's been uncommon for developer tools to claim or revoke ownership over anything you produce using them20:21
fungiit's just further complicated by the fact that in most cases you don't know exactly what material an llm was trained on, and whether it might regurgitate something similar enough to some of its training input to be considered derivative20:24
JayFThe interesting thing is, the way I prompt it to use/copy/model-based-on other openstack code in Apache 2.0, it almost doesn't even make sense to ask that question20:25
clarkbfungi: right I think one the subleties here is that the llm creates an output artifact that is meanginful in one way or another and the llm operator says you can only use that output artifact if you follow their rules. Compiler outputs or other more deterministic content generation tools are generally understood to have no ownership by the tool creator with all of that belonging20:31
clarkbto the user20:31

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!