| opendevreview | Merged openstack/governance master: Propose jlarriba to Telemetry PTL https://review.opendev.org/c/openstack/governance/+/964516 | 06:55 |
|---|---|---|
| slaweq | gouthamr: gmaan I was using script https://github.com/slawqo/tools/blob/master/user_survey_analyzer.py to have a bit of help with analyzing data. IIRC I had to simply copy just TC related questions and answers from the overall survey results and then save them in the csv file which this script could understand | 08:33 |
| slaweq | but it was really long time ago when I last did it so I may be wrong there | 08:33 |
| slaweq | I just checked and it works still with csv file | 08:37 |
| slaweq | you can simply export results from https://docs.google.com/spreadsheets/d/1bte_k0R3_rVzFF9ND5zTZV0j9liXI1PxPNo-XmFnFPs/edit?gid=1181187970#gid=1181187970 to csv and run this script | 08:37 |
| slaweq | just install prettytable module first as it is needed by that script | 08:38 |
| slaweq | I hope it will help you a bit | 08:38 |
| gmaan | slaweq: ++ | 16:03 |
| gmaan | maybe it will be good to add it in governance repo. surly it will help in future also, slaweq gouthamr what you say? | 16:03 |
| slaweq | I think I was asking about it long time ago and then we decided that we don't need it there | 16:04 |
| slaweq | but if you think it could be there, feel free to take it, make it to be more "production ready" tool and propose it - I'm fine with this | 16:04 |
| slaweq | for now it is just simple script which helped me doing those analysis, nothing fancy really :) | 16:05 |
| gmaan | ok, we have such script (project health etc) in this directory and adding this make sense to me https://github.com/openstack/governance/tree/master/tools | 16:08 |
| cardoe | So I didn't have a chance to bring it up during the TC session yesterday but some feedback I got during the PTG was around our usage of AI. Teams were looking to formalize a bit more clarity around what the contributor had the AI do. e.g. write the tests, write the code, write the docs. | 17:46 |
| cardoe | There's still some hesitation from projects to accept code that's written by AI due to the "code written by AI cannot compete against this AI" clause in the license that someone mentioned during one of the sessions (I forget who but if you see this maybe you can speak up and elaborate more). | 17:47 |
| sean-k-mooney | cardoe: personally i dont what to have to go into a lot of detail about what ai did or did not do in each commit message | 17:52 |
| sean-k-mooney | i really really dont want at to end ups in the code | 17:53 |
| sean-k-mooney | i.e. as comment in any way | 17:53 |
| cardoe | I understand. I'm just trying to bring forward something that I heard in 2 different team's PTG sessions about the usage of AI. | 17:53 |
| sean-k-mooney | yep its totally valid to disucss | 17:54 |
| cardoe | My PTG schedule was a total hot mess and I was jumping around a lot. I even entirely missed the OpenStack Helm sessions and I'm a core over there. :/ | 17:54 |
| cardoe | So I don't even remember what teams said it. | 17:54 |
| cardoe | But I just wanted to make the TC aware that it was a topic of concern among a few projects. | 17:54 |
| sean-k-mooney | same its harder to do this virually i have found instead of in person when you work cross project | 17:54 |
| sean-k-mooney | this being attend, and remember ot attend all the session you planned too | 17:55 |
| cardoe | At one point 3 different sessions in 3 different rooms scheduled a topic I wanted to bring up at the same 30 minute window. | 17:56 |
| sean-k-mooney | did any of them start or end the topic in that 30 min window :) | 17:57 |
| sean-k-mooney | i was in session form 13:00-18:30 tuseday-firday and i only didnt atend on monday because it was a public holday. if it was in person it woudl have been less compressed | 17:59 |
| sean-k-mooney | more work to get too/from the location but also more time when there. | 17:59 |
| sean-k-mooney | cardoe: back to your topic most of the ai terms i have seen are phased as "you cant you the out put of the llm to tain an llm that competee with the service we provide" | 18:00 |
| clarkb | cardoe: i brought up the claude tos issue. | 18:01 |
| sean-k-mooney | which is actully a eula restiction for your contract with them not a limiation on teh actual code generated | 18:01 |
| sean-k-mooney | so antropic could ban my account for breakc of terms of use | 18:01 |
| cardoe | clarkb: perfect. Glad you're here cause you're more knowledgeable than I am. | 18:01 |
| sean-k-mooney | but it shoudl not impact my code contibution | 18:01 |
| cardoe | sean-k-mooney: okay. well that was different than what clarkb's interpretation was. So if the foundation's lawyers have agreed upon interpretation I think that would be good. TheJulia we've had them weigh in before yes? | 18:03 |
| sean-k-mooney | well im obvioulsy not a lawyer so ill defer to them | 18:03 |
| cardoe | Either way. I'm not asking the TC to take any actions or projects to take any actions. I more wanted to bring awareness to teams feeling ambiguity and an interest for improved guidance. | 18:03 |
| TheJulia | I'm on a call at the moment which requires my focus, giveme a littl ebit | 18:03 |
| sean-k-mooney | but that is the interpertion i have gathered | 18:03 |
| clarkb | cardoe: to be fair my interpretation is "its ambiguous to me" not that there is a clear line either way | 18:03 |
| clarkb | the existing foundation policy basically says you need to use models that are compatible with open source software development. I don't know if a model that limits field of endeavor for its users is compatbile | 18:05 |
| clarkb | my understanding is that claude limits you to both legal acitvities and no competition with their service | 18:05 |
| clarkb | but whether or not that only applies to the user account or the produced code that ends up in a code base is ambiguous to me | 18:05 |
| clarkb | and I'm not a lawyer | 18:05 |
| JayF | I'll note I have a friend at Anthropic who I passed on the gist of these concerns to and he raised it internally. I don't expect any official response, but I suspect their intention is not to tamp down OSS use. This is not a comment on if I agree or disagree the clause is problematic legally :) | 18:07 |
| clarkb | I think if I recieved code from a human that said "you may not use this to compete with me or perform illegal actions" then I could not in good faith sign the dco when submitting that code | 18:08 |
| sean-k-mooney | cardoe: clarkb youre ferign to section 3.2 of https://www.anthropic.com/legal/consumer-terms correct | 18:08 |
| TheJulia | okay, Off that call | 18:08 |
| TheJulia | beginning to read/digest | 18:08 |
| sean-k-mooney | """To develop any products or services that compete with our Services, including to develop or train any artificial intelligence or machine learning algorithms or models or resell the Services""" | 18:09 |
| clarkb | sean-k-mooney: 3. User of our services has a giant list of restrictions | 18:09 |
| sean-k-mooney | yep | 18:09 |
| sean-k-mooney | i didnt think any of them appled to how i use it for contibutions | 18:09 |
| sean-k-mooney | but obvioulsy there are limit as part fo the useage agreeemtn | 18:10 |
| clarkb | the way I model it in my own head is similar to vendoring code from one code base to another. I can do that if the source code base is licensed in a manner that is compatible with the destination license. Eg no additional infringement of rights. | 18:11 |
| sean-k-mooney | that is no how i think of it but i can see that perspetive | 18:11 |
| clarkb | but I'm not a alwyer so have no idea if that is an appropriate way of understanding this | 18:12 |
| clarkb | and as mentioned above the other way I think about it is if a human handed me the code and listed those limitations would I accept it to the code base (and again when considered this way no I would not) | 18:13 |
| fungi | also keep in mind that the first versions of the openinfra ai policy, and the lf's policy on which it was partly based, came about before osi began to formalize the osaid, and i believe some of the intent was to encourage use of open source ai once we know what that means, because the terms of use and intellectual property risks are somewhat mitigated | 18:14 |
| sean-k-mooney | the usage policy does not asset any ownersip or liesnce over the code. the output content generated by the llm is not subect to a software licence itself. | 18:15 |
| * TheJulia is still reading | 18:15 | |
| clarkb | the third way I look at it (and this is why its ambiguous to me) is claude the claude user have an agreement to interact with each other in a certain way. That agreement is not necessarily transitive so claude user deciding to interact with open source project differently may be fine | 18:16 |
| clarkb | s/is claude the claude user/is claud and the claude user/ | 18:16 |
| clarkb | but the DCO is transitive aiui | 18:16 |
| sean-k-mooney | right i read it as the terms only gorvern my interaction with the service they provide and do not transitivly apply to the generate code | 18:17 |
| fungi | "Contributions to OpenInfra Foundation projects are distributed under open source software licenses (Apache 2.0 or other OSI approved licenses), so code or content included in a contribution must be compatible with those licenses. The license of a contribution does not need to be exactly the same as the project's license, being compatible means that the contribution's license | 18:17 |
| fungi | grants sufficient rights to allow everything the project's license allows (or allows more), and imposes similar restrictions (or fewer restrictions). Many open source licenses are compatible with other open source licenses, and code or content in the public domain is compatible with all open source licenses. Contributors need to verify they have the right to contribute output | 18:17 |
| fungi | from AI tools, just like they do for their own original work, work owned by their employer, work copied or modified from another open source project, or work submitted on behalf of a third party." https://openinfra.org/legal/ai-policy | 18:17 |
| fungi | that's the relevant text from the current policy | 18:17 |
| fungi | which means you at least do need to know what the ip situation is for the output of the tools you use | 18:18 |
| fungi | the policy doesn't make assertions about what that situation is for various popular tools | 18:18 |
| fungi | and "i assume it'll be fine" doesn't really cut it, i don't think | 18:19 |
| fungi | hopefully the tools you use do make their own assertions about the ownership of their output | 18:19 |
| sean-k-mooney | well there si also https://iclg.com/news/22400-us-court-confirms-ai-generated-art-cannot-be-copyrighted to consider | 18:20 |
| fungi | either in a terms of use agreement or similar | 18:20 |
| TheJulia | So, chiming on on ambiguity item 3.2, It can be taken many different ways but from a plain reading standpoint, it is definitely a generalized catch-all which should give folks pause, in so much as using a tool to make a derivative work. You can't control what someone does beyond you and they are not forward bound really (and also, where the general legal ecosystem is sitll up in the air as sean just noted) | 18:20 |
| sean-k-mooney | this si slighly diffent of cuse as im not asserting ther was no human input in an ai driven contibution | 18:21 |
| TheJulia | Partly the policy for OIF was created to recognize "we know, no matter what we say, people are going to play with the shiny tools. Lets minimally get them to acknowledge it and track it through the tags on commit messages as well. Thus, should the legal landscape shift away from permissive, then we have a starting point to do the needful unwind/analysis." | 18:22 |
| fungi | yes, i personally don't use llm-type tools to help me write anything, precisely because i am unsure what the actual legal situation is with their output. it's a very, very new (in terms if ip law) situation which the courts are going to need to sort out over the course of years, and i'm okay waiting for them to do that | 18:22 |
| clarkb | claude says this about outputs: Subject to your compliance with our Terms, we assign to you all of our right, title, and interest—if any—in Outputs." | 18:22 |
| sean-k-mooney | to be clear there was ai generate code in oepnstack since at least bobcat well before the policy even exitied | 18:23 |
| fungi | but yes, i don't have control over what other people do except insofar as i might downvote/block a proposed change in review | 18:23 |
| clarkb | so basically your rights to the outputs are dependent on adherence to the tos | 18:23 |
| clarkb | (if I read that correctly as a layman) | 18:23 |
| sean-k-mooney | as to me there was tecnially thign that requried the policy ot exsit to contibut baed on our prior terms | 18:23 |
| fungi | also is revocation of those rights retroactive to prior "infringing" use? | 18:23 |
| sean-k-mooney | i think until antropic take someone to court and test there terms fo use in multipe juristiction its just goign to be maybe? | 18:24 |
| clarkb | right I'm not saying you shouldn't use ai or that I don't think it isn't going to happen anyway. I'm merely trying to explain why reviewers mgiht have pause when asked to accept changes from people generating code from llms that reqire a legal degree to undersatnd | 18:24 |
| sean-k-mooney | i generally try ot stay pretty clear of there bondaries | 18:25 |
| clarkb | I think that there are some models out there that try to address this more directly. unfortauntely various models don't always perform the same so people don't necessarily want to use those models | 18:25 |
| fungi | i.e. if claude decides you were in violation of their tos, then are your rights to things you've already contributed to openstack no longer yours, and they need to be removed? | 18:25 |
| clarkb | claude is popular because it performs well aiui. It also has a complicated tos (in my opinion) | 18:25 |
| sean-k-mooney | well for one its not clear that they have any rights to the output either | 18:26 |
| sean-k-mooney | to be able to revoke them | 18:26 |
| fungi | that's something i personally would pay a lawyer to advise me on, or just avoid entirely if i can't justify paying counsel | 18:26 |
| clarkb | sean-k-mooney: thats a fair point. The tos can assert things they don't necessarily have a right to assert either | 18:26 |
| sean-k-mooney | again my reading is the service tems are a contract between me as teh human and them as the service provdier but not applicable to the output of the service | 18:26 |
| TheJulia | sean-k-mooney: my plain reading/interpretation is the same as yours it seems | 18:27 |
| sean-k-mooney | "As between you and Anthropic, and to the extent permitted by applicable law, you retain any right, title, and interest that you have in such Inputs. Subject to your compliance with our Terms, we assign to you all our right, title, and interest (if any) in Outputs." | 18:27 |
| sean-k-mooney | they also say that in the right and responisblites section expelcity | 18:27 |
| clarkb | sean-k-mooney: yup but that implies if you breach the tos that you no longer have rights to the outputs. | 18:28 |
| clarkb | and if I want to be super cynical there is an arguemtn there that if you breach the tos in a completely unrelated manner to openstack that they would consider all of your outputs in openstack to be no longer valid | 18:28 |
| clarkb | (but I'm not a lawyer so I have no idea how that maps onto existing legal interpretations) | 18:28 |
| TheJulia | There is also a question at what point is the resulting output been sufficiently changed to become a derivitive work | 18:29 |
| TheJulia | That is a whole un-litigated area related to AI, but in terms of the resulting output the bar is really kind of low... but of course open to interpretation on a case by case basis | 18:29 |
| clarkb | I think where the current policy fails code reviewers at least is guidelines on which models are considered compatible | 18:30 |
| TheJulia | i.e. a bulk rename doesn't make something a derivitie work, really. | 18:30 |
| clarkb | beacuse I don't want to waste time (whcih I've already done withclaude) trying to sort that out for myself | 18:30 |
| sean-k-mooney | clarkb: im not sure we shoudl try to answer that question | 18:30 |
| fungi | also wrt the article linked above, it looks like the court's finding in that case would place ai-generated output in the public domain within the usa (not all signatories of the berne and buenos aires conventions have a public domain concept though), so if that's the interpretation then the contributions would need to be declared public domain or at least not result in the | 18:30 |
| fungi | contributor making copyright claims on them | 18:30 |
| sean-k-mooney | as i dont know that you coudl really prove any of them are | 18:30 |
| fungi | i don't think we can legally answer that question. only courts can | 18:31 |
| clarkb | sean-k-mooney: I don't think we need to prove any of them in courst as much as have a general guideline that says we expect x y z are ok and contributions can be accepted from them | 18:31 |
| clarkb | currently the policy leaves that up to the reader which is me the code reviewer | 18:32 |
| sean-k-mooney | clarkb: right but denpeing on what jusdgment you read and if they hold up, the output of a llm has been ruled in the use as transformtivie | 18:32 |
| TheJulia | The thing we explicitly wanted to avoid was going so deep into that level of detail out of the gate. Maybe time has come to visit that, but realistically I'm not sure folks will also have *that* much interest before the new year. | 18:33 |
| sean-k-mooney | https://www.whitecase.com/insight-alert/two-california-district-judges-rule-using-books-train-ai-fair-use | 18:33 |
| clarkb | TheJulia: when the policy was developed I was expecting more of a "trained on open source for open source developemtn" sort of implementation as the real world application of the policy | 18:33 |
| fungi | the conservative stance for an ai contribution policy would be that projects don't accept such contributions because their legal status is not known and there's no better guidance at this time, but obviously that's not going to work because ai is already "too big to fail" in this industry and so we have to be pragmatic | 18:33 |
| sean-k-mooney | that form the recent case wehre antorpic won on the fair use ground and still lost on the piaracy grounds | 18:33 |
| clarkb | TheJulia: the reality today is no one is doing that and everyone is using claude | 18:33 |
| TheJulia | clarkb: yeah, not many folks had that interest, unfortunately. The landscape also shifted from shiny tool to useful tool as well during that time. | 18:34 |
| clarkb | and I think that is the underlying source of the disconnect between the policy work and what we're running into as code reviewers today | 18:35 |
| sean-k-mooney | to be fair i did try using openwiat models and i still do form time to time | 18:35 |
| fungi | reminds me of when i was in school and we weren't allowed to use calculators in math classes. or the generation after me being told that internet sources aren't appropriate to use for research papers | 18:36 |
| clarkb | as I said I would not personally accept any code from a human under those terms nor would I sign the DCO under those terms if a human was the source. So I'm not sure why I'd accept it from an llm. sean-k-mooney points out that humans produce copyrighted work. LLMs don't necessarily and that may be one reason | 18:36 |
| sean-k-mooney | but there si a very very very big deta in many of the capabliteis | 18:36 |
| TheJulia | The other issue is that it is a constantly shifting ecosystem. Who is to say every tool doesn't change their ToS next week in entirely incompatible or compatible ways as well. | 18:36 |
| sean-k-mooney | clarkb: oh i know your talking form your presecitve | 18:36 |
| sean-k-mooney | i jsut was cometing on the fact that many | 18:36 |
| sean-k-mooney | purly traied modesl are quite alot behiend the comercail ones today | 18:37 |
| sean-k-mooney | althoguht that is changing | 18:37 |
| fungi | clarkb: to play devil's advocate, usa government employees also don't produce copyrighted work (it's explicitly placed in the public domain), but we still accept their contributions | 18:37 |
| clarkb | sean-k-mooney: yup I get it. There is a lot of factors to consider and its easy to overlook things so ist good to try to be thorough before dismissing or accepting something out of hand | 18:37 |
| TheJulia | Also, entirely different different use cases/purposes as well. | 18:37 |
| clarkb | fungi: but they have special dispensation to do so | 18:37 |
| clarkb | fungi: via the special cla iirc | 18:37 |
| clarkb | (not sure if that went away with dco probably it did?) | 18:37 |
| fungi | well, did, now they just use the dco | 18:37 |
| sean-k-mooney | clarkb: on a related note i calim no copyright over the llm output of my code review job https://zuul.teim.app/t/main/build/68ab63e0ffa64ce2b095525587bd9785/log/code-review/review-report.md | 18:38 |
| sean-k-mooney | clarkb: i got that workign using claude code as teh client (also opencode) with glm 4.6 hosted by z.ai | 18:38 |
| sean-k-mooney | step 2 is make it pretty and readable not that it can work | 18:39 |
| clarkb | fungi: but also government employee humans providing things under the public domain are basically saying there are no limitations whatsoever whcih is different than llm service with complicated tos | 18:39 |
| clarkb | in any case I think this is where the friction arises at least as I see it when trying to review and accept generated contributions | 18:40 |
| clarkb | the policy punts that work to the code reviewer. I think we had hoped the landscape would be more clear when applying the policy but unfortunately the opposite seems to have happened | 18:40 |
| fungi | yes, ideally the llm operator's tos would make a similar statement, but instead it seems like the user is going to make some personal judgement about it based on their non-lawyer (in most cases) understanding of the inapplicability of some tos clauses | 18:41 |
| sean-k-mooney | yep at the end of the day it still relies on a human in the loop ot make a judgement call at some level | 18:45 |
| TheJulia | To be clear, this problem is wholly worse without labeling of change's commit messages. Going back to my prior statement, we know some folks are going to always try and take the most expedient path forward and the purpose of the labeling is to highlight the consideration and then the secondary reality call on the reviewer human side is "is this consistent with what I expect" and "is there anything else I need to be aware of". | 18:48 |
| TheJulia | The general plain reading of the ToS, at least doesn't feel like a concern to me today, but if Anthropic decided to expand service areas, then that obviously would begin to be very very problematic and hopefully nobody is explicitly violating terms of service. Similarly, installing some tools which you may barely be conscious of on a host might mean you cannot measure or publish any results. It goes back to what is the plain | 18:48 |
| TheJulia | reading of the person doing that because maybe your doing something else which is entirely unrelated, or loosely related, but still not in violation of the ToS you agreed to with that binary package from a specific hardware vendor | 18:48 |
| clarkb | TheJulia: I don't think the actual service areas matter too much if there is any limitation on field of endeavor fwiw | 18:49 |
| clarkb | at least for open source software dev. Either its ok to have that in the tos and contribute to open source projects or it isn't | 18:49 |
| TheJulia | yeah, and that is the key question | 18:49 |
| TheJulia | sort of like how in the US, the courts try to focus the question. Here, with this, we need to do the same. | 18:50 |
| clarkb | re people doing it secretly as a code reviewer I feel like the blame is squarely on them if I miss it. But if I'm reviewing code that is properly annotated and I let something in that I shouldn't then I own that | 18:51 |
| clarkb | in the first scenario I have done nothing wrong. In the second I have | 18:51 |
| TheJulia | clarkb: do you mean secretly as a contributor? | 18:51 |
| TheJulia | for the first part of that ? | 18:51 |
| clarkb | TheJulia: yes someone generating a contribution and not annotating it that way and contributing it up stream is what I mean for the first case | 18:52 |
| TheJulia | Yeah, fair. I totally agree. | 18:52 |
| TheJulia | I think the other challenge is context and drawing the line, going back to the question and ultimately fair and general reading of the text | 18:53 |
| * TheJulia goes and gets the other laptop with the account to email staff folks | 18:55 | |
| clarkb | if claude didn't have this in the tos: "Outputs may not always be accurate and may contain material inaccuracies even if they appear accurate because of their level of detail or specificity." we could just ask claude :) | 18:57 |
| TheJulia | the standard disclaimer! :) | 18:59 |
| *** vhari_ is now known as vhari | 19:10 | |
| spotz[m] | hehe | 19:58 |
| JayF | TheJulia: clarkb: The thing I keep thinking about is this: you know what else isn't yet well litigated across the world? OSS licenses themselves. | 20:08 |
| JayF | TheJulia: clarkb: Any attempt by me, a software engineer with lots of experience and ideas but not an expert in the direction of legality, to understand legalese is going to be misleading as hell. This is 100000% a problem I need to outsource if it matters. I also am not 100% sure, but would be shocked if there aren't similarly restrictive clauses in other tools one might use as a swdev, that just aren't looked at in this detail. | 20:10 |
| TheJulia | That reminds me, there was a case pending in the UK I need to go lookup. | 20:10 |
| JayF | e.g. https://www.wired.com/story/open-source-license-requires-users-do-no-harm/ | 20:11 |
| JayF | https://archive.is/2xgsS is a paywall bypass for that article | 20:11 |
| fungi | there have been lots of "do no harm" licenses in the past that the osi rejected instantly because it's a restriction on fields of endeavor | 20:14 |
| JayF | Yep, my point is more asking if we've done this level of diligence to ensure *those* tools weren't used in creation of OpenStack? My entire premise is that this class of issue always existed, it's just amplified by the uptake and discourse around AI. | 20:18 |
| fungi | oh sure, i think in the past it's been uncommon for developer tools to claim or revoke ownership over anything you produce using them | 20:21 |
| fungi | it's just further complicated by the fact that in most cases you don't know exactly what material an llm was trained on, and whether it might regurgitate something similar enough to some of its training input to be considered derivative | 20:24 |
| JayF | The interesting thing is, the way I prompt it to use/copy/model-based-on other openstack code in Apache 2.0, it almost doesn't even make sense to ask that question | 20:25 |
| clarkb | fungi: right I think one the subleties here is that the llm creates an output artifact that is meanginful in one way or another and the llm operator says you can only use that output artifact if you follow their rules. Compiler outputs or other more deterministic content generation tools are generally understood to have no ownership by the tool creator with all of that belonging | 20:31 |
| clarkb | to the user | 20:31 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!