LW - My takes on SB-1047 by leogao

The Nonlinear Library

The Nonlinear Fund에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 The Nonlinear Fund 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

2M ago 6:57

MP3•에피소드 홈

Fetch error

Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on September 26, 2024 16:04 (1M ago)

What now? This series will be checked again in the next hour. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My takes on SB-1047, published by leogao on September 9, 2024 on LessWrong.
I recently decided to sign a letter of support for SB 1047. Before deciding whether to do so, I felt it was important for me to develop an independent opinion on whether the bill was good, as opposed to deferring to the opinions of those around me, so I read through the full text of SB 1047.
After forming my opinion, I checked my understanding of tort law basics (definitions of "reasonable care" and "materially contribute") with a law professor who was recommended to me by one of the SB 1047 sponsors, but who was not directly involved in the drafting or lobbying for the bill. Ideally I would have wanted to consult with a completely independent lawyer, but this would have been prohibitively expensive and difficult on a tight timeline. This post outlines my current understanding.
It is not legal advice.
My main impression of the final version of SB 1047 is that it is quite mild. Its obligations only cover models trained with $100M+ of compute, or finetuned with $10M+ of compute. [1] If a developer is training a covered model, they have to write an SSP, that explains why they believe it is not possible to use the model (or a post-train/finetune of the model costing <$10M of compute) to cause critical harm ($500M+ in damage or mass casualties).
This would involve running evals, doing red teaming, etc. The SSP also has to describe what circumstances would cause the developer to decide to shut down training and any copies of the model that the developer controls, and how they will ensure that they can actually do so if needed. Finally, a redacted copy of the SSP must be made available to the public (and an unredacted copy filed with the Attorney General).
This doesn't seem super burdensome, and is very similar to what labs are already doing voluntarily, but it seems good to codify these things because otherwise labs could stop doing them in the future. Also, current SSPs don't make hard commitments about when to actually stop training, so it would be good to have that.
If a critical harm happens, then the question for determining penalties is whether the developer met their duty to exercise "reasonable care" to prevent models from "materially contributing" to the critical harm. This is determined by looking at how good the SSP was (both in an absolute sense and when compared to other developers) and how closely it was adhered to in practice.
Reasonable care is a well-established concept in tort law that basically means you did a cost benefit analysis that a reasonable person would have done. Importantly, it doesn't mean the developer has to be absolutely certain that nothing bad can happen.
For example, suppose you release an open source model after doing dangerous capabilities evals to make sure it can't make a bioweapon, but then a few years later a breakthrough in scaffolding methods happens and someone makes a bioweapon using your model - as long as you were thorough in your dangerous capabilities evals you would not be liable, because it would not have been reasonable for you to anticipate that someone would make a breakthrough that invalidates your evaluations.
Also, if mitigating the risk would be too costly, and the benefit of releasing the model far outweighs the risks of release, this is also a valid reason not to mitigate the risk under the standard of reasonable care (e.g the benefits of driving a car at a normal speed far outweigh the costs of car accidents; so reasonable care doesn't require driving at 2 mph to fully mitigate the risk of car accidents).
My personal opinion is I think the reasonable care standard is too weak to prevent AI from killing everyone. However, this also means that I think people opposing the current version of the bill because of the reasonable care requireme...

2447 에피소드

#Podcasting Education #The Nonlinear Fund