Lightnews — Scholar-powered news

Aaron Scher

@aaronscher.bsky.social

- Reflection on how this is hard but we should try: bsky.app/profile/aaro...
- Mechanism highlight: FlexHEGs: bsky.app/profile/aaro...

Aaron Scher @aaronscher.bsky.social · Dec 5

One mechanism that seems promising is Flexible Hardware-Enabled Guarantees (FlexHEGs) and on-chip approaches. These could potentially be used to securely carry out a wide range of governance operations on AI chips, without leaking sensitive information. (1/3)

Aaron Scher @aaronscher.bsky.social · Dec 4

I’ll be posting some snippets of interesting ideas from this giant verification report in other threads, linked here! (12/12)

December 5, 2024 at 8:02 PM

Aaron Scher

@aaronscher.bsky.social

Some versions of FlexHEGs could be designed + implemented in only a couple years and retrofitted to existing chips! Designing more secure chips like this could unlock many AI governance options that aren’t currently available! (3/3)

December 5, 2024 at 8:01 PM

Aaron Scher

@aaronscher.bsky.social

These have been discussed previously, yoshuabengio.org/wp-content/u..., @yoshuabengio.bsky.social, so we don’t explain them too much in the report, but they are widely useful! (2/3)

yoshuabengio.org

December 5, 2024 at 8:01 PM

Aaron Scher

@aaronscher.bsky.social

- Inspectors could be viable: bsky.app/profile/aaro...
- Reflection on US/China conflict: bsky.app/profile/aaro...
- Mechanism highlight: Signatures of High-Level Chip Measures: bsky.app/profile/aaro...

Aaron Scher @aaronscher.bsky.social · Dec 4

One mechanism that seems promising: Signatures of High-Level Chip Measures. Classify workloads (e.g., is it training or inference) based on high-level chip measures like power-draw, but using ‘signatures’ of these measures based on temporary code access. (1/6)

Aaron Scher @aaronscher.bsky.social · Dec 4

I’ll be posting some snippets of interesting ideas from this giant verification report in other threads, linked here! (12/12)

December 4, 2024 at 8:03 PM

Aaron Scher

@aaronscher.bsky.social

- Substituting high-tech low-access with low-tech high-access: bsky.app/profile/aaro...
- Distributed training causes problems: bsky.app/profile/aaro...
- Mechanism highlight: Networking Equipment Interconnect Limits: bsky.app/profile/aaro...

Aaron Scher @aaronscher.bsky.social · Dec 4

Distributed training (i.e., geographically distributed, decentralized) could pose major problems for many AI Governance plans. In the default case, large AI training happens in a small number of big data centers, so monitoring training can focus on those data centers. (1/10)

Aaron Scher @aaronscher.bsky.social · Dec 4

I’ll be posting some snippets of interesting ideas from this giant verification report in other threads, linked here! (12/12)

December 4, 2024 at 8:02 PM

Aaron Scher

@aaronscher.bsky.social

Conceptually, this could be thought of as having a ‘signature’ of approved activity (e.g., inference, or finetuning on models you’re allowed to finetune) which other chips have to stay sufficiently close to. (6/6)

December 4, 2024 at 7:57 PM

Aaron Scher

@aaronscher.bsky.social

But it’s probably much easier because you no longer have a huge distribution shift (e.g., new algorithms, maybe different types of chips) because you included labeled data from the monitored country in your classifier training set. (5/6)

December 4, 2024 at 7:57 PM

Aaron Scher

@aaronscher.bsky.social

So when deployed, this mechanism looks like similar classification systems: you’re measuring e.g., the power draw or inter-chip network activity of a bunch of chips and trying to detect any prohibited activity. (4/6)

December 4, 2024 at 7:57 PM

Aaron Scher

@aaronscher.bsky.social

But this classification problem gets much easier if you have access to labeled data for chip activities that are approved by the treaty. You could get this by giving inspectors temporary code access. (3/6)

December 4, 2024 at 7:57 PM

Aaron Scher

@aaronscher.bsky.social

Classifying chip activity has been researched previously, but it’s not clear it will be robust enough in the international verification context: highly-competent adversaries who can afford to waste some compute could potentially spoof this. (2/6)

December 4, 2024 at 7:57 PM

Aaron Scher

@aaronscher.bsky.social

In the report we give example calculations for inter-pod bandwidth limits, discuss distributed training, note various issues, and generally flesh out this idea. Kulp et al. discuss this idea for manufacturing new chips, but in our case that’s not strictly necessary. (8/8)

Mechanisms to Verify International Agreements About AI Development — MIRI Technical Governance Team

In this research report we provide an in-depth overview of the mechanisms that could be used to verify adherence to international agreements about AI development.

techgov.intelligence.org

December 4, 2024 at 7:55 PM

Aaron Scher

@aaronscher.bsky.social

This mechanism is promising because it can be implemented with physical networking equipment and security cameras but no code access. This means it poses minimal security risk and could be implemented quickly. (7/8)

December 4, 2024 at 7:55 PM

Aaron Scher

@aaronscher.bsky.social

This gap is critical to interconnect bandwidth limits: with enough inter-pod bandwidth for inference but not training, a data center can verifiably claim that these AI chips are not participating in a large training run. (6/8)

December 4, 2024 at 7:55 PM

Aaron Scher

@aaronscher.bsky.social

Back-of-the-envelope calculations indicate a bandwidth difference of ~1.5 million times from data parallelism to inference! Distributed training methods close this gap substantially, but there is likely still a gap after such adjustments. (5/8)

December 4, 2024 at 7:55 PM

Aaron Scher

@aaronscher.bsky.social

Inference only requires tokens to move in and out of a pod — very little information for text data, order 100 KB/s. For training, communication is typically much larger: activations (tensor parallelism) or gradients (data parallelism). (4/8)

December 4, 2024 at 7:55 PM

Aaron Scher

@aaronscher.bsky.social

If between-pod bandwidth is set correctly, a pod could conduct inference but couldn’t efficiently participate in a larger training run. This is because training has higher between-pod communication requirements than inference. (3/8)

December 4, 2024 at 7:55 PM

Aaron Scher

@aaronscher.bsky.social

AI chips can be physically arranged to have high bandwidth communication with only a small number, e.g., 128, of other chips (a “pod”) and very low bandwidth communication to chips outside this pod. (2/8)

December 4, 2024 at 7:55 PM

Aaron Scher

@aaronscher.bsky.social

Whistleblowers and interviews could be relatively straightforward to implement, not requiring vastly novel tech, and they could make it very difficult to keep violations hidden given the ~hundreds of people involved. (12/12)

December 4, 2024 at 7:51 PM

Aaron Scher

@aaronscher.bsky.social

Interviews could also be key. E.g., people working on AI projects could be made available for interviews by international regulators. These could be structured specifically around detecting violations of international agreements. (11/12)

December 4, 2024 at 7:51 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news