David Grogan
david-grogan.bsky.social
David Grogan
@david-grogan.bsky.social
Software developer. Dublin
... As for impersonation specifically, that would be bad, and if not covered by existing copyright law, I'd support new laws to beef up protections there, but that's compatible with everything I've said already, which is all about judging the output, not the training.
December 16, 2024 at 7:54 PM
I'm not being obtuse, I've covered the jobs aspect already in another reply to another guy in this thread. I genuinely don't consider that to be a real harm at all, just the natural cost of technology advancing. Society will adapt as it always does...
December 16, 2024 at 7:54 PM
I don't like hearing people talking about what technologies we 'allow' people to use, outside of stuff that can literally kill and maim people, we need to err on the side of freedom. Of course people can choose what media they want to support, and are free to criticise as they see fit.
December 16, 2024 at 7:43 PM
Ok, would you agree that the end result here (some abstract data patterns), taken by itself, isn't in any way problematic? If so, the only important thing ethically would be how the data was learned, was it learned via forced child labour camps, or via some maths that harmed nobody?
December 16, 2024 at 7:32 PM
Wut? Your little scenario is not even vaguely similar to what's going on here, learning patterns from music data via calculus is not in any way comparable to forced child labour. lmao indeed.
December 16, 2024 at 6:54 PM
In terms of a legal standard, it's either similar enough to be treated the same legally, or it's completely different, in which case it's not covered at all by current law anyway.
December 16, 2024 at 6:39 PM
In terms of an ethical standard, I think the similarities are enough in terms of the end-product, that you really can treat them the same, and the main differences are all in the learning process, and that's just not relevant...
December 16, 2024 at 6:39 PM
Based on the current technology, for producing completed works, probably not, it's kind of shit; at best vaguely amusing for producing memes. Though as it stands it works well as a musical tool (either producing short samples, or creating stems from regular samples, or other forms of processing).
December 16, 2024 at 6:27 PM
This has always happened in human society when new technologies come along, nothing new here. Trying to prevent technological progress is not only a waste of time, but counter productive too. What we should be doing is using the fruits of these advances to improve society for everyone, e.g. UBI.
December 16, 2024 at 6:19 PM
The open source models are already here, the genie is out of the box. Another possibility, also currently completely possible, would be to train a large model entirely on public domain music, and then individuals could fine-tune the model on whatever data they wanted, including copyrighted music.
December 16, 2024 at 6:14 PM
... It's basically just an incredibly efficient form of data compression. Also, as the technology develops we'll probably start to see the training process resemble meat-based learning more and more.
December 16, 2024 at 6:08 PM
You're correct that the process by which a neural network is trained is completely different to the process by which a human learns things. The end result is surprisingly similar though, we store reproducible patterns in the strengths of neural connections, ML models store them in matrixes...
December 16, 2024 at 6:08 PM
..., and not just because of the existence of open source models, but because we're going to see training efficiency rapidly increase over the coming years, this whole thing will become rapidly democratised.
December 16, 2024 at 5:54 PM
...(and good luck getting them enacted in the current climate); and trying to frame this in anti-capitalist terms is also a bit silly, because a) capitalism is great, and b) these technologies are not going to be limited to people who can afford giant supercomputers for long...
December 16, 2024 at 5:54 PM
It's a philosophical question that gets to the core of whether this is a moral problem at all though, and I'm sceptical that this represents any kind of harm at all. As for criticising the legal argument, well, you'll have to propose some new laws if you want to stop people using this technology...
December 16, 2024 at 5:54 PM
...produces something novel, which is not recognisable as a copyrighted piece of music, then that would not violate copyright.
December 16, 2024 at 4:19 PM
...be with a human artist.

So, if a model spits out something which is recognisably similar to a copyrighted piece of music, and someone tries to monetize that output, then that's what's subject to copyright.

But if a model combines various things it has learned from multiple sources and...
December 16, 2024 at 4:19 PM
Even the notion of training on copyrighted music being a violation is dubious. It's no different in principle from a human learning how to make music by listening to copyrighted music. Where it would fall under copyright protection is when the output is used for commercial ends, just as it would...
December 16, 2024 at 4:19 PM
Wrong, there are loads of emergent behaviours (this has been the subject of much research). Not nearly as many as in a human brain, though we don't understand consciousness enough to rule even that out (for all we know rocks are conscious), I would guess it requires a lot more complexity though.
December 15, 2024 at 8:11 PM
Yes, they do. You're correct that cognition is not reducible to word associations, but frontier models are not just word associatiors, for one the embedding space is multi-modal, and that is only one component, MLP/attention layers allow for modularity, transformations, filtering, etc.
December 15, 2024 at 7:26 PM
That's just the emergent behaviour of the unthinking process, the tiny man isn't conscious, it's the room that's conscious.
December 15, 2024 at 7:06 PM
You are a tiny man in a windowless room pattern matching in a language he doesn't actually speak.
December 15, 2024 at 6:18 PM
So complex that they contain generalised symbolic representations, world modelling and other features, just like our brains do. This is well established in the research now.
December 15, 2024 at 6:08 PM
LLMs are not simple word-probability tables.
December 15, 2024 at 6:01 PM
..., and the ability to learn while reasoning (which current statically trained models completely lack, probably the biggest hurdle).
December 15, 2024 at 6:01 PM