Full paper with no paywall that I can see in private browsing: https://arxiv.org/html/2408.10234v2
Full paper with no paywall that I can see in private browsing: https://arxiv.org/html/2408.10234v2
Totally agree.
I’ll watch whatever Tawny Newsome does next. Lower Decks earned a ton of goodwill. It was a breath of fresh air.
I agree that the models themselves are clearly transformative. That doesn’t mean it’s legal for Meta to pirate everything on earth to use for training. THAT’S where the infringement is. And they admitted they used pirated material: https://www.techspot.com/news/101507-meta-admits-using-pirated-books-train-ai-but.html
You want to use the same bullshit tactics and unreasonable math that the RIAA used in their court cases?
I would enjoying seeing megacorps held to at least the same standards as individuals. I would prefer for those standards to be reasonable across the board, but that’s not really on the table here.
I guess the idea is that the models themselves are not infringing copyright, but the training process DID. Some of the big players have admitted to using pirated material in training data. The rest obviously did even if they haven’t admitted it.
While language models have the capacity to produce infringing output, I don’t think the models themselves are infringing (though there are probably exceptions). I mean, gzip can reproduce infringing material too with the correct input. If producing infringing work requires both the algorithm AND specific, intentional user input, then I don’t think you should put the blame solely on the algorithm.
Either way, I don’t think existing legal frameworks are suitable to answer these questions, so I think it’s more important to think about what the law should be rather than what it currently is.
I remember stories about the RIAA suing individuals for many thousands of dollars per mp3 they downloaded. If you applied that logic to OpenAI — maximum fine for every individual work used — it’d instantly bankrupt them. Honestly, I’d love to see it. But I don’t think any copyright holder has the balls to try that against someone who can afford lawyers. They’re just bullies.
Laptops are a crapshoot, so I’d recommend sticking with distros that are known to support your specific model.
Desktops should, in general, just work.
That said, I’ve never personally had a seamless experience. There’s always something I need to struggle to configure. Usually it’s because I’m very picky and I like things to work MY way. The alternative on Widows would not be that it works my way; it would be that there’d be no way to do that so I’d just have to deal with it. If you’re willing to just roll with the defaults, then yeah, most basic things should just work.
The biggest gotcha is GPU drivers. Not all distros ship with recent kernel versions with modern drivers. You should be pretty safe with Fedora and derivatives.
Why? This cannot possibly have any legal weight. Some adults look young. Some kids look old. The very idea is broken from the outset.
I can’t tell if this is incompetence or malice.
This is probably the best solution I’ve found so far.
Unfortunately, even this is no match for the user-hostile design of, say, Microsoft Copilot, because it hides content that is scrolled off screen so it’s invisible in the output. That’s no fault of this extension. It actually DOES capture the data. It’s not the extension’s fault that the web site intentionally obscures itself. Funnily enough, if I open the resulting html file in Lynx, I can read the hidden text, no problem. LOL.
Thanks for the info. I was not aware that Bluesky had public, shareable block lists. That is indeed a great feature.
For anyone else like me who was not aware, I found this site with an index of a lot of public block lists: https://blueskydirectory.com/lists . I was not able to load some of them, but others did load successfully. Maybe some were deleted or are not public? I’m not sure.
I’ve never been heavily invested in microblogging, so my first-hand experience is limited and mostly academic. I have accounts on Mastodon and Bluesky, though. I would not have realized this feature was available in Bluesky if you hadn’t mentioned it and I didn’t find that index site in a web search. It doesn’t seem easily discoverable within Bluesky’s own UI.
Edit: I agree, of course, that there is a larger systemic problem at the society level. I recently read this excellent piece (very long but worth it!) that talks a bit about how that relates to social media: https://www.wrecka.ge/against-the-dark-forest/ . Here’s a relevant excerpt:
If this truly is the case—if the only way to improve our public internet is to convert all humans one by one to a state of greater enlightenment—then a full retreat into the bushes is the only reasonable course.
But it isn’t the case. Because yes, the existence of dipshits is indeed unfixable, but building arrays of Dipshit Accelerators that allow a small number of bad actors to build destructive empires defended by Dipshit Armies is a choice. The refusal to genuinely remodel that machinery when its harms first appear is another choice. Mega-platform executives, themselves frequently dipshits, who make these choices, lie about them to governments and ordinary people, and refuse to materially alter them.
Do you think this is a systemic problem, or just the happenstance of today? Is there something about Bluesky’s architecture or governance that makes it more resilient against that (particularly in the long term)? Or will they have all the same problems as they gain more users and enable more federation with other servers?
I’d rather have something like a “code grammar checker” that highlights potential errors for my examination rather than something that generates code from scratch itself
Agreed. The other good use case I’ve found is as a faster reference for simple things. LLMs are absolutely great for one-liners and generating troublesome (but logically simple) things like complex xpath queries. But I still haven’t seen one generate a good script of even moderate complexity without hand-holding. In some cases I’ve been able to get usable output with a few shots, saving me a bit of time compared to if I’d written the whole darned thing from scratch.
I’ve found LLMs very useful for coding, but they aren’t replacing my actual coding, per se. They replace looking things up, like through man pages, language references, or StackOverflow. Something like ffmpeg, for example, has a million options and it is always a little annoying to sift through the docs manually when I just need to do one specific task.
I’m sure it’ll happen sooner or later. I’m not naive enough to claim that “computers will never be able to do $THING” anymore. I’ll say “not in the next year”, though.
Right, not an IDE. The BB stands for “bare bones”, but it has a robust feature set as far as general text editing goes. Autocomplete is minimal so I tend to use an IDE for more complex coding tasks.
Sorry, I misspoke. CUPS itself is not deprecated, only most of its old functionality regarding drivers.
From man cups
:
CUPS printer drivers, backends, and PPD files are deprecated and will no longer be supported in a future feature release of CUPS
I don’t. I have installed Firefox manually for many years across several distros now, albeit for different reasons. For example:
Debian only has Firefox ESR in the Bookworm repo. I want the latest mainline version.
Bazzite only offers it via Flatpak, which breaks functionality I need such as native messaging.
I see no problem installing it manually. It keeps itself updated and has caused me zero problems.
BBEdit.
It makes every other GUI text editor look like a joke.
macOS still uses cups. It’s deprecated but still functional. The alternative is to use AirPrint or get fucked.
Just marketing nonsense. There are three ways to present AI features:
A generational improvement on things that have been available for 20+ years. This is not sexy and does not make for good advertising. For example: grammar checking, natural-speech processing (Siri), automatic photo tagging/sorting.
A new type of usage that nobody cares about because they’ve lived without it just fine up to now.
Straight-up lie to people about what it can do, using just enough weasel words to keep yourself out of jail.
Which backend are you using to run it, and does that backend have an option to adjust context size?
I noticed in LM Studio, for example, that the default context size is much smaller than the maximum that the model supports. Qwen should certainly support more than 2000 tokens. I’d try setting it to 32k if you can.
I’d be surprised if it were anything else. No way in hell OpenAI is going to develop their own browser engine from scratch. Mayyyyybe they go with Gecko? Might make sense if OpenAI is trying to eat Google’s lunch long-term.
I never thought about it that way, but yeah. Spot-on.
I don’t hate it in Enterprise to be honest, because there is the context of “humanity not yet at its best”.
My experience might be a bit outdated, but I remember finding the default Mac OS X Terminal extremely slow. A few years back I ran an output-heavy command, and the speed difference between displaying the output in terminal vs outputting it to a file was orders of magnitude. The same thing on my Linux system was much, much faster. I’m not sure how much of that was due specifically to rendering, vs memory management or something else, though.
I might see if I can still reproduce this in Sequoia and if Ghostty is faster on Mac.