Decentralizing the World Wide Web

“The services are kind of creepy in how much they know about you,” says Brewster Kahle, the founder of the Internet Archive.
This is coming from an individual who has made it his life’s goal to archive the entire internet, regardless of the wishes of the people whose work he’s backing up.
posted by NoxAeternum at 12:45 PM on September 12 [6 favorites]
My favourite quote: “Our laptops have become just screens. They cannot do anything useful without the cloud”. I’d suggest not buying a Chromebook next time…
posted by pipeski at 12:52 PM on September 12 [4 favorites]
It basically comes across as “we should be the ones running the internet”. Again, Kahle’s hypocrisy illustrates the point – he’s complaining about how much data the big services hoover up, while engaging in much the same conduct himself. Not to mention that their comments on harassment and illegal activity leave a lot to be desired.
Ultimately, they’re trying to create a technological solution to a social problem, one that they refuse to try to understand.
posted by NoxAeternum at 1:04 PM on September 12 [2 favorites]
posted by demiurge at 1:09 PM on September 12 [2 favorites]
>…he’s complaining about how much data the big services hoover up, while engaging in much the same conduct himself
I don’t think this comparison is accurate. The Archive records the public-facing web, not “privately” (for lack of a better term) posted thoughts and images. The Archive doesn’t then deep-analyze this data and use it to target ads. It’s a public service with huge benefits for millions and is not for profit.
The simple fact that it records large amounts of information doesn’t put it in the same class as companies like Google and Facebook, which surreptitiously track, store, and collate behaviors of individuals with the primary purpose of advertising to them or selling them a product. There’s a huge difference here and conflating them to me seems disingenuous.
Any project that scrapes the web is going to have some privacy implications and will collect data that people may not have intended to be stored elsewhere. But that’s a byproduct of the public, decentralized web as it originally operated. Putting something on the publicly accessible internet is a form of consent for reading and storing (your computer does it automatically for everything you see online) and Archive does that in a systematic way that’s important for the web, not to mention for journalists, developers, and curious people.
Certainly we need feedback mechanisms and ways for takedown and copyright requests to be handled properly. Archive doesn’t have a compliance team like YouTube’s building AI to automatically detect and take down content. It doesn’t have a 20,000-person moderation team like Facebook (they outsource most of it of course). And even those don’t work. So the problems of unwanted duplication or bad content are so far unsolved even by the largest and most advanced tech companies in the world.
posted by BlackLeotardFront at 1:16 PM on September 12 [14 favorites]
posted by Segundus at 1:16 PM on September 12 [4 favorites]
Of course, those big centralized services made it easy for people to share information, even if they would otherwise not be technically proficient enough to set up a website of their own, so a return to a more decentralized approach, while generally good, might leave a lot of people feeling left out.
Basically, like NoxAeternum said, it’s a social problem far more than it is a technical one.
posted by asnider at 1:21 PM on September 12 [1 favorite]
posted by Strange Interlude at 1:25 PM on September 12 [3 favorites]
I don’t think that’s accurate. The Internet Archive respects robots.txt exclusions and you can apparently also request that your content be removed from it. There were some articles from 2017 about them planning to stop respecting robots.txt directives generally, but that must not have come to pass since I’ve personally run into the blocks very recently when trying to archive some new articles from a media website.
posted by cosmic.osmo at 1:30 PM on September 12 [5 favorites]
posted by SansPoint at 1:45 PM on September 12 [6 favorites]
Decentralization is nice– hey, I still have my own domain and website!– but there’s a reason people went to the big sites: ‘cos all their friends were there. I don’t see anything in the article that prevents this happening again.
posted by zompist at 2:12 PM on September 12 [2 favorites]
If this story is any good, the take home message is that unvetted information sources are shredding the social fabric. We can kiss Ben Franklin’s aphorisms goodbye.
The appropriate precedent for study is the rise of the pamphleteer in the 18th Century.
I gather this was an era of unparalleled opinionating and quackery, both boosted by the readily available printing shop.
I am also reminded of Daniel Boorstin’s study of advertising, and how really strict standards were eroded over the course of some decades.
On the other hand, of course, there’s also the stranglehold of corporate TV and radio in midcentury, where centralised information control gave itself a really bad name.
Coming back to the decentralised web, the idea sounds fine in principle, and is already do-able through small, loose networks. But I’m struck by the likelihood of those cells morphing into closed, conspiratorial communities feeding their own neuroses and biases.
The extent to which that already happens on Facebook is evidence enough. The atomisation of the web might mean that giant propaganda machines cannot operate at scale, but it might also mean those machines can be the biggest fish in a planet of small ponds.
With the print media, pamphleteers and advertising, I think the worst behaviour was addressed through regulatory schemes, enforced by a government that had some basis in a civic-minded citizenry. Or something.
posted by rustipi at 3:17 PM on September 12
Original Source
posted by el io at 12:41 PM on September 12 [8 favorites]