Does ActivityPub send those to other instances, or does ActivityPub only send the original post and the rest (upvotes, downvotes, replies) are stored only on the original server where the post was made?
Does ActivityPub send those to other instances, or does ActivityPub only send the original post and the rest (upvotes, downvotes, replies) are stored only on the original server where the post was made?
haven’t worked with AP yet, but as a webdev I’m certain it’s original server only. Syncing upvotes between nodes would be an insane datavolume and one hell to properly keep in sync to begin with.
[This comment has been deleted by an automated system]
They are synced. There is an insane data volume, yes. It is hell.
no way, that’s a massive oof o.O
Yeah. A lot of hand-wringing has gone on about it, e.g. https://gist.github.com/jdarcy/60107fe4e653819138396257df302eef. I’ll post this and then show you a video of server activity that results.
Demo post
Here is a screencast of what happens to my 2 core server when I post something - https://kglitch.social/activitypub_cpu_and_net.mp4.
I run a single user instance, more or less, so there is little chance of some other user causing this load.
Some of it will be due to the way Kbin is built but I believe any software using ActivityPub to communicate will run into similar issues sooner or later, especially with network traffic usage.
Completly off-topic, but what’s that dope af htop replacement?
Looks like btop
My instance has 800 users, is 4 months old, and the database only is over 30GB. It is an insane amount of data.
There is a postgres command to show the size of each table. Most likely it is from activity tables which can be cleared out to save space.
After the second-to-last update the database shrunk and I was under the impression there was some automatic removal happening. Was this not the case?
It’s helpful info for others but personally I’m not that worried about the database size. The size of the pictrs cache is much more of a concern, and as I understand it there isn’t an easy way to identify and remove cache images without accidentally taking out user image uploads.
Yes there is automatic removal so if you have enough disk space, no need to worry about it.
The pictrs storage only consists of uploads from local users, and thumbnails for both local and remote posts. Thumbnails for remote posts could theoretically be wiped and loaded from the other instance, but they shouldnt take much space anyway.
What triggers this? My DB was about 30GB, then the update shrunk it down to 5GB, then it grew back to 30GB.
I’d be pretty confident that the 140GB of pictrs cache I have is mostly cache. There are occasionaly users uploading images, but we don’t have that many active users, I’d be surprised if there was more than a few GB of image uploads in total out of that 140GB. We just aren’t that big of a server.
The pictrs volume also grows consistently at a little under 1GB per day. I just went and had a look, in the files directory there are 6 directories from today (the day only has a couple of hours left), and these sum to almost 700MB of images and almost 6000 files, or a little over 100KB each.
The instance has had just 27 active users today (though of course users not posting will still generate thumbnails).
While the cached images may be small, it adds up really quick.
As far as I can tell there is no cache pruning, as the cache goes up pretty consistently each day.
The activities table is cleared out automatically every week, items older than 3 months are deleted. During the update only a smaller number of rows was migrated so the db temporarily was slower. You can manually clear older items in
sent_activity
andreceived_activity
to free more space.Actually Im wrong about images, turns out that all remote images are mirrored locally in order to generate thumbnails. 0.19 will have an option to disable that. This could use more improvements, the whole image handling is rather confusing now.
Thanks for the info! Ior performance reasons it would be nice to have a way to configure how long the cache is kept rather than disable it completely, but I understand you probably have other priorities.
Would disabling the cache remove images cached up to that point?
You will have to wait for 0.19 to disable it. Pictrs 0.5 will also add a way to clear old images. See the issue: https://github.com/LemmyNet/lemmy/issues/4053