{"id":2436,"date":"2024-12-03T21:03:34","date_gmt":"2024-12-04T01:03:34","guid":{"rendered":"https:\/\/lowtek.ca\/roo\/?p=2436"},"modified":"2024-12-03T21:03:34","modified_gmt":"2024-12-04T01:03:34","slug":"hoarder-a-self-hosted-link-collection-and-web-archive","status":"publish","type":"post","link":"https:\/\/lowtek.ca\/roo\/2024\/hoarder-a-self-hosted-link-collection-and-web-archive\/","title":{"rendered":"Hoarder &#8211; a self hosted link collection and web archive"},"content":{"rendered":"<p>I found out about <a href=\"https:\/\/hoarder.app\/\">Hoarder<\/a> via the <a href=\"https:\/\/selfhosted.show\/\">self-hosted podcast<\/a>. While I don&#8217;t always agree with the opinions of the hosts, they&#8217;ve helped me discover useful things a few times. I&#8217;d certainly recommend checking the podcast out.<\/p>\n<div class=\"mt-4 w-full space-y-6 text-center\">\n<blockquote>\n<h1 class=\"text-center text-3xl font-bold sm:text-6xl\">The <span class=\"bg-gradient-to-r from-purple-600 to-red-600 bg-clip-text text-transparent\">Bookmark Everything<\/span> App<\/h1>\n<\/blockquote>\n<div class=\"mx-auto w-full gap-2 text-base md:w-3\/6\">\n<blockquote>\n<p class=\"text-center text-gray-600\">Quickly save links, notes, and images and hoarder will automatically tag them for you using AI for faster retrieval. Built for the data hoarders out there!<\/p>\n<\/blockquote>\n<p>The <a href=\"https:\/\/docs.hoarder.app\/Installation\/docker\">install<\/a> is docker friendly and based on compose. It&#8217;s a very simple 3 steps to get a test instance setup.<\/p>\n<ol>\n<li>Download the compose yaml.<\/li>\n<li>Create a\u00a0<span class=\"codespan__pre-wrap\"><code>.env<\/code><\/span>\u00a0file with a few values<\/li>\n<li>Then\u00a0<span class=\"codespan__pre-wrap\"><code>docker compose up<\/code><\/span><\/li>\n<\/ol>\n<p>Seems like it supports &#8220;sign up&#8221; &#8211; if you host this visibly externally you may have some spammy sign-ups.. this may be something you can disable.. (yes, you can disable this as I find out below)<\/p>\n<p>After you have created a user &#8211; you are greeted with this blank canvas<\/p>\n<p><a href=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2437\" src=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder1-500x284.png\" alt=\"\" width=\"500\" height=\"284\" srcset=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder1-500x284.png 500w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder1-1024x581.png 1024w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder1-768x436.png 768w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder1-1536x871.png 1536w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder1-2048x1162.png 2048w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder1-1200x681.png 1200w\" sizes=\"auto, (max-width: 500px) 85vw, 500px\" \/><\/a>I currently run <a href=\"https:\/\/wallabag.org\/\">Wallabag<\/a> &#8211; which I landed on after trying a few other choices. It was the best choice for my needs at the time, but also super basic. Wallabag has a mobile app which I find useful &#8211; as it makes sharing links I find on mobile easy to my Wallbag install.<\/p>\n<p>Wallabag often struggles to capture a page &#8211; but it at least keeps the link. One example is <a href=\"https:\/\/www.thekitchn.com\/\">this website<\/a> &#8211; which has some sort of scraper blocker. You get a page that indicates it is protected by <a href=\"https:\/\/www.perimeterx.com\/whywasiblocked\">this<\/a>.<\/p>\n<p>Ok &#8211; so how does Hoarder do with a link <code>https:\/\/www.thekitchn.com\/instant-pot-bo-kho-recipe-23242169<\/code>?<\/p>\n<p>For comparison &#8211; this is what wallabag got..<\/p>\n<p><a href=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2438\" src=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder2-500x242.png\" alt=\"\" width=\"500\" height=\"242\" srcset=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder2-500x242.png 500w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder2-1024x495.png 1024w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder2-768x371.png 768w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder2-1536x742.png 1536w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder2-2048x990.png 2048w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder2-1200x580.png 1200w\" sizes=\"auto, (max-width: 500px) 85vw, 500px\" \/><\/a><\/p>\n<p>The capture in Hoarder take a bit of time &#8211; not long, but it renders sort of a blank-ish card immediately and then the image fills in.<\/p>\n<p><a href=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2439\" src=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder3-500x235.png\" alt=\"\" width=\"500\" height=\"235\" srcset=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder3-500x235.png 500w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder3-1024x482.png 1024w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder3-768x362.png 768w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder3-1536x723.png 1536w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder3-2048x964.png 2048w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder3-1200x565.png 1200w\" sizes=\"auto, (max-width: 500px) 85vw, 500px\" \/><\/a><br \/>\nLet&#8217;s take a closer look at the tile that it created for me<\/p>\n<p><a href=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2440\" src=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder4-500x447.png\" alt=\"\" width=\"500\" height=\"447\" srcset=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder4-500x447.png 500w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder4-1024x916.png 1024w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder4-768x687.png 768w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder4-1200x1073.png 1200w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder4.png 1371w\" sizes=\"auto, (max-width: 500px) 85vw, 500px\" \/><\/a><\/p>\n<p>The top of the tile is a picture and link to the original URL (1). The link (1) is also the same destination.<br \/>\nThe date (2) and expansion arrows (2) both take you to a larger locally hosted view.<br \/>\n(3) is a menu of options<\/p>\n<p><a href=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2441 size-thumbnail\" src=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder5-150x150.png\" alt=\"\" width=\"150\" height=\"150\" \/><\/a><\/p>\n<p>Let&#8217;s dive deeper into the expanded (locally hosted view)<\/p>\n<p><a href=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder6-scaled.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2442\" src=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder6-500x282.jpg\" alt=\"\" width=\"500\" height=\"282\" srcset=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder6-500x282.jpg 500w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder6-1024x578.jpg 1024w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder6-768x434.jpg 768w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder6-1536x868.jpg 1536w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder6-2048x1157.jpg 2048w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder6-1200x678.jpg 1200w\" sizes=\"auto, (max-width: 500px) 85vw, 500px\" \/><\/a><\/p>\n<p>The overall capture\/rendering of the page from the local version is pretty good. Links in the text haven&#8217;t been re-written, but that&#8217;s both expected and generally useful.<\/p>\n<p><a href=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder7.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2443\" src=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder7-500x483.png\" alt=\"\" width=\"500\" height=\"483\" srcset=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder7-500x483.png 500w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder7-1024x989.png 1024w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder7-768x742.png 768w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder7-1200x1159.png 1200w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder7.png 1445w\" sizes=\"auto, (max-width: 500px) 85vw, 500px\" \/><\/a><\/p>\n<p>This view also offers the option to view a screenshot &#8211; which is as you expect.<\/p>\n<p><a href=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder8.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2444\" src=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder8-500x367.jpg\" alt=\"\" width=\"500\" height=\"367\" srcset=\"https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder8-500x367.jpg 500w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder8-1024x752.jpg 1024w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder8-768x564.jpg 768w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder8-1536x1129.jpg 1536w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder8-1200x882.jpg 1200w, https:\/\/lowtek.ca\/roo\/wp-content\/uploads\/2024\/12\/hoarder8.jpg 1799w\" sizes=\"auto, (max-width: 500px) 85vw, 500px\" \/><\/a><\/p>\n<p>Since I didn&#8217;t provide an OpenAI key nor did I configure Ollama the fancy &#8220;Summarize with AI&#8221; button just gives me an error.<\/p>\n<p>Looking &#8211; it seems this setup 3 unique containers<\/p>\n<ul>\n<li>ghcr.io\/hoarder-app\/hoarder:release<\/li>\n<li>getmeili\/meilisearch:v1.11.1<\/li>\n<li>gcr.io\/zenika-hub\/alpine-chrome:123<\/li>\n<\/ul>\n<p>but.. I&#8217;m not seeing any storage on the host &#8211; this is probably bad, because that means at least one of these containers is stateful (and looking at the compose &#8212; there are two data volumes)<\/p>\n<pre class=\"lang:default decode:true \">volumes:\r\n  meilisearch:\r\n  data:\r\n<\/pre>\n<p>I have a preference of storing my data on the host filesystem as a volume mapping&#8230; maybe I&#8217;ll warm up to the whole docker volume thing, but it always feels like a big hack. (Read on and you&#8217;ll find that there is a way to avoid the storage concerns that I have here).<\/p>\n<p>The search appears limited to the title only (boo) &#8211; tags are supported in search too.. but no deep searching within the text of the articles.<\/p>\n<p>Looking more at the <a href=\"https:\/\/docs.hoarder.app\/configuration\">doc<\/a> &#8211; persistence is something you can configure<\/p>\n<p><code>DATA_DIR<\/code> &#8211; &#8220;The path for the persistent data directory. This is where the db and the uploaded assets live.&#8221;<\/p>\n<p>and it does appear you can stop signups from happening<\/p>\n<p><code>DISABLE_SIGNUPS<\/code> &#8211; &#8220;If enabled, no new signups will be allowed and the signup button will be disabled in the UI&#8221;<\/p>\n<p>Interesting options for the crawler (disabled by default)<\/p>\n<p><code>CRAWLER_FULL_PAGE_SCREENSHOT<\/code> &#8211; &#8220;Whether to store a screenshot of the full page or not. Disabled by default, as it can lead to much higher disk usage. If disabled, the screenshot will only include the visible part of the page&#8221;<\/p>\n<p><code>CRAWLER_FULL_PAGE_ARCHIVE<\/code> &#8211; &#8220;Whether to store a full local copy of the page or not. Disabled by default, as it can lead to much higher disk usage. If disabled, only the readable text of the page is archived.&#8221;<\/p>\n<p><code>CRAWLER_VIDEO_DOWNLOAD<\/code> &#8211; &#8220;Whether to download videos from the page or not (using yt-dlp)&#8221;<\/p>\n<p>Overall &#8211; I&#8217;m pretty impressed. I&#8217;m not sure I&#8217;m quite ready to dump wallabag, but this might become a project I tackle during the holiday break. That stew recipe is pretty amazing, absolutely worth trying.<\/p>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>I found out about Hoarder via the self-hosted podcast. While I don&#8217;t always agree with the opinions of the hosts, they&#8217;ve helped me discover useful things a few times. I&#8217;d certainly recommend checking the podcast out. The Bookmark Everything App Quickly save links, notes, and images and hoarder will automatically tag them for you using &hellip; <a href=\"https:\/\/lowtek.ca\/roo\/2024\/hoarder-a-self-hosted-link-collection-and-web-archive\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Hoarder &#8211; a self hosted link collection and web archive&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,9],"tags":[],"class_list":["post-2436","post","type-post","status-publish","format-standard","hentry","category-computing","category-reviews"],"_links":{"self":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts\/2436","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/comments?post=2436"}],"version-history":[{"count":3,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts\/2436\/revisions"}],"predecessor-version":[{"id":2447,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts\/2436\/revisions\/2447"}],"wp:attachment":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/media?parent=2436"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/categories?post=2436"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/tags?post=2436"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}