Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family.
Kofi Annan
Recently I was asked what I do in my free time, and among other things, I named my love for what is known as “Data Hoarding.” This piqued the interest of the person who asked and I went onto explain what data hoarding entailed. Once they understood, they were confused about the appeal of data hoarding. The definition and motivations of data hoarding differ from person to person. Some say they data hoard so that they can have all of their content in the same place, whilst others believe in a grander purpose, the preservation and protection of information for generations in the future. As we march ever further into the digital age, astronomical quantities of information are created each and every day. Millions of examples of music, videos, forms, documents, papers, essays, books, pictures, etc. pour onto our digital landscape waiting for someone to gaze upon it. The beauty of this fact is that we have more information on any given subject than any society so far in the entirety of human history. We have substantial knowledge not only of our current day but also of centuries past. All of this is chaotically spewn about, loosely organized by web crawlers and gathered on indexes. This chaos leads to a lack of protection, every day as more data is added, more is lost. It isn’t uncommon for services to truncate large quantities of data without giving a thought to its significance. Services go offline, servers go down, content is removed, data is lost.
Although we have such a large amount of information, it can be difficult to preserve all of this data, if not impossible. Of course, like all things information is lost. We can’t take a perfect snapshot of our current time representing all facets of the world scene. Instead, we will always have little holes in our knowledge of history. Although this is the case, the digital age has allowed us to document and archive nearly all facets of our day. This is the motivation: preservation and subsequent protection of a wide variety of information. What could be viewed as “useless” information for us today could be viewed as pivotal information in the study of our day and age. Yes, data hoarding is not an exercise of selfishness but an exercise of gift-giving, the gifts being neatly wrapped packages of data for our descendants and peers. Have you ever looked for a video and realized it was removed, but found it reuploaded elsewhere? Have you ever tried to find a site, realized it was down and found a snapshot of it on Internet Archive? These efforts are an example of archivists and data hoarders at work.
Information is power. But like all power, there are those who want to keep it for themselves.
Aaron Swartz
But, with this being said, let us look a bit broader. Though the main motivation of data hoarding is to protect data, there is another that is equally important. This is the decentralization of information. Knowledge is the human extension of information. In other words, knowledge is merely our internalization of information. It would be safe to assume then that the information we are exposed to directly influences our knowledge. A lack of information would indicate a lack of knowledge. In every example of the great farces of history, authoritarians and fascists, tyrants and dictators, control of the people is the ultimate goal. What is the best way of controlling people? Ignorance. The control of information is the most effective way of controlling knowledge and keeping the general population in check. Things that go against the status quo are expunged and manipulated. This isn’t just something from science fiction, a 1984 scenario, we see the control of information in many countries today. China and North Korea for example control most (if not all) of the information coming in and out of their nations, and as a result many resources are completely blocked off from access. This limits the information, limits knowledge, and grants power.
Evidently, control of information and thus control of knowledge is power. My love for data hoarding comes from the act of preservation and empowerment, both for future generations and the present. The decentralization of information helps ensure a longer life span, both from natural losses such as the service hosting it going down, but also outside factors such as attacks on the source. This idea of decentralization is integral to data hoarding: the more copies you have, the better the chance it won’t be lost, expunged, or manipulated. Don’t put all your eggs in one basket.
In conclusion, data hoarding is a fun hobby and is something that I am happy I have gotten into. From aggregating the data to distribution, each step of the process is enjoyable, but the most enjoyable thing is knowing that you are helping preserve a piece of history. In the unfortunate event that something is lost, you can fill in that missing puzzle piece. This is a great feeling and reminds you why you’re doing this. If you want to investigate more about data hoarding and join a community of like-minded individuals, I would head over to r/DataHoarder. There is plenty to read and learn about when it comes to data hoarding. Getting involved is one of the best ways to learn fast.