![]() So, let's dive in and explore the world of MongoDB Whether you are a seasoned developer or just getting started in the world of data management, this post is for you. If you have any questions, please don't hesitate to ask.As the world becomes more digital, the amount of data we generate continues to grow at an exponential rate.In order to keep up with this influx of information, new technologies are constantly being developed.One of the most popular of these is MongoDB, a powerful and flexible NoSQL database system that is designed to handle massive amounts of data while remaining highly scalable and reliable.In this blog post, we will explore what MongoDB is, analyze the MDB stock forecast for 2021, and provide you with the latest news on this exciting technology. Every bit of data has been gathered through official APIs and is - as far as I can tell - within the technical limits of acceptable use but still in a morally gray area. I realize the high level of creepyness that is archiving other peoples photos without their expressed consent - which is one of the reasons why I wanted to end this project. I'm not a professional, I only did this as a hobby. Other download methods may be added in the future, torrents just seem to make the most sense to me at this time. Maybe somebody would like to fix this but I'd rather not spend any more time on this project. Side note: I used to store everything in sqlite3 exclusively, which worked amazingly well considering the size of the dataset but had issues with multi-threaded access and made parallelization difficult - which is why I jumped to mongodb, which I had very little experience with (another opportunity to learn!), which explains why I have multiple collections: they are direct copys of the sqlite table schema that I used and aren't necessarily the perfect design for mongodb. author_file containing a reference to the author document, filename, the image or video data itself and possibly a thumbnail (if one has been created on-the-fly by the flask application I wrote to browse this dataset ).Contains two 4 if you count GridFS collections:.supported downloading images and videos from i.reddit, imgur, gfycat, vidible.Ran once an hour, scraping the following subreddits:.mongodumpexport / bson format, GridFS file storage.It has been a really good learning experience but now I don't know what to do with all that data - deleting it seems like the wrong thing to do, so I thought I'd offer it to this community. The idea of data archival, scraping and just the scope of his project impressed and inspired me to work on my own crawler in python, that has been running for about a year and a half and archived/scraped 500GB worth of image data from several gonewild'ish subreddits - more than enough to satisfy my curiosity and to make working with such a huge blob of data really unweildy on my comperatively undersized workstation and laptop - I draw the line when stuff takes an hour or two to move via gigabit ) Hey everyone, a long time ago someone named ihadp (or something like that) offered a huge archive of gonewild posts. Just make sure to tag the post with the flair and give a little background info/context. On Fridays we'll allow posts that don't normally fit in the usual data-hoarding theme, including posts that would usually be removed by rule 4: “No memes or 'look at this '” ![]() ![]() We are not your personal archival army.No unapproved sale threads, advertisement posts, or giveaways.No memes or 'look at this old storage medium/ connection speed/purchase' (except on Free Post Fridays).Search the Internet, this subreddit and our wiki before posting.And we're trying really hard not to forget.ģ.3v Pin Reset Directions :D / Alt Imgur link Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Timetm). government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. ![]() Among us are represented the various reasons to keep data - legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |