Google Drive does a surprise rollout of file limits, locking out some users

(arstechnica.com)

365 points | by stereoradonc 772 days ago

37 comments

chrisbolt 772 days ago
Previous:
https://news.ycombinator.com/item?id=35329135
https://news.ycombinator.com/item?id=35395001
[-]
- dang 772 days ago
  Thanks! Macroexpanded:
  Please delete 2M files to continue using your Google Drive account - https://news.ycombinator.com/item?id=35395001 - March 2023 (109 comments)
  5M item limit for Google Drive: File unable to generate or upload due to 403 - https://news.ycombinator.com/item?id=35329135 - March 2023 (133 comments)
mcherm 772 days ago
Having a limit of 5 million files is perfectly reasonable. Failing to document that such a limit exists and refusing to publicly confirm it (which apparently is STILL the case) is extraordinarily poor customer service/communication.
Google KEEPS setting new records for poor customer communication, to the point where I (and much of the HN crowd) now expect it. Android developer banned from the app store? There is no meaningful way to appeal but you'll probably never be able to find out why. Your best hope is to post on HN and hope someone with power at Google notices.
Leadership at Google ought to recognize this; they ought to make an effort to improve the channels by which "customers" can communicate with Google. But I see no signs that they are even aware of the issue; I see no effort to change anything.
I would try to tell them but... there's no communication channel. Maybe I should post about it on HN.
[-]
- djha-skin 772 days ago
  Google could have (and many expected them to) eat Amazon's lunch for cloud compute back when it was new. However, Amazon actually had infrastructure around customer support while Google famously really doesn't. From what I gather Google cloud is technologically better, but no one wants to use a cloud from someone who doesn't do customer support as well.
  [-]
  - tedivm 772 days ago
    I've got a tale of two clouds. I complained about an issue with AWS on Twitter and two hours later was on a call with the EKS product manager who connected me with someone on their team who documented the issue, gave me a workaround, and then pushed out a fix to everyone a few days later. It was super impressive!
    I complained about a separate issue with GCP. The product manager found my tweet and told me it was my fault for using a service marked as "preview". He then told me I should have used a different service, which was also marked as "preview". They were rude and defensive and made no attempt to resolve my issue. We decided to just not use GCP for the project as a result.
    Neither of these are isolated stories. I have multiple examples of GCP staff just being super rude for no real reason, while AWS regularly goes above and beyond with customer service.
    [-]
    - acdha 772 days ago
      This matches my experience. GCP was like IBM a decade earlier: they assumed the name alone would close the deal and we didn’t feel like paying more for the privilege of doing their jobs for them.
    - eitland 772 days ago
      I've experienced both:
      One time two very competent guys shows up on site and let me reproduce the issue, agrees it is a serious bug and promises to take care of it.
      Next time I get connected with what seems to be two part time students or something who doesn't seem to have a clue about the product.
      Then in a third case some technical architect or something looks into my stack overflow question, verifies there is a known bug and it will be fixed in next monthly rollout, but I never saw a fix.
      [-]
      - blueyoda 771 days ago
        What's better is when their community forum "expert" redirects you to the first Stackoverflow link on Google - even though it is a band-aid solution that doesn't even solve your problem.
    - berniedurfee 772 days ago
      I had a similar experience, though it was years ago. Found a minor bug in Elastic Beanstalk and opened a ticket. They confirmed the bug and rolled out a fix in the afternoon. I was very impressed.
    - blueyoda 771 days ago
      I've had very similar experiences with Google products. Their "support" for many products tends to be primarily community forums. And on those forums, "if you don't like it then don't use it" seems to be the standard type of answer.
  - lowercased 772 days ago
    > From what I gather Google cloud is technologically better, but no one wants to use a cloud from someone who doesn't do customer support as well
    I have a hard time using any of google's stuff - cloud, whatever. Have had to deal with 'google search console' recently. "Give us your sitemap!". OK... "Could not fetch". The explanation for "could not fetch" might be "couldn't fetch" or "hasn't fetched yet" or "read but not processed" or "error reading" or.. anything else. On March 9, I had a sitemap entry listed as "couldn't fetch" updated to "last fetched date" of March 10 (9am UTC-4, so not even close to march 10 UTC time). It's just.... buggy. Colleagues currently moving stuff to google cloud (by edict of cto) are encountering bug after bug after slowness after flakiness. Google "support", to the extent they answer, says "we don't know".
    Might it have been 'technically better' on day one? perhaps. But if it's buggy/flaky to deal with, and they have no support, not sure how you'd even verify the "technically better" part. How would you trust any numbers you see reported in their own tools?
    Had to use GCP stuff about 6 years ago. It was flaky/slow and relatively unsupported then. Watching colleagues go through stuff today, in 2023, it seems no better.
    [-]
    - SOLAR_FIELDS 772 days ago
      I use both in my day to day work. My takes right now:
      Roles/Permissions in GCP is just done better. The whole system of having to switch roles to be able to see stuff across multiple accounts in AWS is opaque and confusing. TrustPolicies are powerful but feel unnecessarily complex for almost every use case. Google has its own warts around permissioning (doing things often requires an opaque role that you have to look up, the error message is often unhelpful). However, it’s better than unraveling the series of permissions needed to, for instance, have an app pull from an S3 bucket in AWS.
      AWS sucks at naming things. Everything is some nonsensical acronym that only makes sense to salespeople at Amazon. When you wonder what Google’s load balancer product is called, you look it up and it’s perhaps unsurprisingly called Elastic Load Balancer.
      Another plus for Google: Having IAP as a first class citizen is a nice way to avoid having to set up bastions etc when prototyping.
      On the other hand, we just spun up a Karpenter instance in EKS, and according to my colleague it’s much better than Google’s Autopilot product.
      Also there is a whole industry around getting support from Google, lol. We use DoIT at my place of work, which is a company whose entire business is to pool together customer accounts for volume pricing and white glove support from Google. Interestingly, the cost savings from volume pricing are so significant there’s no cost to end users for using DoIT.
      [-]
      - paledot 772 days ago
        Elastic Load Balancer is in fact an AWS product. I believe GCP's equivalent is called Cloud Load Balancing.
        Your point is valid and something I've felt many times as well (Route 53 means DNS somehow???), you just happened to choose literally the only AWS product that is well named.
        [-]
        lwf 770 days ago
        DNS servers listen on port 53 :)
        [-]
        SOLAR_FIELDS 768 days ago
        True. But if you asked me what Route 53 does I wouldn’t be able to tell you without checking. Cloud DNS, OTOH, is a pretty descriptive product name. I’m not saying you disagree, just reiterating the point that AWS naming is just a little too silly and clever for its own good ;)
  - antiterra 772 days ago
    I interviewed with Google Cloud a few years back and they straight up told me they were behind both AWS & Azure as a product and were lacking in user support. They said they were trying to build out an org to address it.
  - throwawaaarrgh 772 days ago
    > From what I gather Google cloud is technologically better
    It's a shitshow. Their design is ridiculous, overcomplicated and opinionated. It's clear they are run by engineers because no sane human would choose to use GCP knowing how bizarre and painful it is.
    Creating an IAM role isn't strongly consistent. You have to build your own polling logic to wait for it to be done to attach a policy to it. Even using their own Terraform provider. If I worked there I'd be embarrassed.
  - silisili 772 days ago
    As someone now in a position to make such decisions on a larger scale, GCP isn't even on my radar for this reason. I've had enough experiences with Google's lack of support when needed that it would take years of turnaround to undo that black mark.
    AWS and Amazon as a whole has always had stellar support. And they'll continue to receive my business in both as long as that holds true, even though we rarely use it.
  - kccqzy 772 days ago
    Google doesn't dogfood its own cloud. None of the popular services like Search, Gmail, or Drive actually use GCP.
    [-]
    - koolba 772 days ago
      Of course they don’t use layer upon any of their own services.
      Otherwise what would they do when they inevitably get discontinued?
    - forty 772 days ago
      Not so long ago (a few years), Amazon retail was not using AWS either (I think they do, at least partially, now).
      [-]
      - kccqzy 771 days ago
        Well, the problem with Google is that AFAIK they have no plans to onboard their important services like Search onto GCP. I believe a few years ago they have tried but found the GCP tools too lackluster for internal use.
    - amf12 772 days ago
      This is true and false. Google services, including GCP, run on its internal cloud. GCP is just a product on top.
- UniverseHacker 772 days ago
  I think it’s basically the attitude “customer support is expensive and annoying, and since we’re Google and can do whatever we want, let’s skip it.” This kind of made sense when they had a massive tech advantage such that their products also needed less support.
  Now they seem to have lost that edge and are just another competitor offering products that come with poor support.
  [-]
  - jollofricepeas 772 days ago
    I’d like to draw parallels to the old cable titans with local monopolies who also had the same customer service tactics as well.
    1. Offer a cheap (free) service
    2. Gain significant market penetration
    3. Provide little to no customer service
    4. Raise prices
    5. Let the product/services degrade until significant public outcry or regulator involvement
  - dingusdew 772 days ago
    [dead]
- tlogan 772 days ago
  The common practice among other companies is this:
  - Give old users approaching or exceeding limit with an additional 10% or 20% buffer.
  - Ensure to notify all users about the change
  - Update relevant support documentation.
  Clear communication is crucial in addressing new storage limitation. For example, organizations might be archiving all their emails in a single Google Drive for legal discovery, AI training, or similar purposes. These companies may quickly reach the 5 million limit, but the situation can be resolved by dividing the content across multiple Google Drives or utilizing alternative storage solutions like Amazon S3 or Dropbox. No big deal.
  The crux of the matter lies in maintaining clear communication.
- kmbfjr 772 days ago
  It is not reasonable at all if you sell the service by N gigabytes per month. They materially changed the service without telling anyone.
  Google does this all the time, anyone who uses GCP, particularly BigQuery, knows all too well about stealth changes that break things.
  [-]
  - thebytefairy 772 days ago
    What BigQuery changes have affected you? I haven't noticed any in years of regular use.
    [-]
    - victortroz 772 days ago
      I believe it's related to the price changes they announced this week, it could double some use cases cost.
- wankle 772 days ago
  > Having a limit of 5 million files is perfectly reasonable.
  Is it? On this desktop I have 1.5 million files, "df -i" says I've used 1.1 million inodes and I have 61 million inodes free.
  The 5 million file limit on Google Drive seems to be excessively low.
  Note to Google, consider re-imagining your PhD's and MBA's as food service personnel.
  [-]
  - messe 772 days ago
    Agreed, it's absurdly low. My rootfs is ~1 million inodes, with ~61 million free too; I only reinstalled Gentoo a couple of months ago, so I'm only using 180GB / 1TB.
    EDIT: s/files/inodes
  - Karellen 772 days ago
    > On this desktop I have 1.5 million files, "df -i" says I've used 1.1 million inodes
    You have 400,000 files with 2 hard links? Or 100,000 files with 5 hard links each?
    How? Why?
    [-]
    - wankle 771 days ago
      I'm not sure what you're asking? I'm happy to try and oblige, do you have a command I can run to answer your questions?
- throwawaaarrgh 772 days ago
  > Leadership at Google ought to recognize this
  They're fully aware of how bad it is. They do not care. The Ad gravy train is still here, and they'll milk it until they can't anymore and then retire to their multiple homes (or go off and do it again to some other company).
  I don't know why people expect leadership to do anything about a shitty company. They have literally no incentive to change. They'll always be hired somewhere else. Business as usual.
- jrootabega 772 days ago
  Customers? Them thinking of us as peasants, or even cattle, might better explain these attitudes.
  [-]
  - hedora 772 days ago
    The term of art is "Chattle", as in "Trespass on Chattle". e.g., start by reading section IIa of this:
    https://digital.sandiego.edu/cgi/viewcontent.cgi?article=316...
    Essentially, Intel employees are 'chattle', and while communicating with them doesn't transfer ownership or control of them away from Intel, it does interfere with Intel's ongoing use of them.
    [-]
    - jrootabega 772 days ago
      The spirit is the same, but I don't think we're talking about the same thing. It's "chattel", and according to wikipedia, for whatever that's worth, it's "trespass to chattels".
      I didn't see any reference there about treating people in a customer relationship as the "chattel" referred to in this law.
      [-]
      - hedora 771 days ago
        It wasn’t originally meant to be applied to employees. That’s a modern interpretation. It seems inevitable that some court will extend the interpretation to customers too.
- pid-1 772 days ago
  Weird they don't consider how that sort of bad behavior impacts their other business.
  I started a company in 2019 -> didn't choose G Suite, because I don't trust Google. Same for GCP.
  [-]
  - kjellsbells 772 days ago
    I wonder if this message is getting through to Google, because the evidence suggests not. Any corp considering Google services has to seriously evaluate the risk that one day google will just decide to whack the service. Even in GCP, things that you might have thought would be solid for the long term, like IoT, can go away. Once you lose that trust, it's very hard to get it back. Azure and AWS I'm sure do their share of dumb things, but it really sticks to Google because everyone knows that they are infamous for killing products.
- dylan604 772 days ago
  >Having a limit of 5 million files is perfectly reasonable
  For ordinary users, this would pretty much equate to an unlimited account. Much like ATT's old unlimited bandwidth accounts. It's not until power users start hitting these ceilings to realize unlimited doesn't mean what they think it means. I don't know if Googs every used that kind of phrasing or not, but you can see where some dev never thought that limit would be achievable so there's no reason to really mention it.
- newZWhoDis 772 days ago
  > Leadership at Google ought to recognize this
  It’s not a recognition problem, there’s no “aha” moment that will come from the right person just explaining it to them.
  They simply don’t care, and have no incentive to ever change.
- criley2 772 days ago
  For the record, Google does run a full communications channel for cross-Google support for subscribers of Google One.
- acomjean 772 days ago
  Even my bog cheap hosting lists storage limits and, "inode" limits on its dashboard. We're doing fine on both but way worse on the number of files.
- colordrops 772 days ago
  How about just not using google? There are viable alternatives to almost everything they offer. I haven't been using them for anything for a year now.
  [-]
  - colordrops 772 days ago
    I find that my most often downvoted comments here are those that shed negative light on Google. Curious.
- prepend 772 days ago
  I don’t think it’s reasonable. I think it’s poor engineering.
  Imagine if a hard drive had a limit on the number of files, not just the size. Would that be reasonable.
  Or imagine if someone designed a storage protocol and submitted it to IETF with such a limitation saying “it doesn’t affect the vast majority of users so we set this arbitrary limit to make our lives easier.” It would not make it through review.
  [-]
  - viraptor 772 days ago
    https://en.wikipedia.org/wiki/Comparison_of_file_systems almost every filesystem has a limit. Some document it, some not. Some limit it per directory rather than globally. There are also path limits which are effectively a file number limits as well. But the limit itself is very normal and would make it through reviews without issues.
    [-]
    - Kwpolska 772 days ago
      NTFS and ext4 have a 2^32 limit according to this table. The Reddit poster mentioned in the article [0] has 2 TB worth of space. I’m not sure if that’s decimal or binary TB, but if we assume binary, you get an average of 512 bytes per file, which is pretty reasonable (and on a real drive, you might be limited by sector size).
      But even if you had a 20 TB hard drive, and you wanted a lot of files smaller than 5120 bytes, you can just create multiple partitions. Maybe a little inconvenient, but you can still use your drive to the fullest. Not so much with Google’s 5 million limit (for a 2 TB Google Drive, the minimum average file size is around 430 KB [2])
      [0] https://www.reddit.com/r/google/comments/123fjx8/google_has_...
      [1] https://www.wolframalpha.com/input?i=%282%2F2%5E32%29+TiB+in...
      [2] https://www.wolframalpha.com/input?i=%282%2F5000000%29+TiB+i...
    - yjftsjthsd-h 772 days ago
      Sure, everything has some limit. But the lowest limit I see on that page is 2^32, which is just a touch higher than 5 million. (/s) And far more importantly, none of those filesystems just decided one day to reduce the limit, which is kind of a big deal. (Although funny, in a horrible way; imagine booting up your machine one day and it tells you that your home filesystem needs to drop a few million files before it can mount. Can you imagine?)
      [-]
      - viraptor 772 days ago
        > none of those filesystems just decided one day to reduce the limit
        Yes and no. This works differently from one filesystem to another, but search for "running out of inodes" to find many ways this can happen in reality, way below the technical limit of the filesystem itself. You can end up in situations where you can't create a new file and need to remove more than one to solve the situation. (Or even where removing many files doesn't seem to make a difference)
        [-]
        Dylan16807 772 days ago
        The thing about running out of inodes is that the format command scales them with the size of the drive, and the default is 1 inode per 16KB. So for a standard google account at 2TB or maybe 5TB you'd have 125 million or 312 million inodes.
        They've basically gone and set it up like "largefile" mode, 1 inode per 1MB, forced upon all users. That's a bad default and an even worse mandate.
  - Thaxll 772 days ago
    It's poor engineering to not understand that everything has limits.
    [-]
    - prepend 772 days ago
      Everything has limits. The limits should be based on hardware limitations, not arbitrary limits. The limits should increase over time.
  - dboreham 772 days ago
    > Imagine if a hard drive had a limit on the number of files
    NPM developer?
    <ducks>
    [-]
    - victortroz 772 days ago
      pnpm helped me a lot with this.
  - Kye 772 days ago
    File systems do have a limit on number of files, though as far as I know they're all quite a bit higher than anything in the millions.
    [-]
    - Wowfunhappy 772 days ago
      Of course, with a physical disk there are also things you can do to get around those limitations, like switch to a different filesystem or use multiple filesystems on one disk.
      Using the cloud leaves you exposed to the whims of the cloud provider. They select and switch technologies based on their needs instead of yours.
      [-]
      - Kye 772 days ago
        If you manage to hit five billion files (which seems to be the standard limit), you might be in the market for object storage rather than new frontiers in file systems.
        [-]
        yjftsjthsd-h 772 days ago
        > five billion files (which seems to be the standard limit),
        Article says 5 million? Still a big number, but smaller than 5 billion.
        [-]
        Kye 772 days ago
        We're talking about file systems.
        [-]
        yjftsjthsd-h 772 days ago
        Then where are we getting 5 billion? The most common filesystem limits listed seem to be 2^32, which is a touch over 4 billion, and 2^64, which is Big.
        (i.e. I can see now what I think you meant, but "5 billion" is a much closer match to "5 million" than 2^32)
        [-]
        Kye 772 days ago
        My brain doesn't do numbers so good, so 4,294,967,295 became 4,924,967,295 and I rounded up instead of down.
        [-]
        yjftsjthsd-h 772 days ago
        Ah, that makes sense. Completely understandable:)
      - amf12 772 days ago
        > Of course, with a physical disk there are also things you can do to get around those limitations, like switch to a different filesystem or use multiple filesystems on one disk.
        So, use another Google account? If someone's workflow depends on one Google account, sounds like they need to understand the concept of sharding.
    - prepend 772 days ago
      Of course they have limits, but the limits should be based on the design. And cloud should have a phenomenally high level.
  - IggleSniggle 772 days ago
    Without a limit, you could design a storage system using empty files on google drive with limitless storage potential.
    [-]
    - jjnoakes 772 days ago
      Why couldn't Google just count the file name length and perhaps a small constant metadata size against your quota?
      [-]
      - acdha 772 days ago
        That is a limit. It’s a different one but it tacitly accepts that some check is needed.
        [-]
        Kwpolska 772 days ago
        Not a limit as small as Google set, and not a limit added overnight without any previous announcement or grace period. If you want to limit storing data in names of blank files, you could say “blank files, files smaller than 1024 bytes, and folders are counted as 1024 bytes for quota purposes”. A minimum billable size seems fair to me, and there is prior art (e.g. AWS S3 has minimum sizes for some storage classes).
        aflag 772 days ago
        They already had a total size limit. The complaint is that they introduced a new one without really telling anyone.
    - dwaite 772 days ago
      on real filesystems, directories are effectively files and take up space. I see no reason for them not to have zero byte files take from the storage limits due to metadata.
  - mindslight 772 days ago
    Said like someone who has never run out of inodes! There's a reason `df -i` exists.
bigiain 772 days ago
Ha!
“ a safeguard to prevent misuse of our system in a way that might impact the stability and safety of the system."
Google: We have identified modern web development as a threat to our systems, and have taken measures to ensure npm users cannot store their npm_modules directories on GoogleDrive. Please consider rewriting your nodejs projects in Go.
[-]
- onion2k 772 days ago
  Google: We have identified modern web development as a threat to our systems, and have taken measures to ensure npm users cannot store their npm_modules directories on GoogleDrive. Please consider rewriting your nodejs projects in Go.
  That's a bit unfair on Node and NPM. The modules directory rarely grows to more than 4 million files.
  [-]
  - yjftsjthsd-h 772 days ago
    > The modules directory rarely grows to more than 4 million files.
    The article says the limit is 5 million files, so... 2 NPM projects? That's not really better...
    [-]
    - hnfong 771 days ago
      I think the GP is sarcasm, but I only have like 70% confidence.
- paulddraper 772 days ago
  https://preview.redd.it/tfugj4n3l6ez.png?auto=webp&v=enabled...
- kube-system 772 days ago
  Why do people put their NPM modules in google drive?
  [-]
  - thunderbong 772 days ago
    The reason this happens is Google Drive is marketed as a backup solution, so people add their work folders to it which consists of all these files.
    [-]
    - esperent 772 days ago
      Exactly, I fell into this trap back when I was starting out as a developer. GDrive is a terrible backup solution and it'll overwrite and corrupt files there if you're actively working on them. As of a few years ago anyway.
      It also doesn't help that there's no way to exclude a folder with wildcards so you can't blacklist all node_modules folders.
    - Takennickname 772 days ago
      wouldn't package.json be enough?
      [-]
      - codetrotter 772 days ago
        They probably have a folder that auto syncs with Google Drive https://support.google.com/drive/answer/10838124?hl=en
        Same way that Dropbox works
        So they are not manually choosing what to sync, they just put all of the work and other files they want to automatically sync inside of that folder. This ends up including things like node modules and things like that
        [-]
        gumby 772 days ago
        > a folder that auto syncs with Google Drive...Same way that Dropbox works
        I’m on a Mac and find that google drive’s Mac client is so unreliable as to be useless. It frequently crashes mysteriously, and even when running it doesn’t sync reliably. It feels like nobody is working on the software, and after all the layoffs that could well be true.
        Say what you want about Dropbox, but I’ve never had a problem with it.
        [-]
        kcartlidge 772 days ago
        > I’m on a Mac and find that google drive’s Mac client is so unreliable as to be useless
        True.
        I used Dropbox for many years without issue and moved because of the cost and the issue a few years back with not allowing it on certain Linux file systems.
        Went to Google Drive and performance was shocking in comparison. At the time (don't know about now) it also felt like they had a deliberate policy to avoid showing cumulative file/folder sizes, meaning it was easier to pay for more storage than to find out what took the space and clean it up.
        Since then I've spent a few years using PCloud, which was almost as fast as DropBox and more reliable at syncing than Google. Then I lost the ability to connect the client on a Mac for a couple of days, had a deeper look, and noticed a few files had had silent collisions between changes on Mac and Windows (not at the same time) leaving multiple versions (names suffixed with the machine they came from).
        So I left there and went to OneDrive. To be honest I've found it okay other than one issue on a Mac where if I delete a file for some reason then when it vanishes to the recycle bin it reappears in the OneDrive root on the laptop. delete-immediately gets around that.
        I'm now at the point where I use OneDrive for convenience/reliability but despite the cost have been considering a return to DropBox as cloud syncing seems to be impossible for anyone else to do properly. Is it still as reliable as it used to be?
        [-]
        gumby 772 days ago
        I tried OneDrive for a while but it wouldn’t sync certain files due to characters in their names. The error message wouldn’t tell me which.
        But much better than google which just silently fails.
        NoZebra120vClip 772 days ago
        I don't sync folders. I mount my accounts as network drives.
        As for corruption or overwriting files, I have rarely encountered issues, but Drive has file revision histories as well as a Trash can, so if you notice the problem and it's not spread over 5M files, you can roll back to last-known-good versions.
        [-]
        codetrotter 772 days ago
        > I don't sync folders. I mount my accounts as network drives.
        Implementation-wise under the hood is it really much different?
        In my simplistic, probably overly naive idea it seems that a “network drive” built on top of Google Drive would still cache the files in memory, and improper shutdown or loss of network access could still leave your files in an inconsistent state. Probably they have a lot of advanced logic on top, that could roll back partial updates, or keep track of things so that the updates are not applied before sync is complete.
        But anyways from this point of view sync vs network mounted drive feels like sync is a kind of network mounted drive that flushes to local disk in addition to working with the remote.
        One of these days I want to build such things from scratch myself so that I can experience first-hand the horrors that lie underneath sync / network disk service setups modelled on my idea of Dropbox / Google Drive etc
      - johannes1234321 772 days ago
        Not saying that the approach taken via G Drive is smart, but to your question: For a full backup of your application just storing package.json/package-lock.json is not enough. In case of things going south this might leave you in an unrecoverable state - imagine a package is removed for whatever reason from npm or npm becomes unreachable.
        Now in your risk assessment the question of course is how likely it is that you lose the copy alone your dev system, CI system and productive version at the same time that npm loses it, but the first three happening isn't as unlikely as it seems: You work on a major refactoring and while deploying you run into a major problem and want to roll back, then artifacts from the old version might be gone. Of course there are ways to prevent that. But that's the risk assessment and prevention you have to make. A full backup including node_modules is one way of dealing with that.
      - csomar 772 days ago
        More like your lock file. That being said, if I am paying for storage, that's because I want an exact replica to what I have on disk.
  - kbumsik 772 days ago
    I mean, why not?
    I use both git and Dropbox, because I have multiple devices to sync: laptops, a desktop, and linux servers.
    If you use a desktop and a laptop back and forth, git is quite inconvenient. You don't need to commit and push from the desktop to switch to the laptop with a folder level realtime sync solution.
    Also, git only protects files that are staged, but Dropbox can restore any files to any point of file event (not periodic snapshot, but it keeps all file events). So combining both you are safe from any kinds of mistakes.
    I use Dropbox because Google Drive sync wasn't stable enough for this kind of workflow but I am not sure if it has been improved lately.
    [-]
    - stinos 772 days ago
      I mean, why not?
      Because git combined with Dropbox and the likes can be slow, lead to broken repositories, mad sync conflicts and so on? At least that is what I've witnessed with colleagues. This was years ago though, and I assume you dont have these issues or maybe a different workflow, so this is more like a word of warning for others: test this properly first before just diving into it.
      [-]
      - kbumsik 772 days ago
        Hmm I haven't have significant issues for almost 10 years. And a few years ago Dropbox introduced a folder-level rewind feature so you can easily restore any broken folder.
        Anyway, YMMV.
        [-]
        youngtaff 772 days ago
        Yup, I’ve been running this config for personal projects for over 10 years too
      - Dylan16807 772 days ago
        Maybe if you shut a laptop while it's syncing and then start editing the same things on a different computer without considering that you should finish the sync, but that seems pretty niche.
        Are you sure those people were using their own personal dropbox folders? If they were sharing with each other that's a different thing entirely and requires a lot more care to avoid problems.
        [-]
        stinos 772 days ago
        Are you sure those people were using their own personal dropbox folders?
        Yes, one single account.
        while it's syncing and then start editing the same things on a different computer
        IIRC this was one of the issues. Not necessarily because of shutting down, but just when using multiple computers one after the other and Dropbox not having synced in between. Not that niche depending on line of work, think lab-like setups where one thing gets deployed to multiple machines and deployment then gets tested. In one company the 'oh but you have to wait for dropbox to sync before doing that' and 'how?' responded by 'urm, yeah, check the timestamp on all files in that directory' was a rather common thing, often leading to stupid issues. Put git in the picture there and things won't get any better.
        [-]
        Dylan16807 772 days ago
        Oh, using it for deployment, so you're moving directly from one computer to another right after making a change and then using those files? I could see how that causes problems sometimes. I think the use case of "here is my checkout, I edit it on multiple computers, I don't use it as anything else" is a lot more reliable.
        [-]
        dreamcompiler 772 days ago
        I think you're on to something here. It might be both safe and convenient to use git with Dropbox iff you create a bare git repo that is not in a Dropbox folder, and then specify a separate worktree that is in a Dropbox folder. That might give you the best of both worlds, but I haven't tested the idea.
        [-]
        Dylan16807 772 days ago
        There aren't any fundamental issues with putting the git repo in Dropbox. Under normal circumstances, to cause a conflict you'd have to do something that writes to the git folder, then switch computers, pull up the same checkout, and do something else that writes to the git folder, all within less than half a minute.
        And even then it's just something like having disagreement about an index file. Easy to resolve.
        [-]
        dreamcompiler 772 days ago
        The fundamental issue is that Dropbox changes are non-atomic w.r.t. what git considers "atoms." See my other comment in this thread. You're right that there's usually not a problem with a single contributor, but if you ever have two or more contributors sharing a Dropbox-contained repo it will be bad news.
        [-]
        Dylan16807 772 days ago
        Sure, don't share. But there's a good use case for putting a single-user checkout into Dropbox, and that's because it adds convenience and gives you continuous backups.
        Though outside of garbage collecting, git doesn't have many files that actually get changed. Mainly the ref pointers which are easy to resolve and the logs which aren't very important.
      - dreamcompiler 772 days ago
        I've done this in the past but stopped. The problem is there are no atomic transactions in Dropbox. If two people are pushing changes to a repo at the same time using only git, it's not a problem because git understands atomic transactions. But if that repo happens to be in Dropbox all hell can break loose because the various files inside the .git directory are not being synched atomically.
        A git repo in Dropbox can appear to work okay with a single contributor where that contributor is only working on one computer at a time. But as soon as you add a second contributor, you very much should take Dropbox out of the process.
  - drusepth 772 days ago
    There's a surprisingly large number of groups of people that use Google Drive, Dropbox, and similar for "source control" and/or collaboration on projects instead of real VCS.
    [-]
    - bigiain 772 days ago
      I have worked with colleagues who expect to be able to keep their current work in Dropbox/GoogleDrive, so they don’t need to take their work laptop home every day but still have easy access to their current work from home.
      “Hey, why can’t I get Dropbox to work on this laptop? I need it so can do emergency bug fixes from my personal laptop at home!” “Yeah, nah. Remember that security policy you signed as part of your employment contract? The one that says exfiltrating company or client data or code to non corporate systems is a fireable offence?”
      [-]
      - lenkite 772 days ago
        Google Drive many-times IS the enterprise approved cloud storage/sharing solution.
        [-]
        justeleblanc 772 days ago
        Writing the files to your home hdd, even using the approved solution, is exflitration of data.
        [-]
        bigiain 771 days ago
        I want to say that the sort of place that had GDrive as their "approved cloud solution" is unlikely to also be the sort of place that has the sort of data the required firing-offence rules for exfiltration to non company systems, but I know that's not the world we actually live in...
        I'm reminded that Lastpass got popped via an employee running a well out of date version of Plex with known RCE exploits and getting a keylogger installed:
        “This was accomplished by targeting the DevOps engineer’s home computer and exploiting a vulnerable third-party media software package, which enabled remote code execution capability and allowed the threat actor to implant keylogger malware,” LastPass officials wrote. “The threat actor was able to capture the employee’s master password as it was entered, after the employee authenticated with MFA, and gain access to the DevOps engineer’s LastPass corporate vault.”
        The hacked DevOps engineer was one of only four LastPass employees with access to the corporate vault.
        I mean, I know exactly why. But why the fuck was the corporate Lasspass vault available to a staff member's home machine running Plex? Is the expense of a corporate vpn-locked laptop and a pair of Yubi keys too much for a fucking _Password Vault_ senior developer??? "Should we spend a couple of grand buying a work laptop for this guy who literally has the keys to our kingdom? Or do we save a few bucks and get him to work on the machine he already owns, the same one he runs outdated media server software and probably bittorrents all his porn with? What could possibly go wrong?"
    - bdcravens 772 days ago
      I don't do that, but I was once bitten by iCloud. I always put my code in ~/code and copy it over when I get a new machine. I once had the idea to just put it in ~/Documents/code, thinking it'd be automatically synced. Turns out that iCloud doesn't sync dotfiles.
      [-]
      - brycewray 772 days ago
        https://stackoverflow.com/questions/35853139/can-git-and-icl...
    - cableshaft 772 days ago
      I use Git for version control but for some of my personal projects I also have a copy in Dropbox, so I can make changes on my laptop downstairs, not be ready to commit it, save it, go upstairs, and continue work from my desktop pretty much seamlessly.
      Some of my projects aren't set up this way, and it can get annoying to have made a few small changes and be forced to commit them in order to pull them from upstairs.
    - lazide 772 days ago
      That seems really illadvised.
      [-]
      - bigiain 772 days ago
        Cynical view, a lot about npm seems really ill advised. One additional bad decision after that choice seems easy enough to go along with… ;-)
        [-]
        tomjen3 772 days ago
        A lot of it is. And yet, it works so well it conquered the world.
        [-]
        bigiain 772 days ago
        So did Mc Donald’s and Starbucks.
        I’m not entirely happy with the idea of living in a world conquered by the lowest common denominator solutions.
        [-]
        tomjen3 772 days ago
        If we somehow changed the rules so that Mc Donald and Starbucks were illegal, we would not end up with local artisinal coffeeshops on every corner, we would likely have a lot fewer places to get good coffee.
        [-]
        raverbashing 772 days ago
        True, it's certainly not that intuitive
        Brian_K_White 772 days ago
        McDonalds is my favorite answer to the popularity argument. Popularity is a data point to factor, but it's only one and it's meaning and value are entirely context dependant. Basically means nothing by itself.
        listenallyall 772 days ago
        Get over yourself. McDonald's is far from "lowest common denominator." Go travel the world a bit and see how real people live and what they actually eat every day.
        [-]
        wincy 772 days ago
        We had a drive thru grass fed burger place open up across the street from McDonald’s and it was gross. It lasted maybe a year. McDonald’s makes pretty good burgers, not what I want every day for sure but in a pinch they aren’t bad.
        throwaway50601 772 days ago
        Speak for yourself, I find it great. Much better than the alternatives like pip or nuget. And the ecosystem has endless amount of solutions ready.
        [-]
        heleninboodler 772 days ago
        And I love big macs, but they may not actually be a good thing.
        [-]
        throwaway50601 772 days ago
        In this case it is a good thing. And as I said, still much better than the alternatives.
        yjftsjthsd-h 772 days ago
        What do you think npm does better than pip?
      - nickphx 772 days ago
        everyone is differently abled.
  - RobotToaster 772 days ago
    I'm guessing, but github desktop by default uses the documents folder. By default Google drive backs up the documents folder.
  - NoZebra120vClip 772 days ago
    I use Google Workspaces at work, and I have configured my notebook more like a netbook. Every possible file and folder gets stashed in Drive, and the agent runs to integrate with the Windows filesystem. This is especially important for Company Data and Confidential Materials.
    I have modest storage needs in my role, but imagine a developer, Big Data user, or video editor used it. As long as they could tolerate the latencies, they might stash quite a bit in there.
  - _v7gu 772 days ago
    I don’t like to do it, but I like to sync my projects folder and google drive does not offer me an “exclude folders named node_modules” option
    [-]
    - uallo 772 days ago
      tresorit can do that:
      https://support.tresorit.com/hc/en-us/articles/217103697-Exc...
    - rootw0rm 772 days ago
      i highly recommend rclone
  - tomjen3 772 days ago
    If I was teaching anybody programming I would teach them to copy the folder over every time they added a new feature, rather than teach them git at the same time.
    With some C++ libraries or node_modules that can run up fast.
    [-]
    - q7xvh97o2pDhNrh 772 days ago
      What? Why would you teach someone to do things the wrong way?
      [-]
      - Brian_K_White 772 days ago
        It's not teaching them the wrong way. It's teaching them a simple version of the right way.
        You can't do everything at once and git is counter-productive on day one, and the most important thing was they find programming remotely interesting at all.
        The next most important thing is always working on a new copy.
        How exactly you make those new copies barely matters at all compared to just having them at all.
        Using something arcane and unforgiving and inscrutable and complex like git to accomplish those copies is way way down on the list.
        The new student is barely interested in the actual coding to make a game if you're lucky. They would rather mow lawns than learn administriva like git.
        It could even be argued as irresponsible to hand git to babies and expect everything to be fine. Just because we all have about 5 git commands we use 99.9% of the time, and have no problem 99% of the time because we learned how to "hold it right" and just always carefully do things in the right order, does not mean git can actually be made safe and simple. Those 5 or so commands aren't actually enough.
        [-]
        q7xvh97o2pDhNrh 771 days ago
        I thought about it more, and I'd say we actually agree on the principle of teaching simple and building up. It's more a question of how different teaching techniques present the value of source control.
        The hard thing about source control is not the how — using the git CLI is only slightly more complicated than copying a folder. Git turns that into a very straightforward directed graph, and the CLI gives you a few commands to move around the graph, sprinkling some more edges and nodes wherever you like. It takes about an afternoon to figure out once you're onboard with the why.
        The why of source control is the thing — and our industry is already full of people who don't understand it. That's how we get people who treat git as a "save" mechanism to be invoked whenever it's been a while since the last commit, or branches with one sloppy commit message after another [1], or entire repos that are just spaghettified carelessly-merged hairballs of commits.
        Mastering the why of source control means learning when to cut a commit, what shape that commit should be, and why that is.
        And I think that's more my objection to the teaching approach outlined here — without any other context, it reads to me like it's not quite focusing on the right things. At worst, it's inducting the student into the industry-standard cargo-cult approach to source control. [2] If the why isn't learned, then it doesn't really matter whether the specific motion is copying folders or typing in git CLI commands.
        Ideally I think source-control techniques could be introduced right when they're going to be interesting or fun — when it's time to build something together with someone else. Then that set of lessons could start by copying folders and eventually building up to git — and each lesson could show how good, disciplined use of source control concretely improves the collaboration process.
        [1]: https://tbaggery.com/2008/04/19/a-note-about-git-commit-mess...
        [2]: https://xkcd.com/1597/
        [-]
        Brian_K_White 769 days ago
        "using the git CLI is only slightly more complicated than copying a folder"
        happy path
        em-bee 772 days ago
        i don't entirely agree with not teaching git to a new programmer, but if we look at the extreme it makes more sense. how would i teach a nine year old? exactly like that: copy the project.
        actually, a good way to think of it is releases. even with git, i keep a copy of each project release, that itself is not stored in git.
        [-]
        hombre_fatal 772 days ago
        A new programmer can just use a git UI like the Github app. It abstracts everything away and leaves you with the obvious benefit of being able to commit a succession of snapshots that represent logical groupings of changes.
        I haven’t used cli git since uni. GUI is better for basically everything except for a few rare and advanced commands.
        [-]
        fbdab103 772 days ago
        Yet that is still just one more impediment to getting someone off the ground. You are already throwing programming concepts (variables, and loops, and conditionals, oh my!) + big scary IDE + likely first-time command line, etc onto their plate. Invoking yet another weirdo program into the paradigm is not helping a beginner.
        Jeremy Howard (of Fastai fame) has a good analogy that we do not make people sit in a classroom for a semester to learn all of the theory of baseball. Instead, we give them a bat and a ball, and layer on the instructions of how the game works. You can get a good approximation of the game with just a couple minutes of instruction and refine the understanding from there.
        [-]
        em-bee 772 days ago
        if someone is at that level they are learning the language and do tutorial or student projects where version control doesn't yet matter. which is fine. once they have done a few of those, it's time to introduce basic version control.
        [-]
        fbdab103 772 days ago
        I managed to finish college without a real VCS. Would it have made some things easier? Sure, but again, add it to the pile of things I needed to learn and did not have enough time to study.
        [-]
        em-bee 771 days ago
        as far as college is expected to prepare students for work, version control should absolutely be part of that.
        i don't expect a fresh graduate to be a skilled git user but if the nature of development at work is substantially different from what the student learned at school, then they will have a much harder time adjusting.
        they should at least be familiar with all the concepts that they will encounter in a junior job and not be faced with having to relearn a completely different development process.
      - Xylakant 772 days ago
        You can’t teach all of the things at the same time. People will just freak out, be confused and learn nothing. There’s a lot of the things that you’ll teach “wrong” at the beginning. You’ll never talk about CI/CD at that stage, deployment, packaging, etc. either, despite a lot of this being a requirement for a lot of software engineering projects.
      - tomjen3 772 days ago
        As others have pointed out, it is not teaching others wrong things: copying the directory is (modulo optimization) what Git does. However if you dump people straight into everything at once they will become extremely confused and won't learn anything.
        What I want to do is take one step at a time, and build on what they already know.
        Then when they can program, you can show them how to use Git and the superpowers that comes with it.
        [-]
        q7xvh97o2pDhNrh 771 days ago
        Woops, just saw I replied to a higher comment in this tree instead of to you as the asker of the original question.
        FWIW, my thoughts are basically that it's a question of sequencing the lessons and framing the overall purpose: https://news.ycombinator.com/item?id=35411405
      - renewiltord 772 days ago
        I must apologize. We have purposely trained them wrong. As a joke.
      - jbverschoor 772 days ago
        Why would you have nodemodules in every project? It shouldn’t be a local directory anyway
      - ikiris 772 days ago
        They’re probably a math teacher and want them to learn about punch cards before introducing complex concepts like a floppy disk.
    - charlieyu1 772 days ago
      I always like the Odin Project way of teaching - you start with setting up virtual machines, install IDE, learn git before you get to write your first line of code.
      [-]
      - Brian_K_White 772 days ago
        This sounds a bit like Lockhart's Lament about how we teach math is why everyone hates math.
        The analogy was imagine if we taught music that way, where you spent years learning fundamentals and theory before ever being allowed to touch an instrument.
        [-]
        rzzzt 772 days ago
        Kodály had reasonable success with this approach across the world: https://www.classical-music.com/features/musical-terms/kodal...
        [-]
        blep-arsh 772 days ago
        His idea was to begin with one's voice as a musical instrument, so one would interact with music directly at first, learning the theory and the notation along the way.
        [-]
        rzzzt 772 days ago
        The lady quoted in the article agrees with you but also seems to support why I thought of Kodály in the first place:
        > "Eleven years of piano lessons taught me something about playing the piano > but almost nothing about music," she has said. "I was skating on the > surface. If a child is shown a written crotchet they have no physical > understanding what’s behind that. Kodály musicianship puts petrol in the > tank in that it gives them a profound experience of music-making, through > the voice, building up a repertoire of songs and giving them the > unconscious knowledge of pitch-matching, walking the pulse, rhythm, > phrasing and improvising – before making it conscious."
        So you will not be touching the piano before all of this takes place and you will be learning the fundamentals and theory, even if in a playful way.
        ipaddr 772 days ago
        Playing a classical instruction you would normally do that. Maybe you learn how to hold, change reids, do scales first but without theory you can't read notes or understand what to do with them. Without reading notes you can't play songs off of sheet music.
  - bluedino 772 days ago
    I worked with a guy that put all of our Rails apps in his Dropbox
  - throwaway689236 770 days ago
    I even exclude them from my TimeMachine backup, they're pinned.
- inamberclad 772 days ago
  Out of curiosity, I checked how many files were in my node based autocompleter:
  $ find .vim/plugged/coc.nvim/node_modules/ | wc
```
  12715   12715  952711
```
  [-]
  - epilys 772 days ago
    Should be wc -l
    [-]
    - switch007 772 days ago
      The first column of the output is the lines
    - ElectricalUnion 772 days ago
      Without parameters, wc prints `newlines`, `words` and `bytes`.
  - rovr138 772 days ago
```
    ~/Development   master ±  find . | wc
     1029821 1237285 118990227
```
    ~/Development contains 3 projects I worked on this past week. Only 3 because this is a new laptop. I'd hate to think what this looks like on my old laptop. But I'll run it later.
mort96 772 days ago
Hmm, there was a HN thread about this a few days ago [1] where everyone seemed to attack people for even considering the idea of storing 5M files in a cloud storage solution, going so far as to argue that even disclosing such a limit would be unreasonable to expect.
In this thread, the prevailing thought seems to be that having a 5M file limit is unreasonable and adding it without disclosing it is egregious.
Just a curious thing I noticed.
[1]: https://news.ycombinator.com/item?id=35329135
[-]
- listenallyall 772 days ago
  The two threads are talking past each other, not really debating the same thing. This thread is pretty much ignoring the potential unreasonableness of storing 5 million files on a free/cheap cloud service without other backups, which is key point over there. The other thread is mostly ignoring the fact that Google seems to have imposed this limit arbitrarily and without notice, possibly causing data loss, which is the key point here.
  Shows the power of framing -- focusing on different components of the same situation and facts can often lead to entirely opposite opinions or conclusions.
  [-]
  - tacker2000 772 days ago
    Also shows the “power” of whatever the top/first post was about, that topic will be framing the conversation.
    [-]
    - listenallyall 772 days ago
      Very true, I agree - but before there's a comment, there's a headline.
      "5M item limit for Google Drive: File unable to generate or upload due to 503" (mostly passive, it's a user problem) vs "Google Drive does a surprise rollout of file limits, locking out some users" (Evil Google secretly screwing customers over)
    - foundart 772 days ago
      I agree, to the point that very frequently I just collapse the top comment immediately upon arriving on the HN discussion page because I want to get more perspective.
- kmos17 772 days ago
  Interesting observation, I have noticed too that contradictory opinions are very frequent. My completely subjective impression has been that there is a loud contingent that just likes to have a contradictory opinion, a form of mansplaining that is frequent in engineering circles. Or maybe it’s just that people in general are driven to give their opinion on things they disagree with.
  [-]
  - ben_w 772 days ago
    > people in general are driven to give their opinion on things they disagree with
    I’m definitely more likely to reply when I disagree than when I agree.
    I do try to make a habit of giving positive replies also (and not just building on what was said before, as I’m doing here, nor just upvote silently), as constant contrarianism is… kinda dull, especially from the outside.
  - drstewart 772 days ago
    What does gender have to do with here?
    [-]
    - PopAlongKid 772 days ago
      Try "condescend".
      [-]
      - drstewart 772 days ago
        Eurosplaining
- Wowfunhappy 772 days ago
  Hacker News isn't a hive mind and different people have different opinions.
- midland_trucker 772 days ago
  Goes to show that nailing down the 'HN crowd' opinion is more difficult than it seems!
- rsync 772 days ago
  5M files. Pfft.
  If your cloud storage account can’t gracefully handle that and keep that handling totally transparent to you you should move on.
  I’m sure there must be some provider out there that can handle that uninteresting workload …
- layer8 772 days ago
  Different time zones maybe, or weekday vs. weekend.
- CatWChainsaw 772 days ago
  This would suggest that HN is not immune from addiction to outrage culture.
- theknocker 772 days ago
  [dead]
throwaway_ab 772 days ago
I pay for 5 TB and planned to use the drive to store a copy of my data.
Things I store that have lots of files:
- The frames for my Timelapse videos = 400,000 files
- The files in my Eagle app photo database = 400,000 files
- Other image files, my programming repositories, documents, music, stable diffusion Deforum frames = 400,000 files
80% of these files I've accumulated in the last 12 months and can see myself easily hitting this 5,000,000 file limit well before I run out of TB's
So now I know I will never be able to use all the space I'm paying for, I'm going to stop uploading my files and instead search for a proper backup service, something I should of researched in the first place.
Anyone here have any recommendations for a backup service?
[-]
- unixhero 772 days ago
  https://rsync.net
  https://backblaze.com
  https://jottacloud.com
  [-]
  - kuratkull 772 days ago
    +1 for rsync. Have used them for a couple of years. I set up a script to backup my stuff there a couple of times a day. Their daily snapshots have also saved me from my own mistakes on several occasions.
  - wooptoo 772 days ago
    This is the first time I hear about Jottacloud and I'm pleasantly surprised. Their combination of bucket-like storage + mobile app + web app + rclone support seems to tick all the boxes for a Google Drive replacement.
  - sickcodebruh 772 days ago
    Dropbox’s omission is interesting. I’m using a large paid account to backup a few TB of data spanning many years. Is there something I should know? Losing it would be devastating.
    [-]
    - unixhero 771 days ago
      Dropbox is an American company. Most hackers does not trust American hosting companies because of the NSA leaks confirming everyones paranoia. I wouldn't trust Dropbox with a ten foot pole.
    - orhmeh09 772 days ago
      I don't think Dropbox is a backup service per se.
      [-]
      - plasticsoprano 772 days ago
        Dropbox offers a backup[1] service and has since, I believe, last year.
        [1]: https://dropbox.com/backup
- n3storm 772 days ago
  https://www.hetzner.com/storage/storage-box
  This is europe, maybe from america it gets a bit slow though.
  [-]
  - smileybarry 772 days ago
    I tried using this for a media streaming project, but I couldn’t get good enough bandwidth to stream high-bitrate media. I ended up paying more to stream files from a storage provider that caches on CDNs.
    I do recommend it for other storage though: snapshots, SMB means you can mount it directly on Windows (and iOS!), cheap, and I personally like Hetzner.
    Their Storage Share service (running on Nextcloud) is also good if you want managed syncing, a-la Dropbox. Same server as a Storage Box but with managed Nextcloud on top.
  - ungamedplayer 772 days ago
    And a bit funky with legal things.
    [-]
    - lars_francke 772 days ago
      Can you expand on that?
      [-]
      - ungamedplayer 772 days ago
        There are laws and legal agreements in place that prevents certain sensitive data being stored in different legal jurisdictions.
        Any gov or military contract data usually has specific requirements stating data storage requirements.
        [-]
        dym_sh 772 days ago
        if you shoping in HN comment about where to backup your USA-GOV contracts, you might need to review your priorities
        [-]
        ungamedplayer 768 days ago
        Some people here might have side gigs with private data, often without even knowing.
        There are other data laws in other countries that are not USA gov that explicitly talk about customers data storage and it's handling.
- Dalewyn 772 days ago
  >Anyone here have any recommendations for a backup service?
  Two or more NASes, rotated out on a regular basis to satisfy the 3-2-1 Backup Rule.
- wankle 772 days ago
  I was thinking the same thing but I tend to create a large compressed backup file and upload that instead. My backup file is currently under 100 gigs, the most critical files not my whole system's files. If I backed up my gaming account (I don't need to since I use Steam Cloud) and compressed it, it'd probably only be another 50 to 100 gigs. This way I should never hit the 5 million file limit.
  Having said that, I feel 5 million files is too low for a lot of people.
- rsync 772 days ago
  Just email [email protected].
  We have a HN discount and we’ll get you up and running this weekend.
  You can even direct transfer from google drive to your new account here - no reason to use your own bandwidth.
- shitlord 772 days ago
  I've used Backblaze for years and never had a problem. There's no upload file/data limit, but downloading it back can be slow. There's an option for them to mail you a hard drive but I never tried it.
  [-]
  - axelsvensson 772 days ago
    Backblaze is terrible, but you will likely only find out when you need to restore.
    When backup excluded some files, they blamed antivirus software and recommended to go without.
    They actively deleted their only backup because the client couldn't read the original files (which was due to disk error).
    They admitted that there isn't any guarantee or even effort to make the backup consistent - after asking me to explain the concept.
    The only reason I still have my data is because of the very expensive cleanroom disk rescue that they of course refused to pay for - because why try to do what you can to compensate for failing at your only job?
    [-]
    - lasftew 772 days ago
      It sounds like these problems are related to their end-user backup solution - can't comment on that as I've never used it.
      However, when referring to Backblaze, I think most people here refer to their nice (and cheap) S3-like cloud storage solution, which works perfectly with the likes of restic, rclone and friends. That's probably what you should use if you care about control.
  - nikanj 772 days ago
    At least 2-3 times per year Backblaze goes into a safety freeze, and the only way to unclog the works is reinstalling+inheriting the backup. Which involves re-uploading every single file.
    Would not choose Backblaze if I was choosing today.
    [-]
    - massysett 772 days ago
      Due to the safety freezes and the high expense of Backblaze when backing up multiple machines, I have switched to iDrive. So far it seems to be working ok.
  - sib 772 days ago
    I've used it for many years. About 18 months ago had a complete motherboard failure on a MBP and had to restore a bit over 2TB to a new computer. I used the "mail me a drive" option and it worked fine. Only challenge was that it took about 1 week for them to create the disk from their backups.
  - shortlived 772 days ago
    I’ve used backblaze for a decade on Mac and windows. I’ve done selective restores and also used their mail you a hard drive option.
- barsonme 772 days ago
  tarsnap!
  [-]
  - shortlived 772 days ago
    Yes but read about restore time first - https://github.com/Tarsnap/tarsnap/issues/333
- bobbean 772 days ago
  Could you just zip them?
- notRobot 772 days ago
  Backblaze seems decent
  [-]
  - axelsvensson 772 days ago
    But they're anything but. https://news.ycombinator.com/item?id=35397752
  - mthoms 772 days ago
    I've had a good experience with Backblaze on MacOS. But I'd still consider it as a secondary backup (a backup-backup?) to use in addition to a local backup like Time Machine.
  - je42 772 days ago
    Drive was decent until now.
- karmasimida 772 days ago
  Just your files that is it.
- speedgoose 772 days ago
  Tar.
- MagicMoonlight 772 days ago
  Why are you storing a video as 400,000 individual frames?
  [-]
  - dumpster_fire 772 days ago
    Time lapse videos are created from photos taken over a period of time. If you create a 30fps 10 minute video, that's already 18000 photos.
    It's quite a pain in the ass to import, color grade, deflicker etc. So one would usually wait till there are a few projects are taken, and process them all at once. If this is your hobby, you can easily hit 5000000 photos in a couple of years. Especially if you're doing something like an ultra long time lapse of say, the construction of a building using two cameras.
    If the question is "why don't you delete the photos after creating the time lapse video?" It's called photo hoarding haha.
    Anyway that's irrelevant, because if I paid for a certain storage size to backup my hoarding habits, I expect to be able to use the capacity, and not have some random ass limits.
- TedDoesntTalk 772 days ago
  Backupsy beats all others I’ve tried including on price.
- AstixAndBelix 772 days ago
  Why don't you look into AWS Glacier?
- bboygravity 772 days ago
  Anything Synology is pretty great IMO.
fencepost 772 days ago
If the number of users affected is as 'vanishingly small' as a Google spokesman indicated then you'd think they'd be able to contact them - at least the paying customers?
[-]
- lazide 772 days ago
  Abusive behavior is almost always long tail (unless it becomes well known that folks can get away with it, then it ‘fattens up’).
  They almost certainly did contact those specific customers, but are sending a warning out publicly for folks who would be a problem so they go somewhere else/don’t do it to begin with.
  It’s the ‘police press release’ method of community moderation, like announcing a speed trap.
  Cloud services like G Drive depend on oversubscription anyway. If every customer hit the limit, it would be a problem for them.
  [-]
  - mynameisvlad 772 days ago
    This is Google; I don't know why you'd be giving them the benefit of the doubt in terms of communicating with their customers. We all know they don't.
    The fact that threads like these even exist indicate that Google did not, in fact, tell their users:
    - https://issuetracker.google.com/issues/268606830?pli=1
    - https://forum.rclone.org/t/new-limit-unlocked-on-google-driv...
- MikeDelta 772 days ago
  Makes me wonder: if the amount of users affected is vanishingly small, then why implement it, as it sounds like a vanishingly small problem to me?
  [-]
  - hennell 772 days ago
    Faulty logic. Just because the user count is small doesn't mean the problem is.
    Banning the letter ö from names on hacker news would affect a vanishingly small amount of users. But if threads needed 100 times more processing and were much slower to load when a user with a ö posted, the problem itself is quite large.
    Without knowing the full circumstances of why they're doing this, what problems it solves you can't really say if it's a good solution or not. We can however absolutely complain about the "no notice, no advice, no assistance" way it was implemented.
  - _a9 772 days ago
    Probably a very small amount of people abusing it somehow (How a lot of files can be abused i dont know), like how a small amount of people abused the old gsuite unlimited plan to store petabytes for $12/m
    [-]
    - alpaca128 772 days ago
      Then why doesn't Google use their standard solution and just ban those vanishingly few users, pointing to an ambiguous "terms of service violation" and then ignoring all their support requests for a lifetime?
      [-]
      - Gigachad 772 days ago
        How is that better than just limiting their account until they delete the files?
        [-]
        alpaca128 772 days ago
        It's not better, but based on past occurrences it seems to be the favorite method applied by Google and <insert any multi-billion IT corporation>.
- eviks 772 days ago
  That's what they are doing - via an article in a reputable news organization!
hedora 772 days ago
“In practice, the number of impacted users here is vanishingly small."
Well, yeah, I imagine they’re moving elsewhere.
Seriously though, do people actually trust them not to randomly intentionally break stuff at this point?
[-]
- andyp-kw 772 days ago
  Google have a history of acting now and communicating to customers later.
  Just look up the number of issues developers have with the Play Store.
rsync 772 days ago
I see it speculated, downthread, that this is a response to modern web-dev and node (?) creating millions of files, etc.
I can’t comment on that but I do know that modern, encrypted, archive tools such as duplicity and borg and restic “chunk” your source files into thousands (potentially millions) of little files.
We see tens of billions of files on our zpools and have “normal” customers breaking 1B … and the cause is typically duplicity or borg.
[-]
- jeffbee 772 days ago
  At a previous company the biggest cause of files in a particular context was that "oh my zsh" was installed for each of thousands of engineers by default. It often seems as though "modern framework" just means it is an abusive piece of junk.
- Dylan16807 772 days ago
  Billions seems weird for normal customers. Duplicity appears to put its chunks into 25MB files, I know for sure that restic puts them into 4MB or more recently 16MB files, and Borg looks like it puts them into 500MB files.
  [-]
  - rsync 771 days ago
    Not average, but normal.
    What I mean is, they just have a plain old rsync.net account and pay no premium nor incur any penalty for using all of those inodes.
    You are correct: the average customer does not use billions of inodes.
    [-]
    - Dylan16807 771 days ago
      Well I interpreted "normal" to mean "does not have many petabytes of data", and if those tools are reaching a billion files with smaller amounts of data then I'm very confused about how it's happening.
greatgib 772 days ago
Good remainder again that "the cloud is just someone's else computer"!
In my experience, GDrive is a piece of crap with a lot of weird behaviors and easy ways to lose your data if you sync your computer with it.
The worse here, as said by multiple persons, is not to have a limit. A limit on their service is fair. It is that this limit is undocumented, and that their key selling point is to shout everywhere that if you pay you will have "unlimited" storage. And that it will scale more easily than using your own "not cloud" backups.
[-]
- Gigachad 772 days ago
  Do they advertise unlimited storage? When I looked, the top plan was 30tb. To hit the file cap, you have to have 30tb of files averaging no more than 6mb.
  [-]
  - greatgib 771 days ago
    I don't know if it is still the case but it used to be advertised like that. For exemple :
    https://support.google.com/a/answer/6034782?hl=en Unlimited storage With G Suite Business, each user in your organization can store unlimited Gmail messages, Google Photos, and files in Drive.
    And I did not search for example, but I'm quite sure that they also advertised google one like that. Something like no need to keep files in your phone anymore ever, unlimited storage with the cloud.
  - ElectricalUnion 772 days ago
    In theory, if name metadata is "free", you can make a infinite ammount of 0 byte files with very large paths and tags.
    [-]
    - andromeduck 772 days ago
      Then just charge for metadata over a certian file count ot size?! Can be a token amount.
      [-]
      - Gigachad 772 days ago
        Drive is a consumer product. Having to explain to a user why they were charged an extra 5 cent because of a feature they don't even understand is not viable. If you want no limits granular pricing. Use S3 or the Google version of it.
        They provide exactly the product you want, it just isn't Drive.
        [-]
        andromeduck 772 days ago
        Just count it towards the storage limit as metadata or some shit.
    - Gigachad 772 days ago
      Whats the chance someone created some FUSE mount for storing data in 0 byte files.
      [-]
      - lazide 772 days ago
        100% I imagine.
        GDrive does have API rate limiting which either way is going to make it slow/useless for REALLY large data storage.
        [-]
        Gigachad 772 days ago
        I've noticed over the years they have been cutting off all the unlimited stuff. Docs used to be unmeasured and unlimited, now they actually count the doc size to your storage, which is still effectively unlimited under non abuse use cases.
exabrial 772 days ago
Once again: Don’t use Google for anything crucial or critical. Not Google Cloud, Google Docs, Google Drive, even Gmail is becoming a liability.
Real Engineering involves developing forward looking designs and maintaining backwards compatibility. It involves a release schedule. It involves communication channels and releases notes. It’s hard. It’s unsexy.
Google treats their product lineup with the seriousness of a social media platform. They don’t care about your puny business; even if it means the world to you, it means nothing to them.
Beldin 772 days ago
"Vanishingly small": a number of users small enough to be downplayed, but large enough so that neither an individual approach to the problem would work, nor that the problems could be ignored. Suspected to be a complex number.
[-]
- anjel 772 days ago
  "affecting" is quite the euphemism, here
stinos 772 days ago
Anyone knows how this works legally? You buy a service, suddenly without notice the services changes features. Does the small print allow for that? And how is this 'ok' in software but probably not anywhere else (pretty sure a service contract for an elevator doesn;t allow the service company to just say "we're going to limit the ?amount of times your elevator goes up and down to 100 times a day now")
[-]
- halestock 772 days ago
  > Does the small print allow for that?
  Almost always, yes.
- nikanj 772 days ago
  It’s the ”we have infinite money for lawyers” loophole, commonly used by large corporations
jmyeet 772 days ago
Some people will let technical limitations define a product. Others will have the product dictate the technical design. This, to me, is an example of the former.
I don't know the serverside implementation of Google Drive but imagine the files on your Drive correspond to files on something like an ext4 filesystem. In this scenario, each file has a cost (eg in the inode table) and there is wastage depending on what your block size is. Whatever the case, Drive seems to treat files as first-class objects.
Compare this to something like Git. In Git (and other DVCSs like Mercurial), you need to track changes between files so the base unit is not files, it's the repo itself. It maps to your local filesystem as files but Git is really tracking the repo as a whole.
So if you were designing Google Drive, you could seamlessly detect a directory full of small files and track that directory as one "unit" if you really wanted to. That would be the way you make the product dictate the design.
fsh 772 days ago
Very interesting that google chose to do this instead of fixing the software that caused the limitation. No wonder that their products are seen as a joke in the business world.
[-]
- pleb_nz 772 days ago
  Blows my mind people still trust them TBH. How many years of poor behaviors and bad decisions does it take for people in the know to move on. Understand there are a lot of non tech savvy users who wouldn't know, but I would expect hackernews users to be better informed on average and would have moved on
  [-]
  - Gigachad 772 days ago
    I’ve been using drive since it came out and I’m yet to run in to a single problem with it. It’s been the most reliable storage system I’ve ever had.
    [-]
    - pleb_nz 772 days ago
      I'm not talking about drive specifically, although now it seems it too can be added to the lists of shitty giggle practices and decisions.
- karmasimida 772 days ago
  Fixing software for like 0.01% of their whole user base, who plays flat fee anyway seems bad investment
  I have to agree with Google in this case, 5 million files for an account is borderline abuse for object storage system.
harshaw 772 days ago
The challenge with running cloud storage is that you have to think around the corners for usage and shape customer behavior with pricing. Seems like google didn't want to do this or was too lazy (sorry). Millions of files will always be a problem. The metadata costs more for these users, it's impossible to manage, hard to cleanup, etc.
The problem with google is if they fuckup their service they make it the customers problem. Other places if they fuckup, its more viewed as a one way door. You can sunset old products with (in this case, unlimited files), but you never put in a new restriction.
burnished 772 days ago
The limit is 5 million files. The article feels a little overblown in its reaction.
[-]
- Dylan16807 772 days ago
  It's easy to hit that many files, and they made the change without warning.
  If you're on the $10 2TB plan and your files average 100KB, 5 million files means you can only use a quarter of the space you're paying for.
  And before you call that unrealistic, my system drive averages 200KB per file and my main personal data drive is close to 300KB per file. Both would hit the limit if I wasn't using something fancy to pack files together.
  And then there's the $18 5TB plan with the same limit on file count. Even completely ignoring the option to pay for extra terabytes.
  [-]
  - burnished 772 days ago
    ..could you not split it up over multiple, smaller drives then?
    I get that this could be frustrating (being surprised specifically), but it seems a pretty reasonable restriction so it seems odd to criticize it like this.
    [-]
    - numpad0 772 days ago
      Google Drive is not a device, it’s a cloud storage service. There’s no metaphorized “drive device” either, it’s all under the same virtual root folder.
    - dragonwriter 772 days ago
      > could you not split it up over multiple, smaller drives then?
      Well, you could create multiple Google Accounts and create all the files in the same Drive with sharing enabled, because the limit isn’t on what is in one user’s Drive, but what one user can create across all Drives.
- mardifoufs 772 days ago
  The problem is that some people actually were well above the 5m files limit, and this sudden update ends up functionally locking them out of their account. They'd have to delete millions of files in some cases if they just want to add more files.
  That can be extremely disruptive especially for small businesses, who would be the most likely to depend on a Drive-based workflow.
- orangecat 772 days ago
  That's not really a lot. Imagine a script that runs daily and processes demographic data, creating one output file per US zip code. That's over 40k files per run, so you'd hit 5 million files after just a few months.
- anothernewdude 772 days ago
  If I'm paying for storage, then I don't want to give a fuck about stupid shit like this.
- ungamedplayer 772 days ago
  Numbers add up quickly.
topicseed 772 days ago
We need a .googledriveignore file then.
ourmandave 772 days ago
Initial thought is the ones who are surprised are the ones who sending a notification email to wouldn't have been noticed.
Like my dad has 300+ unread emails with who knows how many gigs of attachments.
lopkeny12ko 772 days ago
I wonder if you could create a block-level virtual filesystem backed by Google Drive so that you could store many small logical files in one physical remote "block" (file).
[-]
- smileybarry 772 days ago
  Maybe, but most storage providers don’t expose a seek/range API for their consumer stuff, so you’re doomed to fetch & upload the whole block every time.
  For about a decade, Dropbox was the only one to offer delta sync, OneDrive added that a few years ago, and they’re the only two IIRC that do delta syncing. The rest basically redownload & reupload.
Reptur 772 days ago
Seems like an Engineering issue more than a User issue. They could just take the node_modules folders and zip them up behind the scenes without changing the user interaction.
squokko 772 days ago
This is why despite G Suite being in many ways a superior product, it's made almost no inroads in Corporate America vs. Microsoft Office. Enterprises need to be able to specify a business workflow and depend on it, and if there are nasty surprises it fucks with their money.
Microsoft software is much worse than many competitors but it's documented, the behavior doesn't change suddenly, and it's backwards compatible.
nickcw 772 days ago
Rclone users noticed this new limit back in February.
Here is a thread discussing it on the rclone forum:
https://forum.rclone.org/t/new-limit-unlocked-on-google-driv...
It would be nice to have official confirmation of the limit rather than relying on speculation.
[-]
- SSLy 772 days ago
  I'm afraid the limit might have been put there exactly because of rclone users.
- Havoc 772 days ago
  Article now has confirmation at bottom
SMAAART 772 days ago
"Don't be evil" ---> You either die a hero, or live long enough to see yourself become the villain.
LightBug1 772 days ago
I know Google employees are reading this ... don't you recognise this sh!tshow and communicate it internally?
I'll never understand how such a large organisation can let this kind of stuff happen.
[-]
- marcinzm 772 days ago
  Probably the usual corporate reason that the nail that sticks out gets hammered down and that the messenger is often shot. Especially with all the layoffs it's better to not be the one in that position. And even if you do convince them it's a bad idea and don't get punished for it what do you gain? A half sentence on your performance review? Risk not worth the potential reward.
thrdbndndn 772 days ago
Wait, is it new?
I believe Google Drive for Workspace always have a file count limit, and IIRC it's as low as 500k or something, despite having "unlimited" capacity.
To be totally fair to Google, I know this precisely because there are communities of data hoarders that actively abuse various cloud storages. In Google Drive's case, They have ways to create "free" Google Workspace accounts via registration exploitation from various institutions. People use them to store PB-level data.
(For the interested, there are also ways to apply free MS developer accounts that are supposed to expire in 3 months but can be re-refresh indefinitely. This comes with 5TB "free" cloud storage x 5 (10?) separate sub-accounts.)
[-]
- Kwpolska 772 days ago
  What if Microsoft changes their mind and you wake up one day with your cloud backup being nuked due to the account expiring or due to your use being non-development-related?
  [-]
  - thrdbndndn 772 days ago
    At this point people are just hoarding for the sake of hoarding. The data itself isn't really that important: collecting the entire Netflix catalog is pretty neat, but if it's gone it's fine, you can basically re-find anything in piracy world.
    And they usually have multiple backups.
    [-]
    - cableshaft 772 days ago
      I have multiple backups, but they're all on drives in my house. If my house burns down, there go all my backups. That's why I keep all the important stuff on a cloud backup service as well.
SergeAx 772 days ago
I wonder what jury-rigged solution may lead to breaking into 5M limit? I can't believe it was just digital hoarding. In the end, hoarders know better to keep things in zip-archives.
nashashmi 772 days ago
I wish there were a "zip" standard for storing a million small files in one package. NPM and other open source programs badly need this.
[-]
- Xamayon 772 days ago
  I do exactly that for storing the result thumbnails for some of the dbs in my reverse image search engine (SauceNAO). Non compressed zip files allow quickly/easily seeking to and accessing component files without extraction. A few tens to hundreds of thousands per zip file works great. Millions would probably not be too different, but would use more resources/take more time when loading the zip file index.
  [-]
  - mthoms 772 days ago
    Interesting. Have you ever considered SQLite file storage? I'm wondering how it would compare.
    https://www.sqlite.org/sqlar.html
    [-]
    - Xamayon 772 days ago
      Haven't looked into it, but it sounds like it would work similarly (with some nice benefits such as also being able to easily store other metadata/etc). Feasibility would depend on how quickly the indexes and such load, and the resource consumption associated with opening/closing dozens of them at a time 24/7. In my screwy case there are hundreds of thousands of zip files which are randomly accessed on the fly to grab one or two thumbnails at a time. The random access speed on unloaded files is critical, and for zip files it's extremely quick.
- ElectricalUnion 772 days ago
  .jar/.apk (internally a zip archive) comes to mind.
  AppImage (internally is either a ISO 9660 Rock Ridge or a SquashFS filesystem), .deb (internally ar archive), .rpm (internally a cpio archive) are I think relevant examples too.
- eviks 772 days ago
  Exactly. Though yarn has this feature, and it works with zip files as mini-fs, so you don't need to unpack them to disk (generally)
- Gigachad 772 days ago
  Isn’t that just tar?
Overtonwindow 772 days ago
Oh good lord, I have so many files spread across a dozen drive accounts. It's free storage, I'm going to use it.
anothernewdude 772 days ago
Google: "We are too fucking useless or lazy to make file count not an issue since we already limit you by space."
[-]
- tyingq 772 days ago
  Well, also too lazy to provide exclude patterns or some other tool to let you work within our limit.
eviks 772 days ago
On a tangent and out of curiosity, which filesystems are great at working with a gazillion of tiny files?
[-]
- Vecr 772 days ago
  BTRFS, at least technically. It can get really fragmented and slow though.
  [-]
  - eviks 771 days ago
    How is "slow" great?
- cvccvroomvroom 772 days ago
  The Google Search and StackOverflow filesystems.
slackfan 772 days ago
Just another unlimited(*) with the asterisk in giant bolf and the explanation text in 2point type service.
Yay.
benhurmarcel 772 days ago
I wonder how that works for companies using Google Workspace. My company has Workspace users close to 6 digits I believe, I'd think we collectively store way more than a few million files.
[-]
- danpalmer 772 days ago
  It's a per-user limit.
  [-]
  - benhurmarcel 772 days ago
    And how does that work with Shared Drives then? They don’t belong to 1 user.
    [-]
    - danpalmer 770 days ago
      I believe each user can create up to 5 million files in the shared drive.
      I work at a company with many more than 5 million files overall where this has not been an issue.
jeron 772 days ago
Timely reminder of this great GitHub repo:
https://github.com/awesome-selfhosted/awesome-selfhosted
sqldba 772 days ago
So just don’t be one of the vanishingly small paid users who this affects. Easy.
I wonder what vanishingly small is. 0.001% of a million is still thousands.
[-]
- anonn77 772 days ago
  0.001% of 1 million is 10
  [-]
  - Proven 772 days ago
    [dead]
- burnished 772 days ago
  What if vanishingly small is like.. 5?
nathants 772 days ago
hello rclone and s3.
[-]
- knorker 772 days ago
  Not fair to compare Drive with s3. Completely different products. The Google equivalent of s3 is GCS.
  [-]
  - nathants 772 days ago
    hello rclone and gcs! almost as good.
    [-]
    - Ygg2 772 days ago
      I don't have much faith that GCS will exist in 10 years. Or maybe it will allow only bytes ending with 0xBEEF.
      So yeah, S3 for the win.