Google Drive Backups for disaster recovery - How are you handling it?

Hello Community

I was wondering how you deal with backing up Google Drive files. By default, Google Workspace provides these mechanisms (afaik) :

- Files go to trash for 30 days
- Google Vault

These are clearly not enough, as a trash emptying can be forced. After that, you only have 25 days to recover files.

And also, Google Vault isn't very practical since it doesn't preserve the folder structure nor I can imagine what it would be to recover a whole organization dozens or hundreds of terabytes of data via Vault. I think that is not the purpose of Vault.

How are you handling this? ANy good recommended products?

Cheers!

0 11 2,640
11 REPLIES 11

You are correct when you say vault is not built for this purpose. It's an eDiscovery tool not a backup tool.

The simple answer is a third party backup solution.  If you need a true back up with easy click to restore and could handle large volumes of data restore, third party is the way to go.  Spanning and Backupify are two options that have been around for awhile. 

If third party is out of the question, do what you're doing today.  User data restore within 25 days.  Vault as a fail safe.   Then create an activity alert for file deleted.  

If alerted you can restore as it's within 25 days.  Vault as backup. 

 

Thanks a lot.

For the activity alert, I guess it will have to have a threshold right? Like 1000 files deleted in less than 2 minutes or similar. 

For Spanning, I have heard bad stories. Will check Backupify. 

 

 

You might also add Afi.ai to your list of potential providers. While we have not had any real need for the backups (thankfully), our occasional test restores of email and Drive data have worked fine.

 

Hi Jason

Many thanks! Afi.ai is definitely on my radar now.

Aren't you worried about giving a 3rd party vendor full access to all your domain drives, therefore having a single point of failure for your live data + backup data?

As many certifications and audits they might have, a small company can get hacked more easily than Google.

This is definitely a concern, and one I cannot completely address. Your situation may require more than one backup provider but that also broadens the attack target even if you would have more than one backup repository to draw on.

Perhaps a backup of your backup is a fallback? Afi.ai supports backing up to your own cloud storage solution which could then be backed up by another mechanism which does not have access to your live Workspace data, and is not accessible by afi.ai. Of course this brings increased cost and complexity. You may also lose backup fidelity such as ID's, sharing information, etc. But if you can restore a snapshot of the afi.ai backup data for it to use then this may not be an issue. I cannot say.

Perhaps @christiannewman has some expertise to add in regards to your backup vendor having access to everything?

 

Excellent reasoning, ideas and much appreciated. Thanks Jason!

I'm sure there are other backup solutions as well. Those two have just been around a long time.  

For the activity rules you can create whatever threshold you like (Help Article ) As for the time frame though, it can only be in a 24 hr period or a 1 hour period.  In your case, I don't think immediate notification is an issue.  If they deleted it, it's gone and you have 25 days.  1 or 24 should be fine.  

 

 

Vault โ‰  backup

Backup โ‰  vault

Both serve unique but equally important functions.

Vault

Enables you to retain (or purge) data according to business requirements and regulatory compliance (not withstanding end-user retention/deletion), search your data for a specific matter, put a hold on a matter, and share/export a matter for use by legal counsel and other stakeholders for the purposes of investigation or legal defence. Vault does not help you 'put files back' so you can continue collaborating/working with them.

Vault considerations vs your current system:

  • Does the current system allow admins to set varying default retention policies (ranging from 1 day to indefinite) per data type (Gmail messages, Google Drive files, Google Groups conversation history, Google Chat history, Google Meet meetings, Google Sites, Google Voice call history, and Google Calendar events) - even if data is permanently deleted by users?
  • Does the current system allow admins to set varying default retention policies per organizational unit - such that different teams are subject to different retention policies - even if data is permanently deleted by users?
  • Does the current system allow admins to set custom retention policies per data type and organizational unit, and based on a specific search query - such that very specific types of data can be retained longer, or purged earlier, in order to satisfy business requirements and/or regulatory compliance?
  • Does the current system allow administrators to create a matter - searching all data types and users by keywords or search criteria to immediately discover all data pertaining to that matter (such as a legal challenge, HR investigation, highly-sensitive project), and placing a hold on it to ensure it is retained even if permanently deleted by users and/or retention rules with shorter retention periods?
  • Does the current system allow administrators to share such matters with required personnel (e.g. HR, inside legal counsel, senior executives) for the purposes of monitoring ongoing investigations?
  • What is the cost of the inability (or incomplete ability) to quickly discover and hold onto all data pertaining to a dispute, investigation or legal matter in which you are required to defend your organization?

Backup

Enables you to retain copies of your current data so that you can put it back (notwithstanding accidental or malicious user deletion, encryption or infection), and continue collaborating on/using your data.

Backup considerations vs your current system:

  • Is Google Workspace data backed up in full fidelity, such that restoring it places the content back in its original form, preserving file ID (and therefore links to other files),  metadata (original creation date and all subsequent edits), sharing permissions, complete version history and more? Most backup solutions not specialized for Google Workspace download a copy of the data in a standard file format, and reupload upon restore (essentially a new file, losing much or all of the fidelity above), resulting in significant data loss and significantly slowing disaster recovery.
  • Does the Google Workspace data backed up include all core Google Workspace data: Google Drive (My Drive), Google Drive (Shared Drives), Gmail, Calendar, Contacts, Google Chat?
  • Can restore functions be performed by end users - such that if they need to recover a single email or file, they can self-serve without requiring IT support or resources? Most backup solutions not specialized for Google Workspace require IT support and sometimes third-party resources to restore. 
  • Does the restore function include full-text search - such that selectively restoring data can be done much faster by more easily locating the content in question?
  • Does the backup system contain a ransomware detection engine that detects changes to your data, including mass-deletion and mass-encryption operations, triggering preemptive backups, marking last-known unaffected versions of files and notifying administrators?
  • Does the backup system automatically backup the data at a reasonable frequency e.g. 3x daily?
  • What is the cost of the inability (or incomplete ability) to recover from a disaster such as malicious deletion, encryption or infection?

Does this help, @Marcus1?

 

 

 

Hello Christian

Many, many thanks for the thorough explanation. It does help indirectly, as it is very well summarized.

However, I am still wondering how other companies and Google Workspace admins deal with a basic problem: If you use a 3rd party vendor for your GW backups, you are creating a single point of failure in which the vendor being hacked or malintentioned, would mean them having access to all your live data+backups.

I get quite anxious when I think about it. Is it just me?

In today's landscape it's normal to be anxious.  Talking about worst case scenarios can be intense.  

If you do decide to go with a backup solution you have to think about it like any application with access to your companies data.  This is obviously more data than most apps would have access to so there may be more due diligence.  There is always a layer of trust and the company you choose ideally has enough criteria to make you comfortable.  Things like history (hacked, stolen data) and compliance, SOC, ISO etc.

 

Thanks! Glad I am not the only one have some kind of anxiety over it ๐Ÿ™‚

Yes. Indeed. For instance, for cloud backup solutions, I find CloudAlly to be a potential best bet since it is part of a very big company (Opentext, Canada).