Updating a large number of related records - approximately 50 000

Hi Community, I have retro fitted 2 columns to an app that I need to be populated. The records are invoice details, the invoice details are already reference connected to products. I have added the product category field to the invoice details table. This value can be dereferenced from the products ref connection. However I now need to update all 50 000 records in the table to have the categories populated.
I have used a chunking script that combines the AppSheet API, Automations & Apps Script to update the table. However it is really slow.
I am passing 10 Invoice IDs at a time to the AppSheet API, invoking an action which writes a value into the automation file dof the Invoice record. These records are held in a slice which then triggers the ref update action on the Related Invoice Details records.
It works, but it is painfully slow & I have limited the test run to 5 loops, retrieving 10 Invoice IDs at a time & porcessing them. On a successful response form the API it grabs the next 10 Invoice IDs. 
Is there a quicker way to do this?
Thanks for any tips.

Solved Solved
2 15 568
1 ACCEPTED SOLUTION

You may want to share more details. It sounds that your requirement is as follows

Suvrutt_Gurjar_0-1697685953326.png

If so, you may want to give it a try by pure AppSheet API webhook to update the [Product Category] column in the "Invoice details" table.

I believe you could try an AppSheet API webhook with following details

Webhook1 .png

Webhook 2.PNG

The webhook body code can be something like 

{
"Action": "Edit",
"Properties": {
"Locale": "en-US",
"Location": "47.623098, -122.330184",
"Timezone": "Pacific Standard Time"
},
"Rows": [ <<Start: SELECT(Invoice details[ Invoice Details Key], [_ROWNUMBER] <10000)>>
{ "Invoice Details Key" : "<<[Invoice Details key]>>",
"Product Category column in Invoice Details Column" : "<<[Ref column referencing product table].[Product Category column in Products table]>>",

}
<< END >>

]
}

Please change / omit highlighted code suitably and change column names as per actual column names.

The Start: SELECT(Invoice details[ Invoice Details Key], [_ROWNUMBER] <10000)>> expression could be modified suitably to run in batches of say anywhere between 500 - 10000 records or so with some experimentation. Since the webhook has to update only one column, I believe , it will run on substantial number of rows at a time. 

I believe this webhook approach will be significantly less time consuming.

 

View solution in original post

15 REPLIES 15

I would

1) make the column a VC unless you ABSOLUTELY need it to be a normal column

2) populate the col directly inside the database since your incremental rows should not pose performance issues

Just FYI who are interested..

A simple LOOKUP in a Google Spreadsheet with 49000 invoice rows and 300 unique products takes only a few seconds!

API is powerful but when there is a far easier way to accomplish the same thing I would choose to do it that way. (It does not involve VC recalculation until you actually load data into AppSheet which happens anyway)

You may want to share more details. It sounds that your requirement is as follows

Suvrutt_Gurjar_0-1697685953326.png

If so, you may want to give it a try by pure AppSheet API webhook to update the [Product Category] column in the "Invoice details" table.

I believe you could try an AppSheet API webhook with following details

Webhook1 .png

Webhook 2.PNG

The webhook body code can be something like 

{
"Action": "Edit",
"Properties": {
"Locale": "en-US",
"Location": "47.623098, -122.330184",
"Timezone": "Pacific Standard Time"
},
"Rows": [ <<Start: SELECT(Invoice details[ Invoice Details Key], [_ROWNUMBER] <10000)>>
{ "Invoice Details Key" : "<<[Invoice Details key]>>",
"Product Category column in Invoice Details Column" : "<<[Ref column referencing product table].[Product Category column in Products table]>>",

}
<< END >>

]
}

Please change / omit highlighted code suitably and change column names as per actual column names.

The Start: SELECT(Invoice details[ Invoice Details Key], [_ROWNUMBER] <10000)>> expression could be modified suitably to run in batches of say anywhere between 500 - 10000 records or so with some experimentation. Since the webhook has to update only one column, I believe , it will run on substantial number of rows at a time. 

I believe this webhook approach will be significantly less time consuming.

 

I will give this a go, its been running in the background for about 5 hours & only completed about 600 rows! Hasn't timed out or failed, just painfully slow! Thanks for chiming in ๐Ÿ™‚

 


@bradlegassick wrote:

Hasn't timed out or failed, just painfully slow!


I think you are using reference actions means it will take time . This is so because,  I believe the referenced action updates row by row in the referenced table records.

Yeah they are reference actions triggered by an automation that is kickstarted via the API call. All a bit messy to be honest.

Yes, I understand. That is why suggested the above alternative approach.

You my friend are a lifesaver! I have tested with 100 rows & it was super fast. Thanks for coming up with a really good option! Cheers

You are welcome. I guesstimate you could test with even 1000 or 5000 rows in one go.

Thank you for @Suvrutt_Gurjar@MultiTech (insane AI helper called AppSter) for a practical solution to a problem. I must add that I did encounter too many timeouts trying to update via the Webhook in large row chunks. If anyone is interested here is the solution I implemented that has worked well. (suggested by Matt's @MultiTech Appster AI bot!)
1) Apps Script AppSheet API call triggering the automated Webhook - this is on a 5 minute timer. 
2) Helper table - AutomationState - keeps track of the rows I am processing
3) Reduced the rows to 250
4) Set break expressions to ensure the Max rows of the table being processed are less than the Upper Row Limit in the helper table (these are incremented by 250 per run)
My first attempt using ref actions took hours to process about 3000 rows. Using this method I process 250 rows in about a minute (including AppScript & the AppSheet API runs). Equivalent of 12 minutes for 3000 rows!
Thanks for the help & a real time saving solution! Cheers

Just one thought.. if you change the Ref column to Enum base of Ref, and you don't have any other virtual column in a Child table, the AppSheet API / webhook should do the job within 10..20 secs.. and I mean all 50k rows at once. I have tested this earlier with 40k rows and it was working just fine within that timeframe. BUT.. if you have at least one virtual column, it changes the whole situation because then it needs to calculate it row by row.

 virtual columns, I have 2 - so for now this approach is chinking away & getting it done. I will definitely remember your tip here. Thanks Aleksi.

Buy if you need it only as a temporary solution, you can delete them and then set them back after the update is done. Just a thought..

@bradlegassick ,

I think Aleksi's suggestion makes a lot of sense. VCs will recalculate in a single sync for all the rows as soon as you reintroduce them. So I believe , one can easily remove them while the BOT runs.


some@TeeSee1 wrote:

API is powerful but when there is a far easier way to accomplish the same thing I would choose to do it that way.


I agree.  AppSheet is fantastic, but it's not the tool for everything.

Since we live in the world of Ai, and getting some custom bespoke code to do some one-off operation is just a prompt or two away, I would work with GPT4 to come up with an apps script that would accomplish this retrofitting of data into my database.

A script would accomplish this in seconds.

I spent 5 minutes with GPT4 and got this far: 
https://chat.openai.com/share/ddd05874-e68e-4b94-b122-92a0b4350de4

// Configuration options
var sheetId = "your_spreadsheet_id_here"; // Replace with your Google Sheet ID
var invoiceTableName = "InvoiceTable"; // Replace with your Invoice table name
var productsTableName = "ProductsTable"; // Replace with your Products table name
var idColumnName = "ID"; // Replace with your ID column name
var productColumnName = "ProductID"; // Replace with your Product ID column name
var categoryColumnName = "Category"; // Replace with your Category column name

function updateInvoiceCategories() {
  // Open Google Sheets once for all operations
  var ss = SpreadsheetApp.openById(sheetId);
  
  // Fetch product-category mapping into an object for quick lookup
  var productCategoryMap = getProductCategoryMap(ss, productsTableName, productColumnName, categoryColumnName);
  
  // Update Invoice Table
  updateTable(ss, invoiceTableName, idColumnName, productColumnName, categoryColumnName, productCategoryMap);
}

function getProductCategoryMap(ss, tableName, productColumnName, categoryColumnName) {
  var sheet = ss.getSheetByName(tableName);
  var data = sheet.getDataRange().getValues();
  var headers = data.shift();
  var productColumnIndex = headers.indexOf(productColumnName);
  var categoryColumnIndex = headers.indexOf(categoryColumnName);
  
  // Create a map for quick category lookup
  var map = {};
  for (var i = 0; i < data.length; i++) {
    map[data[i][productColumnIndex]] = data[i][categoryColumnIndex];
  }
  
  return map;
}

function updateTable(ss, tableName, idColumnName, productColumnName, categoryColumnName, productCategoryMap) {
  var sheet = ss.getSheetByName(tableName);
  var data = sheet.getDataRange().getValues();
  var headers = data.shift();
  var idColumnIndex = headers.indexOf(idColumnName);
  var productColumnIndex = headers.indexOf(productColumnName);
  var categoryColumnIndex = headers.indexOf(categoryColumnName);
  
  // Update data in memory
  for (var i = 0; i < data.length; i++) {
    var row = data[i];
    var productID = row[productColumnIndex];
    row[categoryColumnIndex] = productCategoryMap[productID] || "";  // Use empty string if category not found
  }
  
  // Write updated data back to sheet in a single operation
  sheet.getRange(2, categoryColumnIndex + 1, data.length, 1).setValues(data.map(function(row) { return [row[categoryColumnIndex]]; }));
}

 ________________________________________________________________________________

Special thanks to @bradlegassick for his continued use and experimentation with Appster; it's power users like him that enable me to continuously improve the system that will eventually replace me! haha (^_^)

Top Labels in this Space