Improving performance by getting rid of unnecessary virtual columns

Hi @Hyman_van_Zyl it is very unlikely that the auto-generated virtual columns are where your sync time is going. However, you may think so when you look at the performance profiler, but that is a flaw in how the data is being presented. Let me explain. Let’s say your app has 10 tables. Our server will only fetch a few of them in parallel (the exact number depends on your subscription and the current load, but let’s say that number is 3). So off it goes to fetch tables 1, 2, and 3 in parallel. When any of them is fetched and ready, that thread goes off to fetch 4, and so on.
But now here’s the tricky part … what happens with virtual columns? The virtual columns for table 1 are computed right after table 1 is fetched. But let’s say one of them requires table 8. So that has to be fetched too. In the performance logs, this will count towards the compute time for the virtual column. But in reality, table 8 had to be fetched anyway at some point during the sync. In fact, you’ll see this recorded if you drill down into the virtual column computation in the profiler.
The virtual columns that take a long time to run are those that involve inefficient multi-level SELECT formulas. The “reverse reference” formulas are usually really fast — they just need one pass through the data. And we do this in our server during sync so that we don’t have to do them on the devices (slower and more of a battery drain on a device)


Is there advantage, if I don’t need the REFROWS, in using the enum basetype REF maneuver?

In general, I would recommend not trying to hack the REFROWS thing. Just treat it as an essential column added by the system. In no performance investigation have I ever found it to be an actual cause of slow performance.

1 Like


Thanks Praveen,
Any suggestions where I can learn about ways to avoid inefficient multi-level SELECT formulas? Like what parameters determine if a formula becomes heavy? How do I know if my app is too slow? What is the recommended amount of tables? Those kind of things.

I guess we should publish some general rules of thumb (adding @Derek FYI).

I’d say, assuming a PRO plan, and painting in very broad strokes here (there are many differences between SQL and google sheets and excel and …):

  1. If you have N tables, assume N/3 seconds of baseline latency
  2. Take your largest table in terms of cells (rows * columns) and add 1 second for every 5000 cells
  3. In #2, If you have security filters, and they are simple (ANDs, equality or IN conditions) and you use a SQL database, then you count the rows that pass the security filters only … otherwise you count all rows
  4. Avoid virtual columns with a SELECT and anything but an equality or IN filter condition

So that’s a good rule of thumb. Syncs should be under 10 seconds. Maybe 15 max. Otherwise, you most likely have (a) too many tables — break up your app into sub apps with a launcher, (b) too much data — use security filters, © inefficient SELECTS in the security filters, or (d) inefficient SELECTS in the virtual columns.


So me always trying to get them sub 5sec is silly…


How do complex UX formatting rules factor into sync times?

Thanks for your input on this topic by the way!

@Grant_Stead it is possible to be sub 5 second, but when using sheets, there is latency imposed by Google (or Dropbox or whoever) that isn’t necessarily related to the number of rows.

UX formatting rules have no impact on sync time but have tremendous impact on display/rendering time. We render rows dynamically – as you scroll up and down, values are shown or not. Now for every one of them, formatting rules have to be run, and if they involve complicated expressions, things can get really laggy


These are very valuable tips. Just to be clear on point 4, so equality and IN conditions are ok and any other kind of condition should be avoided as far as possible?
Does it affect sync time if you make a table read only?
Thanks lots

You can do AND([Col1] = {something that evauates to a constant}, IN([Col2], {something that evaluates to a list}), etc).

Also, a very common pattern is to use a simple filter [Email] = UserEmail on TableA and then to use TableA itself as a filter on TableB like this in the security filter of TableB IN([ARef], TableA[Key])

For example, if you have WorkOrders and WorkOrderDetails, you can filter WorkOrders based on the Email of the assigned user and filter WorkOrderDetails based on the WorkOrder ids.


So a AND([Col1]=[THISROW].[Col2],…) would be ok?

So make a slice of TableA which you filter and then use that slice in Table B?

Does it help to make tables read only where possible?


@praveen This would a be good article indeed to open the backend.


I agree @Aleksi

In terms of Appsheet performance, its a bit like a black box model at the moment. We have to create stuff and see what the results are

1 Like

Please consider dedicating a office hours and / or short YouTube video tutorials dedicated to performance concepts and examples (eg. App examples with inefficient setup and how they can be modified to be more efficient)


Can you please describe, with an example, what you mean by “multi-level SELECT formula”?

It’s a SELECT() formula inside of another SELECT() formula.

1 Like

Are the following are remarks about using multiple SELECT (in particular its variations) in a statement/formula correct in this context?

> Beware of LOOKUP() This is a compact equivalent of ANY(SELECT(Customers[Phone], [Name] = "John Doe")) where the filter is an equality condition.

> Expressions in format rules should be simple — avoid SELECT or its derivatives (like FILTER and MAXROW)

The use of any of the following in a virtual column app formula or in a format rule can hurt performance in direct proportion to your data set size (the more data, the more performance suffers):

All of them use SELECT() behind the scenes.