Improving performance by getting rid of unnecessary virtual columns

Hey guys,
I am trying to tweek out some performance by working through the details in the performance profile. I do not use many of the virtual columns created by appsheet automatically but if I delete them they are regenerated on save. Some of them take up a second or so performance time. Is there a way to disable them?
Another question, is there a way to quickly identify the virtual columns that are not used in your app?
Thanks
Hyman

1 Like

If you are using a Ref field and you don’t change that column type, it will regenerate the virtual list column after saving.

Hey Aleksi,
Is there anyway to kind of disable that virtual list so it does not get processed during app sync?
Thanks

I’m not aware of such a workaround.

Ok, Thanks

Hi @Hyman_van_Zyl it is very unlikely that the auto-generated virtual columns are where your sync time is going. However, you may think so when you look at the performance profiler, but that is a flaw in how the data is being presented. Let me explain. Let’s say your app has 10 tables. Our server will only fetch a few of them in parallel (the exact number depends on your subscription and the current load, but let’s say that number is 3). So off it goes to fetch tables 1, 2, and 3 in parallel. When any of them is fetched and ready, that thread goes off to fetch 4, and so on.
But now here’s the tricky part … what happens with virtual columns? The virtual columns for table 1 are computed right after table 1 is fetched. But let’s say one of them requires table 8. So that has to be fetched too. In the performance logs, this will count towards the compute time for the virtual column. But in reality, table 8 had to be fetched anyway at some point during the sync. In fact, you’ll see this recorded if you drill down into the virtual column computation in the profiler.
The virtual columns that take a long time to run are those that involve inefficient multi-level SELECT formulas. The “reverse reference” formulas are usually really fast — they just need one pass through the data. And we do this in our server during sync so that we don’t have to do them on the devices (slower and more of a battery drain on a device)

9 Likes

Is there advantage, if I don’t need the REFROWS, in using the enum basetype REF maneuver?

In general, I would recommend not trying to hack the REFROWS thing. Just treat it as an essential column added by the system. In no performance investigation have I ever found it to be an actual cause of slow performance.

1 Like

Deal.

Thanks Praveen,
Any suggestions where I can learn about ways to avoid inefficient multi-level SELECT formulas? Like what parameters determine if a formula becomes heavy? How do I know if my app is too slow? What is the recommended amount of tables? Those kind of things.
Thanks
Hyman

I guess we should publish some general rules of thumb (adding @Derek FYI).

I’d say, assuming a PRO plan, and painting in very broad strokes here (there are many differences between SQL and google sheets and excel and …):

  1. If you have N tables, assume N/3 seconds of baseline latency
  2. Take your largest table in terms of cells (rows * columns) and add 1 second for every 5000 cells
  3. In #2, If you have security filters, and they are simple (ANDs, equality or IN conditions) and you use a SQL database, then you count the rows that pass the security filters only … otherwise you count all rows
  4. Avoid virtual columns with a SELECT and anything but an equality or IN filter condition

So that’s a good rule of thumb. Syncs should be under 10 seconds. Maybe 15 max. Otherwise, you most likely have (a) too many tables — break up your app into sub apps with a launcher, (b) too much data — use security filters, © inefficient SELECTS in the security filters, or (d) inefficient SELECTS in the virtual columns.

11 Likes

So me always trying to get them sub 5sec is silly…
LOL

@praveen

How do complex UX formatting rules factor into sync times?

Thanks for your input on this topic by the way!

@Grant_Stead it is possible to be sub 5 second, but when using sheets, there is latency imposed by Google (or Dropbox or whoever) that isn’t necessarily related to the number of rows.

UX formatting rules have no impact on sync time but have tremendous impact on display/rendering time. We render rows dynamically – as you scroll up and down, values are shown or not. Now for every one of them, formatting rules have to be run, and if they involve complicated expressions, things can get really laggy

3 Likes

@praveen
These are very valuable tips. Just to be clear on point 4, so equality and IN conditions are ok and any other kind of condition should be avoided as far as possible?
Does it affect sync time if you make a table read only?
Thanks lots
Hyman

You can do AND([Col1] = {something that evauates to a constant}, IN([Col2], {something that evaluates to a list}), etc).

Also, a very common pattern is to use a simple filter [Email] = UserEmail on TableA and then to use TableA itself as a filter on TableB like this in the security filter of TableB IN([ARef], TableA[Key])

For example, if you have WorkOrders and WorkOrderDetails, you can filter WorkOrders based on the Email of the assigned user and filter WorkOrderDetails based on the WorkOrder ids.

3 Likes

So a AND([Col1]=[THISROW].[Col2],…) would be ok?

So make a slice of TableA which you filter and then use that slice in Table B?

Does it help to make tables read only where possible?

Thanks

@praveen This would a be good article indeed to open the backend.

3 Likes

I agree @Aleksi

In terms of Appsheet performance, its a bit like a black box model at the moment. We have to create stuff and see what the results are

1 Like