App Performance: Estimated speed increase if data ...

Sean_Lim · 03-26-2020 08:03 PM

If you haven’t checked out Performance Profile, do check it out to pinpoint any slow processes and Virtual Columns that may make your sync speeds very slow. If I had known how to use this earlier, I would have been in much less of a world of hurt. Find it via: Manage → Monitor → Performance Profile. You’ll see a table that has a Binoculars icon. Click on it and you can march down on what particularly is slowing you down.

I have a fairly complex app (it basically contains 2 years worth of medical school with all the connections and associations students need to learn to be good future doctors…). The database is Google Sheets, so when I run the Performance Profile, I get the following results for one of the tables:

1.901 seconds to read 44274 rows
1.263 seconds to update 1 row (as in Edit actions)

My question is, if I migrate to an SQL, how much of a speed improvement can I expect? Will it be half the time, quarter of the time, etc.

Koichi_Tsuji

There is no quantified stats, how much the migration to SQL data base will improve the performance, but I would say it will based on my experience. Reading whole tables probably will not make a huge different, but SQL should perform better.

If we have expression like Select, it is construed as equivalent sql statement select * from tablename where xxxx
On the SQL, we will get the better performance if you INDEXing the fields you wish to put into the part of select where conditions.

The problem (it is long running for me) using SQL is when we have read only table.

On the spreadsheet, reading table of read-only, sync setting server-cache turning on will massively improve the performance, as appsheet server should cache data without calling to the backend services.
It works fine when we hold the data on googlespread sheet.
But when we store data (read only table) on the SQL, Appsheet is always read whole table on the backend data base of SQL…
So we have to be bit careful.

@Harry Any comments you can top up? in terms of SQL in general?
And is there any workaround and fix was found in terms of the problem on the last paragraph I mentioned above?

Sean_Lim

That’s helpful thanks for describing the potential benefits and risks. Can you help me understand a little bit more about how INDEXing the fields would improve the performance of SELECT expressions?

Table reads, according to the performance analysis, seems to make up much of the time required to compute a Virtual Column, which for me contains a SELECT expression. I was hoping the SQL would reduce these times significantly.

Koichi_Tsuji

In my case, I use MS Azure SQL as data source, so I do “indexing” through SSIS.

https://docs.microsoft.com/en-us/sql/relational-databases/indexes/clustered-and-nonclustered-indexes...

If using other SQL service, My SQL other, you need to follow their own setting to do “indexing” on the table and target row accordingly.

Harry2

@Sean_Lim Hi Sean, less than 2 seconds to read more than 44,000 rows in Google sheets is actually not bad. If you migrate to a SQL database, there might be some small performance improvement, but you’ll need to take into consideration the additional cost of using SQL.

Based on your description of the problem, I believe the real bottleneck is the time taken to compute virtual columns. You mentioned using the SELECT expression, which can take a very long time to compute. To understand why, consider this scenario.

Suppose you have a table A with 44,000 rows, and this table has a virtual column with a SELECT expression. In the selection condition of the expression, you compare a column in table A to a column in table B. Suppose table B has 21,000 rows (I’m just basing these numbers off the screenshots you shared). Because you need to compare every value in table A against every value in table B, this would result in a total of 44,000 x 21,000 comparisons. If every comparison takes 1 microsecond to be performed, that would result in a total computation time of 924 seconds, or 15.4 minutes.

Of course, the sync time of your app is much smaller than 15 minutes, but you get the idea. Based on this, the biggest improvement in your app’s performance would come from either simplifying your virtual columns, or from reducing the number of rows that are used in these virtual columns. If possible, avoid using expensive expressions like SELECT or LOOKUP with large tables. It’s also a good idea to archive old data that is no longer used. You can do this either by removing old data from your spreadsheets directly, or you can filter data using security filters in AppSheet. For Google sheets, security filters will not reduce the amount of data that our server has to read from the sheets, but they do reduce the amount of data that will be used to compute virtual columns, so they can be a big help. If you use SQL databases, security filters will also reduce the amount of data that our server reads directly from your SQL tables.

I don’t know which plan you’re on, but if you’re on a business plan, you should certainly consider using data partitioning. You can find more information about data partitioning here:

More information about how to scale your app in general can be found here:

Sean_Lim

Thank you for the very thorough explanation! Yes, I suppose if I cut down on my SELECT expressions, I would have a shorter Sync time.

This is concerning to me because, although table A will remain at 44,000 rows, table B will keep on growing due to user row creation (this is for a social-networking like app).

Actually, syncing times for me (~30 seconds) is OK for the time being. I’m concerned about performance because at times, my app slows down and does a force sync while I use it!

I wonder if Slices are an issue. @Harry, is it not a good idea to define a slice based on a Table with a constraint indexed on a column that constantly changes based on user input?

Harry2

Hi @Sean_Lim, slices won’t help improve the app’s performance in this case. Both slices and security filters can be used to filter data. However, unlike security filters, slices cannot be used to reduce the amount of data that the app has to read.

When you create a security filter for a table, you’re telling the app to only read a subset of the table’s data from AppSheet servers. As a result, the security filter will reduce the amount of data that is read during sync.

When you create a slice for a table, the app will still read the entire table from AppSheet. Once all of the table’s data has been loaded into the app, the slice will then be used to display a subset of the table’s data in certain views.

The fact that security filters can prevent data from being loaded into the user’s device means that there’s no way the user can access this data even if they hack the device, hence the word “security” in “security filters”. One the other hand, you can think of slices as a way to simply “hide” data from the user. With a slice, the data is still loaded into the user’s device, even if it’s only hidden from them in certain view. In theory, it is therefore possible for the user to hack the app or the device to extract the hidden data.

Sean_Lim

Understood, thanks.

I’m wondering if Slices can actually worsen the performance of the app. Suppose the app has background syncing turned on. Would it take a lot of the device’s own memory to re-compute a Slice if that Slice had a new row that passes the condition set by the Slice?

Harry2

In general adding a new row to a slice shouldn’t degrade the performance of the slice in any noticeable way. We have implemented a number of optimizations in the app, such as indexing, to make computations involving finding and filtering rows faster, whether by slices or expressions.

App Performance: Estimated speed increase if data moved to SQL?