Multirows Functions and Expressions- Processing Sequence and Sync Time Optimization

We all try to optimize the efficiency of multi row functions /expressions such as SELECT() and other derivatives of SELECT() , MAXROW(), MINROW(), FILTER(), LOOKUP() etc.

May we know how the expression processing in actual table takes place when AppSheet processes it. Is it evaluated sequentially row-wise to see if the conditions are met?

As an example if an expression is say SELECT( Orders[Order ID], AND( [Order Statue]= “Open”, [Order Year]= 2016))

Will the condition for the year and status be evaluated sequentially from row 2 to the last row of the table?
If the condition is evaluated sequentially for rows, then if the app creator knows that say all year 2016 records are before row number 500, then will putting a condition such as say [_ROWNUMBER}<=500 will help in the SELECT() function ignoring remaining rows of the table in processing?

In summary, if there is any sequence that the multirow expressions follow in processing, is it possible that it can be made use of in improving sync time optimization?

2 Likes

I can only speculate.

As I understand it, the AND() and OR() functions do NOT short-circuit, so the order of the subexpressions has ZERO effect on performance. In fact, using AND() and OR() would therefore always deliver worst-case performance.

Conversely (as I understand it), IF(), IFS(), and SWITCH() can be used to short-circuit evaluation, and therefore COULD have a performance benefit.

Therefore this:

AND(
  expr1,
  expr2,
  ...,
  exprN
)

might be better expressed as:

IFS(
  NOT( expr1 ), FALSE,
  NOT( expr2 ), FALSE,
  ...,
  NOT( exprN ), FALSE,
  TRUE, TRUE
)

And this:

OR(
  expr1,
  expr2,
  ...,
  exprN
)

might be better expressed as:

IFS(
  expr1, TRUE,
  expr2, TRUE,
  ...,
  exprN, TRUE,
  TRUE, FALSE
)

If my understanding is correct, then there is a benefit to be gained by writing the expression like this:

SELECT(
  Orders[Order ID],
  IFS(
    NOT([Order Statue] = “Open”), FALSE,
    NOT([Order Year] = 2016), FALSE,
    TRUE, TRUE
  )
)

With the benefit growing with the number of rows in the table.

In this simple case, I wouldn’t think so.


The BIG wins come when your subexpressions include queries themselves, because a short-circuit before the query avoids the query entirely. Consider this:

OR(
  ISBLANK(value),
  (value = LOOKUP(...)
)

With OR(), BOTH expressions are evaluated, even if the ISBLANK() subexpression is TRUE. If ISBLANK(...) is true, the OR() will be TRUE regardless the result of the LOOKUP() subexpression, so there’s no win in doing it, but a loss in the performance hit.

Alternatively with IFS():

IFS(
  ISBLANK(value), TRUE,
  (value = LOOKUP(...), TRUE,
  TRUE, FALSE
)

the LOOKUP() isn’t performed at all if the ISBLANK() is TRUE.

3 Likes

Thank you very much @Steve for your very insightful response. I appreciate. This really helps.

I will try to run some tests in the coming days based on your guidelines and share my observations if any.

4 Likes

I would be very interested in those results!!!

2 Likes

Sure @Steve.

2 Likes