Expression using REF_ROWS VC is not returning rows...

WillowMobileSys · 11-01-2020 07:19 AM

I am creating an expression as follows:

SELECT(Parts[Part ID], IN([Kit], SELECT([Related Ordered Kits][Kit], [Processed?] <> true)))

I am trying to leverage already existing REF_ROWS for better performance.

When using this in an action on Form Save, it doesn’t appear to return any rows. So I began testing with the aid of Virtual Column (VC).

The VC is defined as a List of Ref Type with the base table = “Parts”. Part ID is the Parts table key column.

In the Detail view, the VC shows as below named “Test Kit Count” - indicates 4 rows but no rows are displayed

Clicking on the “View” button, the correct Parts_Inline view is being used but again no rows displayed. This same view is used elsewhere with a REF_ROWS column without issue so I don’t think it is a view problem.

If I change VC List base type from Ref to text, the 4 Part ID values are properly listed in the view.

However, if I click the Edit button to go into the Form and simply click Save, when the app returns to the Detail view the VC (base type as text) is blank initially. Waiting for few seconds, without touching anything else, the VC will update to the 4 values again as shown above.

What am I missing?

Why is the initial expression, when used in an action returning no rows?

Why, when placing the expression in a VC does it show no rows in the inline view?

And why, when the base type is text, does the VC blank out after a Save and then update again with the correct values?

Steve

I’m not a fan of the Related … columns because their behavior is seemingly inconsistent, just as you’re seeing.

Remember, the Related … columns are virtual, so only update within a form view, when the row is updated by an action, or during a sync. Somehow, IsPartOf can also trigger an update, but it’s not clear to me how that works.

When you see that initial empty value, then a correction after a few moments, that says to me a sync happened behind the scenes.

I just avoid using Related … columns.

MultiTech

If a child record of the table that has this marked as “True” is updated, then a re-calc of the parent level is triggered - or it’s supposed to.

This level of integration is buggy.

I would never do anything like this…

I see what you’re trying to do, but you need to split things into different chunks.

For instance:

Take your de-reference list and make that it’s own column
[Related Order Kits][Kit]

But, since you want only the ones that are not processed - first, create a slice that represents these, then make your ref_rows use that instead of the root table.

This then leaves you with a list of all the [Related Incomplete Kits][Kit] in a single column - or in other words: “a list of all the [kit] values for each incomplete kit.”

To get your actual result, a list of all the [Part ID]s for the kits that have yet to be processed…

If you create a slice for these Incomplete kits, you can easily pull data from that subset.
Inside your Kit table, create a VC of all the remaining parts that need to be claimed (or whatever).

Then you can use the following Incomplete_Kits[Remaining_Parts] to pull a list of all the “lists of parts” - which you need to specially deal with by using SPLIT() to transform that into a regular “list of part IDs.”

SPLIT(Incomplete_Kits[Remaining_Parts], " , ")

That will give you a LIVE list of all the remaining part IDs from all the records inside the slice.

WillowMobileSys

@Steve @MultiTech_Visions

First, I want to thank both of you for your input. It is always appreciated and you both have given me some great insight to something I have struggled with for a while but just never put time into trying to dig deeper until now. Thank you!!

I ended up changing the SELECT to operate against the table directly.

Is this only because its not supported well? The parser passes the expression which implies it’s supported. You CAN create a VC with the desired dereferenced column that proves to me the implementation is supported. The two should work together.

It seems to me that AppSheet has provided the capabilities to use Related lists in expressions and since they are just lists why not be able to use them, and any implemented Related list dereference functions, anywhere lists are supported.

My SoapBox

My main goal here is to increase performance. Related lists are a needed list that many apps will need to operate on frequently. They are the most efficient way to get at that data because, well, they are already there with desired items. It would be greatly beneficial to be able to use them anywhere lists can be or are expected.

Adding more VC’s and Slices and writing more SELECT statements not only decrease overall app performance but also increase develop maintenance.

Steve

That inference is entirely wrong.

WillowMobileSys

How so?

Steve

The expression parser will “validate” plenty of expressions that won’t behave as intended. The parser is not purpose-built for AppSheet. It’s an Excel formula parser that’s been tweaked to recognize AppSheet-isms, and it’s not all that great at it.

The reason all the documentation I write quotes text values is because the parser recognizes Excel functions and reserved words and will try to validate their syntax, producing an error if the expression doesn’t match Excel’s expectations.

Date and Time are Excel functions, but also common column names. When an unquoted column of either of those names is used with (e.g.) LOOKUP(), the parser will complain because the Excel functions must have arguments, even though LOOKUP() isn’t even expecting an expression.

If I recall correctly, the parser recognizes and validates both of the Excel functions N() and T(), neither of which exist in AppSheet.

There are several situations where extraneous or missing punctuation is entirely overlooked by the parser. In some of those cases, a huge chunk of the expression is just silently ignored.

WillowMobileSys

Got it! The AppSheet syntax is fairly unique (at least to anything I’ve seen) I thought it was a custom parser they had full control over.

Grant_Stead

They do have control of it, but it’s not perfect.

Austin

The parser will throw errors where there are none and will validate expressions that don’t work for any number of reasons. My favorite is setting a ref column it expects the result to be type ref but you grab the base type of the key for that reference table. Throws an error but is completely accepted and functions just fine.
Dont know if that bug still exist but it did a few weeks ago.

MultiTech

When you see this, some piece of meta-data is off; typically it will be that at some point the “ref” status of the data is lost.

this might be due to the fact that you’re trying to do too much in one column.

If I were you: from here I do what I’m suggesting and split things out into separate columns.

I usually find it’s better to put a dereference-list into it’s own column, then use that in whatever formula I’m trying to get working.

MultiTech

This is a bug dealing with the “background final sync after all the syncs” - hard to explain.

Basically, after all the records that need to be pushed into the cloud are pushed, the app goes through a final sync to finalize all the values.

Then, if you’re not paying attention you’ll miss it, another sync happens (you can spot this by watching the old sync button (if you have it on)). It’s this background sync that has been causing apps on the more complicated level to spawn errors of some kind.

It’s been the bane of an app I built for a research institute: things are pretty complex in that app, and this is a regular issue we have.

@Joshua_Aldrich

WillowMobileSys

I understand. I guess the bug is that the VC is physically blanked out instead of simply being overwritten - even with the same values(s)?

MultiTech

I can completely understand where you’re coming from @WillowMobileSystems, and you’re thinking about things in the right way - now we need to add in a whole-nother layer: the calculation steps.

This is where the real meat and potatoes of your app’s sync time comes in.

If you’ve got the following in a VC:

Select(Parts[PartID], in([PartID], Select([Related Records][Column_A], [Column_B] = “Some Value”)))
the following will happen:

First the list of related records needs to be created
- this is relatively fast, the system is design that way, but it’s still essentially another version of the following:
Select(ChildTable[ChildID], [Parent_Ref] = [_Thisrow].[ParentID])
- So for each parent record, a complete search through every child record happens to find the ones that should be in the list.
  - Meaning: If I’ve got 10 parent records and 10 child records, then we’ve got 100 records to parse through and sort into the appropriate lists
  - More realistically, my parent table might have 500 rows and the child 1500
    - 750,000 rows that your app is having to manage…just right there.
- Thankfully, AppSheet engineers designed an indexing system in the background that indexes this sh!t out of your data for you - so while this sounds daunting, the system handles it really fast.
  - So we can pretty much ignore this part of the steps (though they do still happen, they’re just really really fast).

Now that the system has the base child lists, next we need to process your actual formula. The first part we need is the de-reference list of [Column_A] values - and since the de-ref list is included in a select formula, the additional conditions from that select are applied to this next step.
- Another select happens…
Select(Child_Table[Column_A], and(
in([ChildID], [_Thisrow].[Related Records]),
[Column_B] = “Some Value”
))
- How many child records were there again?? 1500 - so that’s 1500 record-calucation/evaluation-steps just to find that portion of your formula - for one row.
  - Since we’ve got 500 parent records, that’s 750,000 steps for the whole table…

Now we have a list of all the [Column_A] values from the [Related Records] column, where [Column_B] = “Some value.”

The final step is to take that list, and then apply that to the Parts table. This is where the comes in…

How many records in the parts table???
That’s a bit much… let’s say 2000.
- For each record in the parts table, ALL of the previous steps happen.

For a grand total of… 2000 X 750,000

1,500,000,000 rows to parse.

That’s how many records the system is having to deal with, just in that one VC.

If you dig deep into the audit logs, you can find the times for each and every one of these steps, and while many of them are times like 0.00235 they still add up.

I should note: I’m not sayin that the system is having to make that many calculations…
what I’m saying is that the 1.5 billion number is the RECORD COUNT of how many times the system will read data from your tables.

To my suggestion of the slice:

If you move the [Column_B] = “Some Value” into a slice, instead of that being calculated for each record, instead it only processes once - when the slices are calculated - then that list is stored in the slice.

To the point of moving the de-reference list into it’s own column:

that’s because it’s been my experience that the system works better when de-reference lists are put into their own column;
- this way they contain all the necessary meta data, that sometimes you won’t be able to set due to too many layers

Hope I made some sense in there; this is a really complicated and dense topic. I’m putting together a video now about variable handling, should be out in a few weeks.

WillowMobileSys

Great write-up! Kudos for taking the time. I do totally get what you are driving at but I think there is a fundamental mis-understanding in how the table scanning works with regards to Ref values - at least to the best of my understanding.

First, just for clarification, this kind of analysis is usually measured in “operations” using Big O notation. There is a whole section of study within the Computer Science industry determining the performance of code and coding algorithms. They use Big O notation to compare one algorithm to another to find the most performant.

Now, how does table scanning work with regards to Ref/key values?

Tables are typically indexed by the key value. Simply put, this means that a table row can be found within a single operation when the row key is known - no full scan required. When there is a list of Ref/key values, you can think of each row as being retrieved directly from the host table - when needed.

This is why Ref columns REQUIRE the usage of row keys in the constructed list.

It is also why I started this post…because using Related lists, and even Ref columns, is MUCH more efficient than having to rely on tables scans.

There is a caveat to all this, it all depends on how AppSheet has implemented their logic. I am fairly confident in saying that a lot of their implementation does indeed take advantage of indexing otherwise our apps would be tremendously slow.

Good discussion!

Fabian_Weller

Hi @MultiTech_Visions did you have the time to do this video?

MultiTech

@Fabian no
It’s been on my list this whole time, and I’ve actually tried a few times - but every time I try and come up with some sort of way to communicate all this cohesively in a video… it just doesn’t come across.

It’s still on my list, thanks for the reminder - I’ll take another crack at it.