Is there a better way to do this query

What I've got below works but I wondered if there was a more elegant way to do it and what the performance might be like.  I'm a novice at data driven queries but it seems pretty clunky and I am concerned about how it might scale.

A Students table is related to a Classes table through a link table, Attendees.

I wanted to create a slice of the Classes table, Least Attended Classes, containing those classes with the least number of students attending, whatever that might be.  So if we had three classes of 30, 20 and 20 Attendees then Least Attended Classes would contain a slice with two classes, each of 20 Attendees.

First I created a virtual column, [Number of Attendees], in Classes which counted the number of Attendees:
COUNT([Related Attendees])

Then I created a filter condition on the Least Attended Classes slice:
[_THISROW].[Number of Attendees] =
MIN(
   SELECT(
      Classes[Number of Attendees],
      TRUE,
      TRUE
   )
)

Is there a simpler, more elegant way to achieve the same result (a list of the least attended classes)?

My first thought was to try to do it without the virtual column and getting the MIN of a list of COUNTs but I couldn't figure that out.

Will what I have here scale in Classes?  For example, a query optimiser could run the right hand side of the equality above just once, in which case, I assume, it should scale fine.  However if the SELECT is run once for each row this is not so good.

Any tips and advice most welcome.

By way of background, the purpose behind the list of least attended classes is to recommend one of the least well attended classes to students.  I'm more interested in tips on the solution as chosen rather than better ways of solving the underlying problem so this information is provided for background more than as part of the question.  I don't want to waste anyone's time worrying about the underlying problem I'm trying to solve as it is a bit of 'can of worms'.

Thanks in advance

0 6 143
6 REPLIES 6

There may be other techniques to consider, but even for the technique you're applying your slice filter condition could probably be reduced.

[Number of Attendees] = MIN(Classes[Number of Attendees])

In terms of your underlying goal, you could probably get a little fancier if you wanted to return, say, the 3 least attended classes or the 5% least attended classes. To do that, you could incorporate functions like SORT/ORDERBY and INDEX or TOP into the expression. Here's one example.

[Number of Attendees] <= INDEX(SORT(Classes[Number of Attendees]), 3)

 

Thanks @dbaum , do you know if there is a performance advantage using MIN without SELECT.  I've changed to it anyway.  I like the alternate approach to the problem too. Thanks again.

I don't know about performance, but would guess that if either is more performant, it's MIN. FWIW, Re: FAQ: FILTER(), LOOKUP(), MAXROW(), MINROW(), R... - Google Cloud Community  explains various functions that under the hood are equivalent to SELECT and it doesn't mention MIN. You might be able to compare performance of alternate approaches using the Monitor app performance using Performance Profile - AppSheet Help. You could even include the INDEX alternative in your comparison: [...] = INDEX(SORT(....), 1) should be functionally equivalent to MIN.

Quick unrelated question...  How to you choose these boxes

[Number of Attendees] <= INDEX(SORT(Classes[Number of Attendees]), 3)

For the life of me I cannot work out how to get them to show ๐Ÿ™ƒ

I think you're asking about the (confusingly named) "Preformatted" paragraph style:

dbaum_0-1659652143823.png

 

test

test123

 

Awesome, thats been bugging me for a while!

Top Labels in this Space