@tony (I hope it is OK to ping you directly - you seem to be the ML guy?)
1: Does training continue if I log out? I uploaded the Iris dataset (150 rows, 4 columns) an hour or so ago and it is still training. I can imagine that a larger dataset will take many days (of course depending on many variables)
2: Is there a public doc describing training optimzation available? Eg shows me options/ benefits for feature scaling/ normalization?
3: Would you describe the algorithm under the hood as a production ready ML? Or maybe a โnoveltyโ implementation demonstrating how other Google Cloud services can be integrated with AppSheet?
4: Iโm excited by the possibilities that โdemocratizedโ ML apps can bring to businesses - especially integration with other Cloud ML services in the future. Would you say that the current implementation is โfast enoughโ for training. Or would it be better for a dev to train their own models on another (faster?) platform and then go down the traditional code app path?
5: Iโm a ML newbie so I donโt have any specific use cases in mind. I imagine that once I get my head around the tech I will be seeing opportunities everywhere
Thanks and cheers
Hi @Derek_N.
Training should be relatively quick (a minute or two at most). The fact that the Iris dataset is not training suggests that thereโs a bug. Could you create an isolated app that reproduces the issue and send the details to support@appsheet.com? We can take a look.
The platform takes care of scaling and normalization. Right now, we keep the details of the training algorithm and model hidden from the app creator. Weโre considering future designs that include the ability to compare model alternatives, so your feedback is welcome.
Right now the performance of the system is below Google AutoML Tables, but weโre investing more in the ML features so improvements should be coming.
Unless youโre dealing with truly big data, the training time wonโt be the bottleneck. The biggest bottleneck is in developer time, which we hope to reduce by providing no-code features for ML.
Please share any ideas/questions you have on the community, itโs very helpful for us to learn about use cases and scenarios that youโre interested in.
Thanks Tony.
1: I have reproduced the issue and have sent an email to support with the app url.
2: Roger. May I suggest making the model feature importance visible as well? As I understand it (very, very poorly) model and local feature importance is analogous to the โmultivariate principal components analysisโ and would also allow the developer to engineer/ verify that the โdesign of experimentโ / model features are necessary and sufficient?
3: Thanks for the AutoML Tables heads up. Very helpful. Iโm guessing the models are currently assigned from the โcolumn to predictโ types. Eg
Yes/No - Logistic Regression
Enum - Logistic - Multiclass
Ref - Logistic - Multiclass (is this correct? Not sure what Ref means)
Price - Linear Regression
Decimal - Linear Regression
Number - Linear Regression
4: Roger.
5: Will do.
Cheers
Feature importance is shown after training in the editor.
There are three kinds of models that get trained, depending on the type of output column: binary classifiers (yes/no columns), multi-class classifiers (enum columns), regressions (numeric columns). โRefโ means a reference column.
User | Count |
---|---|
39 | |
28 | |
23 | |
23 | |
13 |