Hi,
I have a question about the function ExtractNumbers. The function extracts the numbers found in an OCR text and returns a list.
Do you think the following behavior is intended?
extractnumbers(โ41โ) : (41) ok
extractnumbers(โ41Bโ) : (4100000000) nok, should in my view be 41
extractnumbers("B464/116) : (116) nok, should in my view be a list (464 , 116)
Do you have an idea how I could get the wanted behavior?
I tried also to substitute each character but digits by one space. Interestingly the last example then returned 464.116
It was ok, when I substituted each character with 2 spaces.
Thanks for any advice
Solved! Go to Solution.
Yeah, it sucks having to nest 26 SUBSTITUTE()s just to filter out letters, and then there are still special characters to consider. But I think thatโs the best we can do with the current functions in AppSheet.
We could REALLY use REGEX, or at least a more powerful SUBSTITUTE() with optional sets of parameters, or something like FileMakerโs FILTER().
Another option is saving the raw value to the sheet, and then using a Google Sheet formula with REGEX to extract the numbers to a new sheet column.
Since ExtractNumbers() is focused on numbers, itโs going to always struggle with handling any other characters.
In order to get this to work, you might have to first clear out all the letters and things - this way the ExtractNumbers() is given a clean input.
Clearing out all the letters is not a โprettyโ looking process; it requires nesting a series of SUBSTITUE() statements, one for each letter/character you with to replace/remove.
substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(
TRIM([ENTER_YOUR_COLUMN_HERE]),
" , ", โ"),
", ", "โ),
" โ, โ"),
โ-โ, "โ),
left(โโ โ, 1), โโ),
left(โโ ', 1), โโ),
โ/โ, โโ),
โ.โ, โ_โ),
โ,โ, โโ),
โโ, โโ),
โ:โ, โโ),
โ;โ, โโ),
โ[โ, โโ),
โ]โ, โโ),
โ(โ, โโ),
โ)โ, โโ),
โ{โ, โโ),
โ}โ, โโ),
โ!โ, โโ),
โ@โ, โโ),
โ#โ, โโ),
โ$โ, โโ),
โ%โ, โโ),
โ^โ, โโ),
โ&โ, โโ),
โ*โ, โโ),
โ+โ, โโ),
โ?โ, โโ),
โ|โ, โโ),
โ<โ, โโ),
โ>โ, โโ)
My general rule of thumb for keys is โText/Hidden/UNIQUEID()โ ALWAYS! But sometimes a sensical ID can help debugging things later.
If you examine the formula above, you can see that there is a dedicated SUBSTITUTE() for each character (or series of characters) that I want to remove/replace.
So you would start with something like the following:
SUBSTITUTE([OCR Result], โAโ, โโ)
Then wrap another around this for B:
SUBSTITUTE(SUBSTITUTE([OCR Result], โAโ, โโ), โBโ, โโ)
Etc.
Itโs not pretty, but it gets the job done. Might be the only way to get ExtractNumbers() to play nicely.
Or donโt use EXTRACTNUMBERS().
Do you have another proposal how to identify the numbers?
Perhaps you could train the OCR model to ignore the letters?
Thanks for the proposal. I tried this and it works in principal. To be honest I do not like the solution, but it is working.
Yeah, it sucks having to nest 26 SUBSTITUTE()s just to filter out letters, and then there are still special characters to consider. But I think thatโs the best we can do with the current functions in AppSheet.
We could REALLY use REGEX, or at least a more powerful SUBSTITUTE() with optional sets of parameters, or something like FileMakerโs FILTER().
Another option is saving the raw value to the sheet, and then using a Google Sheet formula with REGEX to extract the numbers to a new sheet column.
Thanks for detailed explanation and the hint for the featue requests. I like the REGEX capabilities as well. Voted!
User | Count |
---|---|
32 | |
31 | |
30 | |
18 | |
17 |