EXTRACTNUMBERS without spaces

Hi,
I have a question about the function ExtractNumbers. The function extracts the numbers found in an OCR text and returns a list.

Do you think the following behavior is intended?
extractnumbers(“41”) : (41) ok
extractnumbers(“41B”) : (4100000000) nok, should in my view be 41
extractnumbers("B464/116) : (116) nok, should in my view be a list (464 , 116)

Do you have an idea how I could get the wanted behavior?

I tried also to substitute each character but digits by one space. Interestingly the last example then returned 464.116
It was ok, when I substituted each character with 2 spaces.

Thanks for any advice

Solved Solved
0 7 470
1 ACCEPTED SOLUTION

Yeah, it sucks having to nest 26 SUBSTITUTE()s just to filter out letters, and then there are still special characters to consider. But I think that’s the best we can do with the current functions in AppSheet.

We could REALLY use REGEX, or at least a more powerful SUBSTITUTE() with optional sets of parameters, or something like FileMaker’s FILTER().

Another option is saving the raw value to the sheet, and then using a Google Sheet formula with REGEX to extract the numbers to a new sheet column.

View solution in original post

7 REPLIES 7

Hey @Christoph_Bucker

Since ExtractNumbers() is focused on numbers, it’s going to always struggle with handling any other characters.

In order to get this to work, you might have to first clear out all the letters and things - this way the ExtractNumbers() is given a clean input.

Clearing out all the letters is not a “pretty” looking process; it requires nesting a series of SUBSTITUE() statements, one for each letter/character you with to replace/remove.

Here's an example of a formula that I've been testing out for converting the text of a column into a sensical ID (to use as the Key value).

substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(

TRIM([ENTER_YOUR_COLUMN_HERE]),

" , ", “"),
", ", "
”),
" “, “"),
“-”, "
”),
left(”’ “, 1), “”),
left(’” ', 1), “”),
“/”, “”),
“.”, “_”),
“,”, “”),
“”, “”),
“:”, “”),
“;”, “”),
“[”, “”),
“]”, “”),
“(”, “”),
“)”, “”),
“{”, “”),
“}”, “”),
“!”, “”),
“@”, “”),
“#”, “”),
“$”, “”),
“%”, “”),
“^”, “”),
“&”, “”),
“*”, “”),
“+”, “”),
“?”, “”),
“|”, “”),
“<”, “”),
“>”, “”)

My general rule of thumb for keys is “Text/Hidden/UNIQUEID()” ALWAYS! But sometimes a sensical ID can help debugging things later.


If you examine the formula above, you can see that there is a dedicated SUBSTITUTE() for each character (or series of characters) that I want to remove/replace.

So you would start with something like the following:

SUBSTITUTE([OCR Result], “A”, “”)

Then wrap another around this for B:

SUBSTITUTE(SUBSTITUTE([OCR Result], “A”, “”), “B”, “”)

Etc.

It’s not pretty, but it gets the job done. Might be the only way to get ExtractNumbers() to play nicely.

Or don’t use EXTRACTNUMBERS().

Do you have another proposal how to identify the numbers?

Perhaps you could train the OCR model to ignore the letters?

Thanks for the proposal. I tried this and it works in principal. To be honest I do not like the solution, but it is working.

Yeah, it sucks having to nest 26 SUBSTITUTE()s just to filter out letters, and then there are still special characters to consider. But I think that’s the best we can do with the current functions in AppSheet.

We could REALLY use REGEX, or at least a more powerful SUBSTITUTE() with optional sets of parameters, or something like FileMaker’s FILTER().

Another option is saving the raw value to the sheet, and then using a Google Sheet formula with REGEX to extract the numbers to a new sheet column.

Thanks for detailed explanation and the hint for the featue requests. I like the REGEX capabilities as well. Voted!

Top Labels in this Space