How do I use the EXTRACTHASHTAGS () function with non-english tags?

Hello everyone!
My experiments have shown that EXTRACTHASHTAGS () only works with English characters.

@raketa ,

This is not a solution but only sharing of observations. I tested with different languages mainly with different scripts. It seems to work with most scripts. I used Google Translate to translate the same sentence from English to other languages and then applied the EXTRACTHASHTAG() expression Of course, it may not work with a few scripts or languages.

2 Likes

@Suvrutt_Gurjar thanks for the solution, but it doesn’t work for me (.
My hashtags contain character sets that are not translated.
In general, I’m trying to solve a more global problem.
I need to get a list of all matches with a tentative list of possible options.
For this to happen at the time of editing the line, i.e. the user must see this list, so a group action solution to recalculate items one at a time is not suitable.

Okay, thank you for the update. Your exact requirement is not clear. There may not be a solution but you may wish to share your requirement with maybe a few more examples so that the community could understand share any insights if possible.

2 Likes

In short, the task is as follows:

  1. There is a table A with operators
  2. There is a table B with text that contains the operators from Table B.
  3. When adding / editing a new record in Table B, you need to see a list of operators from Table A, which are contained with the test field of Table B.

An example of an operator: “{operator_char1_1}”. (There are several similar operators in Table A).

Example text: “Beginning of text {operator_char1_1} continuation of text {operator_char2_1} still continuation of text {operator_char1_3} end of text {operator_char3_2}.”

At the output, you need an Enumlist field with the following content:
{operator_char1_1},
{operator_char2_1},
{operator_char1_3},
{operator_char3_2}

My plan was to do this by substituting the characters “{” → “#” and “}” → “space” to separate the new hashtags from the text + use the EXTRACTHASHTAG () function, but this function does not work with non-English characters.

Perhaps there is a more elegant way to extract the list I want.

Thank you. Your requirement is much clearer now. In general your approach also looks good.

Hovever, a couple of concreate examples of the text and where and how it is failing would have been helpful for next level of any suggestions. Anyway, someone else may have a suggestion for you based on the information shared so far.

2 Likes

A hashtag consists of the initial hash sign (#) followed immediately by one or more non-space, non-punctuation characters.

There no such thing as “English characters”.

3 Likes

Hi @Steve!
Thank you for your attention to my task, I also thought at first that “what a nonsense” it cannot be that hashtags in different languages are processed by the platform differently.
I am attaching the video below to confirm my problem.
I hope you can understand why it doesn’t work as expected.

1 Like

Hi @Steve!
Did you manage to understand what could be the error in determining the hashtag in the text? Could you see my video (Monosnap) confirming the problem?

1 Like

I am unable to view the video. Could you post it to YouTube instead?

1 Like

@Steve, I made a GIF animation:
2021-08-28_23-43-24

2 Likes

Thank you for the GIF. I have no explanation for the behavior you’re experiencing–I’d expect it to work. So I’d call this a bug (or I don’t fully understand how hashtags are defined by AppSheet). Please contact Support for help with this, and let us know what you find out, please!

https://www.appsheet.com/Support/Contact

2 Likes

@Steve, yes, I sent it to support.
As I get clarification, I will write the answer here.

2 Likes

Hi @raketa ,

Thank you. AppSheet’s EXTRACTHASHTAGS() function extracts hashtagged words from a text. At least from the GIF you have shared, it appears that you are doing a reverse process. You are entering some words and in turn the upper field attaches hashtags to those fields and in some fields that process does not work.

Could you elaborate, just for our understanding, ? Maybe I am missing some point.

@Suvrutt_Gurjar,yes of course, here are the details of my process.
I take the Longtext field and write any text content in it with the insertion of specific operators like {any_operator_1}. Next, I need to extract these operators into a separate list, so that I can then compare them with the database of available operators. With the example in GIF, I showed only a “piece” of this process.
The fact is that the user can make a mistake, this is a kind of validation within a specific process.
Below I show how it works with a regular hashtag insertion:
2021-08-30_19-46-12

1 Like

Oh okay. Thank you very much @raketa.

Will look forward to what response you get from the support team.

2 Likes