Extracting specific words from a string using a list

Evening 

Is this at all possible?

Id like to extract the keywords from a long string in column 'Data' using keywords from a list 'Keywords'

does that make sense?

Many thanks 

 

Solved Solved
0 10 243
1 ACCEPTED SOLUTION

SELECT( Keyword[Allergen] , CONTAINS( [_THISROW].[Ingredients] , [Allergen] ) )

View solution in original post

10 REPLIES 10

All is possible.. 

Use an Apps Script task. 

You should give examples of the data and the desired result.

So basically I want to extract allergens from ingredients.

So I want to be able to enter ingredients in column 'ingredients' and the allergen keywords show in column 'allergens' 

The keywords are in a sheet called Keyword allergens

Example

Flour ( WHEAT flour calcium carbonate thiamin niacin iron baking powder ) sugar margarine ( Vegetable Oils ( Palm Rapeseed ) Water Salt ( 2 % ) Emulsifier : Polyglycerol Esters of Fatty Acids Colours : Curcumin Annatto Bixin Flavouring ) butter ( MILK ) EGGS icing sugar ( maize starch refined sugar cane ) ALMOND flavour ( water sunflower oil thickener : xanthan gum; acidity regulator : citric acid; preservative : potassium sorbate ) dark cherry filling ( black cherrys ( 38 % ) water sugar modified starch acidity regulator : citric acid natural flavouring preservative : potassium sorbate ) flaked ALMONDS

Result

Wheat, Milk, Eggs, Almond, Almonds

There is an expression that might help you:
EXTRACT()
https://support.google.com/appsheet/answer/10107889?hl=en&sjid=730445981769874683-EU

 

I might be wrong but doesn't this only work with the specific arguments: Prices, Datetime etc I don't think it can be used with a custom list of words. 

Willing to be corrected on this as it would be the perfect solution ๐Ÿ‘

The examples in the documentation use one text to search for it in a longer one, so it can be used with texts.
I don't know how you have the list of words to search for.
AppSheet has expressions that sometimes surprise you. (I just found out from jose_seddik that if a text expression is applied to a list, it automatically turns the list into chained text.)
Try to find a solution by combining the expressions:

Table[Column] - Get list with content of Column
SPLIT(text, delimiter) - Converts a text to a list
INTERSEC(list1, list2) - Returns a list with the common elements.
UPPER(text) - If instead of putting a text you put a List, it turns the List into a Text chaining its elements (separating them by , ) before converting everything to upper case

When you are editing a formula you can press the text button and check the result in each record.

JSO_0-1690099124441.png

JSO_1-1690099272407.png

Suppose the words are in the Table Keybord allergens and the column is called Allergen
With Keybord allergens[Allergen] you will get a list with the words to search for.
If the ingredients are in the [ingredients] column.
With SPLIT([ingredients]," ") you will obtain a list with all the words of the formula.
If you do INTERSEC(Keybord allergens[Allergen], SPLIT([ingredients]," ")) you will get a list with the elements that are in both,
If you do

UPPER(INTERSEC(Keybord allergens[Allergen], SPLIT([ingredients]," ")))

, it will convert the list to a text string with its components separated by commas and change it to uppercase.
(You may want to make sure with UPPER() of both lists that they are put in the same format to avoid comparison problems.)

 

SELECT( Keyword[Allergen] , CONTAINS( [_THISROW].[Ingredients] , [Allergen] ) )

thank you @Marc_Dillon  I managed to do it using

UPPER(intersect(split([Ingredients]),Keywords[Allergen]) & if(contains([Ingredients],Sulphur Dioxide),", SULPHUR DIOXIDE","")) thanks to @JSO 

But as you can see I had to use a contains to look for a two worded allergen, but your way works perfect thank you 

The results show as MILK,WHEAT,EGG how do i change this to show as MILK, WHEAT, EGG? Cheers

Sorted it with a SPLIT 

Thanks again

Top Labels in this Space