Extract text from QR Code

Hi,

I have been trying to extract text from a QR scan of a Contact vCard. I would like to be able to create a contact record based on the QR data. I didn’t think there is an existing way to do this so I’ve dived into trying to do this myself.

I’ve copied the “raw” data below.

QR Code

BEGIN:VCARD
VERSION:3.0
N:Jordan;Michael
FN:Michael Jordan
ADR:;;;Chicago;Illinois;;United States
TEL;CELL:12345678
EMAIL;WORK;INTERNET:airjordan@gmail.com
END:VCARD

QR Code 2

BEGIN:VCARD
VERSION:3.0
N:Johnson;Magic
ORG:LA Lakers
EMAIL;TYPE=INTERNET:mj@gmail.com
TEL;TYPE=CELL:021 1234567
TEL:03 1234567
ADR:;;Hollywood Ave;Los Angeles;9010;USA
END:VCARD

I have been trying INDEX, SPLIT, LEFT, FIND expression but the SPLIT function seem to be based on a sequence / order and created an error when I tried a different QR codes (that used a different order).
What’s more, I could not come to grips with the LEFT, RIGHT, FIND syntax so am calling out for some help / guidance.

What I have noticed with the two different QR codes is they use similar “labels”. To me a label is anything preceding the colon (:). The N, FN, ADR, ORG, CELL… etc and the data to the right of the label is the text I would like to capture. Excuse me if I have the wrong terminology Label.

Is there an easy way to extract the data based on say the “Label”?

Thanks in advance,

Regards
Te Arawa

Try to separate each line with the SPLIT() like…

SPLIT([QR],CONCATENATE("
"))

Then you should be able to read for example the 3rd value like…
INDEX(SPLIT([QR],CONCATENATE("
")),3)

2 Likes

Good Morning (from New Zealand) and thanks for your quick reply.

That syntax does well in separating data into their own lines. I did have something similar (but not as simple and tidy as what you’ve provided and I’m guessing didn’t take you hours to figure out (that is beside the point).

The syntax I’m sad the say doesn’t help (that I can envisage anyway) because line 4 for the two data sets above have different values (QR Code 1 line 4 = FN:Michael Jordan. QR Code 2 line 4 = ORG: LA Lakers) . So when I’m looking to work with FN (Full Name), some codes will be showing be the ORG data.

Similarly, the line where EMAIL Address is located is different. The data could be in any order hence but the labels seem consistent so makes sense to try to base the search on the labels first.

Thanks again for your assistance.

Hi @Te_Arawa_71
Morena . I am not an expert but in the past have used regex extract in a google sheet to sort out imported csv data that was coming in in non consistent formats.

2 Likes

As you’ve found, AppSheet cannot easily do what you want. Support for VCARD is a good idea, though. I encourage you to submit a feature request!

AppSheet is in no way designed to easily carve up arbitrary text, so the only way you can approach this is with tricks. It’s not clear whether you want everything in the VCARD pulled out and recorded, or whether you want to be able to extract specific lines.

Suppose you want the N (name) line:

MID(
  ANY(
    SORT(
      SPLIT(
        SUBSTITUTE(
          SPLIT(
            [QR Code],
            "
"
          ),
          " , N:",
          " , 000"
        ),
        " , "
      )
      + LIST("999(no name)")
    )
  ),
  4,
  999999
)

The VCARD data is multiple lines, so we have to work with newlines, which makes indenting the expression for clarity difficult. There is one line above that consists solely of a double-quote (") at the very beginning of the line. That line must be exactly that or the expression will fail.

The idea is to mark the line we want with a prefix that sorts alphanumerically before everything else. Then, we convert the VCARD data to a list, sort it, and hopefully find our line at the top. It’s possible, though, the line we want is not included, so we also want to add a line with a default value that sorts after the line we want but before everything else. If the line we want isn’t there, our default value will be the top line.

  1. SPLIT([QR Code], ...) splits the multi-line original into a list, one line per list element. The list is split around the newlines in the original, which is what makes this expression particularly delicate. This gets rid of the newlines.

  2. SUBSTITUTE(..., " , N:", " , 000") locates the list element corresponding to the name (N) label. List elements are separated buy the character sequence, space-comma-space (,). The label (N) follows immediately, terminated by a colon (:). We replace that entire sequence with space-comma-space (,)–to preserve the list element–and the digits, 000. In alphanumeric sorting, numbers sort before letters, so when the list is sorted, this numeric prefix will occur before any labels (which presumably all start with a letter).

  3. SPLIT(..., " , ") takes the textual output produced by SUBSTITUTE() and splits it back into a list.

  4. ... + LIST("999(no name)" adds the default value to the list. 999 sorts after 000 but before any label that starts with a letter. (no name) is the default value. You can make the default value whatever you want, or put nothing after 999 for a blank default value.

  5. SORT(...) sorts the resulting list in alphanumeric order, low-to-high, numbers before letters.

  6. ANY(...) pulls out the first item of the sorted list.

  7. MID(..., 4, 999999) starts at the fourth character of the first item of the sorted list and grabs the first 999,999 characters, or up to however many there are. 4 corresponds to one more than the length of the numeric prefixes we used (000 & 999). 999,999 is just a large number that should cover every likelihood.

See also:







5 Likes

Morena Lynn

Thanks for your suggestion. I haven’t discounted doing calculations from within Google Sheets but my preference would be to try and do everything (or as much as possible) from within AppSheet at this stage.

Cheers

2 Likes

Steve… you’re a MAGICIAN. You mentioned trickery to address my question and the solution you offered works. I’m excited again (I can move forward with my project). This issue was slowly draining the life out of me.

FYI, I am only after the relevant contact information - Name, Address, Organisation, Email and “all” Phone numbers and I do want to record the values. I will also place a feature request for support for vCards.

THANK YOU for your help and for also taking the time to explain your solution. I will admit I would have NEVER come up with that solution in a million years but I’m sure many in the community will find it helpful.

Regards

4 Likes

Interesting case… another option:

INDEX(SPLIT(INDEX(SPLIT([QR COde],“INTERNET:”),2),"
"),1)

4 Likes

Another superb solution Aleksi. Again, I would not have thought of combining two INDEX SPLIT in such a way. I’m glad you and Steve are extremely forthcoming with help and advice.

Much appreciated.

1 Like

Nice!

Doesn’t work:

INDEX(SPLIT(INDEX(SPLIT([QR COde],“N:”),2),"
"),1)

INDEX(SPLIT(INDEX(SPLIT([QR COde],“
N:”),2),"
"),1)

?

2 Likes

There ya go. :slight_smile:

1 Like

I would also prefer this Expression :clap: :slight_smile:

1 Like

Props to @Aleksi for the beautifully simple expression, flops to me for over-engineering a solution.

3 Likes

@Steve There are situations where we need your solution. I’ve again learned from you :slight_smile:

2 Likes

Best flop I ever seen

1 Like