double byte vs single byte characters

NCD
Silver 2
Silver 2

Do double byte, single byte characters matter when using Google Sheets as your source?

I'm creating an app for a Japanese company and got me thinking, do double byte characters take up more storage than single byte characters? 

And if I create a enum and column and prohibit people from creating existing values with an expression, would it consider๏ผด๏ผจ๏ผฉ๏ผณ and THIS as the same value?

Solved Solved
1 6 555
1 ACCEPTED SOLUTION

I agree that it's an interesting question.  I found that when using the AppSheet search function, the two are considered to be identical:

Screen Shot 2022-03-15 at 21.10.40.png

But, as @takuya_miyai pointed out, if you make an expression like "๏ผด๏ผจ๏ผฉ๏ผณ"="THIS" it will be false.

It would be nice if we had the capability of converting such codes within AppSheet.  A text funtion like SINGLEBYTE() would be nice.  Then, SINGLEBYTE("๏ผด๏ผจ๏ผฉ๏ผณ")=SINGLEBYTE("THIS") would be true. 

I think there's a strong possibility that users in Japan will type in both double and single bytes so some way to work around such user behavior would be nice.

View solution in original post

6 REPLIES 6

Quite interesting question.

I'm gonna tag some people that might have experience with this.

@Koichi_Tsuji @takuya_miyai @Kirk_Masden 

Hi @NCD 

That is different by the datasource.
For example, Spreaedsheet is one character, the same for both single byte character and multi byte character.
However, SQL server is naturally different.

> And if I create a enum and column and prohibit people from creating existing values with an expression, would it consider๏ผด๏ผจ๏ผฉ๏ผณ and THIS as the same value?

You will understand if you try it.

2022-03-15_07h01_03.png

โ€ƒ

@SkrOYC , thank you for suggestion.๐Ÿค™

Regarding this part:

ยซ do double byte characters take up more storage than single byte characters? ยป

Negligible, both in terms of storage space and memory usage. As for storage, data will be highly compressed, as text data has naturally-high repetition rates. Regarding memory, each cell would be allocated a fixed-length register regardless of the length of its contents. 

I agree that it's an interesting question.  I found that when using the AppSheet search function, the two are considered to be identical:

Screen Shot 2022-03-15 at 21.10.40.png

But, as @takuya_miyai pointed out, if you make an expression like "๏ผด๏ผจ๏ผฉ๏ผณ"="THIS" it will be false.

It would be nice if we had the capability of converting such codes within AppSheet.  A text funtion like SINGLEBYTE() would be nice.  Then, SINGLEBYTE("๏ผด๏ผจ๏ผฉ๏ผณ")=SINGLEBYTE("THIS") would be true. 

I think there's a strong possibility that users in Japan will type in both double and single bytes so some way to work around such user behavior would be nice.

I found out that while "๏ผด๏ผจ๏ผฉ๏ผณ" and "THIS" are considered to be identical,

"ใ‚ณใƒฌ" and  "๏ฝบ๏พš" are considered as different ๐Ÿ˜ž

Thanks! Interesting.  I assume that you are referring to searching, right.  Still, it would be great to have tools to change one to the other.

Top Labels in this Space