A magical little tool that makes the URL address into "oooooooooo"

2023.04.28

A magical little tool that makes the URL address into "oooooooooo"


The logic of the conversion is a bit like a short chain platform, but this is to change your URL address to a very long long long long, but it looks like ooooooooo, I am very curious about how it is realized, so I checked the source code, and this article explains its core Realize the logic, it is very interesting and clever to realize this function.

I found a very creative gadget website, as shown in the cover image, the function is very simple, that is, to convert a URL address to look like ooooooooo, and access to the converted address can be converted back to the original address. The simple process is shown in the figure below Show.  The logic of the conversion is a bit like a short chain platform, but this is to change your URL address to a very long long long long, but it looks like ooooooooo, I am very curious about how it is realized, so I checked the source code, and this article explains its core Realize the logic, it is very interesting and clever to realize this function.

picture

Pre-knowledge points

Before officially starting, first understand some knowledge points that need to be learned.  Because the two addresses involved are actually conversions between strings, some encoding and decoding capabilities will be used.

"Convert characters to utf8 array", each character after conversion has a specific unique value, for example, the utf8 format array after http conversion is [104, 116, 116, 112].

toUTF8Array(str) {
        var utf8 = [];
        for (var i = 0; i < str.length; i++) {
            var charcode = str.charCodeAt(i);
            if (charcode < 0x80) utf8.push(charcode);
            else if (charcode < 0x800) {
                utf8.push(0xc0 | (charcode >> 6),
                    0x80 | (charcode & 0x3f));
            }
            else if (charcode < 0xd800 || charcode >= 0xe000) {
                utf8.push(0xe0 | (charcode >> 12),
                    0x80 | ((charcode >> 6) & 0x3f),
                    0x80 | (charcode & 0x3f));
            }
            else {
                i++;
                charcode = ((charcode & 0x3ff) << 10) | (str.charCodeAt(i) & 0x3ff)
                utf8.push(0xf0 | (charcode >> 18),
                    0x80 | ((charcode >> 12) & 0x3f),
                    0x80 | ((charcode >> 6) & 0x3f),
                    0x80 | (charcode & 0x3f));
            }
        }
        console.log(utf8, 'utf8');
        return utf8;
    }
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • twenty one.
  • twenty two.
  • twenty three.
  • twenty four.
  • 25.
  • 26.

The above is the encoding, and the corresponding below is the decoding, "convert the utf8 array to a string", for example [99, 111, 109] The converted utf8 format array is com.

Utf8ArrayToStr(array) {
        var out, i, len, c;
        var char2, char3;

        out = "";
        len = array.length;
        i = 0;
        while (i < len) {
            c = array[i++];
            switch (c >> 4) {
                case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7:
                    // 0xxxxxxx
                    out += String.fromCharCode(c);
                    break;
                case 12: case 13:
                    // 110x xxxx   10xx xxxx
                    char2 = array[i++];
                    out += String.fromCharCode(((c & 0x1F) << 6) | (char2 & 0x3F));
                    break;
                case 14:
                    // 1110 xxxx  10xx xxxx  10xx xxxx
                    char2 = array[i++];
                    char3 = array[i++];
                    out += String.fromCharCode(((c & 0x0F) << 12) |
                        ((char2 & 0x3F) << 6) |
                        ((char3 & 0x3F) << 0));
                    break;
            }
        }

        return out;
    }
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • twenty one.
  • twenty two.
  • twenty three.
  • twenty four.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.

"Represent the Number object as a string in the form of 4", toString is used more often, but there are fewer scenes where parameters are passed in. This parameter radix is ​​an optional parameter used to specify the converted base number , the range is 2 ~ 36, if this parameter is not passed in, the decimal system will be used by default.

n.toString(4)
  • 1.

"Pad the specified characters on the left side of the string until the string reaches the specified length". The basic syntax is str.padStart(targetLength [, padString]).

  • targetLength: required, specify the minimum length of the expected string, if the current string is less than this length, padString will be used on the left to fill until the string reaches the specified length.
  • padString: Optional, specifies the character used to pad the string, the default is " " (space).
str.padStart(4, '0')
  • 1.

URL encoding/decoding

The following formally begins the logic of URL encoding. The core logic is as follows:

  • Convert to utf8 array
  • Convert to base 4 and add 0 to 4 digits on the left
  • split into string array
  • Maps to different forms of o
  • Spliced ​​again into a string, that is, the converted URL
// 获取utf8数组
let unversioned = this.toUTF8Array(url)
    // 转换为base 4字符串
    // padstart非常重要!否则会丢失前导0
    .map(n => n.toString(4).padStart(4, "0"))
    // 转换为字符数组
    .join("").split("")
    // 映射到o的不同形式
    .map(x => this.enc[parseInt(x)])
    // 连接成单个字符串
    .join("")
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.

There are two key points to explain above. First, what does it mean to map to different forms of o? In fact, the converted o is not one type of "o", but 4 types, but the effect we see with the naked eye is very similar, which can be seen through the converted characters of encodeURI.

encodeURI('o-ο-о-ᴏ')
// o-%CE%BF-%D0%BE-%E1%B4%8F
  • 1.
  • 2.

This actually explains why the above is converted to base 4 and complemented with 0 to four digits on the left. Because the this.enc defined by the above code is as follows, because there are only four kinds of "o" in total, and the 4-ary system will only produce 0, 1, 2, 3, so that the converted utf8 characters can be corresponding to these special types one by one The "o".

enc = ["o", "ο", "о", "ᴏ"]
  • 1.

The final effect example converts the http character:

  • Convert to utf8 array: [ 104, 116, 116, 112 ]
  • Convert to base 4 and add 0 to 4 digits on the left: ['1220', '1310', '1310', '1300']
  • Split converted to string array: ['1', '2', '2', '0', '1', '3', '1', '0', '1', '3', '1 ', '0', '1', '3', '0', '0']
  • Maps to different forms of o: [ 'ο', 'о', 'о', 'o', 'ο', 'ᴏ', 'ο', 'o', 'ο', 'ᴏ', 'ο ', 'o', 'ο', 'ᴏ', 'o', 'o' ]
  • Spliced ​​again into a string, that is, the converted URL: οооoοᴏοoοᴏοoοᴏoo

At this point, the entire conversion encoding process is over. After reading it, do you think the design is very good? After encoding, it is decoding. Decoding is to reverse the above process and restore to the original URL address  . note here is that 4 characters are parsed each time and parseInt is parsed in a 4-ary format.

// 获取url的base 4字符串表示
let b4str = ooo.split("").map(x => this.dec[x]).join("")

let utf8arr = []
// 每次解析4个字符
// 记住添加前导0的填充
for (let i = 0; i < b4str.length; i += 4)
    utf8arr.push(parseInt(b4str.substring(i, i + 4), 4))
// 返回解码后的字符串
return this.Utf8ArrayToStr(utf8arr)
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.

at last

At this point, the sharing of the core implementation code is over. After reading it, I feel that it is not very complicated. Based on this design, other character effects may be extended. If you are interested, you can also try it.  Sharing the transcoded address with your friends will definitely bring different surprises. Below is the address of an AI gadget I converted. Click to see the effect~