补充内容.

Signed-off-by: rick.chan <chenyang@autoai.com>
This commit is contained in:
rick.chan 2020-10-13 14:26:49 +08:00
parent c4de5dbd88
commit 3b9b03f684

View File

@ -1,22 +1,42 @@
# Python str 与 bytes 之间的转换 # Python string 与 bytes 之间的转换
总的来说bytes 和 string 的关系是:
bytes ---decode--> string
bytes <--encode--- string
常见的几种编码及格式:
* utf8形如\xe4\xbb\x8a\xe5\xa4
* unicode形如\u4eca\u5929\u5929\u6c14\u4e0d\u9519
如果 "\" 变成了 "\\" 说明原字符串是编码后的格式,变成 "\\" 是因为转换成了bytes。
## 1.string 转 bytes
```python ```python
# bytes object s = "abc" # string
b = b"example" s = "abc".encode() # bytesencode 默认编码方式是 utf-8
s = b"abc" # bytes
# str object # 或
s = "example" s = "abc" # string
s = bytes(s, encoding = "utf8") # bytes
# str to bytes ```
sb = bytes(s, encoding = "utf8")
## 2.bytes 转 string
# bytes to str
bs = str(b, encoding = "utf8") ```python
s = b"abc" # bytes
# an alternative method s = b"abc".decode() # stringencode 默认编码方式是 utf-8
# str to bytes s = str(b"") # string
sb2 = str.encode(s) # 或
s = b"abc" # bytes
# bytes to str s = str(s, encoding = "utf8") # string
bs2 = bytes.decode(b) ```
## 3.bytes 类型的 unicode中文输出
```python
s = '\\u4eca\\u5929\\u5929\\u6c14\\u4e0d\\u9519' # 中文是:今天天气真不错
new_s = s.encode().decode('unicode_escape') # 输出为:今天天气真不错
``` ```