Categories
Tags
algorithms APIT Arc arm assembly asynchronous base64 BitHacks Blogging box c clang-format client cmake compiler concat concurrency const_fn contravariant cos covariant cpp Customization cybersecurity DataStructure db debugging Demo deserialization discrete doc DP dtruss Dynamic Example FFI flat_map format FP fsanitize Functional functions futures Fuwari GATs gccrs generics gitignore glibc GUI hacking hashmap haskell heap interop invariant iterator join justfile kernel LaTeX leak LFU linux lto MachineLearning macOS Markdown math ML mmap nc OnceLock optimization OS panic parallels perf physics pin postgresql radare2 release reverse RPIT rust sanitizer science Science serialization server shift sin SmallProjects socket std strace String StringView strip strlen surrealdb SWAR swisstable synchronous tan toml traits triangulation UnsafeRust utf16 utf8 Video wsl x86_64 xilem zig
213 words
1 minutes
240604_rust_String_utf8_simple01
link
출처
Rust는 문자열을 uft-8로 처리한다.
기본적으로
Vec<u8>로 처리
fn main() {
let my_str = "안녕하세요.";
for char in my_str.chars() {
let code_point = char as u32;
println!("문자 '{}': U+{:04X}", char, code_point);
}
// UTF-8 인코딩
let utf8_encoded = my_str.as_bytes();
println!("UTF-8 인코딩 결과: {:?}", utf8_encoded);
// UTF-16 인코딩
let utf16_encoded: Vec<u16> = my_str.encode_utf16().collect();
println!("UTF-16 인코딩 결과: {:?}", utf16_encoded);
// UTF-32 인코딩
let utf32_encoded: Vec<u32> = my_str.chars().map(|c| c as u32).collect();
println!("UTF-32 인코딩 결과: {:?}", utf32_encoded);
println!();
let my_str02 = "테스트";
println!("테스트 : {:?}", my_str02.as_bytes());
}- result
문자 '안': U+C548
문자 '녕': U+B155
문자 '하': U+D558
문자 '세': U+C138
문자 '요': U+C694
문자 '.': U+002E
UTF-8 인코딩 결과: [236, 149, 136, 235, 133, 149, 237, 149, 152, 236, 132, 184, 236, 154, 148, 46]
UTF-16 인코딩 결과: [50504, 45397, 54616, 49464, 50836, 46]
UTF-32 인코딩 결과: [50504, 45397, 54616, 49464, 50836, 46]
테스트 : [237, 133, 140, 236, 138, 164, 237, 138, 184]240604_rust_String_utf8_simple01
https://younghakim7.github.io/blog/posts/240604_rust_string_utf8_simple01/