{"id":286,"date":"2008-09-15T06:18:25","date_gmt":"2008-09-15T05:18:25","guid":{"rendered":"http:\/\/www.malcolmhardie.com\/weblogs\/angus\/?p=286"},"modified":"2025-02-01T03:04:15","modified_gmt":"2025-02-01T03:04:15","slug":"ucs-2-vs-utf-16","status":"publish","type":"post","link":"https:\/\/www.malcolmhardie.com\/weblogs\/angus\/2008\/09\/15\/ucs-2-vs-utf-16\/","title":{"rendered":"UCS-2 vs UTF-16"},"content":{"rendered":"<p>Since I got confused by this one the other day:<\/p>\n<p><a href=\"http:\/\/unicode.org\/faq\/basic_q.html#14\">http:\/\/unicode.org\/faq\/basic_q.html#14<\/a><\/p>\n<blockquote><p><strong>Q: What is the difference between UCS-2 and UTF-16?<\/strong><\/p>\n<p class=\"a\">A: UCS-2 is what a Unicode implementation was up to Unicode       1.1, <em>before<\/em> surrogate code points and UTF-16 were added as concepts to       Version 2.0 of the standard. This term should be now be avoided.<\/p>\n<p class=\"a\">When interpreting what people have meant by &#8220;UCS-2&#8221; in past       usage, it is best thought of as not a data format, but as an indication       that an implementation does not interpret any supplementary characters. In       particular, for the purposes of data exchange, UCS-2 and UTF-16 are       identical formats. Both are 16-bit, and have exactly the same code unit       representation.<\/p>\n<p class=\"a\">The effective difference between UCS-2 and UTF-16 lies at a       different level, when one is interpreting a sequence code units as code       points or as characters. In that case, a UCS-2 implementation would not       handle processing like character properties, codepoint boundaries,       collation, etc. for supplementary characters. <a href=\"http:\/\/unicode.org\/faq\/attribution.html#KW\"><\/a><\/p>\n<\/blockquote>\n<p><a name=\"15\"><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Since I got confused by this one the other day: http:\/\/unicode.org\/faq\/basic_q.html#14 Q: What is the difference between UCS-2 and UTF-16? A: UCS-2 is what a Unicode implementation was up to Unicode 1.1, before surrogate code points and UTF-16 were added as concepts to Version 2.0 of the standard. This term should be now be avoided. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-286","post","type-post","status-publish","format-standard","hentry","category-general"],"_links":{"self":[{"href":"https:\/\/www.malcolmhardie.com\/weblogs\/angus\/wp-json\/wp\/v2\/posts\/286","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.malcolmhardie.com\/weblogs\/angus\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.malcolmhardie.com\/weblogs\/angus\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.malcolmhardie.com\/weblogs\/angus\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.malcolmhardie.com\/weblogs\/angus\/wp-json\/wp\/v2\/comments?post=286"}],"version-history":[{"count":1,"href":"https:\/\/www.malcolmhardie.com\/weblogs\/angus\/wp-json\/wp\/v2\/posts\/286\/revisions"}],"predecessor-version":[{"id":1427,"href":"https:\/\/www.malcolmhardie.com\/weblogs\/angus\/wp-json\/wp\/v2\/posts\/286\/revisions\/1427"}],"wp:attachment":[{"href":"https:\/\/www.malcolmhardie.com\/weblogs\/angus\/wp-json\/wp\/v2\/media?parent=286"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.malcolmhardie.com\/weblogs\/angus\/wp-json\/wp\/v2\/categories?post=286"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.malcolmhardie.com\/weblogs\/angus\/wp-json\/wp\/v2\/tags?post=286"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}