An issue exists in the Bidirectional Algorithm in the Unicode Specification up to and including 14.0. It permits the visual reordering of characters via control sequences, which can be used to craft source code that renders different logic than the logical ordering of tokens ingested by compilers and interpreters. Adversaries can leverage this to encode source code for compilers accepting Unicode such that targeted vulnerabilities are introduced invisibly to human reviewers. NOTE: the Unicode Consortium offers the following alternative approach to presenting this concern. An issue is noted in the nature of international text that can affect applications that implement support for The Unicode Standard and the Unicode Bidirectional Algorithm (all versions). Due to text display behavior when text includes left-to-right and right-to-left characters, the visual order of tokens may be different from their logical order. Additionally, control characters needed to fully support the requirements of bidirectional text can further obfuscate the logical order of tokens. Unless mitigated, an adversary could craft source code such that the ordering of tokens perceived by human reviewers does not match what will be processed by a compiler/interpreter/etc. The Unicode Consortium has documented this class of vulnerability in its document, Unicode Technical Report #36, Unicode Security Considerations. The Unicode Consortium also provides guidance on mitigations for this class of issues in Unicode Technical Standard #39, Unicode Security Mechanisms, and in Unicode Standard Annex #31, Unicode Identifier and Pattern Syntax. Also, the BIDI specification allows applications to tailor the implementation in ways that can mitigate misleading visual reordering in program text; see HL4 in Unicode Standard Annex #9, Unicode Bidirectional Algorithm.
Vulnerable Product | Search on Vulmon | Subscribe to Product |
---|---|---|
unicode unicode |
||
fedoraproject fedora 33 |
||
fedoraproject fedora 34 |
||
fedoraproject fedora 35 |
||
starwindsoftware starwind virtual san v8r13 |
Get our weekly newsletter Bi-directional character attack – simple and nightmarish
The way Unicode's UTF-8 text encoding handles different languages could be misused to write malicious code that says one thing to humans and another to compilers, academics are warning. "What if it were possible to trick compilers into emitting binaries that did not match the logic visible in source code?" ask Cambridge student Nicholas Boucher and Professor Ross Anderson in a paper published today. They say it is possible, and outlined a new threat [PDF] that could be deployed by future supply ...