File:Non-PrintingCharacters.svg

Original file(SVG file, nominally 1,809 × 608 pixels, file size: 212 KB)

Captions

Captions

Add a one-line explanation of what this file represents

Summary edit

Description

Overlapping terms relating to non-printing characters. While they are not always used consistently ("non-printing" and "control" may or may not be treated as synonyms, Space may or may not be treated as a control character, etc), they are illustrated here in the following senses:

  • Control function: something in the scope of the likes of ECMA-48 (as seen in its title, "Control Functions for Coded Character Sets"). Some of the control functions listed are defined by ECMA-35, ECMA-37, JIS X 0207 or DIN 31626 rather than ECMA-48.
  • Format effector: semantically, these are control characters and control functions that constitute part of the text itself (i.e. not an in-band command interspersed with the text nor a switch of text encoding). A few may otherwise have similar semantics to an editor function: e.g., CUD behaves similarly to LF and CUP behaves similarly to HVP, but LF and HVP have semantics of part of the text itself, while CUD and CUP have semantics of commands to move the caret transmitted in-band with the text, as might be emitted for e.g. arrow keys.
    • For characters with behaviour defined by Unicode itself (i.e. not category Cc), this is General Category Cf (F standing for Format). The sui generis case of the single-character general categories Zl and Zp are also included due to being (if/where they are supported) mandatory line breaks, like some of the ASCII format effectors.
    • For ASCII C0 controls, this is FE0 through FE5 (Format Effector Zero through Format Effector Five, respectively better known as BS, HT, LF, VT, FF and CR).
    • For functions defined by ECMA-48, this includes the characters listed as format effectors in 8.2.4, as well as those affected by the Format Effector Action Mode (in the current—and probably perpetual since it is almost as old as I am—June 1998 printing, these are listed in section 7.2.5). However, there are a few others which are absent from that list but should clearly be included for one reason or another: for instance, it lists BPH but not its counterpart NBH, both of which have identical semantics to a Unicode format effector (ZWSP and WJ respectively).
    • For anything else, this is by analogy.
  • Control code: one of the 65 bytes set apart by ECMA-35 for use for control functions, either specified by a higher-level protocol or designated using ECMA-35 mechanisms (with two, ESC and DEL, being fixed), which became code points in Unicode when it inherited them from ISO 8859. In Unicode, these make up General Category Cc, and retain certain extra-ordinary dispensations such as not having primary names, and having semantics defined mainly by or through higher level protocols, not Unicode itself.
  • Blank: can have an advance, but does not draw a glyph.
  • Whitespace: a Unicode property in its own right; roughly equivalent to General Category Z but also including some control codes.
  • Escape sequence: structure and usage defined in ECMA-35. In vernacular, CSI sequences where the CSI is in escape-sequence form are sometimes called "escape sequences"; structurally, however, only the CSI itself is an escape sequence, the rest being the remainder of the control sequence.
  • Control sequence, control string and subtypes of the latter: structure defined in ECMA-48, although for control strings in particular it is not always followed (e.g. de facto, OSC permits termination with BEL instead of ST, and permits inclusion of non-ASCII characters which the standard only permits in SOS).
  • DLE sequence: one of a few different things DLE has been used in, was historically in vogue (at least amongst the standardisers) for additional transmission controls (i.e. analogues to the likes of ACK, ENQ, EOT etc. that are used by a transmission protocol itself), as opposed to additional non-transmission controls which would be achieved through C1 codes or through sequences with CSI or CEX. The standards for these have generally been withdrawn, although ECMA thankfully makes scans of past standards freely available, with ECMA-37 being the main one (though not the sole one) related to this.
  • CEX sequence: JIS X 0207 defined these, as opposed to the newer JIS X 0211, which was synced with ECMA-48 and defines CSI sequences. JIS X 0207 has since been withdrawn, and I only have very limited information about it. Anyone who can enlighten me on how to obtain a copy will be appreciated.
  • Control character: as defined by vernacular and in software interfaces. Considerably broader than control codes, in that it includes format effectors defined by Unicode itself.
  • Non-printing character: a character (not a sequence) which has any property that precludes treating it as a (spacing or combining) printing character (i.e. a blank and/or whitespace and/or control character).
Date
Source Own work
Author HarJIT
Permission
(Reusing this file)
I, the copyright holder of this work, hereby publish it under the following licenses:
Copyfree
This work is licensed under the General Attribution License:

Copyright the Author.

This work is provided "as is", without any express or implied warranties, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. In no event will the authors or contributors be held liable for any direct, indirect, incidental, special, exemplary, or consequential damages however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise), arising in any way out of the use of this work, even if advised of the possibility of such damage.

Permission is granted to anyone to use this work for any purpose, including commercial applications, and to alter and distribute it freely in any form, provided that the following conditions are met:

  1. The origin of this work must not be misrepresented; you must not claim that you authored the original work. If you use this work in a product, an acknowledgment in the product documentation would be appreciated but is not required.
  2. Altered versions in any form may not be misrepresented as being the original work, and neither the name of the copyright holder nor the names of authors or contributors may be used to endorse or promote products derived from this work without specific prior written permission.
  3. The text of this notice must be included, unaltered, with any distribution.
w:en:Creative Commons
attribution
This file is licensed under the Creative Commons Attribution 4.0 International license.
You are free:
  • to share – to copy, distribute and transmit the work
  • to remix – to adapt the work
Under the following conditions:
  • attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
You may select the license of your choice.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current14:34, 10 November 2021Thumbnail for version as of 14:34, 10 November 20211,809 × 608 (212 KB)HarJIT (talk | contribs){{Information |Description=Overlapping terms relating to non-printing characters. While they are not always used consistently ("non-printing" and "control" may or may not be treated as synonyms, Space may or may not be treated as a control character, etc), they are illustrated here in the following senses: * Control function: something in the scope of the likes of ECMA-48 (as seen in its title, "Control Functions for Coded Character Sets"). * Format effector: semantically, these are control c...

There are no pages that use this file.

Metadata