Menu

C Character Set and Tokens

Character Set

  • It is a set of valid characters that are recognized by the C++ language. They are, Letters: A-Z, a-z.
  • Digits: 0-9.
  • Special characters: ~! @ # $ % ^ & * ( ) _ - + / : , . ; ’ ” = < > ? { } [ ] |
  • White spaces
  • Other characters: Non graphic characters (Back space,Horizontal tab, Carriage return) and ASCII characters.

Tokens

  • They are the smallest individual units of a programming language.
  • It is the same as words in English and other languages. 
    1. keywords
    2. Identifiers
    3. Literals
    4. Punctuators
    5. Operators

Keywords

  • They are reserved words for C language, which are used for special purposes.
  • There are 48 keywords. They are,
    auto double int struct break else long switch
    case enum register typedef char extern return union
    const float short unsigned continue for signed void
    default goto sizeof volatile do if static while

Identifiers

  • It includes letters and digits used to identify memory locations (variables), or naming a function, labels etc.
  • So they should follow the rules given below.
    1. It should be unique but not a keyword
    2. The first letter should be an alphabet
    3. White space, symbols, special characters are not allowed
    4. Underscore can use ( _ )
    5. Uppercase and lowercase letters are evaluated differently

Literals

  • They represent data items that never change their values during a program run.
  • They are also called constants.
  • They may be numeric literals, character literals, string literal and escape sequence.
    1. Numeric literals: It may be an integer or floating literals. (125, 9876.3455)
    2. Character literals: They are a single character enclosed in single quotes. (‘d’,‘s’).
    3. String constants: It is a group of characters enclosed in double-quotes. (“Haritha”, “Good Batch”)
    4. Nongraphic characters: Backspace, Horizontal tab, Carriage return are represented by using escape sequences, which consist of a backslash () followed by one or more characters.

Punctuators

  • They are symbols used as punctuation marks in the syntax of various codes and as separators to enhance program readability.
  • They are, parentheses ( ), brackets [ ], braces { } , ; : * = # (Preprocessor directive).

Operators

  • They are symbols that are used in operations such as input, output, computation, assignment etc. + - * / % << >> ++ etc are some examples.