C Programming/stdio.h/scanf

scanf is a function that reads data with specified format from a given string stream source, originated from C programming language, and is present in many other programming languages.

The  function prototype is:

The function returns the total number of items successfully matched, which can be less than the number requested. If the input stream is exhausted or reading from it otherwise fails before any items are matched, EOF is returned.

So far as is traceable, "scanf" stands for "scan format", because it scans the input for valid tokens and parses them according to a specified format.

Usage
The  function is found in C, in which it reads input for numbers and other datatypes from standard input (often a command line interface or similar kind of a text user interface).

The following shows code in C that reads a variable number of unformatted decimal integers from the standard input and prints out each of them on a separate line:

After being processed by the program above, a messy list of integers such as

456 123 789    456 12 456 1       2378

will appear neatly as: 456 123 789 456 12 456 1 2378

To print out a word:

No matter what the datatype the programmer wants the program to read, the arguments (such as  above) must be pointers pointing to memory. Otherwise, the function will not perform correctly because it will be attempting to overwrite the wrong sections of memory, rather than pointing to the memory location of the variable you are attempting to get input for.

As  is designated to read only from standard input, many programming languages with interfaces, such as PHP, have derivatives such as   and   but not   itself.

Derivatives
Depending on the actual source of input, programmers can use different derivatives of. Two common examples are  and.

fscanf
The fscanf derivative reads input from a specified file stream. The prototypes are as follows:

(C or C++)

(PHP)

The  derivative works like the original   function - parts of the input, once read, will not be read again until the file is closed and reopened.

sscanf
The sscanf derivative reads input from a character string passed as the first argument. One important different from fscanf is that the string is read from the beginning every time the function is called. There is no 'pointer' that is incremented upon a successful read operation. The prototypes are as follows:

(C or C++)

(PHP)

vscanf, vsscanf, and vfscanf
These are like the same functions without the  prefix, except they take their arguments from a. (See stdarg.h.) These variants may be used in variable-argument functions.

Format string specifications
The formatting placeholders in  are more or less the same as that in , its reverse function.

There are rarely constants (i.e. characters that are not formatting placeholders) in a format string, mainly because a program is usually not designed to read known data. The exception is one or more whitespace characters, which discards all whitespace characters in the input.

Some of the most commonly used placeholders follow:
 * : Scan an integer as a signed decimal number.
 * : Scan an integer as a signed number. Similar to , but interprets the number as hexadecimal when preceded by   and octal when preceded by  .  For example, the string   would be read as 31 using  , and 25 using  . The flag   in   indicates conversion to a   and   conversion to a.
 * : Scan for decimal  (Note that in the C99 standard the input value minus sign is optional, so if a negative number is read, no errors will arise and the result will be the two's complement. See  .) Correspondingly,   scans for an   and   for an.
 * : Scan a floating-point number in normal (fixed-point) notation.
 * ,  : Scan a floating-point number in either normal or exponential notation.   uses lower-case letters and   uses upper-case.
 * ,  : Scan an integer as an unsigned hexadecimal number.
 * : Scan an integer as an octal number.
 * : Scan a character string. The scan terminates at whitespace. A null character is stored at the end of the string, which means that the buffer supplied must be at least one character longer than the specified input length.
 * : Scan a character (char). No null character is added.
 * : Space scans for whitespace characters.
 * : Scan as a double floating-point number.
 * : Scan as a long double floating-point number.

The above can be used in compound with numeric modifiers and the,   modifiers which stand for "long" in between the percent symbol and the letter. There can also be numeric values between the percent symbol and the letters, preceding the  modifiers if any, that specifies the number of characters to be scanned. An optional asterisk right after the percent symbol denotes that the datum read by this format specifier is not to be stored in a variable. No argument behind the format string should be included for this dropped variable.

The  modifier in printf is not present in scanf, causing differences between modes of input and output. The  and   modifiers are not present in the C90 standard, but are present in the C99 standard.

An example of a format string is

The above format string scans the first seven characters as a decimal integer, then reads the remaining as a string until a space, new line or tab is found, then scans the first non-whitespace character following and a double-precision floating-point number afterwards.

Error handling
is usually used in situations when the program cannot guarantee that the input is in the expected format. Therefore a robust program must check whether the  call succeeded and take appropriate action. If the input was not in the correct format, the erroneous data will still be on the input stream and must be read and discarded before new input can be read. An alternative method of reading input, which avoids this, is to use  and then examine the string read in. The last step can be done by, for example.

Security
Like,   is vulnerable to format string attacks. Great care should be taken to ensure that the formatting string includes limitations for string and array sizes. In most cases the input string size from a user is arbitrary; it can not be determined before the  function is executed. This means that uses of  placeholders without length specifiers are inherently insecure and exploitable for buffer overflows. Another potential problem is to allow dynamic formatting strings, for example formatting strings stored in configuration files or other user controlled files. In this case the allowed input length of string sizes can not be specified unless the formatting string is checked beforehand and limitations are enforced. Related to this are additional or mismatched formatting placeholders which do not match the actual vararg list. These placeholders might be partially extracted from the stack, contain undesirable or even insecure pointers depending on the particular implementation of varargs.

/*Another use that works only on some special compilers is:

scanf("Please enter a value %d",&n);

Which prints the string in quotes and stops to accept input at the indicated %signs.*/