ASCII uses 8 bits to represent characters and is suitable for Standard English with 256 possible characters. For example, the letter A is 01000001. Unicode uses 16 bits and has 65536 possible characters.
Images are made up of pixels. In a black and white image, 1 represents a black pixel and 0 represents a white pixel. In a color image, each pixel's color is represented by the RGB color system, with each color channel (red, green, blue) using a byte. The resolution of the image determines the number of pixels and affects the file size.
Sound is recorded by a microphone and sampled at set time intervals, then converted to binary. The sampling rate is measured in hertz (Hz). For example, a telephone communication samples at 8000Hz, and a CD samples at 44100Hz. The closer the sampling intervals, the higher the sound quality.