A classic cluster is essentially a number of computers grouped together in a manner that allows them to share infrastructure, such as disk space, and work together by sharing program data while those programs are running. However, this simple definition, though accurate, does not really capture the full capability of a modern cluster system, as it excludes a very important concept. This concept, which has been developed to essentially become the core of clustering in general, is the scheduling system. The functional purpose of the scheduling system is to eliminate the need to know what individual computers are doing.
When presented with multiple computers, you do not know what they are doing without individually checking them. Anything could be running on them, by anybody who has access to them. If you want to run a program, you would have to check each computer to see which, if any, have enough available resources, disk space, processors, memory, to run your program. only is it inconvenient to manually check each computer, but if none of them have any available resources, then you will be forced to check again (manually) at a later time.
A scheduling system removes this need, by aggregating data, and monitoring its system, a scheduler will keep an accurate and up to date picture of what resources are available and where. Even beyond tracking resources, a scheduler will allow you to submit instructions for running your program, and then run your program on your behalf once the necessary resources are available.