The real treatment variable here is lots of exercise (the treatment group) versus not very much exercise (the control group), but it was OK to just say conductors (T) versus drivers (C). The outcome variable is whether or not the person got heart disease. It's an observational study because the subjects themselves decided who was going to be a driver and who a conductor, not the experimenters. You can not run a controlled experiment with a setup like this, since you can't force people to exercise a lot or not exercise a lot (equivalently, you can't force them to become conductors instead of drivers, or vice versa) just to see whether or not they get heart disease.

To be a potential confounding factor (PCF), a variable has to satisfy two criteria: it has to be capable of affecting the outcome by itself (in other words, an association between the PCF and the outcome has to be plausible), and it has to be possible for the treatment and control groups to differ on average with respect to the PCF (in other words, an association between the PCF and the treatment variable has to be plausible also). Age satisfies both criteria: as age goes up, the incidence of heart disease tends to increase, and it is certainly possible for the age distributions of the drivers and conductors to have been noticeably different. If the drivers had been a lot older than the conductors, for example, that could have gone a long way right there toward explaining why the control group exhibited more heart disease than the treatment group.

Yes, an association between exercise and heart disease was established by this study; all you have to do to demonstrate an association is show that the group that got more exercise had less heart disease. But that doesn't mean that anything causal has been shown. Prior health status is a big PCF here -- maybe the kind of people who choose to be drivers do so because they think they would not be up to the rigors of the conductor job. Another PCF some people mentioned was job conditions -- if driving the bus was a lot more stressful than walking up and down the aisle punching tickets, for instance, and stress causes heart disease, how are we to figure out with this design if it's the stress or the lack of exercise that was causal? (In fact, the investigators forgot to take a baseline health measure, but some data bearing indirectly on initial health status was later found: the uniform sizes of the two groups of people at the time they were hired. The drivers were noticeably heavier than the conductors to begin with.)