CNN is used mainly on image data (though there are other applications for it) and RNN is used for sequence learning.
A sequence of related images = video. So a general answer would be, for video processing tasks.
But combining two such networks which have very complex dynamics on its own may bring out some beautiful traits which could be applied for other tasks. Since the use of deep learning is still an empirical task, the only way to know would be to try things out.