In this work, a schematic model for human activity recognition based on multiple cues is introduced. In the beginning, a sequence of temporal silhouettes of the moving human body parts are extracted from a video clip (i.e., an action snippet). Next, each action snippet is temporally split into several time-slices represented by fuzzy intervals. As shape features, a variety of descriptors both boundary-based (Fourier descriptors, Curvature features) and region-based (Moments, Moment-based features) are then extracted from the silhouettes at each time-slice. Finally, an NB (Naïve Bayes) classifier is learned in the feature space for activity classification. The performance of the method was evaluated on the KTH dataset and the obtained results are quite encouraging and show that an accuracy on par with or exceeding that of existing methods is achievable. Further the simplicity and computational efficiency of the features employed allow the method to achieve real-time performance, and thus it can provide latency guarantees to real-time applications.