T-PCCE: Twitter Personality based Communicative Communities Extraction System for Big Data

Source of Publication

IEEE Transactions on Knowledge and Data Engineering


© 1989-2012 IEEE. The identification of social media communities has recently been of major concern, since users participating in such communities can contribute to viral marketing campaigns. In this work, we focus on users' communication considering personality as a key characteristic for identifying communicative networks i.e., networks with high information flows. We describe the Twitter Personality based Communicative Communities Extraction (T-PCCE) system that identifies the most communicative communities in a Twitter network graph considering users' personality. We then expand existing approaches in users' personality extraction by aggregating data that represent several aspects of user behavior using machine learning techniques. We use an existing modularity based community detection algorithm and we extend it by inserting a post-processing step that eliminates graph edges based on users' personality. The effectiveness of our approach is demonstrated by sampling the Twitter graph and comparing the communication strength of the extracted communities with and without considering the personality factor. We define several metrics to count the strength of communication within each community. Our algorithmic framework and the subsequent implementation employ the cloud infrastructure and use the MapReduce Programming Environment. Our results show that the T-PCCE system creates the most communicative communities.

Document Type


First Page


Last Page


Publication Date