Designed and Developed the architecture for the SocialGist.com product, capable of handling millions of streaming data from various social media websites, including Chinese Social Networks like Sina Weibo, Tencent, RenRen, etc. Once the data is fetched, it is passed through various enrichment processes like Named Entity Recognition and Sentiment Analysis after which it is passed to an ElasticSearch Cluster.
The customers of SocialGist has the option of able to download all the data in bulk or stream them in near real time. They can also download the raw unprocessed feed vs the enriched feed.
Tech: Python, Kafka, Cassandra, ElasticSearch, Solr.