Description
- To build a solution to analyze the government and public attitude towards COVID governance.
- To analyze influencer score and use it to improve tweets ranking
- To enhance knowledge of building end-to-end IR system including relevancy model and analytics. ‘-
- To build a search engine and analytic web UI to present useful insights
- Based on Project 1, each student should have at least 40K tweets. So, the ‘total dataset size among each group would be 80K – 120K tweets
- You are free to collect more data.
Requirement 1 – Social Network Analysis
- Calculate Influencer Score: The idea is to generate scores for each tweet, based on their potential of influence. This can be achieved in a direct and an indirect way:
- Direct approach: Use the number of retweets or likes for each tweet as a proxy for the unnormalised influence score.
- Indirect approach:equal weights to all his/her tweets, as a proxy for the influence score.You can collect the number of followers‘– that a person has, and assign
- Suggestions on using Influencer Score
- You can weigh your KPIs based on the normalised influence score.
- You can create a social network amongst all the actors (people) that you have in your dataset, treat the influencer score as the starting score, and calculate page rank score for each actor.
- You can use the normalised influence score or page rank score to reorder the documents that are retrieved by your search engine in the UI.
Requirement 2 – Content/Topic Analysis
- Compare number of Covid and non Covid related tweets made by the POIs of each country and correlate the Covid curve in that country with it
Is there any correlation between what POIs are tweeting and the COVID curve in the country? ‘-
- For each country, perform topic analysis all the tweets to extract main topics people are concerned about
- Use your own creativity to come up with more high-level analyses
Requirement 3 – Insights/Analytics
- Main purpose is to show insights based on the outputs of requirements 1 and 2
- You can do additional processing such as sentiment analysis, location analysis, keyword
analysis, etc.
‘-
- You can ingest additional data such as news articles, youtube videos. Eg: extract news articles which talk about any incidents that could be related to the POI’s tweets on COVID.
- Decide on appropriate visualizations (charts, graphs, maps)
Requirement 4 – Faceted Search
- Create a webpage to perform search operations on your indexed data
- Ideally, left side of the web page should render faceted search functionality. There should also be a search bar at the top of the page, like Google search, where you can search your dataset based on keyword. ‘-
- In order to show facets, you may need to do named entity tagging or topic generation
- You may also implement ranking based on Influencer Score
- You are encouraged to implement more search-based functionality and demo various interesting searches



