Dr. Polo Chau is an Assistant Professor at Georgia Tech’s School of Computational Science and Engineering, and an Associate Director of the MS Analytics program. He holds a PhD in Machine Learning and a Masters in human-computer interaction (HCI). His PhD thesis won Carnegie Mellon’s Computer Science Dissertation Award, Honorable Mention. His research group at Georgia Tech bridges data mining and HCI -- innovates at their intersection -- to synthesize scalable, interactive tools that help people understand and interact with big data. He leads the popular annual IDEA workshop that catalyzes cross-pollination across HCI and data mining. He served as general chair for ACM IUI 2015, and is a steering committee member of the conference. His research group have created novel detection technologies for malware (patented with Symantec, protects 120M+ people), auction fraud (WSJ, CNN, MSN), comment spam (patented & deployed with Yahoo), fake reviews (SDM’14 Best Student Paper), insider trading (SEC), unauthorized mobile device access (Wired, Engadget); and fire risk prediction (KDD’16 Best Student Paper, runner up). He received faculty awards from Google, Yahoo, and LexisNexis. He also received the Raytheon Faculty Fellowship, Edenfield Faculty Fellowship, Outstanding Junior Faculty Award. He is the only two-time Symantec fellow.
At the Polo Club of Data Science, we innovate at the intersection of data mining and human-computer interaction (HCI), combining the best fom both worlds to synthesize scalable interactive tools for making sense of billion-scale graph data. I will present some of our latest systems:
(1) Visage: an interactive visual graph querying approach that empowers users to construct expressive queries, without writing complex code (e.g., finding money laundering rings of bankers and business owners).
(2) Facets & Apolo: combine machine inference and visualization to guide the user to interactively explore large graphs. The user gives examples of relevant nodes, and the systems recommend interesting and surprising areas the user may want to see next.
Renata RNotes: (Feel free to add questions, links or ideas below)
arjun sPolo Chau Presentation-Analysis of Big Data as well as small data to understand the challenges. Although, we have access to a large amount of data the question remains how do we understand all the data. The approach we use is by interacting Data mining with Human-computer interaction. both develop methods fro understanding data and we work towards combine both to get the best of both worlds. Scalable and interactive tool.
Human- In0The-Loop- Graph Analysis
Detecting Fake Yelp Reviews: 6 + common venues within days
But, is it easy to build graph queries?
Query formulation is an incremental process.
VISAGE uses graph autocomplete which helps prevent over specification
Compare VISAGE(faster) vs CYPHER
Leading to finding mroe relevant nodes:Apolo (Machine learning + interactive Vis)
Audio Data Analysis | Thursday, January19th, 1:00 -2:00 pm EST
arjun sJoin this open panel discussion if you are a researcher or company working on audio data, in any sector. We will be discussing topics like 3D audio, sonification, speech recognition, and more. It is a two-way discussion so tell us about yourself.
Text Data Analysis | Thursday, December 8th, 1:00 -2:00 pm EST
Join this open panel discussion if you are a researcher or company working on text mining, in any sector. We will be discussing topics like web-scraping, semantic web, analysis tools in R and Python, the benefits of open source search engines such as Solr and elasticsearch as well as current industry search options. It is a two-way discussion so tell us about yourself.