Lea SThe South Hub community is a place for members of the Hub to share their research and initiatives with the community, and to learn about regional and national initiatives in smart, connected and resilient cities and communities.
If you would like to give a presentation to the group about your work in smart and connected communities / cities, or if you would like to volunteer to help us organize this community, please contact firstname.lastname@example.org and notate in the subject "Smart Cities Working Group".
See or contribute resources about Smart Cities Projects, please add to our list here: SMART CITY RESOURCES
The Data Sharing and Infrastructure working group has enlisted the help of members from the south region and is working in collaboration with the Midwest, West, and NE Big Data Hubs, including representatives from the National Data Service, XSEDE, DataNet Federation, and iRODS Consortium. The WG will conduct a requirements analysis of Hub spokes and members, map existing assets, schedule demos of key components for a federated system, and through a testbed, demonstrate integration of iRODs, NDS Labs, XSEDE, and Discovery Environment. The WG meets every other week on Fridays from 3:00 – 4:30 PM EDT.
April 28: Presenter: Vas Vasiliadis/Globus: Simplifying Research Data Management via SaaS
April 28: Presenter: Vani Mandava & Jeff Prosise/Azure for Research: Bringing the Power of the Cloud to Big Data
Vas Vasiliadis: Globus: Simplifying Research Data Management via SaaS |Globus is software-as-a-service for research data management. Our goal is to make it easy for researchers to manage their data throughout its lifecycle, using just a web browser to move, share, and publish data, directly from your own storage systems. Globus provides secure, reliable, high-performance file transfer, the ability to share files with collaborators, and flexible workflows for identifying, describing, curating, and publishing data sets. Since its launch at SC10, the service has been deployed at hundreds of research institutions across the US and abroad. In this talk, we will provide an introductory overview and demonstration of Globus, and describe recent enhancements that bring additional capabilities to both researchers and research computing administrators.
Vas Vasiliadis is Director of Products at the Computation Institute, a joint initiative of the University of Chicago and Argonne National Laboratory, and is the Chief Customer Officer for Globus. In this capacity, Vas works with the research computing community to promote and deliver the use of Globus services for research data management. Vas is also a lecturer in the Masters Program in Computer Science where he teaches courses on Cloud Computing and Product Management. Prior to joining the University Vas held various executive management positions in software companies and large IT consultancies.
Vani Mandava & Jeff Prosise: Azure for Research: Bringing the Power of the Cloud to Big Data |In the early days of cloud computing, the “cloud” was mostly a place to park data and spin up virtual machines. Today it is much more. In addition to providing access to traditional storage and compute resources on Linux as well as Windows, Microsoft Azure includes extensive infrastructure options, and dozens of services to aid researchers build intelligent services and applications from large analytical workloads. These include GPU-equipped HPC clusters with InfiniBand networking, Spark and Hadoop clusters containing thousands of cores, Data lake store and analytical services that support R /Python based massively parallel data transformation, machine learning and stream processing as a service, native Docker container clusters, and cognitive APIs for extracting information from images, text, and other media. Learn what Azure is, how it is being used in the research community, and how to apply for Azure credits to bolster your research.
Vani Mandava is Director of the Data Science for Research effort at Microsoft Research at Redmond. She has over a decade of experience designing and shipping software projects and features that are in use by millions of users across the world. Her role in the Microsoft Research Outreach team is to enable academic researchers and institutions develop technologies that fuel data-intensive scientific research using advanced techniques in data management, data mining. She has enabled the adoption of data mining best practices in various Version1 products across Microsoft client, server and services in the productivity (Office) and online advertising (Bing Ads) space. She holds patents in service infrastructure and after spending bulk of her career in engineering, finds it fascinating to work alongside researchers and scientists.
Jeff Prosise is a co-founder of Wintellect and a software developer who has written nine books and hundreds of magazine articles, trained thousands of developers at Microsoft, and spoken at some of the world’s largest software conferences. In his former life as a mechanical and aerospace engineer, Jeff worked at Oak Ridge National Lab and Lawrence Livermore National Lab, where, among other things, he developed software that combined thermal and structural finite-element methods to model optical systems for high-power laser beams. Today he is thrilled (and somewhat amazed) that with a few button clicks in the Azure Portal, he can spin up an HPC cluster almost as powerful as the Crays he used to work on.
Participants () [ Please add your name and email]:
Le Song is an assistant professor in the Department of Computational Science and Engineering, College of Computing, Georgia Institute of Technology. He received his Ph.D. in Machine Learning from University of Sydney and NICTA in 2008, and then conducted his post-doctoral research in the Department of Machine Learning, Carnegie Mellon University, between 2008 and 2011. Before he joined Georgia Institute of Technology in 2011, he was a research scientist at Google briefly. His principal research direction is machine learning, especially kernel methods and probabilistic graphical models for large scale and complex problems, arising from artificial intelligence, network analysis, computational biology and other interdisciplinary domains. He is the recipient of the Recsys’16 Deep Learning Workshop Best Paper Award, AISTATS'16 Best Student Paper Award, IPDPS'15 Best Paper Award, NSF CAREER Award’14, NIPS’13 Outstanding Paper Award, and ICML’10 Best Paper Award. He has also served as the area chair or senior program committee for many leading machine learning and AI conferences such as ICML, NIPS, AISTATS and AAAI, and the action editor for JMLR.
Structured data, such as sequences, trees, graphs and hypergraphs, are prevalent in a number of real world applications such as social network analysis, recommendation systems and knowledge base reasoning. The availability of large amount of such structured data has posed great challenges for the machine learning community. How to represent such data to capture their similarities or differences? How to learn predictive models from a large amount of such data, and efficiently? How to learn to generate structured data de novo given certain desired properties?
In this talk, I will present a structure embedding framework (Structure2Vec), an effective and scalable approach for representing structured data based on the idea of embedding latent variable models into a feature space, and learning such feature space using discriminative information. Interestingly, Structure2Vec extracts features by performing a sequence of nested nonlinear operations in a way similar to graphical model inference procedures, such as mean field (or convolution over graph) and belief propagation. In large scale applications involving materials design, recommendation system and knowledge reasoning, Structure2Vec consistently produces the-state-of-the-art predictive performance. In some cases, Structure2Vec is able to produces a more accurate model yet being 10,000 times smaller.
Leading to finding mroe relevant nodes:Apolo (Machine learning + interactive Vis)
Le Song Machine Learning in Recommendation Systems, Knowledge Graphs, and Materials Research. Can the same approach be used in all three?
Yes, using graph vectorization as an alternative to matrix vectorization
Resource List, White Paper,and Video Archive Just Released! Links Below!
Resources List for Materials Informatics
This Github repository was created after the Materials and Advanced Manufacturing Workshop, from combined participant input. Courtesy of Andrew Medford , Mark Jack and Jason Hattrick-Simpers. If you know of resources for materials informatics. Feel free to become a contributor.
High Impact Applications of Data Science for Materials & Manufacturing
This white paper summarizes expert opinions from industry, academic, and government partners of the South Big Data Hub. It focuses on the impact of addressing data challenges in the design of materials and in the process of advanced manufacturing.
The Big Data in Health theme community is an open forum to discuss data solutions and issues facing modern Healthcare. This pad text is synchronized as you type, so that everyone viewing this page sees the same text. Please feel free to add to the discussion.
Renata RIf you would like to give a presentation to the group about your work in health analytics, health disparities, or health economy or if your would like to volunteer to help organize this community, please contactRenata Rawlings-Goss at email@example.com
Applications of Analytics and Machine Learning in Energy Industry-Academia Workshop
The goal is to connect industry partners with academic researchers in the domains of Energy: Power, Smart Grid, etc as well as Big Data and Data Science. Speakers will be specifically selected to share their perspective on high-impact applications or challenges surrounding the use of data science, analytics, informatics, and machine learning in the Energy space. Attendees will come from academic research institutions across the 16 states that comprise the South Big Data Innovation Hub and industrial partners across the country. Participants will engage in active scoping and round-table discussions in order to build partnerships across high-impact application verticals.
Georgia Institute of Technology, Klaus Advanced Computing Building, Atlanta, GA
The South Big Data Innovation Hub accelerates partnerships among people in business, academia, and government that apply data science and analytics to societal and economic challenges important to the region. The South Hub is part of a national network of four Big Data Innovation Hubs in the United States, and individually includes more than 500 members from both the private and public sectors.
Video Now Available for Each Session! Click Link Below: